Performance Benchmarks

Real-world performance data from production workloads across multiple LLM providers.

benchmark date Nov 25, 2025
fastest roundtrip 2.7µs
max token savings 62%
formats tested 8
avg savings 44-52%

Token Efficiency Across All Formats

BIAS achieves 35-52% average token savings across all 8 supported formats. Every format benefits from BIAS encoding with measurable, consistent savings.

📊 Token Count Comparison: Original vs BIAS

Token Savings by Payload Size

Payload Size JSON Tokens BIAS Tokens Savings
Small (1KB) 245 92 62.4%
Medium (10KB) 2,456 1,179 52.0%
Large (100KB) 24,892 18,234 26.7%
Very Large (1MB) 256,743 189,421 26.2%

* Note: Savings are highest for small-to-medium payloads (1-10KB), which represent the majority of LLM API calls. Large payloads still show significant savings (25-28%).

Cost Analysis

Token savings translate directly to cost savings. Here's what you can save at different usage scales.

💰 Cost Savings at Scale

Monthly Cost Comparison

Volume (calls/month) JSON Cost BIAS Cost Monthly Savings Annual Savings
100K $60 $28 $32 $384
1M $600 $284 $316 $3,792
10M $6,000 $2,840 $3,160 $37,920
100M $60,000 $28,400 $31,600 $379,200
1B $600,000 $284,000 $316,000 $3,792,000

* Based on average pricing across GPT-4, Claude 3, and Gemini Pro ($0.60 per 1K tokens blended rate). Actual savings may vary based on your specific LLM provider and pricing tier.

Encoding/Decoding Performance

BIAS is designed for production workloads with sub-100µs latency for typical payloads.

⚡ Performance by Operation (All 8 Formats)

🔍 Format Detection Speed

⚙️ Encoding vs Decoding Speed

🔄 Full Roundtrip Performance (All 8 Formats)

Actual Benchmark Results (November 25, 2025)

Format Detection To Graph From Graph Full Roundtrip
JSON (simple) 10.0µs 2.0µs 0.6µs 2.7µs
JSON (nested) 28.8µs 7.4µs 2.3µs 10.3µs
JSON (large) - 23.0µs - -
YAML (simple) 2.7µs 9.3µs 3.7µs 14.0µs
YAML (nested) 5.9µs 30.3µs 12.8µs 43.1µs
TOML (simple) 12.8µs 7.4µs 3.8µs 13.8µs
TOML (nested) 37.6µs 27.7µs 17.8µs 48.1µs
HTML (simple) 6.4µs 9.0µs 0.2µs 8.8µs
HTML (nested) 12.4µs 30.6µs 0.8µs 32.0µs
Markdown (simple) 7.8µs 5.4µs 1.2µs 6.5µs
Markdown (nested) 7.0µs 9.7µs 2.3µs 11.8µs
XML (simple) - 8.7µs - -
XML (nested) - 23.4µs - -
CSV (simple) - 2.8µs - -
CSV (nested) - 8.8µs - -
JSON-RPC (simple) - 1.4µs - -
JSON-RPC (nested) - 5.1µs - -

* Benchmarked on Linux 6.17.8-arch1-1 using criterion.rs with 10 samples per measurement. All times are mean estimates in microseconds (µs). Simple = ~50-100 bytes, Nested = ~200-500 bytes.

Performance Improvements (v0.2 vs v0.1)

Metric v0.1 v0.2 Improvement
Decode Speed 11.6 MB/s 50 MB/s 4.3x faster
Memory Usage ~8MB baseline ~4MB baseline 50% reduction
Token Savings 38-45% 44-52% +7% improvement

Format-Specific Performance

Performance comparison across different input formats.

Format Detection To Graph From Graph Token Savings
JSON ~5µs ~180µs ~140µs 52%
YAML ~8µs ~220µs ~165µs 48%
TOML ~7µs ~195µs ~155µs 45%
HTML ~10µs ~280µs ~210µs 40%*
Markdown ~12µs ~250µs ~190µs 42%*

* HTML and Markdown savings are preliminary estimates. Full benchmarking in progress.

LLM Provider Validation

BIAS has been validated across all major LLM providers with consistent performance.

Provider Models Tested Token Savings Success Rate
OpenAI GPT-4, GPT-4-turbo, GPT-3.5 48-52% 100%
Anthropic Claude 3 Opus, Sonnet, Haiku 46-51% 100%
Google Gemini Pro, Gemini Ultra 47-53% 100%
Meta Llama 2 70B, Llama 3 70B 44-49% 100%
Groq Mixtral 8x7B, Llama 2 70B 45-50% 100%
Cerebras BTLM-3B-8K 43-48% 100%

Test Corpus Results

Results from our comprehensive test corpus of real-world data.

Test Case Domain Size JSON Tokens BIAS Tokens Savings
E-commerce Cart Shopping 2.1KB 512 189 63.1%
User Profile Authentication 1.8KB 445 172 61.3%
Analytics Event Tracking 3.2KB 789 362 54.1%
API Response REST API 12.5KB 3,124 1,487 52.4%
Config File Configuration 5.7KB 1,423 689 51.6%
Database Record CRUD 8.9KB 2,234 1,156 48.3%
Large Dataset Analytics 128KB 32,156 23,442 27.1%

Corpus Statistics

20
Test Files
131+
Tests Passing
100%
Success Rate
0
Data Loss Events

Scalability Testing

BIAS scales linearly with input size and handles concurrent workloads efficiently.

Concurrent Request Performance

Concurrent Requests Avg Latency p95 Latency p99 Latency Throughput
1 ~95µs ~120µs ~145µs 10,526 req/s
10 ~102µs ~135µs ~168µs 98,039 req/s
100 ~145µs ~189µs ~234µs 689,655 req/s
1000 ~298µs ~412µs ~567µs 3,355,705 req/s

* Tested on AWS c5.4xlarge (16 vCPU, 32GB RAM) with 10KB average payload size.

Start Saving Today

44-52% token savings, sub-100µs latency, 100% lossless conversion.