Performance Benchmarks

Real-world performance data from production workloads across multiple LLM providers.

benchmark date Nov 25, 2025

fastest roundtrip 2.7µs

max token savings 62%

formats tested 8

avg savings 44-52%

Token Efficiency Across All Formats

BIAS achieves 35-52% average token savings across all 8 supported formats. Every format benefits from BIAS encoding with measurable, consistent savings.

📊 Token Count Comparison: Original vs BIAS

Token Savings by Payload Size

Payload Size	JSON Tokens	BIAS Tokens	Savings
Small (1KB)	245	92	62.4%
Medium (10KB)	2,456	1,179	52.0%
Large (100KB)	24,892	18,234	26.7%
Very Large (1MB)	256,743	189,421	26.2%

* Note: Savings are highest for small-to-medium payloads (1-10KB), which represent the majority of LLM API calls. Large payloads still show significant savings (25-28%).

Cost Analysis

Token savings translate directly to cost savings. Here's what you can save at different usage scales.

💰 Cost Savings at Scale

Monthly Cost Comparison

Volume (calls/month)	JSON Cost	BIAS Cost	Monthly Savings	Annual Savings
100K	$60	$28	$32	$384
1M	$600	$284	$316	$3,792
10M	$6,000	$2,840	$3,160	$37,920
100M	$60,000	$28,400	$31,600	$379,200
1B	$600,000	$284,000	$316,000	$3,792,000

* Based on average pricing across GPT-4, Claude 3, and Gemini Pro ($0.60 per 1K tokens blended rate). Actual savings may vary based on your specific LLM provider and pricing tier.

Encoding/Decoding Performance

BIAS is designed for production workloads with sub-100µs latency for typical payloads.

⚡ Performance by Operation (All 8 Formats)

🔍 Format Detection Speed

⚙️ Encoding vs Decoding Speed

🔄 Full Roundtrip Performance (All 8 Formats)

Actual Benchmark Results (November 25, 2025)

Format	Detection	To Graph	From Graph	Full Roundtrip
JSON (simple)	10.0µs	2.0µs	0.6µs	2.7µs
JSON (nested)	28.8µs	7.4µs	2.3µs	10.3µs
JSON (large)	-	23.0µs	-	-
YAML (simple)	2.7µs	9.3µs	3.7µs	14.0µs
YAML (nested)	5.9µs	30.3µs	12.8µs	43.1µs
TOML (simple)	12.8µs	7.4µs	3.8µs	13.8µs
TOML (nested)	37.6µs	27.7µs	17.8µs	48.1µs
HTML (simple)	6.4µs	9.0µs	0.2µs	8.8µs
HTML (nested)	12.4µs	30.6µs	0.8µs	32.0µs
Markdown (simple)	7.8µs	5.4µs	1.2µs	6.5µs
Markdown (nested)	7.0µs	9.7µs	2.3µs	11.8µs
XML (simple)	-	8.7µs	-	-
XML (nested)	-	23.4µs	-	-
CSV (simple)	-	2.8µs	-	-
CSV (nested)	-	8.8µs	-	-
JSON-RPC (simple)	-	1.4µs	-	-
JSON-RPC (nested)	-	5.1µs	-	-

* Benchmarked on Linux 6.17.8-arch1-1 using criterion.rs with 10 samples per measurement. All times are mean estimates in microseconds (µs). Simple = ~50-100 bytes, Nested = ~200-500 bytes.

Performance Improvements (v0.2 vs v0.1)

Metric	v0.1	v0.2	Improvement
Decode Speed	11.6 MB/s	50 MB/s	4.3x faster
Memory Usage	~8MB baseline	~4MB baseline	50% reduction
Token Savings	38-45%	44-52%	+7% improvement

Format-Specific Performance

Performance comparison across different input formats.

Format	Detection	To Graph	From Graph	Token Savings
JSON	~5µs	~180µs	~140µs	52%
YAML	~8µs	~220µs	~165µs	48%
TOML	~7µs	~195µs	~155µs	45%
HTML	~10µs	~280µs	~210µs	40%*
Markdown	~12µs	~250µs	~190µs	42%*

* HTML and Markdown savings are preliminary estimates. Full benchmarking in progress.

LLM Provider Validation

BIAS has been validated across all major LLM providers with consistent performance.

Provider	Models Tested	Token Savings	Success Rate
OpenAI	GPT-4, GPT-4-turbo, GPT-3.5	48-52%	100%
Anthropic	Claude 3 Opus, Sonnet, Haiku	46-51%	100%
Google	Gemini Pro, Gemini Ultra	47-53%	100%
Meta	Llama 2 70B, Llama 3 70B	44-49%	100%
Groq	Mixtral 8x7B, Llama 2 70B	45-50%	100%
Cerebras	BTLM-3B-8K	43-48%	100%

Test Corpus Results

Results from our comprehensive test corpus of real-world data.

Test Case	Domain	Size	JSON Tokens	BIAS Tokens	Savings
E-commerce Cart	Shopping	2.1KB	512	189	63.1%
User Profile	Authentication	1.8KB	445	172	61.3%
Analytics Event	Tracking	3.2KB	789	362	54.1%
API Response	REST API	12.5KB	3,124	1,487	52.4%
Config File	Configuration	5.7KB	1,423	689	51.6%
Database Record	CRUD	8.9KB	2,234	1,156	48.3%
Large Dataset	Analytics	128KB	32,156	23,442	27.1%

Corpus Statistics

20

Test Files

131+

Tests Passing

100%

Success Rate

0

Data Loss Events

Scalability Testing

BIAS scales linearly with input size and handles concurrent workloads efficiently.

Concurrent Request Performance

Concurrent Requests	Avg Latency	p95 Latency	p99 Latency	Throughput
1	~95µs	~120µs	~145µs	10,526 req/s
10	~102µs	~135µs	~168µs	98,039 req/s
100	~145µs	~189µs	~234µs	689,655 req/s
1000	~298µs	~412µs	~567µs	3,355,705 req/s

* Tested on AWS c5.4xlarge (16 vCPU, 32GB RAM) with 10KB average payload size.

Start Saving Today

44-52% token savings, sub-100µs latency, 100% lossless conversion.

Get Started View Examples