Real-World Benchmark: JSON vs TOON Token Savings¶

Measured on real public datasets, not synthetic data. Date: 2026-04-11 | Tokenizer: tiktoken cl100k_base | Cost model: GPT-4o $2.50/1M input tokens

Datasets¶

Dataset	Source	Rows	Columns	Domain
MovieLens ml-latest-small	GroupLens	9,742 movies	7 (id, title, genres, year, rating, count, tags)	Entertainment
SF Restaurant Health Scores	DataSF	53,973 inspections	9 (name, address, zip, date, score, type, violation, risk, neighborhood)	Public Health

Key Findings¶

Token Savings Summary¶

Dataset	Query Type	Rows	JSON Tokens	TOON Tokens	Savings
MovieLens	Full columns	50	3,684	2,938	20.2%
MovieLens	Metadata only (numeric)	100	2,258	1,364	39.6%
MovieLens	Filtered (Comedy 2000s)	50	2,413	1,544	36.0%
Restaurant	Full columns	50	3,482	2,132	38.8%
Restaurant	High risk violations	50	3,437	2,076	39.6%
Restaurant	Metadata (4 cols)	100	3,147	2,036	35.3%

Takeaway: TOON saves 20-40% tokens on real structured data. Savings are highest on metadata-heavy queries (short values, many columns) and lowest on text-heavy content (long tag strings).

Detailed Results¶

MovieLens¶

Query	Rows	JSON tok	TOON tok	Savings	JSON cost	TOON cost
Top 10 rated (7 cols)	10	695	559	19.6%	$0.0017	$0.0014
Top 20 rated (7 cols)	20	1,339	1,050	21.6%	$0.0033	$0.0026
Top 50 rated (7 cols)	50	3,684	2,938	20.2%	$0.0092	$0.0073
Top 100 rated (7 cols)	100	6,540	5,019	23.3%	$0.0163	$0.0125
Top 200 rated (7 cols)	200	11,817	8,749	26.0%	$0.0295	$0.0219
ID+title+rating (3 cols)	50	1,263	965	23.6%	$0.0032	$0.0024
Metadata: ID+year+rating+count (4 cols)	100	2,258	1,364	39.6%	$0.0056	$0.0034
Comedy 2000s (7 cols)	50	2,413	1,544	36.0%	$0.0060	$0.0039
Text-heavy tags (7 cols)	49	4,668	3,940	15.6%	$0.0117	$0.0098

Restaurant Inspections¶

Query	Rows	JSON tok	TOON tok	Savings	JSON cost	TOON cost
10 low-score (9 cols)	10	689	438	36.4%	$0.0017	$0.0011
20 low-score (9 cols)	20	1,405	873	37.9%	$0.0035	$0.0022
50 low-score (9 cols)	50	3,482	2,132	38.8%	$0.0087	$0.0053
100 violations (9 cols)	100	7,071	4,326	38.8%	$0.0177	$0.0108
200 violations (9 cols)	200	14,140	8,625	39.0%	$0.0353	$0.0216
Name+score+violation (3 cols)	50	1,324	867	34.5%	$0.0033	$0.0022
High risk only (9 cols)	50	3,437	2,076	39.6%	$0.0086	$0.0052
Metadata (4 cols)	100	3,147	2,036	35.3%	$0.0079	$0.0051

Scaling: How Savings Grow with Row Count¶

MovieLens (7 columns)¶

Rows	JSON Tokens	TOON Tokens	Savings %	Bytes Saved
5	255	194	23.9%	299
10	632	495	21.7%	659
20	1,190	900	24.4%	1,365
50	3,391	2,644	22.0%	3,501
100	6,306	4,789	24.1%	7,064
200	11,646	8,578	26.3%	14,218
500	26,674	18,927	29.0%	35,638

Restaurant Inspections (9 columns)¶

Rows	JSON Tokens	TOON Tokens	Savings %	Bytes Saved
5	361	251	30.5%	709
10	723	473	34.6%	1,563
20	1,431	905	36.8%	3,273
50	3,541	2,190	38.2%	8,403
100	7,071	4,326	38.8%	16,952
200	14,140	8,625	39.0%	34,052
500	35,663	21,787	38.9%	85,330

Pattern: Savings increase with row count and stabilize around the dataset's natural ceiling. More columns with shorter values = higher savings.

Agent Workflow Simulation¶

A typical AI agent queries a database at each reasoning step. Over a 20-step workflow (50 rows per query):

MovieLens (7 columns)¶

Format	Total Tokens	Total Cost	Savings
JSON	73,680	$0.1842	-
TOON	58,760	$0.1469	14,920 tokens / $0.037

Restaurant (9 columns)¶

Format	Total Tokens	Total Cost	Savings
JSON	69,640	$0.1741	-
TOON	42,640	$0.1066	27,000 tokens / $0.068

At scale (1000 queries/day), the restaurant dataset alone would save ~540K tokens/day ($1.35/day, ~$40/month).

What Drives the Difference¶

Factor	Effect on Savings	Why
More columns	Higher savings	TOON writes field names once; JSON repeats per row
Short values	Higher savings	Structure overhead dominates content
More rows	Higher savings (to a ceiling)	Fixed header cost amortized over more rows
Long text values	Lower savings	Content tokens dominate, structure is small fraction
Numeric-only data	Highest savings (35-40%)	Minimal content, maximum structural redundancy
Mixed text+numeric	Moderate savings (20-30%)	Balanced content and structure

Side-by-Side Example¶

JSON (207 tokens, 653 bytes)¶

[{"movie_id":318,"title":"Shawshank Redemption, The (1994)","genres":"Crime, Drama","year":1994,"avg_rating":4.43,"num_ratings":317},{"movie_id":858,"title":"Godfather, The (1972)","genres":"Crime, Drama","year":1972,"avg_rating":4.29,"num_ratings":192},{"movie_id":2959,"title":"Fight Club (1999)","genres":"Action, Crime, Drama, Thriller","year":1999,"avg_rating":4.27,"num_ratings":218},{"movie_id":1221,"title":"Godfather: Part II, The (1974)","genres":"Crime, Drama","year":1974,"avg_rating":4.26,"num_ratings":129},{"movie_id":48516,"title":"Departed, The (2006)","genres":"Crime, Drama, Thriller","year":2006,"avg_rating":4.25,"num_ratings":107}]

TOON (157 tokens, 396 bytes) — 24.2% fewer tokens¶

[5,]{movie_id,title,genres,year,avg_rating,num_ratings}:
  318,"Shawshank Redemption, The (1994)","Crime, Drama",1994,4.43,317
  858,"Godfather, The (1972)","Crime, Drama",1972,4.29,192
  2959,Fight Club (1999),"Action, Crime, Drama, Thriller",1999,4.27,218
  1221,"Godfather: Part II, The (1974)","Crime, Drama",1974,4.26,129
  48516,"Departed, The (2006)","Crime, Drama, Thriller",2006,4.25,107

JSON (165 tokens, 756 bytes)¶

[{"business_name":"Lollipot","inspection_score":45,"violation_description":"Unclean or degraded floors walls or ceilings","risk_category":"Low Risk"},{"business_name":"Lollipot","inspection_score":45,"violation_description":"Improper thawing methods","risk_category":"Moderate Risk"},{"business_name":"Lollipot","inspection_score":45,"violation_description":"High risk food holding temperature","risk_category":"High Risk"},{"business_name":"Lollipot","inspection_score":45,"violation_description":"Unclean or unsanitary food contact surfaces","risk_category":"High Risk"},{"business_name":"Lollipot","inspection_score":45,"violation_description":"Inadequate food safety knowledge or lack of certified food safety manager","risk_category":"Moderate Risk"}]

TOON (119 tokens, 423 bytes) — 27.9% fewer tokens¶

[5,]{business_name,inspection_score,violation_description,risk_category}:
  Lollipot,45,Unclean or degraded floors walls or ceilings,Low Risk
  Lollipot,45,Improper thawing methods,Moderate Risk
  Lollipot,45,High risk food holding temperature,High Risk
  Lollipot,45,Unclean or unsanitary food contact surfaces,High Risk
  Lollipot,45,Inadequate food safety knowledge or lack of certified food safety manager,Moderate Risk

Methodology¶

Tokenizer: tiktoken cl100k_base (GPT-4o tokenizer)
JSON encoding: json.dumps(data, separators=(",", ":")) — compact, no whitespace
TOON encoding: Seamless-RAG encode_tabular() — TOON v3 tabular format
Cost model: GPT-4o input pricing at $2.50 per 1M tokens
Data: Real query results from MariaDB, not synthetic
Reproducibility: Raw results saved in datasets/benchmark_results.json; import script in datasets/import_datasets.py