Skip to content

Real-World Benchmark: JSON vs TOON Token Savings

Measured on real public datasets, not synthetic data. Date: 2026-04-11 | Tokenizer: tiktoken cl100k_base | Cost model: GPT-4o $2.50/1M input tokens

Datasets

Dataset Source Rows Columns Domain
MovieLens ml-latest-small GroupLens 9,742 movies 7 (id, title, genres, year, rating, count, tags) Entertainment
SF Restaurant Health Scores DataSF 53,973 inspections 9 (name, address, zip, date, score, type, violation, risk, neighborhood) Public Health

Key Findings

Token Savings Summary

Dataset Query Type Rows JSON Tokens TOON Tokens Savings
MovieLens Full columns 50 3,684 2,938 20.2%
MovieLens Metadata only (numeric) 100 2,258 1,364 39.6%
MovieLens Filtered (Comedy 2000s) 50 2,413 1,544 36.0%
Restaurant Full columns 50 3,482 2,132 38.8%
Restaurant High risk violations 50 3,437 2,076 39.6%
Restaurant Metadata (4 cols) 100 3,147 2,036 35.3%

Takeaway: TOON saves 20-40% tokens on real structured data. Savings are highest on metadata-heavy queries (short values, many columns) and lowest on text-heavy content (long tag strings).

Detailed Results

MovieLens

Query Rows JSON tok TOON tok Savings JSON cost TOON cost
Top 10 rated (7 cols) 10 695 559 19.6% $0.0017 $0.0014
Top 20 rated (7 cols) 20 1,339 1,050 21.6% $0.0033 $0.0026
Top 50 rated (7 cols) 50 3,684 2,938 20.2% $0.0092 $0.0073
Top 100 rated (7 cols) 100 6,540 5,019 23.3% $0.0163 $0.0125
Top 200 rated (7 cols) 200 11,817 8,749 26.0% $0.0295 $0.0219
ID+title+rating (3 cols) 50 1,263 965 23.6% $0.0032 $0.0024
Metadata: ID+year+rating+count (4 cols) 100 2,258 1,364 39.6% $0.0056 $0.0034
Comedy 2000s (7 cols) 50 2,413 1,544 36.0% $0.0060 $0.0039
Text-heavy tags (7 cols) 49 4,668 3,940 15.6% $0.0117 $0.0098

Restaurant Inspections

Query Rows JSON tok TOON tok Savings JSON cost TOON cost
10 low-score (9 cols) 10 689 438 36.4% $0.0017 $0.0011
20 low-score (9 cols) 20 1,405 873 37.9% $0.0035 $0.0022
50 low-score (9 cols) 50 3,482 2,132 38.8% $0.0087 $0.0053
100 violations (9 cols) 100 7,071 4,326 38.8% $0.0177 $0.0108
200 violations (9 cols) 200 14,140 8,625 39.0% $0.0353 $0.0216
Name+score+violation (3 cols) 50 1,324 867 34.5% $0.0033 $0.0022
High risk only (9 cols) 50 3,437 2,076 39.6% $0.0086 $0.0052
Metadata (4 cols) 100 3,147 2,036 35.3% $0.0079 $0.0051

Scaling: How Savings Grow with Row Count

MovieLens (7 columns)

Rows JSON Tokens TOON Tokens Savings % Bytes Saved
5 255 194 23.9% 299
10 632 495 21.7% 659
20 1,190 900 24.4% 1,365
50 3,391 2,644 22.0% 3,501
100 6,306 4,789 24.1% 7,064
200 11,646 8,578 26.3% 14,218
500 26,674 18,927 29.0% 35,638

Restaurant Inspections (9 columns)

Rows JSON Tokens TOON Tokens Savings % Bytes Saved
5 361 251 30.5% 709
10 723 473 34.6% 1,563
20 1,431 905 36.8% 3,273
50 3,541 2,190 38.2% 8,403
100 7,071 4,326 38.8% 16,952
200 14,140 8,625 39.0% 34,052
500 35,663 21,787 38.9% 85,330

Pattern: Savings increase with row count and stabilize around the dataset's natural ceiling. More columns with shorter values = higher savings.

Agent Workflow Simulation

A typical AI agent queries a database at each reasoning step. Over a 20-step workflow (50 rows per query):

MovieLens (7 columns)

Format Total Tokens Total Cost Savings
JSON 73,680 $0.1842 -
TOON 58,760 $0.1469 14,920 tokens / $0.037

Restaurant (9 columns)

Format Total Tokens Total Cost Savings
JSON 69,640 $0.1741 -
TOON 42,640 $0.1066 27,000 tokens / $0.068

At scale (1000 queries/day), the restaurant dataset alone would save ~540K tokens/day ($1.35/day, ~$40/month).

What Drives the Difference

Factor Effect on Savings Why
More columns Higher savings TOON writes field names once; JSON repeats per row
Short values Higher savings Structure overhead dominates content
More rows Higher savings (to a ceiling) Fixed header cost amortized over more rows
Long text values Lower savings Content tokens dominate, structure is small fraction
Numeric-only data Highest savings (35-40%) Minimal content, maximum structural redundancy
Mixed text+numeric Moderate savings (20-30%) Balanced content and structure

Side-by-Side Example

JSON (207 tokens, 653 bytes)

[{"movie_id":318,"title":"Shawshank Redemption, The (1994)","genres":"Crime, Drama","year":1994,"avg_rating":4.43,"num_ratings":317},{"movie_id":858,"title":"Godfather, The (1972)","genres":"Crime, Drama","year":1972,"avg_rating":4.29,"num_ratings":192},{"movie_id":2959,"title":"Fight Club (1999)","genres":"Action, Crime, Drama, Thriller","year":1999,"avg_rating":4.27,"num_ratings":218},{"movie_id":1221,"title":"Godfather: Part II, The (1974)","genres":"Crime, Drama","year":1974,"avg_rating":4.26,"num_ratings":129},{"movie_id":48516,"title":"Departed, The (2006)","genres":"Crime, Drama, Thriller","year":2006,"avg_rating":4.25,"num_ratings":107}]

TOON (157 tokens, 396 bytes) — 24.2% fewer tokens

[5,]{movie_id,title,genres,year,avg_rating,num_ratings}:
  318,"Shawshank Redemption, The (1994)","Crime, Drama",1994,4.43,317
  858,"Godfather, The (1972)","Crime, Drama",1972,4.29,192
  2959,Fight Club (1999),"Action, Crime, Drama, Thriller",1999,4.27,218
  1221,"Godfather: Part II, The (1974)","Crime, Drama",1974,4.26,129
  48516,"Departed, The (2006)","Crime, Drama, Thriller",2006,4.25,107

JSON (165 tokens, 756 bytes)

[{"business_name":"Lollipot","inspection_score":45,"violation_description":"Unclean or degraded floors walls or ceilings","risk_category":"Low Risk"},{"business_name":"Lollipot","inspection_score":45,"violation_description":"Improper thawing methods","risk_category":"Moderate Risk"},{"business_name":"Lollipot","inspection_score":45,"violation_description":"High risk food holding temperature","risk_category":"High Risk"},{"business_name":"Lollipot","inspection_score":45,"violation_description":"Unclean or unsanitary food contact surfaces","risk_category":"High Risk"},{"business_name":"Lollipot","inspection_score":45,"violation_description":"Inadequate food safety knowledge or lack of certified food safety manager","risk_category":"Moderate Risk"}]

TOON (119 tokens, 423 bytes) — 27.9% fewer tokens

[5,]{business_name,inspection_score,violation_description,risk_category}:
  Lollipot,45,Unclean or degraded floors walls or ceilings,Low Risk
  Lollipot,45,Improper thawing methods,Moderate Risk
  Lollipot,45,High risk food holding temperature,High Risk
  Lollipot,45,Unclean or unsanitary food contact surfaces,High Risk
  Lollipot,45,Inadequate food safety knowledge or lack of certified food safety manager,Moderate Risk

Methodology

  • Tokenizer: tiktoken cl100k_base (GPT-4o tokenizer)
  • JSON encoding: json.dumps(data, separators=(",", ":")) — compact, no whitespace
  • TOON encoding: Seamless-RAG encode_tabular() — TOON v3 tabular format
  • Cost model: GPT-4o input pricing at $2.50 per 1M tokens
  • Data: Real query results from MariaDB, not synthetic
  • Reproducibility: Raw results saved in datasets/benchmark_results.json; import script in datasets/import_datasets.py