Data Formats

TOON Format: Token-Oriented Object Notation Complete Guide

📅 December 10, 2025 ⏱️ 3 min read 👁️ 6 views 🏷️ Data Formats

TOON (Token-Oriented Object Notation) is a revolutionary data format designed specifically for Large Language Models (LLMs). It achieves 30-60% token reduction compared to JSON, making your AI API calls faster and cheaper.

What is TOON?

TOON is a compact, human-readable data format that encodes the JSON data model using minimal syntax. It's optimized to be processed by LLMs like GPT-4, Claude, and Gemini, reducing the number of tokens while maintaining complete data fidelity.

Key Benefits of TOON

30-60% fewer tokens - Significant cost savings on API calls
Faster inference - Less data means quicker processing
Lossless conversion - Perfect round-trip to/from JSON
Human readable - Easy to understand and debug
Schema-aware - Explicit headers help LLMs parse data

JSON vs TOON: Side-by-Side Comparison

Let's see how TOON compresses JSON data:

JSON Format (47 tokens)

{
  "users": [
    {"id": 1, "name": "Alice", "email": "alice@example.com", "active": true},
    {"id": 2, "name": "Bob", "email": "bob@example.com", "active": false},
    {"id": 3, "name": "Charlie", "email": "charlie@example.com", "active": true}
  ]
}

TOON Format (23 tokens) - 51% reduction!

users [3] {id, name, email, active}
1, Alice, alice@example.com, true
2, Bob, bob@example.com, false
3, Charlie, charlie@example.com, true

TOON Syntax Fundamentals

1. Objects with Indentation

TOON uses indentation instead of curly braces:

// JSON
{
  "person": {
    "name": "John",
    "age": 30
  }
}

# TOON
person
  name: John
  age: 30

2. Tabular Arrays

Uniform arrays become CSV-like tables with explicit length and field headers:

# TOON tabular array syntax
products [N] {field1, field2, field3}
value1, value2, value3
value4, value5, value6

When to Use TOON

Use Case	Recommendation
Uniform arrays (user lists, products)	TOON excels
LLM API prompts with data	Great choice
Deeply nested structures	JSON may be better
Configuration files	Use YAML

Converting JSON to TOON (JavaScript)

function jsonToToon(json, indent = '') {
  if (Array.isArray(json)) {
    // Check if uniform array of objects
    if (json.length > 0 && typeof json[0] === 'object') {
      const keys = Object.keys(json[0]);
      const isUniform = json.every(item => 
        typeof item === 'object' && 
        JSON.stringify(Object.keys(item)) === JSON.stringify(keys)
      );
      
      if (isUniform) {
        let result = '[' + json.length + '] {' + keys.join(', ') + '}\n';
        json.forEach(item => {
          result += keys.map(k => formatValue(item[k])).join(', ') + '\n';
        });
        return result;
      }
    }
  }
  
  if (typeof json === 'object' && json !== null) {
    let result = '';
    for (const [key, value] of Object.entries(json)) {
      if (typeof value === 'object') {
        result += indent + key + '\n' + jsonToToon(value, indent + '  ');
      } else {
        result += indent + key + ': ' + formatValue(value) + '\n';
      }
    }
    return result;
  }
  
  return formatValue(json);
}

function formatValue(val) {
  if (typeof val === 'string' && !/[,\n]/.test(val)) return val;
  if (typeof val === 'string') return '"' + val + '"';
  return String(val);
}

Converting TOON to JSON (Python)

import re

def toon_to_json(toon_text):
    """Convert TOON format to JSON"""
    lines = toon_text.strip().split('\n')
    result = {}
    
    # Check for tabular array format
    header_match = re.match(r'(\w+)\s*\[(\d+)\]\s*\{([^}]+)\}', lines[0])
    
    if header_match:
        key = header_match.group(1)
        count = int(header_match.group(2))
        fields = [f.strip() for f in header_match.group(3).split(',')]
        
        items = []
        for line in lines[1:count+1]:
            values = [v.strip() for v in line.split(',')]
            item = {}
            for i, field in enumerate(fields):
                item[field] = parse_value(values[i])
            items.append(item)
        
        return {key: items}
    
    return result

def parse_value(val):
    """Parse a TOON value to Python type"""
    val = val.strip()
    if val == 'true': return True
    if val == 'false': return False
    if val == 'null': return None
    try: return int(val)
    except: pass
    try: return float(val)
    except: pass
    return val.strip('"')

# Example usage
toon_data = """users [3] {id, name, email, active}
1, Alice, alice@example.com, true
2, Bob, bob@example.com, false
3, Charlie, charlie@example.com, true"""

result = toon_to_json(toon_data)
print(result)
# Output: {'users': [{'id': 1, 'name': 'Alice', ...}, ...]}

LLM Cost Savings Calculator

With GPT-4 costing ~$30 per million tokens, here's what you can save:

Data Size	JSON Tokens	TOON Tokens	Savings (GPT-4)
100 records	~5,000	~2,000	$0.09/call
1,000 records	~50,000	~20,000	$0.90/call
10,000 records	~500,000	~200,000	$9.00/call

🔧 Try Our Free TOON Converter

Convert your JSON to TOON format instantly and see your token savings in real-time!

⚡ Open TOON Converter

Conclusion

TOON is a powerful format for anyone working with LLMs. By reducing token counts significantly, it helps you build more cost-effective AI applications.

🏷️ Tags:

toon json llm token optimization data format ai