Data Formats

TOON Format: Token-Oriented Object Notation Complete Guide

📅 December 10, 2025 ⏱️ 4 min read 👁️ 149 views 🏷️ Data Formats

TOON, short for Token Oriented Object Notation, is a compact data format created for working efficiently with Large Language Models. While testing large prompts with structured data, I noticed that standard JSON quickly inflated token counts. TOON addresses that problem by reducing tokens by roughly 30 to 60 percent while keeping the same underlying data.

What is TOON

TOON represents the JSON data model using minimal syntax and predictable structure. It stays readable for humans but removes unnecessary characters that increase token usage. I first experimented with TOON while optimizing prompt payloads for LLM APIs, where token limits and cost were becoming real constraints.

Key Benefits of TOON

  • Lower token usage which directly reduces API costs
  • Faster model processing due to smaller inputs
  • Lossless conversion back to JSON when needed
  • Readable structure that is easier to debug than minified JSON
  • Schema clarity that helps models interpret fields correctly

JSON vs TOON Comparison

While working with user lists and product catalogs, I often ran into token limit warnings when using JSON. The same data expressed in TOON consistently stayed well under limits.

JSON Format

{
  "users": [
    {"id": 1, "name": "Alice", "email": "alice@example.com", "active": true},
    {"id": 2, "name": "Bob", "email": "bob@example.com", "active": false},
    {"id": 3, "name": "Charlie", "email": "charlie@example.com", "active": true}
  ]
}

TOON Format

users [3] {id, name, email, active}
1, Alice, alice@example.com, true
2, Bob, bob@example.com, false
3, Charlie, charlie@example.com, true

The biggest mistake I made early on was assuming token counts scaled linearly. They do not. Structural characters add up fast, and TOON removes most of that overhead.

TOON Syntax Fundamentals

Objects Using Indentation

Instead of braces, TOON relies on indentation. At first, I introduced indentation errors while converting nested objects manually. Using automated conversion fixed those issues.

{
  "person": {
    "name": "John",
    "age": 30
  }
}
person
  name: John
  age: 30

Tabular Arrays

Uniform arrays become compact tables. The most common error I encountered here was inconsistent object keys, which breaks the table layout. Validating JSON first helped avoid this.

products [N] {id, title, price}
1, Book, 10
2, Pen, 2

When TOON Makes Sense

ScenarioRecommendation
Large uniform datasetsTOON works extremely well
LLM prompts with structured dataStrong choice
Highly irregular nested objectsJSON may be clearer
Configuration storageYAML is often better

Converting JSON to TOON in JavaScript

When converting programmatically, my most frequent error was feeding invalid JSON. Running validation first using https://jsonformatterspro.com saved a lot of debugging time.

function jsonToToon(json, indent = '') {
  if (Array.isArray(json)) {
    if (json.length && typeof json[0] === 'object') {
      const keys = Object.keys(json[0]);
      const uniform = json.every(item =>
        typeof item === 'object' &&
        JSON.stringify(Object.keys(item)) === JSON.stringify(keys)
      );
      if (uniform) {
        let out = '[' + json.length + '] {' + keys.join(', ') + '}\n';
        json.forEach(row => {
          out += keys.map(k => formatValue(row[k])).join(', ') + '\n';
        });
        return out;
      }
    }
  }
  if (typeof json === 'object' && json !== null) {
    let out = '';
    for (const key in json) {
      const val = json[key];
      if (typeof val === 'object') {
        out += indent + key + '\n' + jsonToToon(val, indent + '  ');
      } else {
        out += indent + key + ': ' + formatValue(val) + '\n';
      }
    }
    return out;
  }
  return formatValue(json);
}

function formatValue(val) {
  if (typeof val === 'string' && !/[,\n]/.test(val)) return val;
  if (typeof val === 'string') return '"' + val + '"';
  return String(val);
}

Converting TOON to JSON in Python

Parsing errors usually came from malformed rows or mismatched column counts. I now validate row lengths before parsing.

import re

def toon_to_json(text):
    lines = text.strip().split('\n')
    result = {}
    header = re.match(r'(\w+)\s*\[(\d+)\]\s*\{([^}]+)\}', lines[0])
    if header:
        key = header.group(1)
        count = int(header.group(2))
        fields = [f.strip() for f in header.group(3).split(',')]
        items = []
        for line in lines[1:count+1]:
            values = [v.strip() for v in line.split(',')]
            item = {}
            for i, field in enumerate(fields):
                item[field] = parse_value(values[i])
            items.append(item)
        result[key] = items
    return result

def parse_value(val):
    if val == 'true': return True
    if val == 'false': return False
    if val == 'null': return None
    try: return int(val)
    except: pass
    try: return float(val)
    except: pass
    return val.strip('"')

LLM Cost Impact

During testing with large datasets, token reduction translated directly into lower usage costs and fewer prompt truncation issues. This mattered most for batch processing and retrieval augmented generation workflows.

RecordsJSON TokensTOON TokensApprox Savings
10050002000Moderate
10005000020000High
10000500000200000Very High

🔧 Try Our Free TOON Converter

Convert your JSON to TOON format instantly and see your token savings in real-time!

⚡ Open TOON Converter

Conclusion

TOON is not a replacement for every data format, but it solves a very specific and increasingly common problem. When working with LLMs at scale, reducing token usage without losing structure makes a measurable difference. After testing it across multiple projects, TOON has become a practical option whenever structured data meets AI prompts.

🏷️ Tags:
toon json llm token optimization data format ai

📚 Related Articles