Skip to content

TOON (Token-Oriented Object Notation) encoder/decoder for Elixir - Optimized for LLM token efficiency

License

Notifications You must be signed in to change notification settings

kentaro/toon_ex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Toon

Hex.pm Documentation

TOON (Token-Oriented Object Notation) encoder and decoder for Elixir.

TOON is a compact data format optimized for LLM token efficiency, achieving 30-60% token reduction compared to JSON while maintaining readability.

🎯 Specification Compliance

This implementation is tested against the official TOON specification v1.3.3 (2025-10-31) using the official test fixtures.

Test Fixtures: toon-format/spec@b9c71f7

Compliance Status:

  • βœ… 100% (306/306 tests passing)
  • βœ… Decoder: 100% (160/160 tests)
  • βœ… Encoder: 100% (146/146 tests)

Tests validate semantic equivalence (both outputs decode to the same data structure), ensuring correctness independent of Elixir 1.19's automatic key sorting.

Features

  • 🎯 Token Efficient: 30-60% fewer tokens than JSON
  • πŸ“– Human Readable: Indentation-based structure like YAML
  • πŸ”§ Three Array Formats: Inline, tabular, and list formats
  • βœ… Spec Compliant: Tested against official TOON v1.3 specification
  • πŸ›‘οΈ Type Safe: Full Dialyzer support with comprehensive typespecs
  • πŸ”Œ Protocol Support: Custom encoding via Toon.Encoder protocol
  • πŸ“Š Telemetry: Built-in instrumentation for monitoring

Installation

Add toon to your list of dependencies in mix.exs:

def deps do
  [
    {:toon, "~> 0.3.0"}
  ]
end

Quick Start

Encoding

# Simple object
Toon.encode!(%{"name" => "Alice", "age" => 30})
# => "age: 30\\nname: Alice"

# Nested object
Toon.encode!(%{"user" => %{"name" => "Bob"}})
# => "user:\\n  name: Bob"

# Arrays
Toon.encode!(%{"tags" => ["elixir", "toon"]})
# => "tags[2]: elixir,toon"

Decoding

Toon.decode!("name: Alice\\nage: 30")
# => %{"name" => "Alice", "age" => 30}

Toon.decode!("tags[2]: a,b")
# => %{"tags" => ["a", "b"]}

# With options
Toon.decode!("user:\\n    name: Alice", indent_size: 4)
# => %{"user" => %{"name" => "Alice"}}

Comprehensive Examples

Primitives

Toon.encode!(nil)            # => "null"
Toon.encode!(true)           # => "true"
Toon.encode!(42)             # => "42"
Toon.encode!(3.14)           # => "3.14"
Toon.encode!("hello")        # => "hello"
Toon.encode!("hello world")  # => "\\"hello world\\"" (auto-quoted)

Objects

# Simple objects
Toon.encode!(%{"name" => "Alice", "age" => 30})
# =>
# age: 30
# name: Alice

# Nested objects
Toon.encode!(%{
  "user" => %{
    "name" => "Bob",
    "email" => "[email protected]"
  }
})
# =>
# user:
#   email: [email protected]
#   name: Bob

Arrays

# Inline arrays (primitives)
Toon.encode!(%{"tags" => ["elixir", "toon", "llm"]})
# => "tags[3]: elixir,toon,llm"

# Tabular arrays (uniform objects)
Toon.encode!(%{
  "users" => [
    %{"name" => "Alice", "age" => 30},
    %{"name" => "Bob", "age" => 25}
  ]
})
# => "users[2]{age,name}:\\n  30,Alice\\n  25,Bob"

# List-style arrays (mixed or nested)
Toon.encode!(%{
  "items" => [
    %{"type" => "book", "title" => "Elixir Guide"},
    %{"type" => "video", "duration" => 120}
  ]
})
# => "items[2]:\\n  - duration: 120\\n    type: video\\n  - title: \\"Elixir Guide\\"\\n    type: book"

Encoding Options

# Custom delimiters
Toon.encode!(%{"tags" => ["a", "b", "c"]}, delimiter: "\\t")
# => "tags[3\\t]: a\\tb\\tc"

Toon.encode!(%{"values" => [1, 2, 3]}, delimiter: "|")
# => "values[3|]: 1|2|3"

# Length markers
Toon.encode!(%{"tags" => ["a", "b", "c"]}, length_marker: "#")
# => "tags[#3]: a,b,c"

# Custom indentation
Toon.encode!(%{"user" => %{"name" => "Alice"}}, indent: 4)
# => "user:\\n    name: Alice"

Decoding Options

# Atom keys
Toon.decode!("name: Alice", keys: :atoms)
# => %{name: "Alice"}

# Custom indent size
Toon.decode!("user:\\n    name: Alice", indent_size: 4)
# => %{"user" => %{"name" => "Alice"}}

# Strict mode (default: true)
Toon.decode!("  name: Alice", strict: false)  # Accepts non-standard indentation
# => %{"name" => "Alice"}

Specification Compliance

This implementation is tested against the official TOON specification v1.3.

Test Results

$ mix test
306 tests, 0 failures

All official TOON specification tests passing (100%)

Fully Supported Features

Decoder (100% compliant):

  • βœ… All primitive types (strings, numbers, booleans, null)
  • βœ… Nested objects with arbitrary depth
  • βœ… All three array formats (inline, tabular, list)
  • βœ… Custom delimiters (comma, tab, pipe)
  • βœ… Quoted strings with escapes (\\, \", \n, \r, \t)
  • βœ… Leading zero handling ("05" β†’ string, not number)
  • βœ… Strict mode validation (indentation, blank lines, array lengths)
  • βœ… Root primitives, arrays, and objects
  • βœ… Unicode support (emoji, multi-byte characters)

Encoder (100% compliant):

  • βœ… All primitive types with proper quoting
  • βœ… Nested objects with correct indentation
  • βœ… All three array formats (inline, tabular, list)
  • βœ… Custom delimiters and length markers
  • βœ… Escape sequences
  • βœ… Number normalization (-0 β†’ 0, proper precision)
  • βœ… Root primitives, arrays, and objects
  • βœ… Delimiter-aware quoting
  • βœ… Complex nested structures (arrays in list items, etc.)

Testing Approach

Tests use semantic equivalence checking: both encoder output and expected output are decoded and compared. This ensures correctness while accommodating Elixir 1.19's automatic map key sorting (outputs may differ in key order but decode to identical data structures).

Testing

The test suite uses official TOON specification fixtures:

# Run all tests against official spec fixtures
mix test

# Run only fixture-based tests
mix test test/toon/fixtures_test.exs

Test fixtures are loaded from the toon-format/spec repository via git submodule.

TOON Specification

This implementation follows TOON Specification v1.3.

TypeScript Version

This is an Elixir port of the reference implementation: toon-format/toon.

Contributing

Contributions are welcome! Please ensure all official specification tests pass before submitting PRs.

Author

Kentaro Kuribayashi

License

MIT License - see LICENSE.

About

TOON (Token-Oriented Object Notation) encoder/decoder for Elixir - Optimized for LLM token efficiency

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages