Skip to content
This repository was archived by the owner on Nov 3, 2025. It is now read-only.
/ rtoon Public archive

πŸ¦€ rtoon is the official Rust implementation of the Token-Oriented Object Notation (TOON) β€” a compact, human-readable, token-efficient format designed for AI and prompt-driven workflows.

License

Notifications You must be signed in to change notification settings

shreyasbhat0/rtoon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

55 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ Project Moved

This repository (rtoon) has been moved to the toon-format organization and is now maintained there as the official Rust implementation of TOON (Token-Oriented Object Notation).

πŸ‘‰ Please use toon-format/toon-rust for the latest source code, documentation, and updates.

Thank you for supporting this project! β€” Shreyas K S

πŸ¦€ RToon

Rust implementation of TOON (Token-Oriented Object Notation)

A compact, token-efficient format for structured data in LLM applications

TOON - Token-Oriented Object Notation

Crates.io CI Tests License


Token-Oriented Object Notation is a compact, human-readable format designed for passing structured data to Large Language Models with significantly reduced token usage. This is a Rust implementation of the TOON specification.

Tip

Think of TOON as a translation layer: use JSON programmatically, convert to TOON for LLM input.

Table of Contents

Why TOON?

AI is becoming cheaper and more accessible, but larger context windows allow for larger data inputs as well. LLM tokens still cost money – and standard JSON is verbose and token-expensive.

JSON vs TOON Comparison

πŸ“Š Click to see the token efficiency comparison

JSON (verbose, token-heavy):

{
  "users": [
    { "id": 1, "name": "Alice", "role": "admin" },
    { "id": 2, "name": "Bob", "role": "user" }
  ]
}

TOON (compact, token-efficient):

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

TOON conveys the same information with 30–60% fewer tokens! πŸŽ‰

Key Features

  • πŸ’Έ Token-efficient: typically 30–60% fewer tokens than JSON
  • 🀿 LLM-friendly guardrails: explicit lengths and fields enable validation
  • 🍱 Minimal syntax: removes redundant punctuation (braces, brackets, most quotes)
  • πŸ“ Indentation-based structure: like YAML, uses whitespace instead of braces
  • 🧺 Tabular arrays: declare keys once, stream data as rows
  • πŸ”„ Round-trip support: encode and decode with full fidelity
  • πŸ›‘οΈ Type-safe: integrates seamlessly with serde_json::Value
  • βš™οΈ Customizable: delimiter (comma/tab/pipe), length markers, and indentation

Installation

Add to your Cargo.toml:

[dependencies]
rtoon = "0.1.3"
serde_json = "1.0"

Quick Start

use rtoon::encode_default;
use serde_json::json;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let data = json!({
        "user": {
            "id": 123,
            "name": "Ada",
            "tags": ["reading", "gaming"],
            "active": true
        }
    });

    let toon = encode_default(&data)?;
    println!("{}", toon);
    Ok(())
}

Output:

user:
  active: true
  id: 123
  name: Ada
  tags[2]: reading,gaming

Examples

πŸ“ Note: All examples in this section are taken from the examples/ directory. Run cargo run --example examples to see them in action.

Objects

Simple objects encode as key-value pairs:

use rtoon::encode_default;
use serde_json::json;

let data = json!({
    "id": 123,
    "name": "Ada",
    "active": true
});
println!("{}", encode_default(&data).unwrap());

Output:

active: true
id: 123
name: Ada

Nested objects use indentation:

let nested = json!({
    "user": { "id": 123, "name": "Ada" }
});
println!("{}", encode_default(&nested).unwrap());

Output:

user:
  id: 123
  name: Ada

Primitive Arrays

Primitive arrays are inline with count and delimiter-separated values:

use rtoon::encode_default;
use serde_json::json;

let data = json!({ "tags": ["admin", "ops", "dev"] });
println!("{}", encode_default(&data).unwrap());

Output:

tags[3]: admin,ops,dev

Arrays of Objects (Tabular)

When arrays contain uniform objects with the same keys and primitive-only values, they're encoded in tabular format for maximum token efficiency:

use rtoon::encode_default;
use serde_json::json;

let data = json!({
    "items": [
        { "sku": "A1", "qty": 2, "price": 9.99 },
        { "sku": "B2", "qty": 1, "price": 14.5 }
    ]
});
println!("{}", encode_default(&data).unwrap());

Output:

items[2]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5

Tabular arrays can be nested:

let nested = json!({
    "items": [
        {
            "users": [
                { "id": 1, "name": "Ada" },
                { "id": 2, "name": "Bob" }
            ],
            "status": "active"
        }
    ]
});
println!("{}", encode_default(&nested).unwrap());

Output:

items[1]:
  status: active
  users[2]{id,name}:
    1,Ada
    2,Bob

Arrays of Arrays

When arrays contain other primitive arrays, they're expanded as list items:

use rtoon::encode_default;
use serde_json::json;

let data = json!({
    "pairs": [[1, 2], [3, 4]]
});
println!("{}", encode_default(&data).unwrap());

Output:

pairs[2]:
  - [2]: 1,2
  - [2]: 3,4

Mixed Arrays

Non-uniform arrays (containing primitives, objects, or nested arrays) use the expanded list format:

use rtoon::encode_default;
use serde_json::json;

let mixed = json!({
    "items": [1, {"a": 1}, "text"]
});
println!("{}", encode_default(&mixed).unwrap());

Output:

items[3]:
  - 1
  - a: 1
  - text

Objects in list format place the first field on the hyphen line:

let list_objects = json!({
    "items": [
        {"id": 1, "name": "First"},
        {"id": 2, "name": "Second", "extra": true}
    ]
});
println!("{}", encode_default(&list_objects).unwrap());

Output:

items[2]:
  - id: 1
    name: First
  - id: 2
    name: Second
    extra: true

Custom Delimiters

Use tab or pipe delimiters to avoid quoting and save more tokens:

use rtoon::{encode, EncodeOptions, Delimiter};
use serde_json::json;

let data = json!({
    "items": [
        {"sku": "A1", "name": "Widget", "qty": 2, "price": 9.99},
        {"sku": "B2", "name": "Gadget", "qty": 1, "price": 14.5}
    ]
});

// Tab delimiter (\t)
let tab = encode(&data, &EncodeOptions::new().with_delimiter(Delimiter::Tab)).unwrap();
println!("{}", tab);

// Pipe delimiter (|)
let pipe = encode(&data, &EncodeOptions::new().with_delimiter(Delimiter::Pipe)).unwrap();
println!("{}", pipe);

Length Markers

Prefix array lengths with a marker character for clarity:

use rtoon::{encode, EncodeOptions};
use serde_json::json;

let data = json!({
    "tags": ["reading", "gaming", "coding"],
    "items": [
        {"sku": "A1", "qty": 2, "price": 9.99},
        {"sku": "B2", "qty": 1, "price": 14.5}
    ]
});

let opts = EncodeOptions::new().with_length_marker('#');
println!("{}", encode(&data, &opts).unwrap());

Output:

items[#2]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5
tags[#3]: reading,gaming,coding

Empty Containers & Root Forms

Empty arrays and objects are supported:

use rtoon::encode_default;
use serde_json::json;

// Empty array
let empty_items = json!({ "items": [] });
println!("{}", encode_default(&empty_items).unwrap());

// Root array
let root_array = json!(["x", "y"]);
println!("{}", encode_default(&root_array).unwrap());

Output:

items[0]:

[2]: x,y

Empty objects at root encode to empty output.

Round-Trip Encoding

TOON supports full round-trip encoding and decoding:

use rtoon::{decode_default, encode_default};
use serde_json::json;

let original = json!({
    "product": "Widget",
    "price": 29.99,
    "stock": 100,
    "categories": ["tools", "hardware"]
});

let encoded = encode_default(&original).unwrap();
let decoded = decode_default(&encoded).unwrap();

assert_eq!(original, decoded);
println!("Round-trip successful!");

Strict Mode Decoding

Strict mode enforces array counts, indentation, and delimiter consistency:

use rtoon::{decode, DecodeOptions};

// Malformed: header says 2 rows, but only 1 provided
let malformed = "items[2]{id,name}:\n  1,Ada";

let opts = DecodeOptions::new().with_strict(true);
match decode(malformed, &opts) {
    Ok(_) => println!("Unexpectedly decoded"),
    Err(err) => println!("Strict decode error: {}", err),
}

Strict mode (default) checks:

  • Array counts must match declared lengths
  • Indentation must be exact multiples of indent size
  • Tabs cannot be used for indentation
  • Invalid escape sequences cause errors
  • Missing colons after keys cause errors
  • Blank lines inside arrays/tabular rows cause errors

API Reference

Encoding Functions

pub fn encode(value: &serde_json::Value, options: &EncodeOptions) -> ToonResult<String>
pub fn encode_default(value: &serde_json::Value) -> ToonResult<String>
pub fn encode_object(value: &serde_json::Value, options: &EncodeOptions) -> ToonResult<String>
pub fn encode_array(value: &serde_json::Value, options: &EncodeOptions) -> ToonResult<String>

Decoding Functions

pub fn decode(input: &str, options: &DecodeOptions) -> ToonResult<serde_json::Value>
pub fn decode_default(input: &str) -> ToonResult<serde_json::Value>
pub fn decode_strict(input: &str) -> ToonResult<serde_json::Value>
pub fn decode_strict_with_options(input: &str, options: &DecodeOptions) -> ToonResult<serde_json::Value>
pub fn decode_no_coerce(input: &str) -> ToonResult<serde_json::Value>
pub fn decode_no_coerce_with_options(input: &str, options: &DecodeOptions) -> ToonResult<serde_json::Value>

EncodeOptions

#[derive(Debug, Clone, PartialEq, Eq)]
pub enum Indent {
    Spaces(usize),  // Number of spaces per indent level
    Tabs,           // Use tabs for indentation
}

#[derive(Debug, Clone, PartialEq, Eq)]
pub struct EncodeOptions {
    pub delimiter: Delimiter,         // default: Delimiter::Comma
    pub length_marker: Option<char>,   // default: None
    pub indent: Indent,               // default: Indent::Spaces(2)
}

impl EncodeOptions {
    pub fn new() -> Self
    pub fn with_delimiter(self, delimiter: Delimiter) -> Self
    pub fn with_length_marker(self, marker: char) -> Self
    pub fn with_indent(self, style: Indent) -> Self
    pub fn with_spaces(self, count: usize) -> Self 
    pub fn with_tabs(self) -> Self 
}

Example:

use rtoon::{encode, EncodeOptions, Delimiter};

let opts = EncodeOptions::new()
    .with_delimiter(Delimiter::Tab)
    .with_length_marker('#')
    .with_spaces(4);

// Or 
let opts = EncodeOptions::new()
    .with_delimiter(Delimiter::Pipe)
    .with_length_marker('#')
    .with_tabs();

// Or 
use rtoon::types::Indent;
let opts = EncodeOptions::new()
    .with_indent(Indent::Spaces(3));

DecodeOptions

#[derive(Debug, Clone, PartialEq, Eq)]
pub struct DecodeOptions {
    pub delimiter: Option<Delimiter>,  // auto-detect if None
    pub strict: bool,                 // default: true
}

impl DecodeOptions {
    pub fn new() -> Self
    pub fn with_strict(self, strict: bool) -> Self
    pub fn with_delimiter(self, delimiter: Delimiter) -> Self
}

Example:

use rtoon::{decode, DecodeOptions, Delimiter};

let opts = DecodeOptions::new()
    .with_strict(true)
    .with_delimiter(Some(Delimiter::Pipe));

Delimiter

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum Delimiter {
    Comma,  // ","
    Tab,    // "\t" (U+0009)
    Pipe,   // "|"
}

Error Handling

All functions return ToonResult<T>, which is Result<T, ToonError>. The error type provides detailed information about parsing or encoding failures:

use rtoon::{decode_default, ToonError};

match decode_default(input) {
    Ok(value) => println!("Success: {}", value),
    Err(ToonError::ParseError(msg)) => eprintln!("Parse error: {}", msg),
    Err(ToonError::ValidationError(msg)) => eprintln!("Validation error: {}", msg),
    // ... other error variants
}

Format Overview

  • Objects: key: value with 2-space indentation for nesting
  • Primitive arrays: inline with count, e.g., tags[3]: a,b,c
  • Arrays of objects: tabular header, e.g., items[2]{id,name}:\n ...
  • Mixed arrays: list format with - prefix
  • Quoting: only when necessary (special chars, ambiguity, keywords like true, null)
  • Root forms: objects (default), arrays, or primitives

For complete format specification, see SPEC.md.

Specification

This implementation follows the TOON Specification v1.2. The specification defines:

  • Data model and normalization rules
  • Encoding and decoding semantics
  • Header syntax and delimiter scoping
  • Quoting rules and escaping
  • Strict mode validation requirements

Refer to SPEC.md for complete details.

Running Examples

Run the consolidated examples:

cargo run --example examples

This executes examples/main.rs, which invokes all parts under examples/parts/:

  • arrays.rs β€” Primitive array encoding
  • arrays_of_arrays.rs β€” Nested primitive arrays
  • objects.rs β€” Simple and nested objects
  • tabular.rs β€” Tabular array encoding
  • delimiters.rs β€” Custom delimiter usage
  • mixed_arrays.rs β€” Mixed/non-uniform arrays
  • length_marker.rs β€” Length marker examples
  • empty_and_root.rs β€” Edge cases and root forms
  • round_trip.rs β€” Encoding and decoding verification
  • decode_strict.rs β€” Strict mode validation

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

🀝 How to Contribute
  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

MIT Β© 2025

See Also


Built with ❀️ in Rust

About

πŸ¦€ rtoon is the official Rust implementation of the Token-Oriented Object Notation (TOON) β€” a compact, human-readable, token-efficient format designed for AI and prompt-driven workflows.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages