π Project Moved
This repository (
rtoon) has been moved to the toon-format organization and is now maintained there as the official Rust implementation of TOON (Token-Oriented Object Notation).π Please use toon-format/toon-rust for the latest source code, documentation, and updates.
Thank you for supporting this project! β Shreyas K S
Rust implementation of TOON (Token-Oriented Object Notation)
A compact, token-efficient format for structured data in LLM applications
Token-Oriented Object Notation is a compact, human-readable format designed for passing structured data to Large Language Models with significantly reduced token usage. This is a Rust implementation of the TOON specification.
Tip
Think of TOON as a translation layer: use JSON programmatically, convert to TOON for LLM input.
- Why TOON?
- Key Features
- Installation
- Quick Start
- Examples
- API Reference
- Format Overview
- Specification
- Running Examples
- Contributing
- License
- See Also
AI is becoming cheaper and more accessible, but larger context windows allow for larger data inputs as well. LLM tokens still cost money β and standard JSON is verbose and token-expensive.
π Click to see the token efficiency comparison
JSON (verbose, token-heavy):
{
"users": [
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" }
]
}TOON (compact, token-efficient):
users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
TOON conveys the same information with 30β60% fewer tokens! π
- πΈ Token-efficient: typically 30β60% fewer tokens than JSON
- π€Ώ LLM-friendly guardrails: explicit lengths and fields enable validation
- π± Minimal syntax: removes redundant punctuation (braces, brackets, most quotes)
- π Indentation-based structure: like YAML, uses whitespace instead of braces
- π§Ί Tabular arrays: declare keys once, stream data as rows
- π Round-trip support: encode and decode with full fidelity
- π‘οΈ Type-safe: integrates seamlessly with
serde_json::Value - βοΈ Customizable: delimiter (comma/tab/pipe), length markers, and indentation
Add to your Cargo.toml:
[dependencies]
rtoon = "0.1.3"
serde_json = "1.0"use rtoon::encode_default;
use serde_json::json;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let data = json!({
"user": {
"id": 123,
"name": "Ada",
"tags": ["reading", "gaming"],
"active": true
}
});
let toon = encode_default(&data)?;
println!("{}", toon);
Ok(())
}Output:
user:
active: true
id: 123
name: Ada
tags[2]: reading,gaming
π Note: All examples in this section are taken from the
examples/directory. Runcargo run --example examplesto see them in action.
Simple objects encode as key-value pairs:
use rtoon::encode_default;
use serde_json::json;
let data = json!({
"id": 123,
"name": "Ada",
"active": true
});
println!("{}", encode_default(&data).unwrap());Output:
active: true
id: 123
name: Ada
Nested objects use indentation:
let nested = json!({
"user": { "id": 123, "name": "Ada" }
});
println!("{}", encode_default(&nested).unwrap());Output:
user:
id: 123
name: Ada
Primitive arrays are inline with count and delimiter-separated values:
use rtoon::encode_default;
use serde_json::json;
let data = json!({ "tags": ["admin", "ops", "dev"] });
println!("{}", encode_default(&data).unwrap());Output:
tags[3]: admin,ops,dev
When arrays contain uniform objects with the same keys and primitive-only values, they're encoded in tabular format for maximum token efficiency:
use rtoon::encode_default;
use serde_json::json;
let data = json!({
"items": [
{ "sku": "A1", "qty": 2, "price": 9.99 },
{ "sku": "B2", "qty": 1, "price": 14.5 }
]
});
println!("{}", encode_default(&data).unwrap());Output:
items[2]{sku,qty,price}:
A1,2,9.99
B2,1,14.5
Tabular arrays can be nested:
let nested = json!({
"items": [
{
"users": [
{ "id": 1, "name": "Ada" },
{ "id": 2, "name": "Bob" }
],
"status": "active"
}
]
});
println!("{}", encode_default(&nested).unwrap());Output:
items[1]:
status: active
users[2]{id,name}:
1,Ada
2,Bob
When arrays contain other primitive arrays, they're expanded as list items:
use rtoon::encode_default;
use serde_json::json;
let data = json!({
"pairs": [[1, 2], [3, 4]]
});
println!("{}", encode_default(&data).unwrap());Output:
pairs[2]:
- [2]: 1,2
- [2]: 3,4
Non-uniform arrays (containing primitives, objects, or nested arrays) use the expanded list format:
use rtoon::encode_default;
use serde_json::json;
let mixed = json!({
"items": [1, {"a": 1}, "text"]
});
println!("{}", encode_default(&mixed).unwrap());Output:
items[3]:
- 1
- a: 1
- text
Objects in list format place the first field on the hyphen line:
let list_objects = json!({
"items": [
{"id": 1, "name": "First"},
{"id": 2, "name": "Second", "extra": true}
]
});
println!("{}", encode_default(&list_objects).unwrap());Output:
items[2]:
- id: 1
name: First
- id: 2
name: Second
extra: true
Use tab or pipe delimiters to avoid quoting and save more tokens:
use rtoon::{encode, EncodeOptions, Delimiter};
use serde_json::json;
let data = json!({
"items": [
{"sku": "A1", "name": "Widget", "qty": 2, "price": 9.99},
{"sku": "B2", "name": "Gadget", "qty": 1, "price": 14.5}
]
});
// Tab delimiter (\t)
let tab = encode(&data, &EncodeOptions::new().with_delimiter(Delimiter::Tab)).unwrap();
println!("{}", tab);
// Pipe delimiter (|)
let pipe = encode(&data, &EncodeOptions::new().with_delimiter(Delimiter::Pipe)).unwrap();
println!("{}", pipe);Prefix array lengths with a marker character for clarity:
use rtoon::{encode, EncodeOptions};
use serde_json::json;
let data = json!({
"tags": ["reading", "gaming", "coding"],
"items": [
{"sku": "A1", "qty": 2, "price": 9.99},
{"sku": "B2", "qty": 1, "price": 14.5}
]
});
let opts = EncodeOptions::new().with_length_marker('#');
println!("{}", encode(&data, &opts).unwrap());Output:
items[#2]{sku,qty,price}:
A1,2,9.99
B2,1,14.5
tags[#3]: reading,gaming,coding
Empty arrays and objects are supported:
use rtoon::encode_default;
use serde_json::json;
// Empty array
let empty_items = json!({ "items": [] });
println!("{}", encode_default(&empty_items).unwrap());
// Root array
let root_array = json!(["x", "y"]);
println!("{}", encode_default(&root_array).unwrap());Output:
items[0]:
[2]: x,y
Empty objects at root encode to empty output.
TOON supports full round-trip encoding and decoding:
use rtoon::{decode_default, encode_default};
use serde_json::json;
let original = json!({
"product": "Widget",
"price": 29.99,
"stock": 100,
"categories": ["tools", "hardware"]
});
let encoded = encode_default(&original).unwrap();
let decoded = decode_default(&encoded).unwrap();
assert_eq!(original, decoded);
println!("Round-trip successful!");Strict mode enforces array counts, indentation, and delimiter consistency:
use rtoon::{decode, DecodeOptions};
// Malformed: header says 2 rows, but only 1 provided
let malformed = "items[2]{id,name}:\n 1,Ada";
let opts = DecodeOptions::new().with_strict(true);
match decode(malformed, &opts) {
Ok(_) => println!("Unexpectedly decoded"),
Err(err) => println!("Strict decode error: {}", err),
}Strict mode (default) checks:
- Array counts must match declared lengths
- Indentation must be exact multiples of indent size
- Tabs cannot be used for indentation
- Invalid escape sequences cause errors
- Missing colons after keys cause errors
- Blank lines inside arrays/tabular rows cause errors
pub fn encode(value: &serde_json::Value, options: &EncodeOptions) -> ToonResult<String>
pub fn encode_default(value: &serde_json::Value) -> ToonResult<String>
pub fn encode_object(value: &serde_json::Value, options: &EncodeOptions) -> ToonResult<String>
pub fn encode_array(value: &serde_json::Value, options: &EncodeOptions) -> ToonResult<String>pub fn decode(input: &str, options: &DecodeOptions) -> ToonResult<serde_json::Value>
pub fn decode_default(input: &str) -> ToonResult<serde_json::Value>
pub fn decode_strict(input: &str) -> ToonResult<serde_json::Value>
pub fn decode_strict_with_options(input: &str, options: &DecodeOptions) -> ToonResult<serde_json::Value>
pub fn decode_no_coerce(input: &str) -> ToonResult<serde_json::Value>
pub fn decode_no_coerce_with_options(input: &str, options: &DecodeOptions) -> ToonResult<serde_json::Value>#[derive(Debug, Clone, PartialEq, Eq)]
pub enum Indent {
Spaces(usize), // Number of spaces per indent level
Tabs, // Use tabs for indentation
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct EncodeOptions {
pub delimiter: Delimiter, // default: Delimiter::Comma
pub length_marker: Option<char>, // default: None
pub indent: Indent, // default: Indent::Spaces(2)
}
impl EncodeOptions {
pub fn new() -> Self
pub fn with_delimiter(self, delimiter: Delimiter) -> Self
pub fn with_length_marker(self, marker: char) -> Self
pub fn with_indent(self, style: Indent) -> Self
pub fn with_spaces(self, count: usize) -> Self
pub fn with_tabs(self) -> Self
}Example:
use rtoon::{encode, EncodeOptions, Delimiter};
let opts = EncodeOptions::new()
.with_delimiter(Delimiter::Tab)
.with_length_marker('#')
.with_spaces(4);
// Or
let opts = EncodeOptions::new()
.with_delimiter(Delimiter::Pipe)
.with_length_marker('#')
.with_tabs();
// Or
use rtoon::types::Indent;
let opts = EncodeOptions::new()
.with_indent(Indent::Spaces(3));#[derive(Debug, Clone, PartialEq, Eq)]
pub struct DecodeOptions {
pub delimiter: Option<Delimiter>, // auto-detect if None
pub strict: bool, // default: true
}
impl DecodeOptions {
pub fn new() -> Self
pub fn with_strict(self, strict: bool) -> Self
pub fn with_delimiter(self, delimiter: Delimiter) -> Self
}Example:
use rtoon::{decode, DecodeOptions, Delimiter};
let opts = DecodeOptions::new()
.with_strict(true)
.with_delimiter(Some(Delimiter::Pipe));#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum Delimiter {
Comma, // ","
Tab, // "\t" (U+0009)
Pipe, // "|"
}All functions return ToonResult<T>, which is Result<T, ToonError>. The error type provides detailed information about parsing or encoding failures:
use rtoon::{decode_default, ToonError};
match decode_default(input) {
Ok(value) => println!("Success: {}", value),
Err(ToonError::ParseError(msg)) => eprintln!("Parse error: {}", msg),
Err(ToonError::ValidationError(msg)) => eprintln!("Validation error: {}", msg),
// ... other error variants
}- Objects:
key: valuewith 2-space indentation for nesting - Primitive arrays: inline with count, e.g.,
tags[3]: a,b,c - Arrays of objects: tabular header, e.g.,
items[2]{id,name}:\n ... - Mixed arrays: list format with
-prefix - Quoting: only when necessary (special chars, ambiguity, keywords like
true,null) - Root forms: objects (default), arrays, or primitives
For complete format specification, see SPEC.md.
This implementation follows the TOON Specification v1.2. The specification defines:
- Data model and normalization rules
- Encoding and decoding semantics
- Header syntax and delimiter scoping
- Quoting rules and escaping
- Strict mode validation requirements
Refer to SPEC.md for complete details.
Run the consolidated examples:
cargo run --example examplesThis executes examples/main.rs, which invokes all parts under examples/parts/:
arrays.rsβ Primitive array encodingarrays_of_arrays.rsβ Nested primitive arraysobjects.rsβ Simple and nested objectstabular.rsβ Tabular array encodingdelimiters.rsβ Custom delimiter usagemixed_arrays.rsβ Mixed/non-uniform arrayslength_marker.rsβ Length marker examplesempty_and_root.rsβ Edge cases and root formsround_trip.rsβ Encoding and decoding verificationdecode_strict.rsβ Strict mode validation
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
π€ How to Contribute
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
MIT Β© 2025
- Original JavaScript/TypeScript implementation: @byjohann/toon
Built with β€οΈ in Rust