Skip to content

collinear-ai/simulations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Collinear TraitMix Simulations

   arXiv     Blog:TraitBasis     Benchmark:TauTrait

traitmix
Collinear TraitMix is a product for AI agent sandbox testing with simulations. With TraitMix , you can generate realistic, high-fidelity, multi-turn user interactions conditioned on user intents and user personas (traits and attributes). TraitMix can be used to evaluate agents, to create training data for long horizon RL training, or to prototype product flows.

TraitMix is powered by our cutting-edge research on steering personas called TraitBasis. TraitBasis is a method for highly-controllable generations that does not suffer from the limitations of prompt-based approaches like the user persona fading with number of turns or the user forgetting the intent in a long context.

You will need a Collinear API key to run TraitMix that you can get by signing up here. Demo notebooks can be found in the examples/ directory.

Features

  • Generate multi-turn customer interactions directly from the example notebooks
  • Tune personas, intents, and sampling parameters through the JSON configs in examples/**/configs/
  • Evaluate agents with malicious personas or benchmark on standard benchmarks like BFCL and $\tau$-Bench
  • Call Together or any OpenAI-compatible endpoint by supplying your API keys
  • (Together example) Upload simulation outputs to Together and poll evaluation jobs entirely within the notebook flow
  • (Together example) Review persona summaries and scored transcripts inline to decide next changes

Example folders

  • examples/quick_start/: Minimal notebook to generate TraitMix rollouts.
  • examples/simulations_bfcl/: Agent rollouts with TraitMix applied to Berkeley Function Calling Leaderboard
  • examples/simulations_rlvr/: Rollouts tailored for long-horizon RL training workflows using examples similar to $\tau$-Bench and $\tau$-Trait
  • examples/simulations_together_evals/: End-to-end testing for any models hosted on Together using Together Evals
  • examples/simulations_adversarial_persona: Simulate realistic redteaming for agents with malicious user personas

Getting Started

Install & Configure

Install uv if you don't already have it, then prepare the environment:

curl -LsSf https://astral.sh/uv/install.sh | sh
source .venv/bin/activate
uv venv --python 3.12 --seed
uv pip install --upgrade pip
uv pip install jupyterlab ipykernel together collinear nest_asyncio jinja2 --no-cache
uv pip install "openai>=1.13.3" "mistralai>=0.4.0" "anthropic>=0.26.1" "google-generativeai>=0.5.4" "tenacity>=8.3.0" "termcolor>=2.4.0" "numpy>=1.26.4" "litellm==1.41.0"
uv pip install tau-trait

If you plan to call Together endpoints, export your API key in the active shell:

export TOGETHER_API_KEY="YOUR_TOGETHER_API_KEY"

Run the Notebooks

Start Jupyter from the project root:

uv run --with jupyter jupyter lab

Select the trait-basis kernel in each notebook and run the grouped import cell before generating simulations.

About

Generate high-fidelity, realistic simulation rollouts for AI Agents training and testing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 7