Fastpaca Context Store

Keep long conversations fast without compromising user experience.

Fastpaca provides full message history + context budgeting with compaction for LLM apps.

Store messages in fastpaca and optionally archive to postgres.
Set token budgets. Conversations stay within bounds.
You control the latency/accuracy/cost tradeoff.

                      ╔═ fastpaca ════════════════════════╗
╔══════════╗          ║                                   ║░    ╔═optional═╗
║          ║░         ║  ┏━━━━━━━━━━━┓     ┏━━━━━━━━━━━┓  ║░    ║          ║░
║  client  ║░───API──▶║  ┃  Message  ┃────▶┃  Context  ┃  ║░ ──▶║ postgres ║░
║          ║░         ║  ┃  History  ┃     ┃  Policy   ┃  ║░    ║          ║░
╚══════════╝░         ║  ┗━━━━━━━━━━━┛     ┗━━━━━━━━━━━┛  ║░    ╚══════════╝░
 ░░░░░░░░░░░░         ║                                   ║░     ░░░░░░░░░░░░
                      ╚═══════════════════════════════════╝░
                       ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░

Enforces a per-conversation token budget before requests hit your LLM, without compromising user experience.

Long conversations get expensive and slow

Users want to see full conversation history when they talk to LLMs
More messages = more tokens = higher cost
Larger context = slower responses
Eventually you hit the LLMs limit

What fastpaca does

Enforces per-conversation token budgets with deterministic compaction.

Keep full history for users
Compact context for the model
Choose your policy (last_n, skip_parts, manual)

Example: last_n policy (keep recent messages)

Before (10 messages):

[
  { role: 'user', text: 'What's the weather?' },
  { role: 'assistant', text: '...' },
  { role: 'user', text: 'Tell me about Paris' },
  { role: 'assistant', text: '...' },
  // ... 6 more exchanges
  { role: 'user', text: 'Book a flight to Paris' }
]

After last_n policy with limited budget (3 messages):

[
  { role: 'user', text: 'Tell me about Paris' },
  { role: 'assistant', text: '...' },
  { role: 'user', text: 'Book a flight to Paris' }
]

Full history stays in storage. Only compact context goes to the model.

Example: skip_parts policy (drop heavy content)

Before (assistant message with reasoning + tool results):

{
  role: 'assistant',
  parts: [
    { type: 'reasoning', text: '<3000 tokens of chain-of-thought>' },
    { type: 'tool_use', name: 'search', input: {...} },
    { type: 'tool_result', content: '<5000 tokens of search results>' },
    { type: 'text', text: 'Based on the search, here's the answer...' }
  ]
}

After skip_parts policy (keeps message structure, drops bulk):

{
  role: 'assistant',
  parts: [
    { type: 'text', text: 'Based on the search, here's the answer...' }
  ]
}

Drops reasoning traces, tool results, images — keeps the final response. Massive token savings while preserving conversation flow.

Quick Start

Tip

See example for a more comprehensive look at how it looks in a real chat app!

Start container, note that postgres is optional. Data will persist in memory with a TAIL for message history.

docker run -d \
  -p 4000:4000 \
  -v fastpaca_data:/data \
  ghcr.io/fastpaca/context-store:latest

Use our typescript SDK

import { createClient } from '@fastpaca/fastpaca';

const fastpaca = createClient({ baseUrl: 'http://localhost:4000/v1' });
const ctx = await fastpaca.context('demo', { budget: 1_000_000 });
await ctx.append({ role: 'user', parts: [{ type: 'text', text: 'Hi' }] });

// For your LLM
const { messages } = await ctx.context();

When to use fastpaca

Good fit:

Multi-turn conversations that grow unbounded
Agent apps with heavy tool use and reasoning traces
Apps that need full history retention + compact model context
Scenarios where you want deterministic, policy-based compaction

Not a fit (yet):

Single-turn Q&A (no conversation state to manage)
Apps that need semantic compaction (we're deterministic, not embedding-based)

Background

We kept rebuilding the same Redis + Postgres + pub/sub stack to manage conversation state and compaction. It was messy, hard to scale, and expensive to tune. Fastpaca turns that pattern into a single service you can drop in.

Development

# Clone and set up
git clone https://github.com/fastpaca/context-store
cd context-store
mix setup            # install deps, create DB, run migrations

# Start server on http://localhost:4000
mix phx.server

# Run tests / precommit checks
mix test
mix precommit        # format, compile (warnings-as-errors), test

Storage tiers

Hot (Raft): LLM context window + bounded message tail. Raft snapshots include these plus watermarks (last_seq, archived_seq).
Cold (optional): Archiver persists full history to Postgres and acknowledges a high-water mark so Raft can trim older tail segments.

Contributing

We welcome pull requests. Before opening one:

Run mix precommit (format, compile, test)
Add tests for new behaviour
Update docs if you change runtime behaviour or message flow

If you use a coding agent, make sure it follows AGENTS.md/CLAUDE.md and review all output carefully.

Name		Name	Last commit message	Last commit date
Latest commit History 286 Commits
.github/workflows		.github/workflows
bench		bench
config		config
docs-site		docs-site
docs		docs
examples/nextjs-chat		examples/nextjs-chat
lib		lib
priv/repo/migrations		priv/repo/migrations
scripts		scripts
sdk/typescript		sdk/typescript
test		test
.formatter.exs		.formatter.exs
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
docker-entrypoint.sh		docker-entrypoint.sh
mix.exs		mix.exs
mix.lock		mix.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fastpaca Context Store

Long conversations get expensive and slow

What fastpaca does

Quick Start

When to use fastpaca

Background

Development

Storage tiers

Contributing

About

Uh oh!

Releases

Packages

Languages

License

fastpaca/context-store

Folders and files

Latest commit

History

Repository files navigation

Fastpaca Context Store

Long conversations get expensive and slow

What fastpaca does

Quick Start

When to use fastpaca

Background

Development

Storage tiers

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages