Traceforge scaffolds production‑grade, multi‑agent orchestration into any repository and adds AI‑assisted SRS and phase planning. Built for agentic development with Claude Code, plus optional Codex/Gemini per role.
🚀 Value Proposition (Drop‑in for Claude Code)
- 🧠 Real multi‑level orchestration via a tiny runner + hooks (overcomes single‑level subagent limits)
- 🧪 Guardrailed engineer ⇄ QA loops with policy gates, retries, and traceable evidence
- 🧩 Plug‑in backends per role (Claude/Codex/Gemini) — vendor‑agnostic by design
- 🗂️ Deterministic outputs:
.claude/*,.pm/*, anddocs/SRS.mdfor end‑to‑end traceability
npx traceforge@latest init --stack golang \
--backend-map orchestrator=claude,engineer=codex,qa=claude \
--evidence-root ./evidenceThis generates .claude/* agents, hooks, drivers, and a Python runner that spawns child sessions on the selected backends.
Additionally, it creates docs/SRS.md as a starter Software Requirements Specification to support traceability.
Traceforge emulates multi‑level orchestration inside Claude Code using hooks and a tiny runner, then enforces a methodology with engineer/QA cycles, gates, and evidence. Per‑role backend mapping (Claude/Codex/Gemini) lets you pick the best tool for each role without vendor lock‑in.
Limitations of native Claude Code vs. Traceforge’s drop‑in orchestration.
sequenceDiagram
autonumber
participant U as User
participant B as Base Model
participant S as Subagent
U->>B: "Do project X"
B->>S: "Implement + QA please"
Note over B,S: Subagents cannot delegate further
S-->>B: Implementation attempt
B->>S: "Now run QA"
S-->>B: QA attempt (mixed consistency)
B-->>U: Result (fragile, manual chaining)
sequenceDiagram
autonumber
participant U as User
participant B as Base Model
participant O as Orchestrator (Traceforge)
participant E as Engineer Agent
participant Q as QA Agent
U->>B: "Execute Phase GO-1"
B->>O: "Run phase plan"
O->>E: STORY 1.1 context + policies
E-->>O: status, commit, evidence paths
O->>Q: verify tests, coverage, scans
Q-->>O: verdict, findings
alt RED
O->>E: remediation pack (diff, failing tests)
E-->>O: fix commit + notes
O->>Q: re-verify
end
O-->>B: Phase summary (GREEN) + evidence
ASCII fallback
Native: User -> Base -> Subagent (no further delegation). QA often chained manually.
Traceforge: User -> Base -> Orchestrator -> Engineer <-> QA loops -> Base. Deterministic gates + evidence.
init [target]— scaffold the kit into a repository (.claude/, .pm/, docs/SRS.md)add-stack [target] <name>— add a stack using skeleton templatesdoctor [target]— validate orchestration files/drivers/hooks/agentsupgrade [target]— reapply common templates and optional--stackgen-srs [target]— generate or refine an SRS (interactive and/or AI-assisted)gen-phase [target] --stack <name>— generate a phase file (optionally AI-assisted)
--stack <golang|dotnet>--backend-map orchestrator=claude,engineer=codex,qa=claude--evidence-root <path>(default:./evidence)--dry-run— print the plan only--force— overwrite existing files
upgrade options:
--stack <name>— also reapply the named stack templates--backend-map,--evidence-root,--dry-run,--force
- Node.js >= 18
- Optional: CLIs for configured backends (claude, codex, gemini) and their API keys
Hooks run with your environment privileges. Review changes in PRs and scope environment variables carefully.
MIT
- Build:
npm run build - Try locally:
node dist/index.js init --stack golang --dry-run - Add a stack:
node dist/index.js add-stack . dotnet - Doctor check:
node dist/index.js doctor . - Upgrade:
node dist/index.js upgrade . --stack golang --dry-run - Generate SRS:
node dist/index.js gen-srs . --project traceforge --interactiveSRS generation options: --project <name>— required unless using--interactive--interactive— prompt for fields--ai— use OpenAI-compatible endpoint (with--model,--base-url,--api-keyor envOPENAI_API_KEY)
Phase generation options:
--stack <name>— stack name (e.g., golang)--id <PHASE-ID>— override phase ID--title <string>— title of the phase--ai— request AI to propose story tasks- Generate Phase:
node dist/index.js gen-phase . --stack golang --ai - Publish:
npm publish --access public
- npmjs: publish unscoped
traceforge(see.github/workflows/release.yml). - GitHub Packages: requires a scope. CI can publish as
@karolswdev/traceforge— seePUBLISHING.md.
- 🧭 Orchestrator reads the phase file and policies, builds a DAG of stories.
- 🧑💻 Engineer implements each story atomically (code → tests → traceability → commit).
- 🔍 QA verifies tests, coverage, linters, vuln/secret scans, and traceability; can minimally repair if policy allows.
- ♻️ Remediation loop runs (engineer ⇄ QA) within retry limits until GREEN or policy STOP.
- 🚦 Phase Gate: run regression, PR/merge, header flip; orchestrator writes log and summary.
- 🧾 Evidence: artifacts under
evidence/, with orchestratorlog.mdandsummary.json.
Under the hood, hooks block nested Task calls and redirect work to the runner/MCP, which spawns child sessions on the configured backends and returns structured results. You get deterministic behavior and clean audit trails.
- Orchestrated sub-orchestration: overcome single-level delegation with hooks and a runner.
- Evidence-driven: deterministic outputs and evidence paths for audits and CI.
- Multi-LLM by role: map orchestrator/engineer/qa to best-fit backends.
- Safe and idempotent: clear plan, dry-run, and non-destructive by default.
.claude/hooks/*: guardrails and state injection.claude/mcp/runner.py: multi-LLM child-session runner (swappable backends).claude/agents/*: orchestrator + stack engineer/qa prompts.pm/*: phase/story plan skeletonsdocs/SRS.md: live SRS (interactive and AI-assisted generation supported)
- Doctor: deeper environment and CLI verification
- Upgrade: 3-way merge of local changes with template updates
- Stack plugins: discover
traceforge-stack-*via npm