refactor: condense AGENTS.md following best practices #1200

amikofalvy · 2025-12-02T02:56:37Z

Summary

Refactored AGENTS.md following HumanLayer's best practices for writing effective AI agent guidance files:

Reduced AGENTS.md from 442 lines to 96 lines (was 7x longer than recommended)
Created docs/agent-guides/ directory for progressive disclosure of detailed topics
Replaced embedded code examples with file references to prevent outdated snippets
Kept only universally applicable instructions in the main file

Progressive Disclosure Files Created

File	Content
`docs/agent-guides/testing-patterns.md`	Vitest patterns, test structure, coverage requirements
`docs/agent-guides/ui-development.md`	Next.js/React patterns for agents-manage-ui
`docs/agent-guides/documentation.md`	MDX documentation guidelines for agents-docs
`docs/agent-guides/database-migrations.md`	Drizzle migration workflow and critical rules
`docs/agent-guides/git-worktrees.md`	Parallel development with git worktrees
`docs/agent-guides/debugging.md`	Jaeger tracing commands and debugging workflows

Key Improvements

Less is more: Frontier models can follow ~150-200 instructions max; previous file likely exceeded this
Universal applicability: Detailed task-specific content moved to separate files
Pointers over copies: Code examples replaced with references to actual files
Progressive disclosure: Agents only see detailed guides when they navigate to them

Test plan

Verify AI agents can still find essential commands in AGENTS.md
Verify links to docs/agent-guides/ files work correctly
Test that agents can follow progressive disclosure to find detailed information

changeset-bot · 2025-12-02T02:56:41Z

⚠️ No Changeset found

Latest commit: 8d3ca96

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

vercel · 2025-12-02T02:56:42Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
agents-docs	Ready	Preview	Comment	Dec 2, 2025 9:26pm
agents-manage-api	Ready	Preview	Comment	Dec 2, 2025 9:26pm
agents-manage-ui	Ready	Preview	Comment	Dec 2, 2025 9:26pm
agents-run-api	Ready	Preview	Comment	Dec 2, 2025 9:26pm

claude · 2025-12-02T02:57:02Z

Claude finished @amikofalvy's task —— View job

Staff Engineer Review Complete

I've conducted a comprehensive architectural review of this PR covering both the AGENTS.md refactoring and the context tracking evaluation system. Here's my analysis:

🎯 Overall Assessment: Strong Refactoring with Well-Architected Context Tracking

This PR successfully addresses two distinct objectives: progressive disclosure for agent guidance and sophisticated context analysis capabilities. Both are well-executed with thoughtful architectural decisions.

📋 AGENTS.md Refactoring Analysis

✅ Excellent Progressive Disclosure Implementation

The refactoring from 442 to 96 lines following HumanLayer's best practices is architecturally sound:

Cognitive Load Reduction: Maintains ~150-200 instruction limit for frontier models
Universal Applicability: Core AGENTS.md now contains only universally relevant commands and patterns
Smart Information Architecture: Task-specific content properly separated into docs/agent-guides/

✅ Well-Structured Guide Files

The progressive disclosure files demonstrate good documentation patterns:

Consistent Structure: All guides follow similar format (Stack → Commands → Guidelines)
Practical Focus: Each guide provides immediately actionable information
Proper Scope: Each guide covers a cohesive domain (testing, UI, docs, etc.)

⚠️ Minor Naming Inconsistency

Issue: File is named Agents.md (PascalCase) but referenced as AGENTS.md in documentation.

Recommendation:

# Standardize on AGENTS.md convention
git mv Agents.md AGENTS.md

🔍 Context Tracking System Analysis

This is a sophisticated addition that adds significant observability value. The implementation demonstrates strong system design principles:

✅ Excellent Token Estimation Strategy

agents-run-api/src/utils/token-estimator.ts:

Model Agnostic: 4-chars-per-token approximation works across OpenAI/Anthropic/Gemini
Performance Optimized: Simple string length calculation vs expensive tokenizer calls
Appropriate Precision: Perfect for relative comparisons between context components

✅ Strong Type Safety & Extensibility

ContextBreakdown Interface Design:

export interface ContextBreakdown {
  systemPromptTemplate: number;
  coreInstructions: number;
  agentPrompt: number;
  // ... 11 total components with clear semantic naming
  total: number;
}

Comprehensive Coverage: Captures all major context components
Clear Semantics: Field names clearly indicate their purpose
Future-Proof: Easy to extend with new components

✅ Sophisticated UI Implementation

agents-manage-ui/src/components/traces/context-breakdown.tsx:

Progressive Visual Design: Stacked bar + detailed breakdown provides multiple levels of insight
Accessibility: Proper ARIA labeling, tooltips, and color coding
Performance: useMemo for expensive calculations
User Experience: Filters out segments <0.5% to avoid visual clutter

✅ Clean Architectural Integration

System Integration Points:

SystemPromptBuilder: Returns AssembleResult with both prompt and breakdown
Phase1Config/Phase2Config: Implements token tracking throughout prompt assembly
UI Components: Seamlessly integrates breakdown display in trace viewer

🔧 Architectural Considerations & Recommendations

1. Token Estimation Accuracy Trade-offs

Current Approach: Character-based approximation (~4 chars/token)
Considerations:

✅ Fast, model-agnostic, dependency-free
⚠️ May be 15-25% off for some content types (code vs prose)

Recommendation: Current approach is excellent for the use case. Consider adding a configuration option for precision-sensitive scenarios:

interface TokenEstimatorConfig {
  method: 'character-based' | 'model-specific';
  model?: string; // For future model-specific tokenizers
}

2. Context Component Granularity

Current Breakdown: 11 components tracked individually
Consideration: Balance between granularity and usability

Recommendation: Current granularity is appropriate. Consider grouping for high-level views:

interface ContextSummary {
  instructions: number; // systemPromptTemplate + coreInstructions + agentPrompt
  tools: number; // toolsSection + artifactComponents
  history: number; // conversationHistory
  // ... etc
}

3. Performance & Memory Considerations

Current Implementation: Creates breakdown objects for every prompt assembly

Considerations:

Memory usage for high-volume scenarios
CPU overhead of token estimation

Recommendations:

✅ Current implementation is efficient (simple calculations)
Consider adding feature flag for production environments where breakdown isn't needed
Potential optimization: Cache breakdown results for identical prompt components

🚨 Critical Items (None Found)

No critical architectural flaws or security issues identified. Both the documentation refactoring and context tracking implementation follow solid engineering principles.

🎯 Customer Impact Assessment

Positive Impacts:

Developer Experience: Cleaner, more discoverable agent guidance
Debugging Capabilities: Rich context analysis for troubleshooting token usage
Performance Insights: Ability to optimize prompts based on component breakdown
Maintainability: Progressive disclosure makes guidance easier to maintain

Risk Assessment: LOW

No breaking changes to APIs or data contracts
Additive features with graceful degradation
Backward compatible changes only

📈 Future Extensibility

The architecture supports several natural extensions:

Token Budget Warnings: Alert when approaching model context limits
Component Optimization: Suggest which components to trim when over budget
Historical Analysis: Track context evolution over time
A/B Testing: Compare token efficiency across prompt variations

✅ Final Recommendation: APPROVE

This PR demonstrates excellent system architecture across both objectives:

AGENTS.md Refactoring: Thoughtful progressive disclosure following industry best practices
Context Tracking: Well-designed observability system with strong technical implementation

The only suggestion is the minor naming standardization for AGENTS.md. Otherwise, this is production-ready code that adds significant value to the platform.

Confidence Level: High - This is well-architected, thoroughly implemented, and follows TypeScript/React best practices throughout.

Refactored AGENTS.md from 442 lines to 96 lines following HumanLayer's best practices for writing effective AI agent guidance files: - Condensed to essential commands, architecture, and quick reference - Created docs/agent-guides/ for progressive disclosure of detailed topics: - testing-patterns.md - ui-development.md - documentation.md - database-migrations.md - git-worktrees.md - debugging.md - Replaced embedded code examples with file references - Kept only universally applicable instructions in the main file Reference: https://www.humanlayer.dev/blog/writing-a-good-claude-md

- Reduce AGENTS.md from 442 to 106 lines (76% reduction) - Reduce .cursorrules from 202 to 39 lines (81% reduction) - Add progressive disclosure files in docs/agent-guides/ - Add "Before Committing" section with Biome lint commands - Replace embedded code examples with file references

vercel bot deployed to Preview – agents-run-api December 2, 2025 02:58 View deployment

vercel bot deployed to Preview – agents-manage-api December 2, 2025 02:58 View deployment

vercel bot deployed to Preview – agents-manage-ui December 2, 2025 02:59 View deployment

vercel bot deployed to Preview – agents-docs December 2, 2025 03:00 View deployment

vercel bot deployed to Preview – agents-run-api December 2, 2025 21:21 View deployment

vercel bot deployed to Preview – agents-manage-ui December 2, 2025 21:21 View deployment

vercel bot deployed to Preview – agents-manage-api December 2, 2025 21:22 View deployment

amikofalvy added 2 commits December 2, 2025 11:22

amikofalvy force-pushed the refactor/agents-md-best-practices branch from 3f1a23f to 8d3ca96 Compare December 2, 2025 21:23

vercel bot deployed to Preview – agents-run-api December 2, 2025 21:25 View deployment

vercel bot deployed to Preview – agents-manage-api December 2, 2025 21:25 View deployment

vercel bot deployed to Preview – agents-manage-ui December 2, 2025 21:25 View deployment

vercel bot deployed to Preview – agents-docs December 2, 2025 21:26 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: condense AGENTS.md following best practices #1200

refactor: condense AGENTS.md following best practices #1200

Uh oh!

amikofalvy commented Dec 2, 2025

Uh oh!

changeset-bot bot commented Dec 2, 2025 •

edited

Loading

Uh oh!

vercel bot commented Dec 2, 2025 •

edited

Loading

Uh oh!

claude bot commented Dec 2, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

refactor: condense AGENTS.md following best practices #1200

Are you sure you want to change the base?

refactor: condense AGENTS.md following best practices #1200

Uh oh!

Conversation

amikofalvy commented Dec 2, 2025

Summary

Progressive Disclosure Files Created

Key Improvements

Test plan

Uh oh!

changeset-bot bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

vercel bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Staff Engineer Review Complete

🎯 Overall Assessment: Strong Refactoring with Well-Architected Context Tracking

📋 AGENTS.md Refactoring Analysis

✅ Excellent Progressive Disclosure Implementation

✅ Well-Structured Guide Files

⚠️ Minor Naming Inconsistency

🔍 Context Tracking System Analysis

✅ Excellent Token Estimation Strategy

✅ Strong Type Safety & Extensibility

✅ Sophisticated UI Implementation

✅ Clean Architectural Integration

🔧 Architectural Considerations & Recommendations

1. Token Estimation Accuracy Trade-offs

2. Context Component Granularity

3. Performance & Memory Considerations

🚨 Critical Items (None Found)

🎯 Customer Impact Assessment

Positive Impacts:

Risk Assessment: LOW

📈 Future Extensibility

✅ Final Recommendation: APPROVE

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

changeset-bot bot commented Dec 2, 2025 •

edited

Loading

vercel bot commented Dec 2, 2025 •

edited

Loading

claude bot commented Dec 2, 2025 •

edited

Loading