Skip to content

Conversation

@gulivan
Copy link
Collaborator

@gulivan gulivan commented Sep 26, 2025

Summary by CodeRabbit

  • Bug Fixes

    • Improved token counting by building complete, service-aware tool representations (including parameters and sensible schema defaults) for more accurate estimates.
    • Tool names normalized per service context to align with external tokenization expectations.
    • Estimation errors now log a warning and gracefully fallback to 0 to avoid verification failures.
  • Chores

    • Applied a temporary adjustment factor to token estimates pending further tuning.

@coderabbitai
Copy link

coderabbitai bot commented Sep 26, 2025

Walkthrough

Change updates internal token counting in src/core/mcpClient.ts: the private method now accepts serverName, builds a Claude-facing tool representation (adds $schema, moves inputSchema under parameters, applies object-schema defaults, prefixes tool name with mcp__{serverName}__{tool.name}), serializes and counts tokens (×3), and falls back to 0 with a warning on errors.

Changes

Cohort / File(s) Summary
MCP tool token counting
src/core/mcpClient.ts
Signature changed to calculateToolTokens(tools, serverName). For each tool build a Claude-style wrapper (claudeToolRepresentation) with $schema, optional description, name prefixed mcp__{serverName}__{tool.name}, and move inputSchema into parameters applying object-schema defaults (type, properties, required, additionalProperties). Wrap representation in <function>…</function> for counting, multiply token result by 3 (temporary), and on errors emit a warning and use 0 tokens. Public call updated to pass server.name.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Caller
  participant MCPClient
  participant Builder as Tool Representation Builder
  participant Tokenizer

  Caller->>MCPClient: request token counts for tools (passes server.name)
  MCPClient->>Builder: for each tool -> build claudeToolRepresentation
  note right of Builder #DDF2E9: add `$schema`,\nmove inputSchema -> parameters,\napply object defaults,\nprefix name with mcp__{serverName}__
  Builder-->>MCPClient: complete tool representation
  MCPClient->>Tokenizer: wrap in <function>…</function>, serialize and count tokens (×3 adjustment)
  alt success
    Tokenizer-->>MCPClient: token count
  else error
    Tokenizer-->>MCPClient: error (MCPClient warns, uses 0)
  end
  MCPClient-->>Caller: return per-tool counts and total
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I twitch my whiskers, schemas bright,
$schema tucked beneath moonlight.
Parameters march in tidy rows,
Names prefixed where the river flows.
I count the tokens—hop!—and write. 🐇

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “feat: improve token calc” follows a conventional commit style and concisely indicates that the pull request enhances the token calculation logic, which is the main change introduced in the PR. It directly relates to the modifications in calculateToolTokens and the integration of serverName‐aware token counts. Although somewhat generic, it still conveys the primary intent of improving token calculation.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/improve_token_calc

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9b39277 and 107a8d7.

📒 Files selected for processing (1)
  • src/core/mcpClient.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Write clean, maintainable code following TypeScript best practices
Use TypeScript for type safety when applicable
Follow consistent naming conventions
Add comments for complex logic
Maintain consistency with existing code style
Add appropriate error handling
Never expose sensitive information in code or logs
Follow security best practices for TypeScript
Validate all inputs
Use environment variables for configuration
Document all public APIs, functions, and classes with clear descriptions

Files:

  • src/core/mcpClient.ts

- Only apply object-specific defaults (properties, required, additionalProperties) when schema is actually an object type
- Correctly handle primitive types (string, number, array) without forcing object structure
- Support union types that include object
- Conditionally include description field only when it exists
- Prevents token count inflation for non-object schemas

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/core/mcpClient.ts (1)

83-87: Avoid pretty‑printed JSON to prevent token inflation

Pretty printing adds whitespace that inflates token counts. Minify the JSON for a closer estimate to what models actually tokenize, unless you know Claude receives pretty JSON.

Apply this diff:

-        const toolText = JSON.stringify(toolForCounting, null, 2);
+        const toolText = JSON.stringify(toolForCounting);

Please confirm whether Claude’s tool context is pretty‑printed; if so, keep the current formatting.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 107a8d7 and 6f85f8b.

📒 Files selected for processing (1)
  • src/core/mcpClient.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Write clean, maintainable code following TypeScript best practices
Use TypeScript for type safety when applicable
Follow consistent naming conventions
Add comments for complex logic
Maintain consistency with existing code style
Add appropriate error handling
Never expose sensitive information in code or logs
Follow security best practices for TypeScript
Validate all inputs
Use environment variables for configuration
Document all public APIs, functions, and classes with clear descriptions

Files:

  • src/core/mcpClient.ts
🧬 Code graph analysis (1)
src/core/mcpClient.ts (2)
src/types/index.ts (1)
  • MCPTool (189-193)
tests/__mocks__/contextcalc.js (2)
  • countTokens (4-13)
  • countTokens (4-13)
🔇 Additional comments (3)
src/core/mcpClient.ts (3)

43-44: Docstring clarity LGTM

Clearer intent for token calculation in Claude context.


79-81: Selective description inclusion LGTM

Including description only when defined avoids token bloat.


53-76: Non‑object schemas still receive object defaults when schema is present; refine guard

If a tool provides a non‑object schema (e.g., boolean, string), rawSchema becomes undefined, schemaType stays undefined, and the code applies object defaults. This regresses token accuracy for legitimate non‑object schemas and repeats the earlier concern. Apply defaults only when schema is missing OR the provided schema is actually an object (including unions that contain "object").

Apply this diff:

-        const rawSchema =
-          tool.inputSchema && typeof tool.inputSchema === "object"
-            ? tool.inputSchema
-            : undefined;
-        const schemaType = rawSchema?.type;
-        const parameters = {
-          $schema: "http://json-schema.org/draft-07/schema#",
-          ...(rawSchema ?? {}),
-        };
-
-        const isObjectSchema =
-          schemaType === "object" ||
-          (Array.isArray(schemaType) && schemaType.includes("object")) ||
-          schemaType === undefined;
-
-        // Only apply object-specific defaults when the schema is actually an object type
-        if (isObjectSchema) {
+        const hasSchema = tool.inputSchema !== undefined;
+        const rawSchema =
+          hasSchema &&
+          typeof tool.inputSchema === "object" &&
+          tool.inputSchema !== null &&
+          !Array.isArray(tool.inputSchema)
+            ? tool.inputSchema
+            : undefined;
+        const schemaType = rawSchema?.type;
+        const parameters: Record<string, any> = {
+          $schema: "http://json-schema.org/draft-07/schema#",
+          ...(rawSchema ?? {}),
+        };
+
+        const isObjectType =
+          schemaType === "object" ||
+          (Array.isArray(schemaType) && schemaType.includes("object")) ||
+          (schemaType === undefined && rawSchema !== undefined);
+        const applyObjectDefaults = !hasSchema || isObjectType;
+
+        // Only apply defaults when schema is missing OR actually an object
+        if (applyObjectDefaults) {
           if (!("type" in parameters)) parameters.type = "object";
           if (!("properties" in parameters)) parameters.properties = {};
           if (!("required" in parameters)) parameters.required = [];
           if (!("additionalProperties" in parameters))
             parameters.additionalProperties = false;
         }

As per coding guidelines: prefer explicit typing for complex objects.

gulivan and others added 2 commits September 26, 2025 14:54
Remove pretty-printing from token calculation to match how Claude actually receives the JSON schema. Pretty-printed JSON with whitespace inflates token counts and doesn't reflect the real context usage.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Update MCP tool token calculation to better match Claude's actual context representation:

- Add server prefixes to tool names (mcp__server__toolname format)
- Include function wrapper tags (<function>...</function>) in calculation
- Use Claude's actual JSON schema format for function definitions
- Escape descriptions properly in the JSON representation

Results in ~160 tokens per tool average (up from ~60), much closer to Claude's reported 396-468 range.
While still not exact, this provides users with much more realistic estimates of MCP tool context usage.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f85f8b and c7c5137.

📒 Files selected for processing (1)
  • src/core/mcpClient.ts (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Write clean, maintainable code following TypeScript best practices
Use TypeScript for type safety when applicable
Follow consistent naming conventions
Add comments for complex logic
Maintain consistency with existing code style
Add appropriate error handling
Never expose sensitive information in code or logs
Follow security best practices for TypeScript
Validate all inputs
Use environment variables for configuration
Document all public APIs, functions, and classes with clear descriptions

Files:

  • src/core/mcpClient.ts
🧬 Code graph analysis (1)
src/core/mcpClient.ts (2)
src/types/index.ts (1)
  • MCPTool (189-193)
tests/__mocks__/contextcalc.js (2)
  • countTokens (4-13)
  • countTokens (4-13)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c7c5137 and e7d3ef9.

📒 Files selected for processing (1)
  • src/core/mcpClient.ts (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Write clean, maintainable code following TypeScript best practices
Use TypeScript for type safety when applicable
Follow consistent naming conventions
Add comments for complex logic
Maintain consistency with existing code style
Add appropriate error handling
Never expose sensitive information in code or logs
Follow security best practices for TypeScript
Validate all inputs
Use environment variables for configuration
Document all public APIs, functions, and classes with clear descriptions

Files:

  • src/core/mcpClient.ts
🧬 Code graph analysis (1)
src/core/mcpClient.ts (2)
src/types/index.ts (1)
  • MCPTool (189-193)
tests/__mocks__/contextcalc.js (2)
  • countTokens (4-13)
  • countTokens (4-13)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e7d3ef9 and b4a63e7.

📒 Files selected for processing (1)
  • src/core/mcpClient.ts (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Write clean, maintainable code following TypeScript best practices
Use TypeScript for type safety when applicable
Follow consistent naming conventions
Add comments for complex logic
Maintain consistency with existing code style
Add appropriate error handling
Never expose sensitive information in code or logs
Follow security best practices for TypeScript
Validate all inputs
Use environment variables for configuration
Document all public APIs, functions, and classes with clear descriptions

Files:

  • src/core/mcpClient.ts
🧬 Code graph analysis (1)
src/core/mcpClient.ts (2)
src/types/index.ts (1)
  • MCPTool (189-193)
tests/__mocks__/contextcalc.js (2)
  • countTokens (4-13)
  • countTokens (4-13)

Comment on lines +53 to 91
const rawSchema =
tool.inputSchema && typeof tool.inputSchema === "object"
? tool.inputSchema
: undefined;
const schemaType = rawSchema?.type;
const parameters = {
$schema: "http://json-schema.org/draft-07/schema#",
...(rawSchema ?? {}),
};

const hasObjectHints =
rawSchema !== undefined &&
["properties", "required", "additionalProperties", "patternProperties"].some(
(key) => key in rawSchema
);

const isObjectSchema =
rawSchema === undefined ||
schemaType === "object" ||
(Array.isArray(schemaType) && schemaType.includes("object")) ||
hasObjectHints;

// Only apply object-specific defaults when the schema is actually an object type
if (isObjectSchema) {
if (!("type" in parameters)) parameters.type = "object";
if (!("properties" in parameters)) parameters.properties = {};
if (!("required" in parameters)) parameters.required = [];
if (!("additionalProperties" in parameters))
parameters.additionalProperties = false;
}

// Create the prefixed tool name as it appears in Claude's context
const prefixedToolName = `mcp__${serverName}__${tool.name}`;

const toolForCounting = {
name: tool.name,
description: tool.description || '',
inputSchema: tool.inputSchema || {}
name: prefixedToolName,
...(tool.description !== undefined ? { description: tool.description } : {}),
parameters,
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle boolean schemas without forcing object defaults

If a tool advertises a boolean JSON Schema (true/false), we currently drop into the rawSchema === undefined branch and apply the object defaults (type: "object", properties: {}, etc.). That materially changes the schema Claude would see, bloats the serialized payload, and throws off the very token-count accuracy this PR is trying to improve.

Please preserve boolean schemas verbatim and only layer defaults when we truly have an object (or no schema at all). One way to do that:

-        const rawSchema =
-          tool.inputSchema && typeof tool.inputSchema === "object"
-            ? tool.inputSchema
-            : undefined;
-        const schemaType = rawSchema?.type;
-        const parameters = {
-          $schema: "http://json-schema.org/draft-07/schema#",
-          ...(rawSchema ?? {}),
-        };
-
-        const hasObjectHints =
-          rawSchema !== undefined &&
-          ["properties", "required", "additionalProperties", "patternProperties"].some(
-            (key) => key in rawSchema
-          );
-
-        const isObjectSchema =
-          rawSchema === undefined ||
-          schemaType === "object" ||
-          (Array.isArray(schemaType) && schemaType.includes("object")) ||
-          hasObjectHints;
-
-        // Only apply object-specific defaults when the schema is actually an object type
-        if (isObjectSchema) {
-          if (!("type" in parameters)) parameters.type = "object";
-          if (!("properties" in parameters)) parameters.properties = {};
-          if (!("required" in parameters)) parameters.required = [];
-          if (!("additionalProperties" in parameters))
-            parameters.additionalProperties = false;
-        }
+        const inputSchema = tool.inputSchema;
+        let parameters: unknown;
+
+        if (typeof inputSchema === "boolean") {
+          parameters = inputSchema;
+        } else {
+          const rawSchema =
+            inputSchema && typeof inputSchema === "object"
+              ? inputSchema
+              : undefined;
+          const schemaType = rawSchema?.type;
+          const parameterObject = {
+            $schema: "http://json-schema.org/draft-07/schema#",
+            ...(rawSchema ?? {}),
+          };
+
+          const hasObjectHints =
+            rawSchema !== undefined &&
+            ["properties", "required", "additionalProperties", "patternProperties"].some(
+              (key) => key in rawSchema
+            );
+
+          const isObjectSchema =
+            rawSchema === undefined ||
+            schemaType === "object" ||
+            (Array.isArray(schemaType) && schemaType.includes("object")) ||
+            hasObjectHints;
+
+          if (isObjectSchema) {
+            if (!("type" in parameterObject)) parameterObject.type = "object";
+            if (!("properties" in parameterObject)) parameterObject.properties = {};
+            if (!("required" in parameterObject)) parameterObject.required = [];
+            if (!("additionalProperties" in parameterObject))
+              parameterObject.additionalProperties = false;
+          }
+
+          parameters = parameterObject;
+        }

This keeps no-arg tools (or intentionally unsatisfiable ones) accurate while still giving object schemas the safe defaults.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const rawSchema =
tool.inputSchema && typeof tool.inputSchema === "object"
? tool.inputSchema
: undefined;
const schemaType = rawSchema?.type;
const parameters = {
$schema: "http://json-schema.org/draft-07/schema#",
...(rawSchema ?? {}),
};
const hasObjectHints =
rawSchema !== undefined &&
["properties", "required", "additionalProperties", "patternProperties"].some(
(key) => key in rawSchema
);
const isObjectSchema =
rawSchema === undefined ||
schemaType === "object" ||
(Array.isArray(schemaType) && schemaType.includes("object")) ||
hasObjectHints;
// Only apply object-specific defaults when the schema is actually an object type
if (isObjectSchema) {
if (!("type" in parameters)) parameters.type = "object";
if (!("properties" in parameters)) parameters.properties = {};
if (!("required" in parameters)) parameters.required = [];
if (!("additionalProperties" in parameters))
parameters.additionalProperties = false;
}
// Create the prefixed tool name as it appears in Claude's context
const prefixedToolName = `mcp__${serverName}__${tool.name}`;
const toolForCounting = {
name: tool.name,
description: tool.description || '',
inputSchema: tool.inputSchema || {}
name: prefixedToolName,
...(tool.description !== undefined ? { description: tool.description } : {}),
parameters,
};
const inputSchema = tool.inputSchema;
let parameters: unknown;
if (typeof inputSchema === "boolean") {
parameters = inputSchema;
} else {
const rawSchema =
inputSchema && typeof inputSchema === "object"
? inputSchema
: undefined;
const schemaType = rawSchema?.type;
const parameterObject = {
$schema: "http://json-schema.org/draft-07/schema#",
...(rawSchema ?? {}),
};
const hasObjectHints =
rawSchema !== undefined &&
["properties", "required", "additionalProperties", "patternProperties"].some(
(key) => key in rawSchema
);
const isObjectSchema =
rawSchema === undefined ||
schemaType === "object" ||
(Array.isArray(schemaType) && schemaType.includes("object")) ||
hasObjectHints;
if (isObjectSchema) {
if (!("type" in parameterObject)) parameterObject.type = "object";
if (!("properties" in parameterObject)) parameterObject.properties = {};
if (!("required" in parameterObject)) parameterObject.required = [];
if (!("additionalProperties" in parameterObject))
parameterObject.additionalProperties = false;
}
parameters = parameterObject;
}
// Create the prefixed tool name as it appears in Claude's context
const prefixedToolName = `mcp__${serverName}__${tool.name}`;
const toolForCounting = {
name: prefixedToolName,
...(tool.description !== undefined ? { description: tool.description } : {}),
parameters,
};
🤖 Prompt for AI Agents
In src/core/mcpClient.ts around lines 53 to 91, the code treats a boolean JSON
Schema (true/false) like undefined and applies object defaults; instead detect
boolean schemas and preserve them verbatim: if typeof rawSchema === "boolean"
set parameters to rawSchema (do not build an object with $schema or spread), and
ensure subsequent checks (hasObjectHints and isObjectSchema) only treat
rawSchema as an object when typeof rawSchema === "object" (so boolean schemas do
not trigger object defaults); then only apply the object-specific default fields
when isObjectSchema is true and parameters is an object.

@gulivan gulivan merged commit 9de2e9d into main Sep 26, 2025
4 checks passed
@github-actions
Copy link

🎉 This PR is included in version 1.4.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants