Skip to content

Conversation

@katzdave
Copy link
Collaborator

@katzdave katzdave commented Dec 15, 2025

Main changes are removing pricing.rs which maintains model pricing via a cached config file which is refreshed from open router. Use canonical models instead (same data source, but much higher hit rate on name matching).

Stop fetching all of the models/prices into the client, and trying to maintain that state. There was some UI buried in settings about when the last pricing data refresh occurred. Only do targeted fetches with (provider, model).

Result is no meaningful difference in behavior but many more models/providers supported with pricing (databricks!).

A slight loss on the freshness, but will continue to follow up on keeping canonical models as up to date as possible.

@katzdave katzdave changed the title Massively simplify pricing code Integrate pricing with canonical model Dec 16, 2025
) {
// Model/provider has changed, save the costs for the previous model
const prevKey = `${prevProviderRef.current}/${prevModelRef.current}`;
const handleModelChange = async () => {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is client side only and doesn't get persisted...

Filed: #6141

Will fix in a followup.

@katzdave katzdave marked this pull request as ready for review December 16, 2025 19:34
Copilot AI review requested due to automatic review settings December 16, 2025 19:34
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the pricing system to use canonical models instead of maintaining a separate pricing cache. The key architectural change is moving from eagerly fetching and caching all model pricing data from OpenRouter to fetching pricing on-demand from the canonical model registry, which provides better model name matching (supporting providers like Databricks) at the cost of slightly less fresh data.

  • Removed the complex pricing.rs module with its disk cache and OpenRouter API integration
  • Simplified frontend pricing logic by removing the costDatabase session cache
  • Changed API from bulk fetch (configured_only flag) to targeted fetch (provider + model parameters)

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
ui/desktop/src/utils/pricing.ts New simplified pricing utility using targeted backend fetch
ui/desktop/src/utils/costDatabase.ts Removed complex session caching and OpenRouter model parsing logic
ui/desktop/src/hooks/useCostTracking.ts Updated to use async fetch instead of sync cache lookup
ui/desktop/src/hooks/useAgent.ts Removed pricing cache initialization
ui/desktop/src/components/settings/app/AppSettingsSection.tsx Removed pricing refresh UI and status indicators
ui/desktop/src/components/bottom_menu/CostTracker.tsx Simplified to fetch pricing directly without session cache
crates/goose/src/providers/pricing.rs Removed entire OpenRouter pricing cache module
crates/goose/src/providers/mod.rs Removed pricing module export
crates/goose/src/providers/canonical/mod.rs Added helper to fetch canonical model with pricing
crates/goose-server/src/routes/config_management.rs Changed pricing endpoint to targeted fetch using canonical models
crates/goose-server/src/commands/agent.rs Removed pricing cache initialization on startup
crates/goose-cli/src/session/output.rs Changed to sync pricing lookup from canonical models
crates/goose-cli/src/session/mod.rs Removed pricing cache initialization

Comment on lines +35 to +81
const handleModelChange = async () => {
if (
prevModelRef.current !== undefined &&
prevProviderRef.current !== undefined &&
(prevModelRef.current !== currentModel || prevProviderRef.current !== currentProvider)
) {
// Model/provider has changed, save the costs for the previous model
const prevKey = `${prevProviderRef.current}/${prevModelRef.current}`;

// Get pricing info for the previous model
const prevCostInfo = getCostForModel(prevProviderRef.current, prevModelRef.current);
// Get pricing info for the previous model
const prevCostInfo = await fetchModelPricing(
prevProviderRef.current,
prevModelRef.current
);

if (prevCostInfo) {
const prevInputCost =
(sessionInputTokens || localInputTokens) * (prevCostInfo.input_token_cost || 0);
const prevOutputCost =
(sessionOutputTokens || localOutputTokens) * (prevCostInfo.output_token_cost || 0);
const prevTotalCost = prevInputCost + prevOutputCost;
if (prevCostInfo) {
const prevInputCost =
(sessionInputTokens || localInputTokens) * (prevCostInfo.input_token_cost || 0);
const prevOutputCost =
(sessionOutputTokens || localOutputTokens) * (prevCostInfo.output_token_cost || 0);
const prevTotalCost = prevInputCost + prevOutputCost;

// Save the accumulated costs for this model
setSessionCosts((prev) => ({
...prev,
[prevKey]: {
inputTokens: sessionInputTokens || localInputTokens,
outputTokens: sessionOutputTokens || localOutputTokens,
totalCost: prevTotalCost,
},
}));
// Save the accumulated costs for this model
setSessionCosts((prev) => ({
...prev,
[prevKey]: {
inputTokens: sessionInputTokens || localInputTokens,
outputTokens: sessionOutputTokens || localOutputTokens,
totalCost: prevTotalCost,
},
}));
}

console.log(
'Model changed from',
`${prevProviderRef.current}/${prevModelRef.current}`,
'to',
`${currentProvider}/${currentModel}`,
'- saved costs and restored session token counters'
);
}

console.log(
'Model changed from',
`${prevProviderRef.current}/${prevModelRef.current}`,
'to',
`${currentProvider}/${currentModel}`,
'- saved costs and restored session token counters'
);
}
prevModelRef.current = currentModel || undefined;
prevProviderRef.current = currentProvider || undefined;
};

prevModelRef.current = currentModel || undefined;
prevProviderRef.current = currentProvider || undefined;
handleModelChange();
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This async function inside useEffect doesn't properly handle race conditions or cleanup. If the effect runs again while a previous fetchModelPricing call is still pending, both calls will complete and potentially update state out of order. Consider using an abort controller or a flag to cancel or ignore stale requests.

Copilot uses AI. Check for mistakes.
Comment on lines 476 to 502
pub async fn get_pricing(
Json(query): Json<PricingQuery>,
) -> Result<Json<PricingResponse>, StatusCode> {
let configured_only = query.configured_only;

// If refresh requested (configured_only = false), refresh the cache
if !configured_only {
if let Err(e) = refresh_pricing().await {
tracing::error!("Failed to refresh pricing data: {}", e);
}
}
let canonical_model =
maybe_get_canonical_model(&query.provider, &query.model).ok_or(StatusCode::NOT_FOUND)?;

let mut pricing_data = Vec::new();

if !configured_only {
// Get ALL pricing data from the cache
let all_pricing = get_all_pricing().await;

for (provider, models) in all_pricing {
for (model, pricing) in models {
pricing_data.push(PricingData {
provider: provider.clone(),
model: model.clone(),
input_token_cost: pricing.input_cost,
output_token_cost: pricing.output_cost,
currency: "$".to_string(),
context_length: pricing.context_length,
});
}
}
} else {
for (metadata, provider_type) in get_providers().await {
// Skip unconfigured providers if filtering
if !check_provider_configured(&metadata, provider_type) {
continue;
}

for model_info in &metadata.known_models {
// Handle OpenRouter models specially - they store full provider/model names
let (lookup_provider, lookup_model) = if metadata.name == "openrouter" {
// For OpenRouter, parse the model name to extract real provider/model
if let Some((provider, model)) = parse_model_id(&model_info.name) {
(provider, model)
} else {
// Fallback if parsing fails
(metadata.name.clone(), model_info.name.clone())
}
} else {
// For other providers, use names as-is
(metadata.name.clone(), model_info.name.clone())
};

// Only get pricing from OpenRouter cache
if let Some(pricing) = get_model_pricing(&lookup_provider, &lookup_model).await {
pricing_data.push(PricingData {
provider: metadata.name.clone(),
model: model_info.name.clone(),
input_token_cost: pricing.input_cost,
output_token_cost: pricing.output_cost,
currency: "$".to_string(),
context_length: pricing.context_length,
});
}
// No fallback to hardcoded prices
}
}
if let (Some(input_cost), Some(output_cost)) = (
canonical_model.pricing.prompt,
canonical_model.pricing.completion,
) {
pricing_data.push(PricingData {
provider: query.provider.clone(),
model: query.model.clone(),
input_token_cost: input_cost,
output_token_cost: output_cost,
currency: "$".to_string(),
context_length: Some(canonical_model.context_length as u32),
});
}

tracing::debug!(
"Returning pricing for {} models{}",
pricing_data.len(),
if configured_only {
" (configured providers only)"
} else {
" (all cached models)"
}
);

Ok(Json(PricingResponse {
pricing: pricing_data,
source: "openrouter".to_string(),
source: "canonical".to_string(),
}))
}
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API can return 200 OK with an empty pricing array when a canonical model exists but has no pricing data (prompt or completion is None). This differs from returning 404 when the model isn't found. Consider returning 404 or a different status when pricing is unavailable to distinguish between "model not found" and "model found but no pricing".

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@alexhancock alexhancock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks awesome - added one comment / question

@@ -0,0 +1,60 @@
import { getApiUrl } from '../config';

export interface ModelCostInfo {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make this a type in the openapi api and get it from there instead? that way we could use the generated SDK for fetching it too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we remove this type now and use PricingData? https://github.com/block/goose/pull/6130/files#diff-270e779202976f7f9b473ced6ebce8debeba7c7d7e5085991d08d5a363f1f40fR530

that's what I was after

after that, LGTM for a merge!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah misunderstood slightly. Done

…icing

* 'main' of github.com:block/goose: (35 commits)
  docs: skills (#6062)
  fix: add conditional configuration for GOOSE_BIN_DIR in PATH (#5940)
  Update dependencies to help in Fedora packaging (#5835)
  fix: make goose reviewer less bad (#6154)
  docs: create/edit recipe button (#6145)
  fix(google): Fix 400 Bad Request error with Gemini 3 thought signatures (#6035)
  fix: we don't need to warn about tool count when in code mode (#6149)
  deps: upgrade agent-client-protocol to 0.9.0 (#6109)
  fix(providers): fix for gemini-cli on windows to work around cmd's multiline prompt limitations #5911 (#5966)
  More slash commands (#5858)
  fix: MCP UI not rendering due to CallToolResult structure change (#6143)
  fix: display shell output as static text instead of spinner (#6041)
  fix : Custom providers with empty API keys show as configured in desktop (#6105)
  Add .agents/skills and ~/.config/agent/skills to skills discovery paths (#6139)
  fix: use instructions for system prompt and prompt for user message in subagents (#6121)
  Fix compaction loop for small models or large input (#5803)
  feat: Centralize theme management with ThemeContext (#6137)
  OpenRouter & Xai streaming (#5873)
  fix: resolve mcp-hermit cleanup path expansion issue (#5953)
  feat: add goose PR reviewer workflow (#6124)
  ...
Copilot AI review requested due to automatic review settings December 18, 2025 16:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

Comment on lines +495 to +501
pricing_data.push(PricingData {
provider: query.provider.clone(),
model: query.model.clone(),
input_token_cost: input_cost,
output_token_cost: output_cost,
currency: "$".to_string(),
context_length: Some(canonical_model.context_length as u32),
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cast from usize to u32 could overflow on 64-bit systems if context_length exceeds u32::MAX. Consider using try_into() with proper error handling or validate that the value fits within u32 range.

Suggested change
pricing_data.push(PricingData {
provider: query.provider.clone(),
model: query.model.clone(),
input_token_cost: input_cost,
output_token_cost: output_cost,
currency: "$".to_string(),
context_length: Some(canonical_model.context_length as u32),
let context_length = u32::try_from(canonical_model.context_length)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
pricing_data.push(PricingData {
provider: query.provider.clone(),
model: query.model.clone(),
input_token_cost: input_cost,
output_token_cost: output_cost,
currency: "$".to_string(),
context_length: Some(context_length),

Copilot uses AI. Check for mistakes.
Comment on lines +491 to 503
if let (Some(input_cost), Some(output_cost)) = (
canonical_model.pricing.prompt,
canonical_model.pricing.completion,
) {
pricing_data.push(PricingData {
provider: query.provider.clone(),
model: query.model.clone(),
input_token_cost: input_cost,
output_token_cost: output_cost,
currency: "$".to_string(),
context_length: Some(canonical_model.context_length as u32),
});
}
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a model is found but has no pricing data (prompt or completion is None), the endpoint returns a 200 OK with an empty pricing array. The frontend at line 20 of pricing.ts expects pricing?.[0] ?? null, which correctly handles this. However, returning 404 NOT_FOUND when pricing data is unavailable would be more semantically correct and make error handling clearer.

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +27
pub fn maybe_get_canonical_model(provider: &str, model: &str) -> Option<CanonicalModel> {
let registry = CanonicalModelRegistry::bundled().ok()?;
let canonical_id = map_to_canonical_model(provider, model, registry)?;
registry.get(&canonical_id).cloned()
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function calls CanonicalModelRegistry::bundled() twice - once on line 25 and implicitly again on line 27 when using the result. The registry should be stored in a variable after the first call to avoid the potential overhead of multiple lazy initialization checks and error conversions.

Copilot uses AI. Check for mistakes.
@katzdave katzdave merged commit 6a0b8c2 into main Dec 18, 2025
24 checks passed
@katzdave katzdave deleted the dkatz/canonical-pricing branch December 18, 2025 16:44
zanesq added a commit that referenced this pull request Dec 18, 2025
* 'main' of github.com:block/goose: (28 commits)
  Clean PR preview sites from gh-pages branch history (#6161)
  fix: make goose reviewer less sycophantic (#6171)
  revert /reply to previous behavior (replacing session history) when full conversation provided (#6058)
  chore: manually update version (#6166)
  Integrate pricing with canonical model (#6130)
  Regenerate canonical models when release branch is created. (#6127)
  fix: use correct parameter name in read_module handler (#6148)
  docs: blog for code mode MCP (#6126)
  test: add ACP integration test (#6150)
  docs: auto download updates (#6163)
  fix: respect default_enabled value of platform extensions (#6159)
  docs: skills (#6062)
  fix: add conditional configuration for GOOSE_BIN_DIR in PATH (#5940)
  Update dependencies to help in Fedora packaging (#5835)
  fix: make goose reviewer less bad (#6154)
  docs: create/edit recipe button (#6145)
  fix(google): Fix 400 Bad Request error with Gemini 3 thought signatures (#6035)
  fix: we don't need to warn about tool count when in code mode (#6149)
  deps: upgrade agent-client-protocol to 0.9.0 (#6109)
  fix(providers): fix for gemini-cli on windows to work around cmd's multiline prompt limitations #5911 (#5966)
  ...

# Conflicts:
#	ui/desktop/src/api/sdk.gen.ts
#	ui/desktop/src/hooks/useAgent.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants