-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Integrate pricing with canonical model #6130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| ) { | ||
| // Model/provider has changed, save the costs for the previous model | ||
| const prevKey = `${prevProviderRef.current}/${prevModelRef.current}`; | ||
| const handleModelChange = async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the pricing system to use canonical models instead of maintaining a separate pricing cache. The key architectural change is moving from eagerly fetching and caching all model pricing data from OpenRouter to fetching pricing on-demand from the canonical model registry, which provides better model name matching (supporting providers like Databricks) at the cost of slightly less fresh data.
- Removed the complex pricing.rs module with its disk cache and OpenRouter API integration
- Simplified frontend pricing logic by removing the costDatabase session cache
- Changed API from bulk fetch (
configured_onlyflag) to targeted fetch (provider + model parameters)
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| ui/desktop/src/utils/pricing.ts | New simplified pricing utility using targeted backend fetch |
| ui/desktop/src/utils/costDatabase.ts | Removed complex session caching and OpenRouter model parsing logic |
| ui/desktop/src/hooks/useCostTracking.ts | Updated to use async fetch instead of sync cache lookup |
| ui/desktop/src/hooks/useAgent.ts | Removed pricing cache initialization |
| ui/desktop/src/components/settings/app/AppSettingsSection.tsx | Removed pricing refresh UI and status indicators |
| ui/desktop/src/components/bottom_menu/CostTracker.tsx | Simplified to fetch pricing directly without session cache |
| crates/goose/src/providers/pricing.rs | Removed entire OpenRouter pricing cache module |
| crates/goose/src/providers/mod.rs | Removed pricing module export |
| crates/goose/src/providers/canonical/mod.rs | Added helper to fetch canonical model with pricing |
| crates/goose-server/src/routes/config_management.rs | Changed pricing endpoint to targeted fetch using canonical models |
| crates/goose-server/src/commands/agent.rs | Removed pricing cache initialization on startup |
| crates/goose-cli/src/session/output.rs | Changed to sync pricing lookup from canonical models |
| crates/goose-cli/src/session/mod.rs | Removed pricing cache initialization |
| const handleModelChange = async () => { | ||
| if ( | ||
| prevModelRef.current !== undefined && | ||
| prevProviderRef.current !== undefined && | ||
| (prevModelRef.current !== currentModel || prevProviderRef.current !== currentProvider) | ||
| ) { | ||
| // Model/provider has changed, save the costs for the previous model | ||
| const prevKey = `${prevProviderRef.current}/${prevModelRef.current}`; | ||
|
|
||
| // Get pricing info for the previous model | ||
| const prevCostInfo = getCostForModel(prevProviderRef.current, prevModelRef.current); | ||
| // Get pricing info for the previous model | ||
| const prevCostInfo = await fetchModelPricing( | ||
| prevProviderRef.current, | ||
| prevModelRef.current | ||
| ); | ||
|
|
||
| if (prevCostInfo) { | ||
| const prevInputCost = | ||
| (sessionInputTokens || localInputTokens) * (prevCostInfo.input_token_cost || 0); | ||
| const prevOutputCost = | ||
| (sessionOutputTokens || localOutputTokens) * (prevCostInfo.output_token_cost || 0); | ||
| const prevTotalCost = prevInputCost + prevOutputCost; | ||
| if (prevCostInfo) { | ||
| const prevInputCost = | ||
| (sessionInputTokens || localInputTokens) * (prevCostInfo.input_token_cost || 0); | ||
| const prevOutputCost = | ||
| (sessionOutputTokens || localOutputTokens) * (prevCostInfo.output_token_cost || 0); | ||
| const prevTotalCost = prevInputCost + prevOutputCost; | ||
|
|
||
| // Save the accumulated costs for this model | ||
| setSessionCosts((prev) => ({ | ||
| ...prev, | ||
| [prevKey]: { | ||
| inputTokens: sessionInputTokens || localInputTokens, | ||
| outputTokens: sessionOutputTokens || localOutputTokens, | ||
| totalCost: prevTotalCost, | ||
| }, | ||
| })); | ||
| // Save the accumulated costs for this model | ||
| setSessionCosts((prev) => ({ | ||
| ...prev, | ||
| [prevKey]: { | ||
| inputTokens: sessionInputTokens || localInputTokens, | ||
| outputTokens: sessionOutputTokens || localOutputTokens, | ||
| totalCost: prevTotalCost, | ||
| }, | ||
| })); | ||
| } | ||
|
|
||
| console.log( | ||
| 'Model changed from', | ||
| `${prevProviderRef.current}/${prevModelRef.current}`, | ||
| 'to', | ||
| `${currentProvider}/${currentModel}`, | ||
| '- saved costs and restored session token counters' | ||
| ); | ||
| } | ||
|
|
||
| console.log( | ||
| 'Model changed from', | ||
| `${prevProviderRef.current}/${prevModelRef.current}`, | ||
| 'to', | ||
| `${currentProvider}/${currentModel}`, | ||
| '- saved costs and restored session token counters' | ||
| ); | ||
| } | ||
| prevModelRef.current = currentModel || undefined; | ||
| prevProviderRef.current = currentProvider || undefined; | ||
| }; | ||
|
|
||
| prevModelRef.current = currentModel || undefined; | ||
| prevProviderRef.current = currentProvider || undefined; | ||
| handleModelChange(); |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This async function inside useEffect doesn't properly handle race conditions or cleanup. If the effect runs again while a previous fetchModelPricing call is still pending, both calls will complete and potentially update state out of order. Consider using an abort controller or a flag to cancel or ignore stale requests.
| pub async fn get_pricing( | ||
| Json(query): Json<PricingQuery>, | ||
| ) -> Result<Json<PricingResponse>, StatusCode> { | ||
| let configured_only = query.configured_only; | ||
|
|
||
| // If refresh requested (configured_only = false), refresh the cache | ||
| if !configured_only { | ||
| if let Err(e) = refresh_pricing().await { | ||
| tracing::error!("Failed to refresh pricing data: {}", e); | ||
| } | ||
| } | ||
| let canonical_model = | ||
| maybe_get_canonical_model(&query.provider, &query.model).ok_or(StatusCode::NOT_FOUND)?; | ||
|
|
||
| let mut pricing_data = Vec::new(); | ||
|
|
||
| if !configured_only { | ||
| // Get ALL pricing data from the cache | ||
| let all_pricing = get_all_pricing().await; | ||
|
|
||
| for (provider, models) in all_pricing { | ||
| for (model, pricing) in models { | ||
| pricing_data.push(PricingData { | ||
| provider: provider.clone(), | ||
| model: model.clone(), | ||
| input_token_cost: pricing.input_cost, | ||
| output_token_cost: pricing.output_cost, | ||
| currency: "$".to_string(), | ||
| context_length: pricing.context_length, | ||
| }); | ||
| } | ||
| } | ||
| } else { | ||
| for (metadata, provider_type) in get_providers().await { | ||
| // Skip unconfigured providers if filtering | ||
| if !check_provider_configured(&metadata, provider_type) { | ||
| continue; | ||
| } | ||
|
|
||
| for model_info in &metadata.known_models { | ||
| // Handle OpenRouter models specially - they store full provider/model names | ||
| let (lookup_provider, lookup_model) = if metadata.name == "openrouter" { | ||
| // For OpenRouter, parse the model name to extract real provider/model | ||
| if let Some((provider, model)) = parse_model_id(&model_info.name) { | ||
| (provider, model) | ||
| } else { | ||
| // Fallback if parsing fails | ||
| (metadata.name.clone(), model_info.name.clone()) | ||
| } | ||
| } else { | ||
| // For other providers, use names as-is | ||
| (metadata.name.clone(), model_info.name.clone()) | ||
| }; | ||
|
|
||
| // Only get pricing from OpenRouter cache | ||
| if let Some(pricing) = get_model_pricing(&lookup_provider, &lookup_model).await { | ||
| pricing_data.push(PricingData { | ||
| provider: metadata.name.clone(), | ||
| model: model_info.name.clone(), | ||
| input_token_cost: pricing.input_cost, | ||
| output_token_cost: pricing.output_cost, | ||
| currency: "$".to_string(), | ||
| context_length: pricing.context_length, | ||
| }); | ||
| } | ||
| // No fallback to hardcoded prices | ||
| } | ||
| } | ||
| if let (Some(input_cost), Some(output_cost)) = ( | ||
| canonical_model.pricing.prompt, | ||
| canonical_model.pricing.completion, | ||
| ) { | ||
| pricing_data.push(PricingData { | ||
| provider: query.provider.clone(), | ||
| model: query.model.clone(), | ||
| input_token_cost: input_cost, | ||
| output_token_cost: output_cost, | ||
| currency: "$".to_string(), | ||
| context_length: Some(canonical_model.context_length as u32), | ||
| }); | ||
| } | ||
|
|
||
| tracing::debug!( | ||
| "Returning pricing for {} models{}", | ||
| pricing_data.len(), | ||
| if configured_only { | ||
| " (configured providers only)" | ||
| } else { | ||
| " (all cached models)" | ||
| } | ||
| ); | ||
|
|
||
| Ok(Json(PricingResponse { | ||
| pricing: pricing_data, | ||
| source: "openrouter".to_string(), | ||
| source: "canonical".to_string(), | ||
| })) | ||
| } |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API can return 200 OK with an empty pricing array when a canonical model exists but has no pricing data (prompt or completion is None). This differs from returning 404 when the model isn't found. Consider returning 404 or a different status when pricing is unavailable to distinguish between "model not found" and "model found but no pricing".
alexhancock
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks awesome - added one comment / question
ui/desktop/src/utils/pricing.ts
Outdated
| @@ -0,0 +1,60 @@ | |||
| import { getApiUrl } from '../config'; | |||
|
|
|||
| export interface ModelCostInfo { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we make this a type in the openapi api and get it from there instead? that way we could use the generated SDK for fetching it too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we remove this type now and use PricingData? https://github.com/block/goose/pull/6130/files#diff-270e779202976f7f9b473ced6ebce8debeba7c7d7e5085991d08d5a363f1f40fR530
that's what I was after
after that, LGTM for a merge!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah misunderstood slightly. Done
…icing * 'main' of github.com:block/goose: (35 commits) docs: skills (#6062) fix: add conditional configuration for GOOSE_BIN_DIR in PATH (#5940) Update dependencies to help in Fedora packaging (#5835) fix: make goose reviewer less bad (#6154) docs: create/edit recipe button (#6145) fix(google): Fix 400 Bad Request error with Gemini 3 thought signatures (#6035) fix: we don't need to warn about tool count when in code mode (#6149) deps: upgrade agent-client-protocol to 0.9.0 (#6109) fix(providers): fix for gemini-cli on windows to work around cmd's multiline prompt limitations #5911 (#5966) More slash commands (#5858) fix: MCP UI not rendering due to CallToolResult structure change (#6143) fix: display shell output as static text instead of spinner (#6041) fix : Custom providers with empty API keys show as configured in desktop (#6105) Add .agents/skills and ~/.config/agent/skills to skills discovery paths (#6139) fix: use instructions for system prompt and prompt for user message in subagents (#6121) Fix compaction loop for small models or large input (#5803) feat: Centralize theme management with ThemeContext (#6137) OpenRouter & Xai streaming (#5873) fix: resolve mcp-hermit cleanup path expansion issue (#5953) feat: add goose PR reviewer workflow (#6124) ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.
| pricing_data.push(PricingData { | ||
| provider: query.provider.clone(), | ||
| model: query.model.clone(), | ||
| input_token_cost: input_cost, | ||
| output_token_cost: output_cost, | ||
| currency: "$".to_string(), | ||
| context_length: Some(canonical_model.context_length as u32), |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cast from usize to u32 could overflow on 64-bit systems if context_length exceeds u32::MAX. Consider using try_into() with proper error handling or validate that the value fits within u32 range.
| pricing_data.push(PricingData { | |
| provider: query.provider.clone(), | |
| model: query.model.clone(), | |
| input_token_cost: input_cost, | |
| output_token_cost: output_cost, | |
| currency: "$".to_string(), | |
| context_length: Some(canonical_model.context_length as u32), | |
| let context_length = u32::try_from(canonical_model.context_length) | |
| .map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?; | |
| pricing_data.push(PricingData { | |
| provider: query.provider.clone(), | |
| model: query.model.clone(), | |
| input_token_cost: input_cost, | |
| output_token_cost: output_cost, | |
| currency: "$".to_string(), | |
| context_length: Some(context_length), |
| if let (Some(input_cost), Some(output_cost)) = ( | ||
| canonical_model.pricing.prompt, | ||
| canonical_model.pricing.completion, | ||
| ) { | ||
| pricing_data.push(PricingData { | ||
| provider: query.provider.clone(), | ||
| model: query.model.clone(), | ||
| input_token_cost: input_cost, | ||
| output_token_cost: output_cost, | ||
| currency: "$".to_string(), | ||
| context_length: Some(canonical_model.context_length as u32), | ||
| }); | ||
| } |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a model is found but has no pricing data (prompt or completion is None), the endpoint returns a 200 OK with an empty pricing array. The frontend at line 20 of pricing.ts expects pricing?.[0] ?? null, which correctly handles this. However, returning 404 NOT_FOUND when pricing data is unavailable would be more semantically correct and make error handling clearer.
| pub fn maybe_get_canonical_model(provider: &str, model: &str) -> Option<CanonicalModel> { | ||
| let registry = CanonicalModelRegistry::bundled().ok()?; | ||
| let canonical_id = map_to_canonical_model(provider, model, registry)?; | ||
| registry.get(&canonical_id).cloned() |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function calls CanonicalModelRegistry::bundled() twice - once on line 25 and implicitly again on line 27 when using the result. The registry should be stored in a variable after the first call to avoid the potential overhead of multiple lazy initialization checks and error conversions.
* 'main' of github.com:block/goose: (28 commits) Clean PR preview sites from gh-pages branch history (#6161) fix: make goose reviewer less sycophantic (#6171) revert /reply to previous behavior (replacing session history) when full conversation provided (#6058) chore: manually update version (#6166) Integrate pricing with canonical model (#6130) Regenerate canonical models when release branch is created. (#6127) fix: use correct parameter name in read_module handler (#6148) docs: blog for code mode MCP (#6126) test: add ACP integration test (#6150) docs: auto download updates (#6163) fix: respect default_enabled value of platform extensions (#6159) docs: skills (#6062) fix: add conditional configuration for GOOSE_BIN_DIR in PATH (#5940) Update dependencies to help in Fedora packaging (#5835) fix: make goose reviewer less bad (#6154) docs: create/edit recipe button (#6145) fix(google): Fix 400 Bad Request error with Gemini 3 thought signatures (#6035) fix: we don't need to warn about tool count when in code mode (#6149) deps: upgrade agent-client-protocol to 0.9.0 (#6109) fix(providers): fix for gemini-cli on windows to work around cmd's multiline prompt limitations #5911 (#5966) ... # Conflicts: # ui/desktop/src/api/sdk.gen.ts # ui/desktop/src/hooks/useAgent.ts
Main changes are removing pricing.rs which maintains model pricing via a cached config file which is refreshed from open router. Use canonical models instead (same data source, but much higher hit rate on name matching).
Stop fetching all of the models/prices into the client, and trying to maintain that state. There was some UI buried in settings about when the last pricing data refresh occurred. Only do targeted fetches with (provider, model).
Result is no meaningful difference in behavior but many more models/providers supported with pricing (databricks!).
A slight loss on the freshness, but will continue to follow up on keeping canonical models as up to date as possible.