Avoid auto-categorizing without user categories #261

jjmata · 2025-10-30T14:49:45Z

Summary

skip family auto-categorization when a family has no available categories
prevent the OpenAI provider from making auto-categorization requests without categories and log an error

Testing

bin/rails test test/models/family/auto_categorizer_test.rb

https://chatgpt.com/codex/tasks/task_e_69033d0891ec833299a3eb936ae10a8a

Summary by CodeRabbit

Bug Fixes
- Strengthened validation in auto-categorization to prevent processing when category data is unavailable.
- Enhanced error handling with improved logging for missing categories scenarios.

coderabbitai · 2025-10-30T14:50:15Z

Walkthrough

The PR introduces validation guards for category inputs before LLM processing. The OpenAI provider now validates that user_categories is non-empty and raises a specific error if blank. The AutoCategorizer extracts a sanitized categories_input variable, performs an early validation check, and uses this validated input for LLM calls instead of the original user input.

Changes

Cohort / File(s)	Summary
Category validation guards `app/models/provider/openai.rb`, `app/models/family/auto_categorizer.rb`	Added precondition checks to validate non-empty category inputs before LLM processing. OpenAI provider raises "No categories available for auto-categorization" error with family id logging. AutoCategorizer extracts categories_input, validates against empty state with early return, and uses sanitized input throughout LLM call and result mapping.

Sequence Diagram

sequenceDiagram
    actor User
    participant OpenAI as Provider::OpenAI
    participant AutoCategorizer as Family::AutoCategorizer
    participant LLM as Language Model

    User->>OpenAI: auto_categorize(transactions, user_categories)
    
    rect rgb(255, 240, 245)
    Note over OpenAI: New guard: validate categories
    alt user_categories blank?
        OpenAI->>OpenAI: Log error (family_id)
        OpenAI-->>User: Raise "No categories available" error
    end
    end
    
    OpenAI->>AutoCategorizer: call(transactions, user_categories_input)
    
    rect rgb(240, 248, 255)
    Note over AutoCategorizer: New guard: extract & validate
    AutoCategorizer->>AutoCategorizer: categories_input = user_categories_input
    alt categories_input empty?
        AutoCategorizer->>AutoCategorizer: Log error
        AutoCategorizer-->>OpenAI: Return early
    end
    end
    
    AutoCategorizer->>LLM: Call with categories_input
    LLM-->>AutoCategorizer: Categorization results
    
    AutoCategorizer->>AutoCategorizer: Map results using categories_input
    AutoCategorizer-->>OpenAI: Return enriched transactions
    OpenAI-->>User: Success

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Both changes follow an identical guard-clause pattern (validate input → log error → early exit)
Minimal logic density; primarily defensive programming
Changes are consistent and repetitive across files, reducing per-file reasoning overhead
No complex control flow or stateful interactions introduced

Poem

🐰 A guard at each gate, we now stand tall,
Checking each category before the call,
Empty? We catch it and warn right away,
No silent failures to ruin the day!
Safe inputs, safe outputs—hop hooray! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Avoid auto-categorizing without user categories" directly and accurately reflects the core intent of the pull request. The changes across both `app/models/family/auto_categorizer.rb` and `app/models/provider/openai.rb` consistently implement guards that prevent auto-categorization from proceeding when no user categories are available, which is exactly what the title describes. The title is concise, specific, and uses clear language that a teammate scanning PR history would easily understand as preventing unnecessary LLM calls and operations when categories are missing.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch codex/prevent-llm-calls-with-empty-categories

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

app/models/provider/openai.rb (1)

57-61: Validation guard works correctly but duplicates AutoCategorizer check.

The guard correctly prevents LLM calls with empty categories and provides clear error messaging. However, this validation duplicates the check in Family::AutoCategorizer (lines 21-24), which returns early when categories are empty. While defensive validation can be beneficial, consider whether both guards are necessary or if the provider should trust the caller's validation.

If you want to keep both validations (defensive programming), consider whether the provider needs the family parameter specifically for this error logging, or if the error message could be simplified.

app/models/family/auto_categorizer.rb (1)

19-24: Early return guard is well-placed.

The empty categories check correctly prevents unnecessary LLM provider calls and is the appropriate place for this business logic validation. The logging is helpful for debugging.

Note that this validation is duplicated in the provider (Provider::Openai lines 57-61), though with different behavior (early return here vs. exception there). Consider whether both guards are necessary.

Minor: The categories_input variable is just a direct assignment without transformation or sanitization (despite the AI summary suggesting otherwise). Consider whether this intermediate variable adds sufficient clarity, or if you could use user_categories_input directly in the guard and subsequent calls.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5eadfaa and 312abd0.

📒 Files selected for processing (2)

app/models/family/auto_categorizer.rb (2 hunks)
app/models/provider/openai.rb (1 hunks)

🧰 Additional context used

📓 Path-based instructions (7)

app/models/**/*.rb

📄 CodeRabbit inference engine (.cursor/rules/project-design.mdc)

Domain models should not call Provider::Registry directly; use a Provided concern within the model’s namespace to select providers and expose convenience methods

Use ActiveRecord validations for forms and complex domain constraints.

app/models/**/*.rb: Place business logic in POROs and model classes under app/models
Models should answer questions about themselves (e.g., prefer account.balance_series over service objects)
Implement complex validations and business logic with ActiveRecord validations
Model-level validations may mirror DB constraints but are not strictly required

Files:

app/models/family/auto_categorizer.rb
app/models/provider/openai.rb

app/**/*.rb

📄 CodeRabbit inference engine (AGENTS.md)

Place Rails application Ruby code (models, controllers, services, jobs, mailers, components) under app/

Files:

app/models/family/auto_categorizer.rb
app/models/provider/openai.rb

**/*.rb

📄 CodeRabbit inference engine (AGENTS.md)

Ruby style: 2-space indentation; snake_case for methods/variables; CamelCase for classes/modules

Files:

app/models/family/auto_categorizer.rb
app/models/provider/openai.rb

**/*.{rb,js,jsx,ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{rb,js,jsx,ts,tsx}: Make changes atomic, testable, and explain their impact briefly in code suggestions.
Respect existing tests and add tests when changing critical logic.

Files:

app/models/family/auto_categorizer.rb
app/models/provider/openai.rb

app/{helpers,models}/**/*.rb

📄 CodeRabbit inference engine (.cursor/rules/project-conventions.mdc)

Format currencies, numbers, and dates on the server side (Ruby) before sending to the client

Files:

app/models/family/auto_categorizer.rb
app/models/provider/openai.rb

app/{models,controllers,views}/**/*.{rb,erb}

📄 CodeRabbit inference engine (.cursor/rules/project-conventions.mdc)

Avoid N+1 queries

Files:

app/models/family/auto_categorizer.rb
app/models/provider/openai.rb

app/models/provider/**/*.rb

📄 CodeRabbit inference engine (.cursor/rules/project-design.mdc)

Concrete provider classes must be under the Provider:: namespace, inherit from Provider, wrap calls with with_provider_response, and raise on invalid/unavailable data

Files:

app/models/provider/openai.rb

🧬 Code graph analysis (2)

app/models/family/auto_categorizer.rb (2)

app/models/family/auto_merchant_detector.rb (2)

llm_provider (65-67)

transactions_input (82-92)

app/models/provider/openai.rb (1)

auto_categorize (54-84)

app/models/provider/openai.rb (1)

app/models/assistant/function.rb (1)

family (78-80)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: ci / test

🔇 Additional comments (2)

app/models/family/auto_categorizer.rb (2)

28-28: Consistent use of validated input.

Using categories_input here is consistent with the validation guard above. The parameter is guaranteed to be non-empty at this point.

40-40: Consistent variable usage.

Good consistency using categories_input here to match the variable passed to the provider on line 28.

jjmata · 2025-10-30T21:18:32Z

Can you take a look, @sokie?

sokie

Good to go

…from-upstream * 'main' of https://github.com/we-promise/sure: (60 commits) Add delay after disabling AI Assistant Avoid auto-categorizing without user categories (we-promise#261) Localization for an accountable group and sidebar (we-promise#257) Recurring transactions (we-promise#271) Lunch flow improvements (we-promise#268) Update ai.md (we-promise#263) Fix "invisible" merchants (we-promise#262) Lunchflow integration (we-promise#259) Build up to 0.6.5-alpha.6 Update AI model recommendations section Update AI assistant documentation with version caution Add support for dynamic config UI (we-promise#256) Feature/yahoo finance (we-promise#123) Update pages/dashboard locales (we-promise#255) Providers factory (we-promise#250) Add onboarding state selector for self-hosted signup (we-promise#251) Added a clickable drop down arrow that reveals list of instructions for Brand Fetch Client ID (we-promise#246) Added a clickable drop down arrow that reveals list of instructions for Twelve Data Secret Key (we-promise#247) Add runServices for db and redis in devcontainer Add PikaPods ... # Conflicts: # .devcontainer/docker-compose.yml # app/controllers/concerns/self_hostable.rb # app/models/provider/twelve_data.rb # app/views/pages/redis_configuration_error.html.erb # config/cable.yml # config/database.yml # config/initializers/sidekiq.rb

Avoid auto-categorizing without user categories

312abd0

jjmata added the codex Touched by OpenAI Codex somehow label Oct 30, 2025 — with ChatGPT Codex Connector

coderabbitai bot reviewed Oct 30, 2025

View reviewed changes

jjmata self-assigned this Oct 30, 2025

jjmata requested a review from sokie October 30, 2025 21:18

sokie approved these changes Nov 1, 2025

View reviewed changes

jjmata merged commit 0327552 into main Nov 1, 2025
6 checks passed

luckyPipewrench pushed a commit to luckyPipewrench/sure that referenced this pull request Nov 2, 2025

Avoid auto-categorizing without user categories (we-promise#261)

e4e7041

alessiocappa pushed a commit to alessiocappa/sure that referenced this pull request Nov 16, 2025

Avoid auto-categorizing without user categories (we-promise#261)

57681c5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid auto-categorizing without user categories #261

Avoid auto-categorizing without user categories #261

Uh oh!

jjmata commented Oct 30, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 30, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

jjmata commented Oct 30, 2025

Uh oh!

sokie left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Avoid auto-categorizing without user categories #261

Avoid auto-categorizing without user categories #261

Uh oh!

Conversation

jjmata commented Oct 30, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

jjmata commented Oct 30, 2025

Uh oh!

sokie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jjmata commented Oct 30, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 30, 2025 •

edited

Loading