Skip to content

Conversation

jxnl
Copy link
Collaborator

@jxnl jxnl commented Jul 11, 2025

Fixes issue where response_model=None would fail for Google GenAI provider with error "Models.generate_content() got an unexpected keyword argument 'messages'".

The bug was in instructor/process_response.py where the response_model=None code path handled Cohere and Anthropic modes but was missing GenAI mode handling. This fix adds proper message to contents conversion for GenAI modes.

Fixes #1690

Generated with Claude Code


Important

Fixes handling of response_model=None for GenAI modes in instructor/process_response.py and adds tests to verify the fix.

  • Bug Fix:
    • Fixes handling of response_model=None for GenAI modes in instructor/process_response.py.
    • Converts messages to contents and extracts system message for GenAI modes.
  • Testing:
    • Adds test_fix_local.py to demonstrate the fix for response_model=None.
    • Adds test_response_model_none.py in tests/llm/test_genai/ to verify GenAI client behavior with response_model=None.

This description was created by Ellipsis for a08a408. You can customize this summary. It will automatically update as commits are pushed.

Fixes issue where response_model=None would fail for Google GenAI provider
with error ''Models.generate_content() got an unexpected keyword argument messages''.

The bug was in instructor/process_response.py where the response_model=None
code path (lines 1094-1121) handled Cohere and Anthropic modes but was
missing GenAI mode handling. When response_model is provided, GenAI modes
properly convert OpenAI-style messages to GenAI-style contents, but this
conversion was missing for the response_model=None case.

The fix adds GenAI mode handling that:
1. Converts OpenAI-style messages to GenAI-style contents using convert_to_genai_messages()
2. Extracts multimodal content using extract_genai_multimodal_content()
3. Handles system messages by extracting them and creating proper config

Fixes #1690

Co-authored-by: Jason Liu <[email protected]>
Copy link

cloudflare-workers-and-pages bot commented Jul 11, 2025

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
instructor a95e3ae Commit Preview URL

Branch Preview URL
Jul 11 2025, 07:55 PM

Copy link
Contributor

claude bot commented Jul 11, 2025

Claude finished @jxnl's task —— View job


PR Review Analysis - Complete ✅

I've completed a comprehensive review of the PR that fixes GenAI modes when response_model=None. Here's my analysis:

Code Quality and Best Practices

Strengths:

  • Clear Problem Statement: The PR directly addresses the reported issue where response_model=None failed for Google GenAI provider
  • Consistent with Existing Patterns: The fix follows the same pattern used for Anthropic and Cohere modes in the same function
  • Good Code Organization: The fix is placed in the logical location within the existing response_model=None handling block
  • Proper Import Management: Uses conditional imports (from instructor.utils import ...) and imports google.genai.types only when needed

Code Structure:

  • The implementation mirrors the existing Anthropic handling pattern (lines 1105-1112), maintaining consistency
  • Proper separation of concerns: message conversion, multimodal content extraction, and system message handling are done sequentially
  • Clean resource management: removes messages after converting to contents

Correctness and Bug Analysis

The Fix Addresses the Root Cause:

  • Original Issue: GenAI modes were missing from the response_model=None code path, causing messages to be passed to Models.generate_content() which expects contents
  • Solution: Converts OpenAI-style messages to GenAI-style contents and properly handles system messages

Implementation Details:

  • ✅ Handles both Mode.GENAI_TOOLS and Mode.GENAI_STRUCTURED_OUTPUTS
  • ✅ Converts messages using convert_to_genai_messages(messages)
  • ✅ Processes multimodal content with extract_genai_multimodal_content()
  • ✅ Extracts system messages and creates proper GenerateContentConfig
  • ✅ Removes messages parameter to avoid conflicts

Test Coverage Assessment

Comprehensive Testing:

  • Unit Tests: test_response_model_none.py includes parameterized tests for both GenAI modes
  • Integration Test: Tests both simple messages and messages with system prompts
  • Demonstration Script: test_fix_local.py provides a practical example showing before/after behavior

Test Quality:

  • Tests verify the function doesn't raise errors (addresses the original bug)
  • Validates return type expectations (raw response, not Pydantic model)
  • Covers edge cases like system message handling

Performance Considerations

  • No Performance Regression: The fix only affects the response_model=None code path for GenAI modes
  • Efficient Processing: Reuses existing utility functions rather than duplicating logic
  • Minimal Overhead: Only imports GenAI types when actually needed

Security Considerations

  • No Security Issues: The changes only involve message format conversion and don't introduce any security vulnerabilities
  • Input Validation: Uses existing validated utility functions for message conversion
  • No Data Exposure: Maintains the same data flow patterns as existing code

🔍 Minor Observations

Potential Areas for Improvement:

  1. Line 1127: Consider handling potential exceptions from extract_genai_system_message() (though this follows existing patterns)
  2. Import Style: The conditional import on line 1129 could be moved to the top of the elif block for consistency
  3. Code Comments: The inline comment on line 1114 could be more specific about what conversion is happening

Edge Case Considerations:

  • ✅ Handles empty messages list
  • ✅ Handles missing system messages
  • ✅ Handles multimodal content properly
  • ⚠️ Potential Issue: If convert_to_genai_messages() fails, there's no fallback handling (but this mirrors existing code patterns)

Overall Assessment

Verdict: Excellent Fix ⭐⭐⭐⭐⭐

This is a well-implemented bug fix that:

  • Directly addresses the reported issue
  • Follows established code patterns
  • Includes comprehensive test coverage
  • Maintains backward compatibility
  • Introduces no regressions

Recommendations:

  1. Approve and Merge: This fix is ready for production
  2. Consider adding error handling around utility function calls in a future PR
  3. The fix properly handles the GenAI provider's unique message format requirements

Files Changed:

  • instructor/process_response.py:1113-1136 - Core fix implementation
  • test_fix_local.py - Demonstration script
  • tests/llm/test_genai/test_response_model_none.py - Unit tests

Copy link
Contributor

claude bot commented Jul 11, 2025

Claude finished @jxnl's task —— View job


PR Review Analysis

Reviewing PR for code quality, potential bugs, performance, security, and test coverage.

Todo List:

  • Read and analyze instructor/process_response.py changes
  • Review tests/llm/test_genai/test_response_model_none.py
  • Evaluate fix completeness and correctness
  • Check for potential edge cases or bugs
  • Assess test coverage adequacy
  • Provide comprehensive review feedback

Code Review Results

Fix Quality & Correctness

The fix correctly addresses the reported issue where response_model=None would fail for GenAI modes with the error "Models.generate_content() got an unexpected keyword argument 'messages'".

Key changes in instructor/process_response.py:1113-1136:

  • Adds proper GenAI mode handling in the response_model=None code path
  • Follows established patterns used by other providers (Cohere, Anthropic)
  • Correctly converts OpenAI-style messages to GenAI-style contents using existing utility functions
  • Properly handles system message extraction and configuration

Code Quality & Best Practices

  • Consistency: The implementation follows the same pattern as existing provider handlers
  • Reusability: Uses existing utility functions (convert_to_genai_messages, extract_genai_system_message, etc.)
  • Clean Code: Logical structure with clear comments explaining the transformation
  • Type Safety: Proper imports and type handling

Test Coverage

The test file tests/llm/test_genai/test_response_model_none.py provides good coverage:

  • Both Modes: Tests both Mode.GENAI_TOOLS and Mode.GENAI_STRUCTURED_OUTPUTS
  • System Messages: Includes test case with system message extraction
  • Response Validation: Verifies that raw response is returned (not a Pydantic model)
  • Real API Calls: Follows the codebase pattern of using real API calls instead of mocking

⚠️ Minor Observations

  1. Import Location: The from google.genai import types import is done inline (line 1129). While this works, it might be cleaner to move it to the top of the file or consolidate with other GenAI imports for better organization.

  2. Duplicate Logic: The GenAI message conversion logic is duplicated between the response_model=None path and the regular GenAI mode handlers. Consider extracting this into a shared utility function to reduce code duplication.

  3. Error Handling: The fix doesn't include explicit error handling for the GenAI conversion functions. While the existing utility functions should handle edge cases, consider adding try-catch blocks for robustness.

Performance & Security

  • Performance: No performance concerns - the fix uses existing optimized utility functions
  • Security: No security issues identified - the code doesn't introduce any new security vulnerabilities

🎯 Edge Cases Considered

  • Empty Messages: The existing utility functions handle empty message arrays
  • System Message Variations: Both inline system messages and separate system kwargs are handled
  • Multimodal Content: The fix properly processes multimodal content using extract_genai_multimodal_content

Recommendation: APPROVE

This is a solid bug fix that:

  • Correctly resolves the reported issue
  • Follows established codebase patterns
  • Includes appropriate test coverage
  • Maintains code quality standards
  • Introduces no breaking changes

The minor observations above are suggestions for future improvements but don't block this fix from being merged.


@github-actions github-actions bot added bug Something isn't working python Pull requests that update python code size:M This PR changes 30-99 lines, ignoring generated files. labels Jul 11, 2025
@jxnl jxnl merged commit 187517b into main Jul 11, 2025
9 of 14 checks passed
@jxnl jxnl deleted the claude/issue-1690-20250711_163543 branch July 11, 2025 19:45
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to a08a408 in 1 minute and 59 seconds. Click for details.
  • Reviewed 166 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. instructor/process_response.py:1113
  • Draft comment:
    Nice fix for handling GenAI modes when response_model is None. The branch cleanly converts messages to GenAI contents, extracts multimodal content, and sets up the config if a system message is found. Consider adding a brief note on the rationale for only updating config if 'system' isn’t already provided.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 85% The comment is purely informative and doesn't suggest any clear code changes. It's mostly just describing what the code does, which is already clear from the code comments. The suggestion to add a note about the rationale is a request for explanation rather than a concrete code improvement. According to the rules, we should not make purely informative comments or ask for explanations. The code does seem to have a specific logic around system messages - maybe there is a potential bug or improvement needed here that I'm missing? No, the system message handling is straightforward and well-commented. If there was a bug, the comment doesn't identify it. The comment is just descriptive and asking for more documentation. The comment should be deleted as it is purely informative and doesn't suggest any concrete code changes. The code is already well-commented and the logic is clear.
2. test_fix_local.py:24
  • Draft comment:
    Test script looks functional. For better robustness, consider adding assertions to verify that the response structure (e.g., presence of 'contents' and absence of 'messages') meets expectations in addition to printing the result.
  • Reason this comment was not posted:
    Confidence changes required: 80% <= threshold 85% None
3. tests/llm/test_genai/test_response_model_none.py:13
  • Draft comment:
    Good use of parameterized tests to verify behavior for both GENAI_TOOLS and GENAI_STRUCTURED_OUTPUTS modes. The tests correctly assert that the raw response is returned and isn’t parsed as a Pydantic model.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 85% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply praises the use of parameterized tests without offering any constructive criticism or specific guidance.

Workflow ID: wflow_DhCrfStP580RI5Vq

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working python Pull requests that update python code size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gemini does not work with response_model = None

1 participant