Skip to content

Conversation

@betatim
Copy link
Member

@betatim betatim commented Aug 27, 2025

Need to sync the attributes to the CPU model for the HTML repr to work properly. This is because we delegate the HTML repr building to the CPU model. The advantage of this is that users get exactly what they are expecting, the downside is that we need to sync fitted attributes so that the fitted/not fitted detection works correctly. In theory we could dig into how the detection is done and only sync "that one attribute" needed to pass the test. I think that is a premature optimisation and relies on the internals of scikit-learn which means it is fragile.

closes #7145

Need to sync the attributes to the CPU model for the HTML repr to work
properly.
@betatim betatim requested a review from a team as a code owner August 27, 2025 10:03
@betatim betatim requested review from divyegala and jcrist August 27, 2025 10:03
@github-actions github-actions bot added the Cython / Python Cython or Python issue label Aug 27, 2025
@betatim betatim added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Aug 27, 2025
Copy link
Member

@jcrist jcrist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me. If we wanted a cheap optimization, only syncing something common but easy like n_features_in_ in this case would be sufficient (just enough synced to make sklearn detect the CPU model is fitted).

Alternatively we could attempt to make the _html_repr_ run on the proxy itself, but sklearn calls so many private methods defined in Base to make that work, it's not clear if reorging to run on the proxy itself would work or not cause other issues.

Anyone saving a model for use later (or accessing a fitted attribute) will require a CPU sync anyway, so not too worried about the perf.

@betatim
Copy link
Member Author

betatim commented Aug 28, 2025

/merge

@rapids-bot rapids-bot bot merged commit a577263 into rapidsai:branch-25.10 Aug 28, 2025
137 of 139 checks passed
@betatim betatim deleted the fix-html-repr branch August 28, 2025 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Version 25.08 classify RandomForestClassifier "Not fitted", which 25.06 classify as "fitted"

2 participants