-
Notifications
You must be signed in to change notification settings - Fork 612
Fallback to CPU for sparse inputs for KMeans #6448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fallback to CPU for sparse inputs for KMeans #6448
Conversation
- Introduced a new method in KMeans to dispatch CPU implementation when sparse arrays are detected during fitting. - Updated the is_sparse function to use cupyx' and scipy's issparse method for better compatibility.
- Introduced a new test to verify that KMeans correctly dispatches to CPU when fitting with sparse input. - Ensured that the model's attributes and predictions are validated as numpy arrays when using sparse data.
|
This is not taking advantage of some of the existing infrastructure for this: cuml/python/cuml/cuml/internals/base.pyx Lines 783 to 797 in 6774818
|
viclafargue
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just minor recommendations
python/cuml/cuml/tests/accel/estimators_hyperparams/test_accel_kmeans.py
Show resolved
Hide resolved
Co-authored-by: Victor Lafargue <[email protected]>
|
/merge |
jcrist
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR adds support for handling sparse input arrays in the KMeans algorithm by dispatching to CPU implementation when sparse arrays are detected during fitting. It also updates the sparse array detection utilities to be more robust and consistent across the codebase.
Fixes scikit-learn test
test_kmeans_results[float64-lloyd-sparse_array]in combination with #6442 .Changes
_should_dispatch_cpumethod to KMeans to handle sparse input arraysis_sparseutility function to useissparseinstead ofisspmatrixfor better compatibilityinput_utils.pyto use the newissparsemethodTesting