Several improvements to `Ridge` #7410

jcrist · 2025-10-29T21:44:56Z

This PR includes several improvements to Ridge. I started trying to fix one issue, but they were all so coupled it's a bit of a large PR. Hopefully still understandable though.

Move SVD solver to pure python. This simplifies maintenance, and improves flexibility. The new python-based solver is as fast or faster than the old C++ version, which makes sense since it's (mostly) just a thin layer over cublas and cusolver calls.
Adds support for multi-target regression to Ridge. Fixes [FEA] Support multi-target for cuML ridge regression #4412.
Adds support for X with more features than samples to Ridge with solver="svd". Fixes [BUG] Ridge(solver='svd') fails on wide matrices (n_features > n_samples) with CUSOLVER_STATUS_INVALID_VALUE instead of CPU fallback #7198.
Adds support for array-like alpha, improving our sklearn compatibility. Fixes a long standing TODO in our docs.
Adds a copy_X parameter, mirroring the sklearn equivalent. This lets the solver mutate X instead of making a copy, reducing memory usage.
Fixes a bug where the solver would accidentally mutate y and sample_weight in some configurations, and adds a test.
Moves Ridge tests out into their own file, the test_linear_models.py file was getting a bit untenable.
Vastly improved testing coverage and hygiene
Cleanup docstrings to match conventions

It does include one breaking change, which I think we want to do based on other conversations. Previously if a user explicitly passed solver="eig" but the "eig" solver couldn't handle the inputs, it'd warn and fallback to "svd". We now error in this case, alerting the user that the solver they specified isn't supported. This better matches sklearn conventions, and is something I think we'll want to apply to the LinearRegression implementation as well (see #7355 (comment)). With the default of solver="auto" we'll still fallback to "svd" in those cases, we only fail if the user explicitly requested a solver that doesn't support the input types.

Fixes #4412. Fixes #7198. Followup to #7330.

jcrist · 2025-10-29T21:49:09Z

A quick benchmark comparing the new cupy-based SVD solver vs the old C++ one:

bench.py

from time import perf_counter
from itertools import product

import cuml
from cuml.datasets import make_regression

N_RUNS = 5
N_FEATURES = [10, 100, 1000]
N_SAMPLES = [1000, 10000, 50000]

for fit_intercept in [True, False]:
    print(f"# {fit_intercept = }")
    for n_features, n_samples in product(N_FEATURES, N_SAMPLES):
        X, y = make_regression(
            n_features=n_features, n_samples=n_samples, random_state=42
        )

        start = perf_counter()
        for _ in range(N_RUNS):
            cuml.Ridge(fit_intercept=fit_intercept, solver="svd").fit(X, y)
        duration = (perf_counter() - start) / N_RUNS

        print(f"- {X.shape}: {duration * 1e3:.3f} ms")

On main

# fit_intercept = True
- (1000, 10): 6.573 ms
- (10000, 10): 3.945 ms
- (50000, 10): 7.908 ms
- (1000, 100): 7.959 ms
- (10000, 100): 12.996 ms
- (50000, 100): 24.516 ms
- (1000, 1000): 106.750 ms
- (10000, 1000): 172.791 ms
- (50000, 1000): 287.953 ms
# fit_intercept = False
- (1000, 10): 3.127 ms
- (10000, 10): 2.083 ms
- (50000, 10): 4.297 ms
- (1000, 100): 7.796 ms
- (10000, 100): 10.251 ms
- (50000, 100): 21.500 ms
- (1000, 1000): 122.850 ms
- (10000, 1000): 165.714 ms
- (50000, 1000): 264.219 ms

On this PR

# fit_intercept = True
- (1000, 10): 10.041 ms
- (10000, 10): 3.649 ms
- (50000, 10): 3.522 ms
- (1000, 100): 7.930 ms
- (10000, 100): 8.796 ms
- (50000, 100): 16.406 ms
- (1000, 1000): 105.039 ms
- (10000, 1000): 157.908 ms
- (50000, 1000): 221.359 ms
# fit_intercept = False
- (1000, 10): 1.854 ms
- (10000, 10): 1.035 ms
- (50000, 10): 2.516 ms
- (1000, 100): 6.641 ms
- (10000, 100): 8.255 ms
- (50000, 100): 14.055 ms
- (1000, 1000): 104.022 ms
- (10000, 1000): 153.018 ms
- (50000, 1000): 215.664 ms

jcrist · 2025-10-30T19:22:34Z

python/cuml/cuml_accel_tests/upstream/scikit-learn/xfail-list.yaml

  - "sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,estimator=Ridge(),param_grid={'alpha':[0.1,1.0]})-check_complex_data]"
  - "sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,estimator=Ridge(),param_grid={'alpha':[0.1,1.0]})-check_dtype_object]"
  - "sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,estimator=Ridge(),param_grid={'alpha':[0.1,1.0]})-check_estimators_nan_inf]"
+  - "sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,estimator=Ridge(),param_grid={'alpha':[0.1,1.0]})-check_fit1d]"


Even though we're now more compatible with sklearn, we only see new xfails instead of any new passing tests. The new failures are due the following:

Several test_search_cv tests now are accelerated that previously would fallback since the input data was wider than long. These tests just check for nice errors if nan/inf in the data, we'll fix them uniformly across all estimators when we revamp data ingestion. Nothing to do here now.

Some test_solver_consistency tests that check the different solvers result in similar enough results. We're very close, but don't quite match with the tolerances used in the tests here.

A test for n_iter, previously we'd fallback for the fit combo used, we now run accelerated but lack the attribute

A test for array-like alpha. We support all the cases used, but don't match their error message test since their error message has a bug in it (the incorrect values are accidentally flipped).

jcrist · 2025-10-30T19:23:11Z

python/cuml/cuml/accel/_wrappers/sklearn/linear_model.py

+        # column 2D y. The sklearn 1.6+ behavior is what we implement in
+        # cuml.Ridge, here we adjust the shape of `coef_` after the fit to
+        # match the older behavior. This will also trickle down to change the
+        # output shape of `predict` to match the older behavior transparently.


This is gross, but isolated to cuml.accel so I'm ok with it.

csadorf

Great work!

python/cuml/cuml/linear_model/ridge.pyx

csadorf · 2025-10-31T20:06:39Z

I ran comprehensive benchmarks comparing the nightly build against this PR (commit 121033e79) across 47 different configurations.

Performance Results

This PR delivers significant performance improvements across the board with no regressions:

Configuration	Nightly	This PR	Improvement
`basic_svd_1000x10_intercept=True`	1.58ms	1.14ms	-28% ⚡
`basic_svd_10000x10_intercept=True`	1.58ms	1.11ms	-30% ⚡
`basic_svd_10000x100_intercept=True`	9.11ms	8.05ms	-12% ⚡
`basic_svd_10000x1000_intercept=True`	157.40ms	145.71ms	-7% ⚡
`basic_svd_50000x1000_intercept=True`	258.30ms	209.23ms	-19% ⚡
`basic_eig_10000x1000_intercept=True`	26.73ms	26.88ms	+1% (negligible)

Performance Summary by Problem Size

Small matrices (1000×10): 28-30% faster
Medium matrices (10000×100): 12% faster
Large matrices (50000×1000): 19% faster
Eigendecomposition solver: Performance maintained (within measurement variance)
Overall: No regressions detected across any of the 47 configurations

New Capabilities Benchmarked

This PR also enables 8 previously unsupported configurations, all with competitive performance:

Feature	Example Configuration	Performance
Multi-target regression	`multitarget_n=10` (10 output columns)	7.54ms fit
Array-like alpha	`array_alpha_n=10` (10 targets, per-target regularization)	8.58ms fit
Wide matrices (SVD)	`wide_svd_500x2000` (500 samples, 2000 features)	44.72ms fit

Detailed Benchmark Results

Nightly Build Results (click to expand)

Version: rapidsai-nightly/linux-64::cuml-25.12.00a90-cuda13_py313_251024_b62d68ec

====================================================================================================
Ridge Regression Benchmark Suite
====================================================================================================
cuML version: 25.12.00a90
sklearn version: 1.7.2
Version label: rapidsai-nightly/linux-64::cuml-25.12.00a90-cuda13_py313_251024_b62d68ec
Number of runs per benchmark: 5
Include sklearn: False
====================================================================================================

Running comprehensive benchmark (47 configurations)
[1/47] Running: basic_svd_1000x10_intercept=True... fit=1.58ms, predict=0.17ms
[2/47] Running: basic_svd_1000x10_intercept=False... fit=1.32ms, predict=0.17ms
[3/47] Running: basic_svd_10000x10_intercept=True... fit=1.58ms, predict=0.17ms
[4/47] Running: basic_svd_10000x10_intercept=False... fit=1.35ms, predict=0.17ms
[5/47] Running: basic_svd_50000x10_intercept=True... fit=3.67ms, predict=0.18ms
[6/47] Running: basic_svd_50000x10_intercept=False... fit=3.65ms, predict=0.19ms
[7/47] Running: basic_eig_1000x100_intercept=True... fit=2.43ms, predict=0.17ms
[8/47] Running: basic_eig_1000x100_intercept=False... fit=2.21ms, predict=5.37ms
[9/47] Running: basic_svd_1000x100_intercept=True... fit=6.71ms, predict=0.17ms
[10/47] Running: basic_svd_1000x100_intercept=False... fit=6.57ms, predict=0.16ms
[11/47] Running: basic_eig_10000x100_intercept=True... fit=3.28ms, predict=0.18ms
[12/47] Running: basic_eig_10000x100_intercept=False... fit=3.28ms, predict=0.18ms
[13/47] Running: basic_svd_10000x100_intercept=True... fit=9.11ms, predict=0.18ms
[14/47] Running: basic_svd_10000x100_intercept=False... fit=9.15ms, predict=0.19ms
[15/47] Running: basic_eig_50000x100_intercept=True... fit=10.22ms, predict=0.18ms
[16/47] Running: basic_eig_50000x100_intercept=False... fit=10.65ms, predict=0.18ms
[17/47] Running: basic_svd_50000x100_intercept=True... fit=20.19ms, predict=6.44ms
[18/47] Running: basic_svd_50000x100_intercept=False... fit=19.49ms, predict=0.18ms
[19/47] Running: basic_eig_1000x1000_intercept=True... fit=13.41ms, predict=0.30ms
[20/47] Running: basic_eig_1000x1000_intercept=False... fit=13.17ms, predict=0.17ms
[21/47] Running: basic_svd_1000x1000_intercept=True... fit=101.29ms, predict=0.18ms
[22/47] Running: basic_svd_1000x1000_intercept=False... fit=100.81ms, predict=0.18ms
[23/47] Running: basic_eig_10000x1000_intercept=True... fit=26.73ms, predict=0.20ms
[24/47] Running: basic_eig_10000x1000_intercept=False... fit=26.57ms, predict=0.20ms
[25/47] Running: basic_svd_10000x1000_intercept=True... fit=157.40ms, predict=0.21ms
[26/47] Running: basic_svd_10000x1000_intercept=False... fit=159.93ms, predict=0.21ms
[27/47] Running: basic_eig_50000x1000_intercept=True... fit=79.35ms, predict=0.67ms
[28/47] Running: basic_eig_50000x1000_intercept=False... fit=81.19ms, predict=0.47ms
[29/47] Running: basic_svd_50000x1000_intercept=True... fit=258.30ms, predict=0.47ms
[30/47] Running: basic_svd_50000x1000_intercept=False... fit=257.32ms, predict=0.47ms
[31/47] Running: dtype_float64_eig... fit=11.55ms, predict=0.17ms
[32/47] Running: dtype_float64_svd... fit=20.75ms, predict=0.17ms
[33/47] Running: weighted_eig... fit=3.36ms, predict=0.17ms
[34/47] Running: weighted_svd... fit=9.14ms, predict=0.18ms
[35/47] Running: multitarget_n=2... ERROR: Expected 1 columns but got 2 columns.
[36/47] Running: multitarget_n=5... ERROR: Expected 1 columns but got 5 columns.
[37/47] Running: multitarget_n=10... ERROR: Expected 1 columns but got 10 columns.
[38/47] Running: array_alpha_n=3... ERROR: Expected 1 columns but got 3 columns.
[39/47] Running: array_alpha_n=10... ERROR: Expected 1 columns but got 10 columns.
[40/47] Running: wide_eig_100x200... fit=11.18ms, predict=0.17ms
[41/47] Running: wide_eig_100x1000... fit=10.64ms, predict=0.16ms
[42/47] Running: wide_eig_500x2000... fit=39.74ms, predict=0.17ms
[43/47] Running: wide_svd_100x200... ERROR: cuSOLVER error encountered
[44/47] Running: wide_svd_100x1000... ERROR: cuSOLVER error encountered
[45/47] Running: wide_svd_500x2000... ERROR: cuSOLVER error encountered
[46/47] Running: single_feature_svd... fit=1.01ms, predict=0.18ms
[47/47] Running: normalize_eig... fit=3.32ms, predict=0.18ms

====================================================================================================
BENCHMARK SUMMARY
Version: rapidsai-nightly/linux-64::cuml-25.12.00a90-cuda13_py313_251024_b62d68ec
====================================================================================================

cuML Results (39 benchmarks):
----------------------------------------------------------------------------------------------------
Name                                               Fit (ms)        Predict (ms)    Memory (MB)    
----------------------------------------------------------------------------------------------------
single_feature_svd                                     1.01 ±  0.01      0.18 ±  0.02          N/A
basic_svd_1000x10_intercept=False                      1.32 ±  0.02      0.17 ±  0.02          N/A
basic_svd_10000x10_intercept=False                     1.35 ±  0.04      0.17 ±  0.02          N/A
basic_svd_10000x10_intercept=True                      1.58 ±  0.22      0.17 ±  0.02          N/A
basic_svd_1000x10_intercept=True                       1.58 ±  0.40      0.17 ±  0.02          N/A
basic_eig_1000x100_intercept=False                     2.21 ±  0.04      5.37 ± 10.40          N/A
basic_eig_1000x100_intercept=True                      2.43 ±  0.19      0.17 ±  0.03          N/A
basic_eig_10000x100_intercept=False                    3.28 ±  0.03      0.18 ±  0.02          N/A
basic_eig_10000x100_intercept=True                     3.28 ±  0.03      0.18 ±  0.03          N/A
normalize_eig                                          3.32 ±  0.02      0.18 ±  0.02          N/A
weighted_eig                                           3.36 ±  0.01      0.17 ±  0.02          N/A
basic_svd_50000x10_intercept=False                     3.65 ±  0.02      0.19 ±  0.03          N/A
basic_svd_50000x10_intercept=True                      3.67 ±  0.03      0.18 ±  0.03          N/A
basic_svd_1000x100_intercept=False                     6.57 ±  0.02      0.16 ±  0.03          N/A
basic_svd_1000x100_intercept=True                      6.71 ±  0.05      0.17 ±  0.03          N/A
basic_svd_10000x100_intercept=True                     9.11 ±  0.05      0.18 ±  0.03          N/A
weighted_svd                                           9.14 ±  0.03      0.18 ±  0.02          N/A
basic_svd_10000x100_intercept=False                    9.15 ±  0.04      0.19 ±  0.03          N/A
basic_eig_50000x100_intercept=True                    10.22 ±  0.08      0.18 ±  0.04          N/A
wide_eig_100x1000                                     10.64 ±  0.02      0.16 ±  0.02          N/A
basic_eig_50000x100_intercept=False                   10.65 ±  0.07      0.18 ±  0.04          N/A
wide_eig_100x200                                      11.18 ± 17.55      0.17 ±  0.02          N/A
dtype_float64_eig                                     11.55 ±  1.11      0.17 ±  0.03          N/A
basic_eig_1000x1000_intercept=False                   13.17 ±  0.08      0.17 ±  0.03          N/A
basic_eig_1000x1000_intercept=True                    13.41 ±  0.38      0.30 ±  0.26          N/A
basic_svd_50000x100_intercept=False                   19.49 ±  0.04      0.18 ±  0.04          N/A
basic_svd_50000x100_intercept=True                    20.19 ±  0.09      6.44 ± 12.49          N/A
dtype_float64_svd                                     20.75 ±  0.41      0.17 ±  0.03          N/A
basic_eig_10000x1000_intercept=False                  26.57 ±  0.19      0.20 ±  0.04          N/A
basic_eig_10000x1000_intercept=True                   26.73 ±  0.31      0.20 ±  0.04          N/A
wide_eig_500x2000                                     39.74 ±  0.07      0.17 ±  0.03          N/A
basic_eig_50000x1000_intercept=True                   79.35 ±  1.94      0.67 ±  0.19          N/A
basic_eig_50000x1000_intercept=False                  81.19 ±  1.26      0.47 ±  0.05          N/A
basic_svd_1000x1000_intercept=False                  100.81 ±  1.12      0.18 ±  0.05          N/A
basic_svd_1000x1000_intercept=True                   101.29 ±  1.34      0.18 ±  0.05          N/A
basic_svd_10000x1000_intercept=True                  157.40 ±  1.42      0.21 ±  0.05          N/A
basic_svd_10000x1000_intercept=False                 159.93 ±  1.26      0.21 ±  0.05          N/A
basic_svd_50000x1000_intercept=False                 257.32 ±  1.60      0.47 ±  0.04          N/A
basic_svd_50000x1000_intercept=True                  258.30 ±  2.70      0.47 ±  0.05          N/A

Errors (8):
----------------------------------------------------------------------------------------------------
multitarget_n=2: Expected 1 columns but got 2 columns.
multitarget_n=5: Expected 1 columns but got 5 columns.
multitarget_n=10: Expected 1 columns but got 10 columns.
array_alpha_n=3: Expected 1 columns but got 3 columns.
array_alpha_n=10: Expected 1 columns but got 10 columns.
wide_svd_100x200: cuSOLVER error (CUSOLVER_STATUS_INVALID_VALUE)
wide_svd_100x1000: cuSOLVER error (CUSOLVER_STATUS_INVALID_VALUE)
wide_svd_500x2000: cuSOLVER error (CUSOLVER_STATUS_INVALID_VALUE)

Results saved to: ridge_benchmark_results.json

Benchmark complete!

PR #7410 Results (click to expand)

Version: 121033e79 (this PR)

====================================================================================================
Ridge Regression Benchmark Suite
====================================================================================================
cuML version: 25.12.00
sklearn version: 1.7.2
Version label: 121033e79
Number of runs per benchmark: 5
Include sklearn: False
====================================================================================================

Running comprehensive benchmark (47 configurations)
[1/47] Running: basic_svd_1000x10_intercept=True... fit=1.14ms, predict=0.16ms
[2/47] Running: basic_svd_1000x10_intercept=False... fit=0.87ms, predict=0.16ms
[3/47] Running: basic_svd_10000x10_intercept=True... fit=1.11ms, predict=0.16ms
[4/47] Running: basic_svd_10000x10_intercept=False... fit=0.86ms, predict=0.15ms
[5/47] Running: basic_svd_50000x10_intercept=True... fit=2.92ms, predict=0.17ms
[6/47] Running: basic_svd_50000x10_intercept=False... fit=2.43ms, predict=0.17ms
[7/47] Running: basic_eig_1000x100_intercept=True... fit=2.81ms, predict=0.16ms
[8/47] Running: basic_eig_1000x100_intercept=False... fit=1.87ms, predict=0.16ms
[9/47] Running: basic_svd_1000x100_intercept=True... fit=6.37ms, predict=0.16ms
[10/47] Running: basic_svd_1000x100_intercept=False... fit=6.12ms, predict=0.16ms
[11/47] Running: basic_eig_10000x100_intercept=True... fit=3.61ms, predict=0.17ms
[12/47] Running: basic_eig_10000x100_intercept=False... fit=2.41ms, predict=0.17ms
[13/47] Running: basic_svd_10000x100_intercept=True... fit=8.05ms, predict=0.17ms
[14/47] Running: basic_svd_10000x100_intercept=False... fit=7.85ms, predict=0.17ms
[15/47] Running: basic_eig_50000x100_intercept=True... fit=10.75ms, predict=3.09ms
[16/47] Running: basic_eig_50000x100_intercept=False... fit=4.02ms, predict=0.17ms
[17/47] Running: basic_svd_50000x100_intercept=True... fit=14.25ms, predict=0.17ms
[18/47] Running: basic_svd_50000x100_intercept=False... fit=13.41ms, predict=0.16ms
[19/47] Running: basic_eig_1000x1000_intercept=True... fit=13.42ms, predict=0.16ms
[20/47] Running: basic_eig_1000x1000_intercept=False... fit=12.27ms, predict=0.16ms
[21/47] Running: basic_svd_1000x1000_intercept=True... fit=99.93ms, predict=0.17ms
[22/47] Running: basic_svd_1000x1000_intercept=False... fit=99.31ms, predict=0.18ms
[23/47] Running: basic_eig_10000x1000_intercept=True... fit=26.88ms, predict=0.19ms
[24/47] Running: basic_eig_10000x1000_intercept=False... fit=19.74ms, predict=0.19ms
[25/47] Running: basic_svd_10000x1000_intercept=True... fit=145.71ms, predict=0.19ms
[26/47] Running: basic_svd_10000x1000_intercept=False... fit=144.36ms, predict=0.19ms
[27/47] Running: basic_eig_50000x1000_intercept=True... fit=80.75ms, predict=0.46ms
[28/47] Running: basic_eig_50000x1000_intercept=False... fit=32.08ms, predict=0.44ms
[29/47] Running: basic_svd_50000x1000_intercept=True... fit=209.23ms, predict=0.44ms
[30/47] Running: basic_svd_50000x1000_intercept=False... fit=205.05ms, predict=0.46ms
[31/47] Running: dtype_float64_eig... fit=11.45ms, predict=0.17ms
[32/47] Running: dtype_float64_svd... fit=18.82ms, predict=6.99ms
[33/47] Running: weighted_eig... fit=3.76ms, predict=0.17ms
[34/47] Running: weighted_svd... fit=8.65ms, predict=0.17ms
[35/47] Running: multitarget_n=2... fit=7.88ms, predict=0.17ms
[36/47] Running: multitarget_n=5... fit=7.45ms, predict=0.17ms
[37/47] Running: multitarget_n=10... fit=7.54ms, predict=0.16ms
[38/47] Running: array_alpha_n=3... fit=7.83ms, predict=0.16ms
[39/47] Running: array_alpha_n=10... fit=8.58ms, predict=0.19ms
[40/47] Running: wide_eig_100x200... fit=2.46ms, predict=0.16ms
[41/47] Running: wide_eig_100x1000... fit=10.93ms, predict=0.16ms
[42/47] Running: wide_eig_500x2000... fit=39.98ms, predict=0.17ms
[43/47] Running: wide_svd_100x200... fit=5.74ms, predict=0.16ms
[44/47] Running: wide_svd_100x1000... fit=6.35ms, predict=0.16ms
[45/47] Running: wide_svd_500x2000... fit=44.72ms, predict=0.16ms
[46/47] Running: single_feature_svd... fit=0.87ms, predict=0.15ms
[47/47] Running: normalize_eig... fit=3.59ms, predict=2.52ms

====================================================================================================
BENCHMARK SUMMARY
Version: 121033e79
====================================================================================================

cuML Results (47 benchmarks):
----------------------------------------------------------------------------------------------------
Name                                               Fit (ms)        Predict (ms)    Memory (MB)    
----------------------------------------------------------------------------------------------------
basic_svd_10000x10_intercept=False                     0.86 ±  0.01      0.15 ±  0.01          N/A
basic_svd_1000x10_intercept=False                      0.87 ±  0.01      0.16 ±  0.01          N/A
single_feature_svd                                     0.87 ±  0.01      0.15 ±  0.01          N/A
basic_svd_10000x10_intercept=True                      1.11 ±  0.05      0.16 ±  0.01          N/A
basic_svd_1000x10_intercept=True                       1.14 ±  0.11      0.16 ±  0.02          N/A
basic_eig_1000x100_intercept=False                     1.87 ±  0.05      0.16 ±  0.02          N/A
basic_eig_10000x100_intercept=False                    2.41 ±  0.03      0.17 ±  0.03          N/A
basic_svd_50000x10_intercept=False                     2.43 ±  0.02      0.17 ±  0.02          N/A
wide_eig_100x200                                       2.46 ±  0.01      0.16 ±  0.01          N/A
basic_eig_1000x100_intercept=True                      2.81 ±  0.47      0.16 ±  0.03          N/A
basic_svd_50000x10_intercept=True                      2.92 ±  0.01      0.17 ±  0.02          N/A
normalize_eig                                          3.59 ±  0.02      2.52 ±  4.68          N/A
basic_eig_10000x100_intercept=True                     3.61 ±  0.04      0.17 ±  0.03          N/A
weighted_eig                                           3.76 ±  0.01      0.17 ±  0.02          N/A
basic_eig_50000x100_intercept=False                    4.02 ±  0.09      0.17 ±  0.02          N/A
wide_svd_100x200                                       5.74 ±  0.01      0.16 ±  0.02          N/A
basic_svd_1000x100_intercept=False                     6.12 ±  0.00      0.16 ±  0.01          N/A
wide_svd_100x1000                                      6.35 ±  0.01      0.16 ±  0.02          N/A
basic_svd_1000x100_intercept=True                      6.37 ±  0.04      0.16 ±  0.02          N/A
multitarget_n=5                                        7.45 ±  0.01      0.17 ±  0.02          N/A
multitarget_n=10                                       7.54 ±  0.09      0.16 ±  0.02          N/A
array_alpha_n=3                                        7.83 ±  0.01      0.16 ±  0.02          N/A
basic_svd_10000x100_intercept=False                    7.85 ±  0.02      0.17 ±  0.02          N/A
multitarget_n=2                                        7.88 ±  0.01      0.17 ±  0.02          N/A
basic_svd_10000x100_intercept=True                     8.05 ±  0.01      0.17 ±  0.02          N/A
array_alpha_n=10                                       8.58 ±  1.82      0.19 ±  0.02          N/A
weighted_svd                                           8.65 ±  0.00      0.17 ±  0.02          N/A
basic_eig_50000x100_intercept=True                    10.75 ±  0.08      3.09 ±  5.79          N/A
wide_eig_100x1000                                     10.93 ±  0.10      0.16 ±  0.03          N/A
dtype_float64_eig                                     11.45 ±  1.11      0.17 ±  0.04          N/A
basic_eig_1000x1000_intercept=False                   12.27 ±  0.05      0.16 ±  0.02          N/A
basic_svd_50000x100_intercept=False                   13.41 ±  0.13      0.16 ±  0.02          N/A
basic_eig_1000x1000_intercept=True                    13.42 ±  0.09      0.16 ±  0.03          N/A
basic_svd_50000x100_intercept=True                    14.25 ±  0.05      0.17 ±  0.02          N/A
dtype_float64_svd                                     18.82 ±  0.03      6.99 ± 13.64          N/A
basic_eig_10000x1000_intercept=False                  19.74 ±  8.54      0.19 ±  0.02          N/A
basic_eig_10000x1000_intercept=True                   26.88 ±  0.15      0.19 ±  0.03          N/A
basic_eig_50000x1000_intercept=False                  32.08 ±  0.22      0.44 ±  0.02          N/A
wide_eig_500x2000                                     39.98 ±  0.10      0.17 ±  0.04          N/A
wide_svd_500x2000                                     44.72 ±  0.37      0.16 ±  0.02          N/A
basic_eig_50000x1000_intercept=True                   80.75 ±  0.51      0.46 ±  0.04          N/A
basic_svd_1000x1000_intercept=False                   99.31 ±  1.30      0.18 ±  0.05          N/A
basic_svd_1000x1000_intercept=True                    99.93 ±  1.74      0.17 ±  0.02          N/A
basic_svd_10000x1000_intercept=False                 144.36 ±  1.29      0.19 ±  0.02          N/A
basic_svd_10000x1000_intercept=True                  145.71 ±  2.63      0.19 ±  0.02          N/A
basic_svd_50000x1000_intercept=False                 205.05 ±  1.61      0.46 ±  0.05          N/A
basic_svd_50000x1000_intercept=True                  209.23 ±  1.67      0.44 ±  0.02          N/A

Results saved to: ridge_benchmark_results.json

Benchmark complete!

Summary

Performance Improvements:

⚡ 7-30% speedup on SVD solver across multiple problem sizes
⚡ Maintained performance on eigendecomposition solver
⚡ Zero regressions across 47 benchmark configurations

Benchmark Environment

GPU: NVIDIA RTX A6000 49GB
CUDA Version: 13.0
Driver Version: 580.95.05
cuML versions:
- Nightly: 25.12.00a90
- This PR: 25.12.00 (commit 121033e79)
sklearn version: 1.7.2
Runs per benchmark: 5 iterations each
Total configurations tested: 47

- Move SVD solver to pure python, simplifying maintenance and improving flexibility. - Add support for multi-target inputs - Add support for array-like `alpha` - Add support for `n_features > n_samples` in SVD solver - Add `copy_X` parameter, allowing the solver to mutate X instead of making a copy. - Fix bug where solver would accidentally mutate y and sample_weight in some cases, and add a test. - Improve testing coverage - Cleanup docstring a bit

csadorf

🚢

jcrist · 2025-10-31T22:00:22Z

/merge

This PR includes several improvements to `Ridge`. I started trying to fix one issue, but they were all so coupled it's a bit of a large PR. Hopefully still understandable though. - Move SVD solver to pure python. This simplifies maintenance, and improves flexibility. The new python-based solver is as fast or faster than the old C++ version, which makes sense since it's (mostly) just a thin layer over cublas and cusolver calls. - Adds support for multi-target regression to `Ridge`. Fixes rapidsai#4412. - Adds support for `X` with more features than samples to `Ridge` with `solver="svd"`. Fixes rapidsai#7198. - Adds support for array-like `alpha`, improving our sklearn compatibility. Fixes a long standing TODO in our docs. - Adds a `copy_X` parameter, mirroring the sklearn equivalent. This lets the solver mutate X instead of making a copy, reducing memory usage. - Fixes a bug where the solver would accidentally mutate `y` and `sample_weight` in some configurations, and adds a test. - Moves `Ridge` tests out into their own file, the `test_linear_models.py` file was getting a bit untenable. - Vastly improved testing coverage and hygiene - Cleanup docstrings to match conventions It does include one breaking change, which I think we want to do based on other conversations. Previously if a user explicitly passed `solver="eig"` but the `"eig"` solver couldn't handle the inputs, it'd warn and fallback to `"svd"`. We now error in this case, alerting the user that the solver they specified isn't supported. This better matches sklearn conventions, and is something I think we'll want to apply to the `LinearRegression` implementation as well (see rapidsai#7355 (comment)). With the default of `solver="auto"` we'll still fallback to `"svd"` in those cases, we only fail if the user _explicitly requested_ a solver that doesn't support the input types. Fixes rapidsai#4412. Fixes rapidsai#7198. Followup to rapidsai#7330. Authors: - Jim Crist-Harif (https://github.com/jcrist) Approvers: - Simon Adorf (https://github.com/csadorf) URL: rapidsai#7410

jcrist self-assigned this Oct 29, 2025

jcrist requested a review from a team as a code owner October 29, 2025 21:44

jcrist added the Cython / Python Cython or Python issue label Oct 29, 2025

jcrist requested a review from dantegd October 29, 2025 21:44

jcrist added improvement Improvement / enhancement to an existing function breaking Breaking change algo: linear-model labels Oct 29, 2025

jcrist requested a review from csadorf October 29, 2025 21:51

jcrist force-pushed the ridge-improvements branch 2 times, most recently from 87b97fc to 121033e Compare October 30, 2025 19:16

jcrist commented Oct 30, 2025

View reviewed changes

csadorf approved these changes Oct 31, 2025

View reviewed changes

python/cuml/cuml/linear_model/ridge.pyx Outdated Show resolved Hide resolved

python/cuml/cuml/linear_model/ridge.pyx Outdated Show resolved Hide resolved

csadorf force-pushed the ridge-improvements branch from 61c2a98 to 121033e Compare October 31, 2025 20:18

jcrist added 4 commits October 31, 2025 13:32

Split Ridge tests out into their own file

9a3496b

Update cuml-accel wrapper and xfail-list

a13311f

Compat with sklearn < 1.6

3938611

jcrist force-pushed the ridge-improvements branch from 121033e to 4d0e61d Compare October 31, 2025 20:43

Respond to feedback

4d0e61d

csadorf approved these changes Oct 31, 2025

View reviewed changes

rapids-bot bot merged commit 1e4db5d into rapidsai:main Oct 31, 2025
101 checks passed

jcrist deleted the ridge-improvements branch October 31, 2025 22:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Several improvements to `Ridge` #7410

Several improvements to `Ridge` #7410

Uh oh!

jcrist commented Oct 29, 2025

Uh oh!

jcrist commented Oct 29, 2025

Uh oh!

jcrist Oct 30, 2025

Uh oh!

jcrist Oct 30, 2025

Uh oh!

csadorf left a comment

Uh oh!

Uh oh!

Uh oh!

csadorf commented Oct 31, 2025

Uh oh!

csadorf left a comment

Uh oh!

jcrist commented Oct 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Several improvements to Ridge #7410

Several improvements to Ridge #7410

Uh oh!

Conversation

jcrist commented Oct 29, 2025

Uh oh!

jcrist commented Oct 29, 2025

Uh oh!

jcrist Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

jcrist Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

csadorf left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

csadorf commented Oct 31, 2025

Performance Results

Performance Summary by Problem Size

New Capabilities Benchmarked

Detailed Benchmark Results

Summary

Benchmark Environment

Uh oh!

csadorf left a comment

Choose a reason for hiding this comment

Uh oh!

jcrist commented Oct 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Several improvements to `Ridge` #7410

Several improvements to `Ridge` #7410