Skip to content

Conversation

@jcrist
Copy link
Member

@jcrist jcrist commented Oct 29, 2025

This PR includes several improvements to Ridge. I started trying to fix one issue, but they were all so coupled it's a bit of a large PR. Hopefully still understandable though.

  • Move SVD solver to pure python. This simplifies maintenance, and improves flexibility. The new python-based solver is as fast or faster than the old C++ version, which makes sense since it's (mostly) just a thin layer over cublas and cusolver calls.
  • Adds support for multi-target regression to Ridge. Fixes [FEA] Support multi-target for cuML ridge regression #4412.
  • Adds support for X with more features than samples to Ridge with solver="svd". Fixes [BUG] Ridge(solver='svd') fails on wide matrices (n_features > n_samples) with CUSOLVER_STATUS_INVALID_VALUE instead of CPU fallback #7198.
  • Adds support for array-like alpha, improving our sklearn compatibility. Fixes a long standing TODO in our docs.
  • Adds a copy_X parameter, mirroring the sklearn equivalent. This lets the solver mutate X instead of making a copy, reducing memory usage.
  • Fixes a bug where the solver would accidentally mutate y and sample_weight in some configurations, and adds a test.
  • Moves Ridge tests out into their own file, the test_linear_models.py file was getting a bit untenable.
  • Vastly improved testing coverage and hygiene
  • Cleanup docstrings to match conventions

It does include one breaking change, which I think we want to do based on other conversations. Previously if a user explicitly passed solver="eig" but the "eig" solver couldn't handle the inputs, it'd warn and fallback to "svd". We now error in this case, alerting the user that the solver they specified isn't supported. This better matches sklearn conventions, and is something I think we'll want to apply to the LinearRegression implementation as well (see #7355 (comment)). With the default of solver="auto" we'll still fallback to "svd" in those cases, we only fail if the user explicitly requested a solver that doesn't support the input types.

Fixes #4412. Fixes #7198. Followup to #7330.

@jcrist jcrist self-assigned this Oct 29, 2025
@jcrist jcrist requested a review from a team as a code owner October 29, 2025 21:44
@jcrist jcrist added the Cython / Python Cython or Python issue label Oct 29, 2025
@jcrist jcrist requested a review from dantegd October 29, 2025 21:44
@jcrist jcrist added improvement Improvement / enhancement to an existing function breaking Breaking change algo: linear-model labels Oct 29, 2025
@jcrist
Copy link
Member Author

jcrist commented Oct 29, 2025

A quick benchmark comparing the new cupy-based SVD solver vs the old C++ one:

bench.py
from time import perf_counter
from itertools import product

import cuml
from cuml.datasets import make_regression

N_RUNS = 5
N_FEATURES = [10, 100, 1000]
N_SAMPLES = [1000, 10000, 50000]

for fit_intercept in [True, False]:
    print(f"# {fit_intercept = }")
    for n_features, n_samples in product(N_FEATURES, N_SAMPLES):
        X, y = make_regression(
            n_features=n_features, n_samples=n_samples, random_state=42
        )

        start = perf_counter()
        for _ in range(N_RUNS):
            cuml.Ridge(fit_intercept=fit_intercept, solver="svd").fit(X, y)
        duration = (perf_counter() - start) / N_RUNS

        print(f"- {X.shape}: {duration * 1e3:.3f} ms")

On main

# fit_intercept = True
- (1000, 10): 6.573 ms
- (10000, 10): 3.945 ms
- (50000, 10): 7.908 ms
- (1000, 100): 7.959 ms
- (10000, 100): 12.996 ms
- (50000, 100): 24.516 ms
- (1000, 1000): 106.750 ms
- (10000, 1000): 172.791 ms
- (50000, 1000): 287.953 ms
# fit_intercept = False
- (1000, 10): 3.127 ms
- (10000, 10): 2.083 ms
- (50000, 10): 4.297 ms
- (1000, 100): 7.796 ms
- (10000, 100): 10.251 ms
- (50000, 100): 21.500 ms
- (1000, 1000): 122.850 ms
- (10000, 1000): 165.714 ms
- (50000, 1000): 264.219 ms

On this PR

# fit_intercept = True
- (1000, 10): 10.041 ms
- (10000, 10): 3.649 ms
- (50000, 10): 3.522 ms
- (1000, 100): 7.930 ms
- (10000, 100): 8.796 ms
- (50000, 100): 16.406 ms
- (1000, 1000): 105.039 ms
- (10000, 1000): 157.908 ms
- (50000, 1000): 221.359 ms
# fit_intercept = False
- (1000, 10): 1.854 ms
- (10000, 10): 1.035 ms
- (50000, 10): 2.516 ms
- (1000, 100): 6.641 ms
- (10000, 100): 8.255 ms
- (50000, 100): 14.055 ms
- (1000, 1000): 104.022 ms
- (10000, 1000): 153.018 ms
- (50000, 1000): 215.664 ms

@jcrist jcrist requested a review from csadorf October 29, 2025 21:51
@jcrist jcrist force-pushed the ridge-improvements branch 2 times, most recently from 87b97fc to 121033e Compare October 30, 2025 19:16
- "sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,estimator=Ridge(),param_grid={'alpha':[0.1,1.0]})-check_complex_data]"
- "sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,estimator=Ridge(),param_grid={'alpha':[0.1,1.0]})-check_dtype_object]"
- "sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,estimator=Ridge(),param_grid={'alpha':[0.1,1.0]})-check_estimators_nan_inf]"
- "sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,estimator=Ridge(),param_grid={'alpha':[0.1,1.0]})-check_fit1d]"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though we're now more compatible with sklearn, we only see new xfails instead of any new passing tests. The new failures are due the following:

  • Several test_search_cv tests now are accelerated that previously would fallback since the input data was wider than long. These tests just check for nice errors if nan/inf in the data, we'll fix them uniformly across all estimators when we revamp data ingestion. Nothing to do here now.
  • Some test_solver_consistency tests that check the different solvers result in similar enough results. We're very close, but don't quite match with the tolerances used in the tests here.
  • A test for n_iter, previously we'd fallback for the fit combo used, we now run accelerated but lack the attribute
  • A test for array-like alpha. We support all the cases used, but don't match their error message test since their error message has a bug in it (the incorrect values are accidentally flipped).

# column 2D y. The sklearn 1.6+ behavior is what we implement in
# cuml.Ridge, here we adjust the shape of `coef_` after the fit to
# match the older behavior. This will also trickle down to change the
# output shape of `predict` to match the older behavior transparently.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is gross, but isolated to cuml.accel so I'm ok with it.

Copy link
Contributor

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

@csadorf
Copy link
Contributor

csadorf commented Oct 31, 2025

I ran comprehensive benchmarks comparing the nightly build against this PR (commit 121033e79) across 47 different configurations.

Performance Results

This PR delivers significant performance improvements across the board with no regressions:

Configuration Nightly This PR Improvement
basic_svd_1000x10_intercept=True 1.58ms 1.14ms -28%
basic_svd_10000x10_intercept=True 1.58ms 1.11ms -30%
basic_svd_10000x100_intercept=True 9.11ms 8.05ms -12%
basic_svd_10000x1000_intercept=True 157.40ms 145.71ms -7%
basic_svd_50000x1000_intercept=True 258.30ms 209.23ms -19%
basic_eig_10000x1000_intercept=True 26.73ms 26.88ms +1% (negligible)

Performance Summary by Problem Size

  • Small matrices (1000×10): 28-30% faster
  • Medium matrices (10000×100): 12% faster
  • Large matrices (50000×1000): 19% faster
  • Eigendecomposition solver: Performance maintained (within measurement variance)
  • Overall: No regressions detected across any of the 47 configurations

New Capabilities Benchmarked

This PR also enables 8 previously unsupported configurations, all with competitive performance:

Feature Example Configuration Performance
Multi-target regression multitarget_n=10 (10 output columns) 7.54ms fit
Array-like alpha array_alpha_n=10 (10 targets, per-target regularization) 8.58ms fit
Wide matrices (SVD) wide_svd_500x2000 (500 samples, 2000 features) 44.72ms fit

Detailed Benchmark Results

Nightly Build Results (click to expand)

Version: rapidsai-nightly/linux-64::cuml-25.12.00a90-cuda13_py313_251024_b62d68ec

====================================================================================================
Ridge Regression Benchmark Suite
====================================================================================================
cuML version: 25.12.00a90
sklearn version: 1.7.2
Version label: rapidsai-nightly/linux-64::cuml-25.12.00a90-cuda13_py313_251024_b62d68ec
Number of runs per benchmark: 5
Include sklearn: False
====================================================================================================

Running comprehensive benchmark (47 configurations)
[1/47] Running: basic_svd_1000x10_intercept=True... fit=1.58ms, predict=0.17ms
[2/47] Running: basic_svd_1000x10_intercept=False... fit=1.32ms, predict=0.17ms
[3/47] Running: basic_svd_10000x10_intercept=True... fit=1.58ms, predict=0.17ms
[4/47] Running: basic_svd_10000x10_intercept=False... fit=1.35ms, predict=0.17ms
[5/47] Running: basic_svd_50000x10_intercept=True... fit=3.67ms, predict=0.18ms
[6/47] Running: basic_svd_50000x10_intercept=False... fit=3.65ms, predict=0.19ms
[7/47] Running: basic_eig_1000x100_intercept=True... fit=2.43ms, predict=0.17ms
[8/47] Running: basic_eig_1000x100_intercept=False... fit=2.21ms, predict=5.37ms
[9/47] Running: basic_svd_1000x100_intercept=True... fit=6.71ms, predict=0.17ms
[10/47] Running: basic_svd_1000x100_intercept=False... fit=6.57ms, predict=0.16ms
[11/47] Running: basic_eig_10000x100_intercept=True... fit=3.28ms, predict=0.18ms
[12/47] Running: basic_eig_10000x100_intercept=False... fit=3.28ms, predict=0.18ms
[13/47] Running: basic_svd_10000x100_intercept=True... fit=9.11ms, predict=0.18ms
[14/47] Running: basic_svd_10000x100_intercept=False... fit=9.15ms, predict=0.19ms
[15/47] Running: basic_eig_50000x100_intercept=True... fit=10.22ms, predict=0.18ms
[16/47] Running: basic_eig_50000x100_intercept=False... fit=10.65ms, predict=0.18ms
[17/47] Running: basic_svd_50000x100_intercept=True... fit=20.19ms, predict=6.44ms
[18/47] Running: basic_svd_50000x100_intercept=False... fit=19.49ms, predict=0.18ms
[19/47] Running: basic_eig_1000x1000_intercept=True... fit=13.41ms, predict=0.30ms
[20/47] Running: basic_eig_1000x1000_intercept=False... fit=13.17ms, predict=0.17ms
[21/47] Running: basic_svd_1000x1000_intercept=True... fit=101.29ms, predict=0.18ms
[22/47] Running: basic_svd_1000x1000_intercept=False... fit=100.81ms, predict=0.18ms
[23/47] Running: basic_eig_10000x1000_intercept=True... fit=26.73ms, predict=0.20ms
[24/47] Running: basic_eig_10000x1000_intercept=False... fit=26.57ms, predict=0.20ms
[25/47] Running: basic_svd_10000x1000_intercept=True... fit=157.40ms, predict=0.21ms
[26/47] Running: basic_svd_10000x1000_intercept=False... fit=159.93ms, predict=0.21ms
[27/47] Running: basic_eig_50000x1000_intercept=True... fit=79.35ms, predict=0.67ms
[28/47] Running: basic_eig_50000x1000_intercept=False... fit=81.19ms, predict=0.47ms
[29/47] Running: basic_svd_50000x1000_intercept=True... fit=258.30ms, predict=0.47ms
[30/47] Running: basic_svd_50000x1000_intercept=False... fit=257.32ms, predict=0.47ms
[31/47] Running: dtype_float64_eig... fit=11.55ms, predict=0.17ms
[32/47] Running: dtype_float64_svd... fit=20.75ms, predict=0.17ms
[33/47] Running: weighted_eig... fit=3.36ms, predict=0.17ms
[34/47] Running: weighted_svd... fit=9.14ms, predict=0.18ms
[35/47] Running: multitarget_n=2... ERROR: Expected 1 columns but got 2 columns.
[36/47] Running: multitarget_n=5... ERROR: Expected 1 columns but got 5 columns.
[37/47] Running: multitarget_n=10... ERROR: Expected 1 columns but got 10 columns.
[38/47] Running: array_alpha_n=3... ERROR: Expected 1 columns but got 3 columns.
[39/47] Running: array_alpha_n=10... ERROR: Expected 1 columns but got 10 columns.
[40/47] Running: wide_eig_100x200... fit=11.18ms, predict=0.17ms
[41/47] Running: wide_eig_100x1000... fit=10.64ms, predict=0.16ms
[42/47] Running: wide_eig_500x2000... fit=39.74ms, predict=0.17ms
[43/47] Running: wide_svd_100x200... ERROR: cuSOLVER error encountered
[44/47] Running: wide_svd_100x1000... ERROR: cuSOLVER error encountered
[45/47] Running: wide_svd_500x2000... ERROR: cuSOLVER error encountered
[46/47] Running: single_feature_svd... fit=1.01ms, predict=0.18ms
[47/47] Running: normalize_eig... fit=3.32ms, predict=0.18ms

====================================================================================================
BENCHMARK SUMMARY
Version: rapidsai-nightly/linux-64::cuml-25.12.00a90-cuda13_py313_251024_b62d68ec
====================================================================================================

cuML Results (39 benchmarks):
----------------------------------------------------------------------------------------------------
Name                                               Fit (ms)        Predict (ms)    Memory (MB)    
----------------------------------------------------------------------------------------------------
single_feature_svd                                     1.01 ±  0.01      0.18 ±  0.02          N/A
basic_svd_1000x10_intercept=False                      1.32 ±  0.02      0.17 ±  0.02          N/A
basic_svd_10000x10_intercept=False                     1.35 ±  0.04      0.17 ±  0.02          N/A
basic_svd_10000x10_intercept=True                      1.58 ±  0.22      0.17 ±  0.02          N/A
basic_svd_1000x10_intercept=True                       1.58 ±  0.40      0.17 ±  0.02          N/A
basic_eig_1000x100_intercept=False                     2.21 ±  0.04      5.37 ± 10.40          N/A
basic_eig_1000x100_intercept=True                      2.43 ±  0.19      0.17 ±  0.03          N/A
basic_eig_10000x100_intercept=False                    3.28 ±  0.03      0.18 ±  0.02          N/A
basic_eig_10000x100_intercept=True                     3.28 ±  0.03      0.18 ±  0.03          N/A
normalize_eig                                          3.32 ±  0.02      0.18 ±  0.02          N/A
weighted_eig                                           3.36 ±  0.01      0.17 ±  0.02          N/A
basic_svd_50000x10_intercept=False                     3.65 ±  0.02      0.19 ±  0.03          N/A
basic_svd_50000x10_intercept=True                      3.67 ±  0.03      0.18 ±  0.03          N/A
basic_svd_1000x100_intercept=False                     6.57 ±  0.02      0.16 ±  0.03          N/A
basic_svd_1000x100_intercept=True                      6.71 ±  0.05      0.17 ±  0.03          N/A
basic_svd_10000x100_intercept=True                     9.11 ±  0.05      0.18 ±  0.03          N/A
weighted_svd                                           9.14 ±  0.03      0.18 ±  0.02          N/A
basic_svd_10000x100_intercept=False                    9.15 ±  0.04      0.19 ±  0.03          N/A
basic_eig_50000x100_intercept=True                    10.22 ±  0.08      0.18 ±  0.04          N/A
wide_eig_100x1000                                     10.64 ±  0.02      0.16 ±  0.02          N/A
basic_eig_50000x100_intercept=False                   10.65 ±  0.07      0.18 ±  0.04          N/A
wide_eig_100x200                                      11.18 ± 17.55      0.17 ±  0.02          N/A
dtype_float64_eig                                     11.55 ±  1.11      0.17 ±  0.03          N/A
basic_eig_1000x1000_intercept=False                   13.17 ±  0.08      0.17 ±  0.03          N/A
basic_eig_1000x1000_intercept=True                    13.41 ±  0.38      0.30 ±  0.26          N/A
basic_svd_50000x100_intercept=False                   19.49 ±  0.04      0.18 ±  0.04          N/A
basic_svd_50000x100_intercept=True                    20.19 ±  0.09      6.44 ± 12.49          N/A
dtype_float64_svd                                     20.75 ±  0.41      0.17 ±  0.03          N/A
basic_eig_10000x1000_intercept=False                  26.57 ±  0.19      0.20 ±  0.04          N/A
basic_eig_10000x1000_intercept=True                   26.73 ±  0.31      0.20 ±  0.04          N/A
wide_eig_500x2000                                     39.74 ±  0.07      0.17 ±  0.03          N/A
basic_eig_50000x1000_intercept=True                   79.35 ±  1.94      0.67 ±  0.19          N/A
basic_eig_50000x1000_intercept=False                  81.19 ±  1.26      0.47 ±  0.05          N/A
basic_svd_1000x1000_intercept=False                  100.81 ±  1.12      0.18 ±  0.05          N/A
basic_svd_1000x1000_intercept=True                   101.29 ±  1.34      0.18 ±  0.05          N/A
basic_svd_10000x1000_intercept=True                  157.40 ±  1.42      0.21 ±  0.05          N/A
basic_svd_10000x1000_intercept=False                 159.93 ±  1.26      0.21 ±  0.05          N/A
basic_svd_50000x1000_intercept=False                 257.32 ±  1.60      0.47 ±  0.04          N/A
basic_svd_50000x1000_intercept=True                  258.30 ±  2.70      0.47 ±  0.05          N/A

Errors (8):
----------------------------------------------------------------------------------------------------
multitarget_n=2: Expected 1 columns but got 2 columns.
multitarget_n=5: Expected 1 columns but got 5 columns.
multitarget_n=10: Expected 1 columns but got 10 columns.
array_alpha_n=3: Expected 1 columns but got 3 columns.
array_alpha_n=10: Expected 1 columns but got 10 columns.
wide_svd_100x200: cuSOLVER error (CUSOLVER_STATUS_INVALID_VALUE)
wide_svd_100x1000: cuSOLVER error (CUSOLVER_STATUS_INVALID_VALUE)
wide_svd_500x2000: cuSOLVER error (CUSOLVER_STATUS_INVALID_VALUE)

Results saved to: ridge_benchmark_results.json

Benchmark complete!
PR #7410 Results (click to expand)

Version: 121033e79 (this PR)

====================================================================================================
Ridge Regression Benchmark Suite
====================================================================================================
cuML version: 25.12.00
sklearn version: 1.7.2
Version label: 121033e79
Number of runs per benchmark: 5
Include sklearn: False
====================================================================================================

Running comprehensive benchmark (47 configurations)
[1/47] Running: basic_svd_1000x10_intercept=True... fit=1.14ms, predict=0.16ms
[2/47] Running: basic_svd_1000x10_intercept=False... fit=0.87ms, predict=0.16ms
[3/47] Running: basic_svd_10000x10_intercept=True... fit=1.11ms, predict=0.16ms
[4/47] Running: basic_svd_10000x10_intercept=False... fit=0.86ms, predict=0.15ms
[5/47] Running: basic_svd_50000x10_intercept=True... fit=2.92ms, predict=0.17ms
[6/47] Running: basic_svd_50000x10_intercept=False... fit=2.43ms, predict=0.17ms
[7/47] Running: basic_eig_1000x100_intercept=True... fit=2.81ms, predict=0.16ms
[8/47] Running: basic_eig_1000x100_intercept=False... fit=1.87ms, predict=0.16ms
[9/47] Running: basic_svd_1000x100_intercept=True... fit=6.37ms, predict=0.16ms
[10/47] Running: basic_svd_1000x100_intercept=False... fit=6.12ms, predict=0.16ms
[11/47] Running: basic_eig_10000x100_intercept=True... fit=3.61ms, predict=0.17ms
[12/47] Running: basic_eig_10000x100_intercept=False... fit=2.41ms, predict=0.17ms
[13/47] Running: basic_svd_10000x100_intercept=True... fit=8.05ms, predict=0.17ms
[14/47] Running: basic_svd_10000x100_intercept=False... fit=7.85ms, predict=0.17ms
[15/47] Running: basic_eig_50000x100_intercept=True... fit=10.75ms, predict=3.09ms
[16/47] Running: basic_eig_50000x100_intercept=False... fit=4.02ms, predict=0.17ms
[17/47] Running: basic_svd_50000x100_intercept=True... fit=14.25ms, predict=0.17ms
[18/47] Running: basic_svd_50000x100_intercept=False... fit=13.41ms, predict=0.16ms
[19/47] Running: basic_eig_1000x1000_intercept=True... fit=13.42ms, predict=0.16ms
[20/47] Running: basic_eig_1000x1000_intercept=False... fit=12.27ms, predict=0.16ms
[21/47] Running: basic_svd_1000x1000_intercept=True... fit=99.93ms, predict=0.17ms
[22/47] Running: basic_svd_1000x1000_intercept=False... fit=99.31ms, predict=0.18ms
[23/47] Running: basic_eig_10000x1000_intercept=True... fit=26.88ms, predict=0.19ms
[24/47] Running: basic_eig_10000x1000_intercept=False... fit=19.74ms, predict=0.19ms
[25/47] Running: basic_svd_10000x1000_intercept=True... fit=145.71ms, predict=0.19ms
[26/47] Running: basic_svd_10000x1000_intercept=False... fit=144.36ms, predict=0.19ms
[27/47] Running: basic_eig_50000x1000_intercept=True... fit=80.75ms, predict=0.46ms
[28/47] Running: basic_eig_50000x1000_intercept=False... fit=32.08ms, predict=0.44ms
[29/47] Running: basic_svd_50000x1000_intercept=True... fit=209.23ms, predict=0.44ms
[30/47] Running: basic_svd_50000x1000_intercept=False... fit=205.05ms, predict=0.46ms
[31/47] Running: dtype_float64_eig... fit=11.45ms, predict=0.17ms
[32/47] Running: dtype_float64_svd... fit=18.82ms, predict=6.99ms
[33/47] Running: weighted_eig... fit=3.76ms, predict=0.17ms
[34/47] Running: weighted_svd... fit=8.65ms, predict=0.17ms
[35/47] Running: multitarget_n=2... fit=7.88ms, predict=0.17ms
[36/47] Running: multitarget_n=5... fit=7.45ms, predict=0.17ms
[37/47] Running: multitarget_n=10... fit=7.54ms, predict=0.16ms
[38/47] Running: array_alpha_n=3... fit=7.83ms, predict=0.16ms
[39/47] Running: array_alpha_n=10... fit=8.58ms, predict=0.19ms
[40/47] Running: wide_eig_100x200... fit=2.46ms, predict=0.16ms
[41/47] Running: wide_eig_100x1000... fit=10.93ms, predict=0.16ms
[42/47] Running: wide_eig_500x2000... fit=39.98ms, predict=0.17ms
[43/47] Running: wide_svd_100x200... fit=5.74ms, predict=0.16ms
[44/47] Running: wide_svd_100x1000... fit=6.35ms, predict=0.16ms
[45/47] Running: wide_svd_500x2000... fit=44.72ms, predict=0.16ms
[46/47] Running: single_feature_svd... fit=0.87ms, predict=0.15ms
[47/47] Running: normalize_eig... fit=3.59ms, predict=2.52ms

====================================================================================================
BENCHMARK SUMMARY
Version: 121033e79
====================================================================================================

cuML Results (47 benchmarks):
----------------------------------------------------------------------------------------------------
Name                                               Fit (ms)        Predict (ms)    Memory (MB)    
----------------------------------------------------------------------------------------------------
basic_svd_10000x10_intercept=False                     0.86 ±  0.01      0.15 ±  0.01          N/A
basic_svd_1000x10_intercept=False                      0.87 ±  0.01      0.16 ±  0.01          N/A
single_feature_svd                                     0.87 ±  0.01      0.15 ±  0.01          N/A
basic_svd_10000x10_intercept=True                      1.11 ±  0.05      0.16 ±  0.01          N/A
basic_svd_1000x10_intercept=True                       1.14 ±  0.11      0.16 ±  0.02          N/A
basic_eig_1000x100_intercept=False                     1.87 ±  0.05      0.16 ±  0.02          N/A
basic_eig_10000x100_intercept=False                    2.41 ±  0.03      0.17 ±  0.03          N/A
basic_svd_50000x10_intercept=False                     2.43 ±  0.02      0.17 ±  0.02          N/A
wide_eig_100x200                                       2.46 ±  0.01      0.16 ±  0.01          N/A
basic_eig_1000x100_intercept=True                      2.81 ±  0.47      0.16 ±  0.03          N/A
basic_svd_50000x10_intercept=True                      2.92 ±  0.01      0.17 ±  0.02          N/A
normalize_eig                                          3.59 ±  0.02      2.52 ±  4.68          N/A
basic_eig_10000x100_intercept=True                     3.61 ±  0.04      0.17 ±  0.03          N/A
weighted_eig                                           3.76 ±  0.01      0.17 ±  0.02          N/A
basic_eig_50000x100_intercept=False                    4.02 ±  0.09      0.17 ±  0.02          N/A
wide_svd_100x200                                       5.74 ±  0.01      0.16 ±  0.02          N/A
basic_svd_1000x100_intercept=False                     6.12 ±  0.00      0.16 ±  0.01          N/A
wide_svd_100x1000                                      6.35 ±  0.01      0.16 ±  0.02          N/A
basic_svd_1000x100_intercept=True                      6.37 ±  0.04      0.16 ±  0.02          N/A
multitarget_n=5                                        7.45 ±  0.01      0.17 ±  0.02          N/A
multitarget_n=10                                       7.54 ±  0.09      0.16 ±  0.02          N/A
array_alpha_n=3                                        7.83 ±  0.01      0.16 ±  0.02          N/A
basic_svd_10000x100_intercept=False                    7.85 ±  0.02      0.17 ±  0.02          N/A
multitarget_n=2                                        7.88 ±  0.01      0.17 ±  0.02          N/A
basic_svd_10000x100_intercept=True                     8.05 ±  0.01      0.17 ±  0.02          N/A
array_alpha_n=10                                       8.58 ±  1.82      0.19 ±  0.02          N/A
weighted_svd                                           8.65 ±  0.00      0.17 ±  0.02          N/A
basic_eig_50000x100_intercept=True                    10.75 ±  0.08      3.09 ±  5.79          N/A
wide_eig_100x1000                                     10.93 ±  0.10      0.16 ±  0.03          N/A
dtype_float64_eig                                     11.45 ±  1.11      0.17 ±  0.04          N/A
basic_eig_1000x1000_intercept=False                   12.27 ±  0.05      0.16 ±  0.02          N/A
basic_svd_50000x100_intercept=False                   13.41 ±  0.13      0.16 ±  0.02          N/A
basic_eig_1000x1000_intercept=True                    13.42 ±  0.09      0.16 ±  0.03          N/A
basic_svd_50000x100_intercept=True                    14.25 ±  0.05      0.17 ±  0.02          N/A
dtype_float64_svd                                     18.82 ±  0.03      6.99 ± 13.64          N/A
basic_eig_10000x1000_intercept=False                  19.74 ±  8.54      0.19 ±  0.02          N/A
basic_eig_10000x1000_intercept=True                   26.88 ±  0.15      0.19 ±  0.03          N/A
basic_eig_50000x1000_intercept=False                  32.08 ±  0.22      0.44 ±  0.02          N/A
wide_eig_500x2000                                     39.98 ±  0.10      0.17 ±  0.04          N/A
wide_svd_500x2000                                     44.72 ±  0.37      0.16 ±  0.02          N/A
basic_eig_50000x1000_intercept=True                   80.75 ±  0.51      0.46 ±  0.04          N/A
basic_svd_1000x1000_intercept=False                   99.31 ±  1.30      0.18 ±  0.05          N/A
basic_svd_1000x1000_intercept=True                    99.93 ±  1.74      0.17 ±  0.02          N/A
basic_svd_10000x1000_intercept=False                 144.36 ±  1.29      0.19 ±  0.02          N/A
basic_svd_10000x1000_intercept=True                  145.71 ±  2.63      0.19 ±  0.02          N/A
basic_svd_50000x1000_intercept=False                 205.05 ±  1.61      0.46 ±  0.05          N/A
basic_svd_50000x1000_intercept=True                  209.23 ±  1.67      0.44 ±  0.02          N/A

Results saved to: ridge_benchmark_results.json

Benchmark complete!

Summary

Performance Improvements:

  • 7-30% speedup on SVD solver across multiple problem sizes
  • Maintained performance on eigendecomposition solver
  • Zero regressions across 47 benchmark configurations

Benchmark Environment

  • GPU: NVIDIA RTX A6000 49GB
  • CUDA Version: 13.0
  • Driver Version: 580.95.05
  • cuML versions:
    • Nightly: 25.12.00a90
    • This PR: 25.12.00 (commit 121033e79)
  • sklearn version: 1.7.2
  • Runs per benchmark: 5 iterations each
  • Total configurations tested: 47

@csadorf csadorf force-pushed the ridge-improvements branch from 61c2a98 to 121033e Compare October 31, 2025 20:18
- Move SVD solver to pure python, simplifying maintenance and improving
  flexibility.
- Add support for multi-target inputs
- Add support for array-like `alpha`
- Add support for `n_features > n_samples` in SVD solver
- Add `copy_X` parameter, allowing the solver to mutate X instead of
  making a copy.
- Fix bug where solver would accidentally mutate y and sample_weight in
  some cases, and add a test.
- Improve testing coverage
- Cleanup docstring a bit
@jcrist jcrist force-pushed the ridge-improvements branch from 121033e to 4d0e61d Compare October 31, 2025 20:43
Copy link
Contributor

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@jcrist
Copy link
Member Author

jcrist commented Oct 31, 2025

/merge

@rapids-bot rapids-bot bot merged commit 1e4db5d into rapidsai:main Oct 31, 2025
101 checks passed
@jcrist jcrist deleted the ridge-improvements branch October 31, 2025 22:00
vardhan30016 pushed a commit to vardhan30016/cuml that referenced this pull request Nov 7, 2025
This PR includes several improvements to `Ridge`. I started trying to fix one issue, but they were all so coupled it's a bit of a large PR. Hopefully still understandable though.

- Move SVD solver to pure python. This simplifies maintenance, and improves flexibility. The new python-based solver is as fast or faster than the old C++ version, which makes sense since it's (mostly) just a thin layer over cublas and cusolver calls.
- Adds support for multi-target regression to `Ridge`. Fixes rapidsai#4412.
- Adds support for `X` with more features than samples to `Ridge` with `solver="svd"`. Fixes rapidsai#7198.
- Adds support for array-like `alpha`, improving our sklearn compatibility. Fixes a long standing TODO in our docs.
- Adds a `copy_X` parameter, mirroring the sklearn equivalent. This lets the solver mutate X instead of making a copy, reducing memory usage.
- Fixes a bug where the solver would accidentally mutate `y` and `sample_weight` in some configurations, and adds a test.
- Moves `Ridge` tests out into their own file, the `test_linear_models.py` file was getting a bit untenable.
- Vastly improved testing coverage and hygiene
- Cleanup docstrings to match conventions

It does include one breaking change, which I think we want to do based on other conversations. Previously if a user explicitly passed `solver="eig"` but the `"eig"` solver couldn't handle the inputs, it'd warn and fallback to `"svd"`. We now error in this case, alerting the user that the solver they specified isn't supported. This better matches sklearn conventions, and is something I think we'll want to apply to the `LinearRegression` implementation as well (see rapidsai#7355 (comment)). With the default of `solver="auto"` we'll still fallback to `"svd"` in those cases, we only fail if the user _explicitly requested_ a solver that doesn't support the input types.

Fixes rapidsai#4412. Fixes rapidsai#7198. Followup to rapidsai#7330.

Authors:
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Simon Adorf (https://github.com/csadorf)

URL: rapidsai#7410
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

algo: linear-model breaking Breaking change Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function

Projects

None yet

2 participants