NBTest is an automated assertion generation tool designed specifically for machine learning notebooks. It identifies ML-specific metrics (e.g., dataset statistics, model performance) and generates statistically robust assertions to help you catch subtle regressions and inconsistencies that previous tools like nbval
may miss.
NBTest includes three main components:
nbtest-gen
: A CLI tool that analyzes a Jupyter notebook and automatically generates ML-related assertions.nbtest-plugin
: A custom Pytest plugin that executes assertions embedded in.ipynb
files.nbtest-lab-extension
: A JupyterLab extension to toggle assertion checking interactively during notebook development.
You can toggle assertions on or off depending on your workflow:
- During development, keep assertions off to avoid interruption.
- After development, enable assertions to verify consistency with previous runs.
JupyterLab NBTest Plugin UI
The NBTest assertions appear in the right-hand side panel of JupyterLab.
You can hide this panel with the Hide Assertion Editor button or enable/disable assertions using the Toggle Assertions button.
When all assertions pass, a green notification bar labeled Assertions Passed is displayed.
Running NBTest with Pytest
NBTest can also be executed through pytest. The example below shows the pytest output when running a Jupyter notebook that includes NBTest assertions:
Install all components from TestPyPI using:
pip install --extra-index-url https://test.pypi.org/simple/ nbtest-plugin
pip install --extra-index-url https://test.pypi.org/simple/ nbtest-gen
pip install --extra-index-url https://test.pypi.org/simple/ nbtest-lab-extension
nbtest-gen <notebook_path> -n <num_iterations> -o <output_dir>
notebook_path
: Path to the.ipynb
file.num_iterations
: Number of times the notebook is run to collect statistics (suggested: 10-30).output_dir
: Output directory for logs and the assertion-augmented notebook.
Example:
nbtest-gen ./spaceship-titanic.ipynb -n 10 -o ./outputs
The generated notebook will be:
./outputs/spaceship-titanic_chebyshev_0.95_10.ipynb
pytest -v --nbtest <notebook_with_assertions>
Example:
pytest -v --nbtest ./outputs/spaceship-titanic_chebyshev_0.95_10.ipynb
Launch the notebook:
jupyter-lab ./outputs/spaceship-titanic_chebyshev_0.95_10.ipynb
Use the Toggle Asserts button to switch NBTest status: ON/OFF in the top-right corner.
NBTest status: OFF
— Assertions will be ignored during execution.NBTest status: ON
— Assertions are enforced during cell execution.
You can add assertions manually using nbtest
. Supported types include:
-
assert_equal(a, b, err_msg='', type='', test_id='')
a
,b
: Values to compare.err_msg
: Optional error message.type
,test_id
: Optional metadata.- Example:
nbtest.assert_equal([1, 2, 3], [1, 2, 3])
-
assert_allclose(a, b, rtol=1e-07, atol=0, err_msg='', type='', test_id='')
a
,b
: Numeric values or arrays to compare.rtol
,atol
: Relative and absolute tolerance.err_msg
,type
,test_id
: Optional metadata.- Example:
nbtest.assert_allclose(np.array([0.1, 0.2]), np.array([0.1, 0.20000001]), rtol=1e-5)
-
assert_true(a, msg=None, type='', test_id='')
- Example:
nbtest.assert_true(3 > 1)
- Example:
-
assert_false(a, msg=None, type='', test_id='')
:- Example:
nbtest.assert_false(1 > 3)
- Example:
-
assert_nanvar(a, b, rtol=1e-07, atol=0, err_msg='', test_id='')
- Assert that the variance of numeric columns in a DataFrame matches expected values (ignoring NaNs).
a
: Pandas DataFrame.b
: Expected variance value or array.- Example:
nbtest.assert_nanvar(df, expected_var)
-
assert_nanmean(a, b, rtol=1e-07, atol=0, err_msg='', test_id='')
- Assert that the mean of numeric columns in a DataFrame matches expected values (ignoring NaNs).
a
: Pandas DataFrame.b
: Expected mean value or array.- Example:
nbtest.assert_nanmean(df, expected_mean)
-
assert_column_types(a, b, err_msg='', test_id='')
- Assert that column data types match the expected list.
a
: Pandas DataFrame.b
: List of expected data type strings.- Example:
nbtest.assert_column_types(df, ['int64', 'float64', 'object'])
-
assert_column_names(a, b, err_msg='', test_id='')
- Assert that column names match the expected list.
a
: Pandas DataFrame.b
: List of expected column names (sorted).- Example:
nbtest.assert_column_names(df, ['age', 'height', 'weight'])
In this section, we walk through a real-world example using NBTest on a Kaggle notebook. (No GPU required!)
We'll use the notebook at ./nbtest_demo/house-prices-advanced-regression-techniques.ipynb
.
cd nbtest_demo
conda env create -f ./environment.yml
conda activate nbtest_demo
nbtest_demo
environment includes all required packages (i.e., nbtest-gen
, nbtest-plugin
, and nbtest-lab-extension
).
Test the notebook to ensure it executes without error:
jupyter execute ./house-prices-advanced-regression-techniques.ipynb --output=./house-prices-advanced-regression-techniques_output.ipynb
Run the following command:
nbtest-gen ./house-prices-advanced-regression-techniques.ipynb -n 10 -o ./outputs
It may take around 10 minutes—go grab a coffee ☕.
After completion, your new notebook will be located at:
./outputs/house-prices-advanced-regression-techniques_chebyshev_0.95_10.ipynb
Check if the current notebook passes the generated assertions:
pytest -v --nbtest ./outputs/house-prices-advanced-regression-techniques_chebyshev_0.95_10.ipynb
You should see something like this in the terminal output, indicating all assertions passed.
============================= test session starts ==============================
platform linux -- Python 3.9.22, pytest-8.3.5, pluggy-1.6.0 -- /home/usr/bin/python3.9
cachedir: .pytest_cache
rootdir: /home/user/
configfile: pyproject.toml
plugins: anyio-4.9.0, nbtest_plugin-0.1.6
collecting ... collected 62 items
outputs/house-prices-advanced-regression-techniques_chebyshev_0.95_10.ipynb::4 PASSED [ 1%]
outputs/house-prices-advanced-regression-techniques_chebyshev_0.95_10.ipynb::3 PASSED [ 3%]
outputs/house-prices-advanced-regression-techniques_chebyshev_0.95_10.ipynb::2 PASSED [ 4%]
...
outputs/house-prices-advanced-regression-techniques_chebyshev_0.95_10.ipynb::61 PASSED [100%]
============================= 62 passed in 43.67s ==============================
Want to debug in real time? Launch the notebook with:
jupyter-lab ./outputs/house-prices-advanced-regression-techniques_chebyshev_0.95_10.ipynb
In the top-right corner, you’ll see:
- Toggle Asserts icon
- NBTest status: ON/OFF indicator
Set NBTest status: ON. Now assertions will execute as part of cell execution.
Try running the first 3 cells:
train_df = pd.read_csv('./input/train.csv')
nbtest.assert_nanvar(train_df, np.float64(1010233951.9129187), atol=np.float64(5.331201499700043e-07), test_id='4')
nbtest.assert_nanmean(train_df, np.float64(5457.877675397229), atol=np.float64(0.0), test_id='3')
nbtest.assert_column_types(train_df, ['int64', ... 'int64'], test_id='2')
nbtest.assert_column_names(train_df, ['1stFlrSF', ... 'YrSold'], test_id='1')
nbtest.assert_equal(train_df.shape, (1460, 81), test_id='0')
They should all pass—unless you've made unintended changes.
Suppose we modify the notebook and mistakenly drop a row:
train_df = pd.read_csv('./input/train.csv')
train_df = train_df.drop(index=1)
nbtest.assert_nanvar(train_df, np.float64(1010233951.9129187), atol=np.float64(5.331201499700043e-07), test_id='4')
Now rerun the cell. NBTest will catch this discrepancy immediately. This is particularly useful for debugging pipelines or validating preprocessing logic.
NBTest also checks model metrics. For example:
nbtest_tmpvar_18 = r2_score(y_test, y_pred)
nbtest.assert_allclose(nbtest_tmpvar_18, np.float64(0.8208617481485916), atol=np.float64(0.3493966396957795), test_id='61')
Even slight regressions in prediction performance can be flagged early using such assertions.
You can insert custom assertions directly:
import nbtest
import math
nbtest.assert_equal(round(math.pi, 2), 3.14)
nbtest.assert_true(math.pi == 3) # This will fail
When you execute this with assertions OFF, nothing happens.
jupyter execute ./outputs/house-prices-advanced-regression-techniques_chebyshev_0.95_10.ipynb --output=./outputs/house-prices-advanced-regression-techniques_chebyshev_0.95_10_customized_assertions.ipynb
When assertions are ON (i.e., running using pytest):
pytest -v --nbtest ./outputs/house-prices-advanced-regression-techniques_chebyshev_0.95_10.ipynb
You'll see the failure logged clearly—just like a standard Pytest assertion.
NBTest can generate:
Checks related to dataset structure, variance, means, column types/names, shape. Example:
train_df = pd.read_csv('./input/train.csv')
nbtest.assert_nanvar(train_df, np.float64(1010233951.9129187), atol=np.float64(5.331201499700043e-07), test_id='4')
nbtest.assert_nanmean(train_df, np.float64(5457.877675397229), atol=np.float64(0.0), test_id='3')
nbtest.assert_column_types(train_df, ['int64', ..., 'int64'], test_id='2')
nbtest.assert_column_names(train_df, ['1stFlrSF', ..., 'YrSold'], test_id='1')
nbtest.assert_equal(train_df.shape, (1460, 81), test_id='0')
Ensures model configurations match expectations.
Example:
regressor = RandomForestRegressor(n_estimators=10, random_state=0)
nbtest.assert_equal({k: v for k, v in regressor.get_params().items() if k != 'random_state' and not (hasattr(v, '__module__') and v.__module__.startswith('sklearn'))}, {'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': None, 'max_features': 1.0, 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'monotonic_cst': None, 'n_estimators': 10, 'n_jobs': None, 'oob_score': False, 'verbose': 0, 'warm_start': False}, test_id='59')
Verifies key performance metrics like accuracy, R^2, loss, etc.
Example:
nbtest.assert_allclose(r2_score(y_test, y_pred), 0.82, atol=0.35)
This project is licensed under the MIT License.
Questions or feedback? Reach out to Elaine Yao [[email protected]]
Happy Testing!