Modanovo is a de novo peptide sequencing tool for post-translationally modified (PTM) peptides, built on top of Casanovo (v4.0.0).
We recommend using a fresh conda environment with Python 3.10.
conda create --name modanovo-env python=3.10
conda activate modanovo-envInstall the dependencies:
- PyTorch (pick the command matching your CUDA/CPU setup from the PyTorch site; generic example):
pip3 install torch- Depthcharge-MS (pinned commit):
pip install git+https://github.com/wfondrie/depthcharge.git@bd2861f- Clone this repository:
git clone https://github.com/gagneurlab/Modanovo.git
cd modanovo- Install Modanovo:
pip install .For development (editable install):
pip install -e .Modanovo supports three modes: training, evaluation, and inference.
Train a new model from scratch:
modanovo train -c <config_path> -p <val_paths> <train_paths>Fine-tune from pretrained Casanovo weights:
modanovo train -c <config_path> -m <model_path> -p <val_paths> <train_paths>Where <model_path> points to the pretrained Casanovo v4.0.0 weights.
Evaluate a trained model on validation/test spectra:
modanovo evaluate -c <config_path> -m <model_path> -p <val_paths>Run Modanovo in inference mode:
modanovo sequence -c <config_path> -m <model_path> -o <out_path>This writes peptide sequence predictions in .mzTab format.
Assuming you’ve installed Modanovo and have a model checkpoint:
# from the repo root
modanovo sequence \
-c modanovo/config.yaml \
-m path/to/casanovo_or_modanovo_weights.ckpt \
-o outputs/predictions.mztabMake sure that the defined residues are compatible with the model weights. Leaving the config entry expanded_residues in the configuration file empty uses Casanovo's tokens. By default, fine-tuning residues are those from the MULTI-PTM dataset in PROSPECT-PTM.
- Default configuration:
modanovo/config.yaml - Example spectra file:
data_utils/example_data.mgf - Train/val/test splits used during development:
https://huggingface.co/datasets/gagneurlab/Modanovo-development-dataset - Model weights for the model fine-tuned to cover 19 amino acid-PTM combinations:
https://huggingface.co/gagneurlab/Modanovo-model
- Compatible with Casanovo v4.0.0 weights and formats.
-
Casanovo: _Yilmaz, Melih, William E Fondrie, Wout Bittremieux, et al. 2024. “Sequence-to-Sequence Translation from Mass Spectra to Peptides with a Transformer Model.” Nature Communications 15 (1): 6427.
-
PROSPECT-PTM: Gabriel, Wassim, Omar Shouman, Ayla Schroeder, Florian Boessl, and Mathias Wilhelm. 2024. “PROSPECT PTMs: Rich Labeled Tandem Mass Spectrometry Dataset of Modified Peptides for Machine Learning in Proteomics.” Advances in Neural Information Processing Systems 37.
If you use Modanovo in your research, please cite:
FIXME