This repository contains code for the paper HMD-AMP.
First, clone the repository and create an environment with conda.
git clone https://github.com/ml4bio/HMD-AMP.git
conda create -n HMD-AMP python=3.8 -y
conda activate HMD-AMPPytorch installation is different for different Operating systems. For Linux, please use the following commands.
pip install torch torchvision torchaudioThen, install the following packages.
pip install deep-forest
pip install scikit-learn==1.3.0
pip install pandas==1.2.0
pip install biopython==1.83
pip install fair-esm==2.0.0
pip install numpy==1.19.5You can find the training script in script.
The training data can be obtained from Zenodo.
prediction.py contains the script for AMP and its target groups prediction.
Besides training the model by yourself, we also provide the fine-tuned protein language model and trained classifiers for direct usage.
Fine-tuned protein language model: ft_parts.pth
Trained classifier: clsmodel/
First, in prediction.py, assign sequences_file_path with your path of FASTA file, then download and decompress the above model files: assign ftmodel_save_path with your path of Fine-tuned protein language model
and clsmodel_save_path with your path of Trained classifier folder.
Fine-tuned protein language model: ft_parts.pth
Trained classifier: clsmodel/
First, download and decompress the above model files and it is suggested to organize them in the following format of directory:
Model
├── Gram+
│ ├── Fine-tuned model
│ └── Trained classifier folder
│ └── ...
├── Gram-
│ ├── Fine-tuned model
│ └── Trained classifier folder
│ └── ...
├── Mammalian_Cell
│ ├── Fine-tuned model
│ └── Trained classifier folder
│ └── ...
├── Virus
│ ├── Fine-tuned model
│ └── Trained classifier folder
│ └── ...
├── Fungus
│ ├── Fine-tuned model
│ └── Trained classifier folder
│ └── ...
└── Cancer
├── Fine-tuned model
└── Trained classifier folder
└── ...
and you then could modify the corresponding path in prediction.py:
# specify the path of Fine-tuned model
target_ftmodel_save_path = f'model/{target}/ft_parts.pth'
# specify the path of Trained classifier folder
target_clsmodel_save_path = f'model/{target}/clsmodel'At last, run:
python prediction.py
you could get the prediction result of the sequences.