A platform for quick and easy development of deep learning networks for recognition and detection in videos. Includes popular models like C3D and SSD.
Check out our wiki!
| Model Architecture | Dataset | ViP Accuracy (%) |
|---|---|---|
| I3D | HMDB51 (Split 1) | 72.75 |
| C3D | HMDB51 (Split 1) | 50.14 ± 0.777 |
| C3D | UCF101 (Split 1) | 80.40 ± 0.399 |
| Model Architecture | Dataset | ViP Accuracy (%) |
|---|---|---|
| SSD300 | VOC2007 | 76.58 |
| Model Architecture | Dataset | ViP Accuracy (%) |
|---|---|---|
| DVSA (+fw, obj) | YC2-BB (Validation) | 30.09 |
fw: framewise weighting, obj: object interaction
Please cite ViP when releasing any work that used this platform: https://arxiv.org/abs/1910.02793
@article{ganesh2019vip,
title={ViP: Video Platform for PyTorch},
author={Ganesh, Madan Ravi and Hofesmann, Eric and Louis, Nathan and Corso, Jason},
journal={arXiv preprint arXiv:1910.02793},
year={2019}
}
| Dataset | Task(s) |
|---|---|
| HMDB51 | Activity Recognition |
| UCF101 | Activity Recognition |
| ImageNetVID | Video Object Detection |
| MSCOCO 2014 | Object Detection, Keypoints |
| VOC2007 | Object Detection, Classification |
| YC2-BB | Video Object Grounding |
| DHF1K | Video Saliency Prediction |
| Model | Task(s) |
|---|---|
| C3D | Activity Recognition |
| I3D | Activity Recognition |
| SSD300 | Object Detection |
| DVSA (+fw, obj) | Video Object Grounding |
- Python 3.6
- Cuda 9.0
- (Suggested) Virtualenv
# Set up Python3 virtual environment
virtualenv -p python3.6 --no-site-packages vip
source vip/bin/activate
# Clone ViP repository
git clone https://github.com/MichiganCOG/ViP
cd ViP
# Install requirements and model weights
./install.sh
Run train.py and eval.py to train or test any implemented model. The parameters of every experiment is specified in its config.yaml file.
Use the --cfg_file command line argument to point to a different config yaml file.
Additionally, all config parameters can be overriden with a command line argument.
Run eval.py with the argument --cfg_file pointing to the desired model config yaml file.
Ex: From the root directory of ViP, evaluate the action recognition network C3D on HMDB51
python eval.py --cfg_file models/c3d/config_test.yaml
Run train.py with the argument --cfg_file pointing to the desired model config yaml file.
Ex: From the root directory of ViP, train the action recognition network C3D on HMDB51
python train.py --cfg_file models/c3d/config_train.yaml
Additional examples can be found on our wiki.
New models and datasets can be added without needing to rewrite any training, evaluation, or data loading code.
To add a new model:
- Create a new folder
ViP/models/custom_model_name - Create a model class in
ViP/models/custom_model_name/custom_model_name.py- Complete
__init__,forward, and (optional)__load_pretrained_weightsfunctions
- Complete
- Add PreprocessTrain and PreprocessEval classes within
custom_model_name.py - Create
config_train.yamlandconfig_test.yamlfiles for the new model
Examples of previously implemented models can be found here.
Additional information can be found on our wiki.
To add a new dataset:
- Convert annotation data to our JSON format
- Create a dataset class in
ViP/datasets/custom_dataset_name.py.- Inherit
DetectionDatasetorRecognitionDatasetfromViP/abstract_dataset.py - Complete
__init__and__getitem__functions - Example skeleton dataset can be found here
- Inherit
Additional information can be found on our wiki.
A detailed FAQ can be found on our wiki.