Stars
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
[ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding
Learning and Verification of Task Structure in Instructional Videos
Code for paper, "TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency" ECCV 2022
Audio-conditioned video texture generation
This is an official pytorch implementation of Learning To Recognize Procedural Activities with Distant Supervision. In this repository, we provide PyTorch code for training and testing as described…
[CVPR'22 Oral] Temporal Alignment Networks for Long-term Video. Tengda Han, Weidi Xie, Andrew Zisserman.
karttikeya / SlowFast
Forked from facebookresearch/SlowFastPySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
S3D Text-Video model trained on HowTo100M using MIL-NCE
Easy to use video deep features extractor
TheRockXu / pegasus-demo
Forked from google-research/pegasusThis is a working demo of the pegasus summarization model trained on cnn_dailymail
Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)
Unsupervised video summarization with deep reinforcement learning (AAAI'18)
A PyTorch implementation of the Transformer model from "Attention Is All You Need".
Latex code for making neural networks diagrams
Simple project webpage template. Originally used in Colorful Image Colorization. ECCV, 2016.
This repository contains the source code for the paper First Order Motion Model for Image Animation
Script for converting the pretrained VGGish model provided with AudioSet from TensorFlow to PyTorch, along with a basic smoke test.
Instructional notebooks on music information retrieval.
PyTorch implementation of Super SloMo by Jiang et al.
Audio To Body Dynamics, CVPR 2018
PyTorch implementations of Generative Adversarial Networks.
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
TensorFlow implementation for audio neural style.
Deep Audio-Visual Embedding network (DAVEnet) implementation in PyTorch