Stars
The py version of toflow → https://github.com/anchen1011/toflow
Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation
A mix of GAN implementations including progressive growing
The state-of-the-art image restoration model without nonlinear activation functions.
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
a reimplementation of Optical Flow Estimation using a Spatial Pyramid Network in PyTorch
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐
A latent text-to-image diffusion model
[IJCV2024] Exploiting Diffusion Prior for Real-World Image Super-Resolution
Taming Transformers for High-Resolution Image Synthesis
The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
A family of diffusion models for text-to-audio generation.
Implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2 in PyTorch
GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
Structured state space sequence models
implementation of music transformer with pytorch (ICLR2019)
2018/2019/校招/春招/秋招/自然语言处理(NLP)/深度学习(Deep Learning)/机器学习(Machine Learning)/C/C++/Python/面试笔记,此外,还包括创建者看到的所有机器学习/深度学习面经中的问题。 除了其中 DL/ML 相关的,其他与算法岗相关的计算机知识也会记录。 但是不会包括如前端/测试/JAVA/Android等岗位中有关的问题。
此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现,也是作为一个算法工程师必会的理论基础知识。
Code and generated sounds for "Conditional Sound Generation Using Neural Discrete Time-Frequency Representation Learning", MLSP 2021
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder