Stars
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
[ArXiv 25] Stable Video Infinity: Infinite-Length Video Generation with Error Recycling
A Collection of Variational Autoencoders (VAE) in PyTorch.
High-Resolution Image Synthesis with Latent Diffusion Models
A PyTorch library and evaluation platform for end-to-end compression research
[ACMMM 2021 Oral] Enhanced Invertible Encoding for Learned Image Compression
Code for reproducing results in "Glow: Generative Flow with Invertible 1x1 Convolutions"
[TMM 2025] Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression
This is an open-source repository based on our paper, primarily applied in the field of remote sensing image compression.
Paper list: deep learning based video compression
The paper list about deep learning based image compression
A high-throughput and memory-efficient inference and serving engine for LLMs
A truly open version of gpt-oss which shows the entire pre-training from scratch
Learn the building blocks of how to build gpt-oss from scratch
Model and tool for computing audio feature representations based on VAE
the official TangXu's group released codes about the Remote Sensing images classificaiton
A suite of image and video neural tokenizers
Pusa: Thousands Timesteps Video Diffusion Model
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
The Implementation of FastSpeech based on pytorch.
Text-audio foundation model from Boson AI
Scenic: A Jax Library for Computer Vision Research and Beyond
Code Release for MViTv2 on Image Recognition.
[ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"
Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)