Stars
FlashMLA: Efficient Multi-head Latent Attention Kernels
An extremely fast Python package and project manager, written in Rust.
Fully open reproduction of DeepSeek-R1
Janus-Series: Unified Multimodal Understanding and Generation Models
Train transformer language models with reinforcement learning.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Outfit Anyone: Ultra-high quality virtual try-on for Any Clothing and Any Person
SSSegmentation: An Open Source Supervised Semantic Segmentation Toolbox Based on PyTorch.
[ACM Multimedia 2023] Taming the Power of Diffusion Models for High-Quality Virtual Try-On with Appearance Flow.
Code for our ICCV'2023 paper "SHERF: Generalizable Human NeRF from a Single Image"
Foundational Models for State-of-the-Art Speech and Text Translation
An out-of-box human parsing representation extractor.
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
Official PyTorch codes for the paper: "ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation"