-
Zhejiang University, Harbin Institute of Technology
- Shanghai
- https://orcid.org/0009-0005-3732-3035
Stars
LongLive: Real-time Interactive Long Video Generation
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
Youtu-GraphRAG boosts cost efficiency, inference accuracy, and cross-domain adaptability, pushing the boundaries of performance in complex QA.
deepbeepmeep / Wan2GP
Forked from Wan-Video/Wan2.1A fast AI Video Generator for the GPU Poor. Supports Wan 2.1/2.2, Qwen Image, Hunyuan Video, LTX Video and Flux.
Unlimited-length talking video generation that supports image-to-video and video-to-video generation
Phantom-Data: Towards a General Subject-Consistent Video Generation Dataset
HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation
[ArXiv 2025] A survey about controllable video generation: This repo is the official awesome of "Controllable video generation: A survey"
The minimal opencv for Android, iOS, ARM Linux, Windows, Linux, MacOS, HarmonyOS, WebAssembly, watchOS, tvOS, visionOS
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
Stand-In is a lightweight, plug-and-play framework for identity-preserving video generation.
Official inference repo for FLUX.1 models
Wan: Open and Advanced Large-Scale Video Generative Models
Enjoy the magic of Diffusion models!
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
[NeurIPS 2025 D&B🔥] OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation
Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment
丁真宇宙,一眼丁真合集,已有两千多张图片。The YYDZ (Yi Yan Ding Zhen / One Eye Ding Zhen) dataset.
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
A pytorch of the paper ''UNI-IQA: A Unified Approach for Mutual Promotion of Natural and Screen Content Image Quality Assessment"
SkyReels-A2: Compose anything in video diffusion transformers
DICE-Talk is a diffusion-based emotional talking head generation method that can generate vivid and diverse emotions for speaking portraits.
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Open-source and strong foundation image recognition models.
Wan: Open and Advanced Large-Scale Video Generative Models
[NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment