-
Xi'an Jiaotong University
- Xi’an, China
Highlights
- Pro
Stars
[NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
MotionGPT3: Human Motion as a Second Modality, a MoT-based framework for unified motion understanding and generation
Official repository for "MaskControl: Spatio-Temporal Control for Masked Motion Synthesis" ICCV 2025 (Oral & Award Candidate)
TL_Control: Trajectory and Language Control for Human Motion Synthesis
[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified motion-language generation model using LLMs
Model See Model Do: Speech-Driven Facial Animation with Style Control
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
The official PyTorch implementation of the paper "Human Motion Diffusion Model"
Official implementation for the SIGGRAPH Asia 2024 paper SPARK: Self-supervised Personalized Real-time Monocular Face Capture
Memory-Guided Diffusion for Expressive Talking Video Generation
Official Pytorch Implementation of SMIRK: 3D Facial Expressions through Analysis-by-Neural-Synthesis (CVPR 2024)
Example code for the FLAME 3D head model. The code demonstrates how to sample 3D heads from the model, fit the model to 3D keypoints and 3D scans.
Mapping Mediapipe's 52 blendshapes to FLAME's expression coefficients and poses.
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis; ICLR 2024 Spotlight; Official code
[ACM MM 2025] Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis
clash节点、免费clash节点、免费节点、免费梯子、clash科学上网、clash翻墙、clash订阅链接、clash for Windows、clash教程、免费公益节点、最新clash免费节点订阅地址、clash免费节点每日更新
[CVPR'25] Official repository for "Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics"
✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
[CVPR 2025] Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer
Official code for ICLR25 "TEASER: Token Enhanced Spatial Modeling for Expressions Reconstruction"
🔥🔥🔥 Set the world of 3D faces on fire with INFERNO 🔥🔥🔥
Summary of publicly available ressources such as code, datasets, and scientific papers for the FLAME 3D head model
OpenVideo specializes in the domain of text-to-video generation, with the goal of providing high-quality and diverse video datasets to AI researchers globally.
[WACV 2024] LibreFace: An Open-Source Toolkit for Deep Facial Expression Analysis