-
Peking University
- Beijing, China
-
23:41
(UTC +08:00) - https://chenguolin.github.io
- @lin_chenguo
Highlights
- Pro
Starred repositories
The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Official inference repo for FLUX.2 models
Video Content Customization Using First Frame
IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction
HunyuanVideo-1.5: A leading lightweight video generation model
Kandinsky 5.0: A family of diffusion models for Video & Image generation
[SIGGRAPH ASIA 2025] Code for PartUV: Part-Based UV Unwrapping of 3D Meshes
The repository provides code for running inference with the SAM 3D Body Model (3DB), links for downloading the trained model checkpoints and datasets, and example notebooks that show how to use the…
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
Official Implementation of "MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation"
PyTorch implementation of JiT https://arxiv.org/abs/2511.13720
Scaling Spatial Intelligence with Multimodal Foundation Models
Cambrian-S: Towards Spatial Supersensing in Video
[NeurIPS 2025 Oral]Infinity⭐️: Unified Spacetime AutoRegressive Modeling for Visual Generation
An unofficial and simplified implementation of SIGGRAPH 2025 best paper nominate: CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image, working in progress
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
A part-based 3D generation framework & the largest and most comprehensively annotated 3D part dataset.
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
Official Repo for Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
Native Multimodal Models are World Learners
Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction (ICCV 2025)
"ViMax: Agentic Video Generation (Director, Screenwriter, Producer, and Video Generator All-in-One)"
L4P -- a feed-forward foundational model designed for multiple low-level 4D vision perception tasks.
[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention
A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.
[Neurips DB 2025] PartNeXt: A Next-Generation Dataset for Fine-Grained and Hierarchical 3D Part Understanding