Skip to content
View zengyh1900's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report zengyh1900

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.

Python 2,154 236 Updated Jan 12, 2026

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Python 8,501 835 Updated Jan 8, 2026

📷 Camera-controlled text-to-video generation, now with intrinsics, distortion and orientation control!

Python 106 1 Updated Jan 9, 2026

[ICLR'24] GTA: A Geometry-Aware Attention Mechanism for Multi-view Transformers

Python 152 2 Updated Dec 27, 2025

The official implementation of InfiniteVGGT

Python 207 9 Updated Jan 12, 2026

A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…

1,074 27 Updated Jan 9, 2026

[ICLR'25] SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints

Python 673 21 Updated May 23, 2025

PyTorch code and models for VJEPA2 self-supervised learning from video.

Python 2,765 291 Updated Aug 28, 2025

A list of works on video generation towards world model

322 6 Updated Jan 5, 2026

Post-training with Tinker

Python 2,717 290 Updated Jan 12, 2026

Official code of Motus: A Unified Latent Action World Model

Python 553 9 Updated Jan 5, 2026

Train transformer language models with reinforcement learning.

Python 16,935 2,417 Updated Jan 12, 2026

Benchmarking Knowledge Transfer in Lifelong Robot Learning

Jupyter Notebook 1,360 271 Updated Mar 15, 2025

RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI

Python 2,081 212 Updated Jan 12, 2026

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Shell 55,728 4,044 Updated Jan 12, 2026

TurboDiffusion: 100–200× Acceleration for Video Diffusion Models

Python 3,169 213 Updated Jan 9, 2026

🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.

Python 278 46 Updated Nov 24, 2025

AGENTS.md — a simple, open format for guiding coding agents

TypeScript 14,918 1,042 Updated Dec 19, 2025
Python 118 5 Updated Dec 19, 2025

HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency

Python 943 68 Updated Jan 12, 2026

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 4,887 329 Updated Dec 21, 2025

A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.

Python 187 19 Updated Jul 23, 2025

Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation

Python 241 13 Updated Dec 15, 2025

[ICML 2025] Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences

Python 29 4 Updated Jun 29, 2025

[CVPR 2025 Highlight] InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment

Python 43 3 Updated Jun 29, 2025

Official Implementations for Paper - MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues

Python 108 8 Updated Dec 3, 2025
Python 8,904 537 Updated Jan 7, 2026

Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various r…

Python 303 15 Updated Mar 12, 2025

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 22,326 4,024 Updated Jan 13, 2026

LongLive: Real-time Interactive Long Video Generation

Python 960 71 Updated Jan 11, 2026
Next