Skip to content
View zengyh1900's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report zengyh1900

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.

Python 2,116 234 Updated Jan 12, 2026

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Python 8,501 835 Updated Jan 8, 2026

📷 Camera-controlled text-to-video generation, now with intrinsics, distortion and orientation control!

Python 106 1 Updated Jan 9, 2026

[ICLR'24] GTA: A Geometry-Aware Attention Mechanism for Multi-view Transformers

Python 152 2 Updated Dec 27, 2025

The official implementation of InfiniteVGGT

Python 205 9 Updated Jan 12, 2026

A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…

1,071 27 Updated Jan 9, 2026

[ICLR'25] SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints

Python 673 21 Updated May 23, 2025

PyTorch code and models for VJEPA2 self-supervised learning from video.

Python 2,762 291 Updated Aug 28, 2025

A list of works on video generation towards world model

321 6 Updated Jan 5, 2026

Post-training with Tinker

Python 2,713 290 Updated Jan 8, 2026

Official code of Motus: A Unified Latent Action World Model

Python 553 9 Updated Jan 5, 2026

Train transformer language models with reinforcement learning.

Python 16,933 2,417 Updated Jan 12, 2026

Benchmarking Knowledge Transfer in Lifelong Robot Learning

Jupyter Notebook 1,359 271 Updated Mar 15, 2025

RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI

Python 2,080 212 Updated Jan 12, 2026

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Shell 55,616 4,030 Updated Jan 11, 2026

TurboDiffusion: 100–200× Acceleration for Video Diffusion Models

Python 3,168 213 Updated Jan 9, 2026

🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.

Python 278 46 Updated Nov 24, 2025

AGENTS.md — a simple, open format for guiding coding agents

TypeScript 14,900 1,041 Updated Dec 19, 2025
Python 118 5 Updated Dec 19, 2025

HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency

Python 940 67 Updated Jan 12, 2026

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 4,887 329 Updated Dec 21, 2025

A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.

Python 187 19 Updated Jul 23, 2025

Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation

Python 240 13 Updated Dec 15, 2025

[ICML 2025] Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences

Python 29 4 Updated Jun 29, 2025

[CVPR 2025 Highlight] InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment

Python 43 3 Updated Jun 29, 2025

Official Implementations for Paper - MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues

Python 108 8 Updated Dec 3, 2025
Python 8,891 535 Updated Jan 7, 2026

Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various r…

Python 303 15 Updated Mar 12, 2025

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 22,311 4,022 Updated Jan 12, 2026

LongLive: Real-time Interactive Long Video Generation

Python 958 70 Updated Jan 11, 2026
Next