Skip to content
View DingShizhe's full-sized avatar
🏠
Working from home
🏠
Working from home
  • Beijing China

Highlights

  • Pro

Block or report DingShizhe

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Native Multimodal Models are World Learners

Python 1,204 42 Updated Nov 7, 2025
Python 10 Updated Apr 2, 2025

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Python 2,406 108 Updated Oct 31, 2025

Fast and memory-efficient exact attention

Python 198 67 Updated Oct 20, 2025

[ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation

Python 177 5 Updated May 21, 2025

Voyager is an interactive RGBD video generation model conditioned on camera input, and supports real-time 3D reconstruction.

Python 1,335 126 Updated Oct 22, 2025

Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)

Python 2,813 199 Updated Sep 12, 2025
Python 567 16 Updated Nov 10, 2025

PixiEditor is a Universal Editor for all your 2D needs

C# 6,809 267 Updated Nov 10, 2025

Generative Models by Stability AI

Python 26,591 2,978 Updated Nov 3, 2025

Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"

Jupyter Notebook 154 4 Updated Oct 21, 2025

[ICCV2025] "Di[M]O: Distilling Masked Diffusion Models into One-step Generator", Yuanzhi Zhu, Xi Wang, Stéphane Lathuilière, Vicky Kalogeiton

Python 29 1 Updated Aug 14, 2025

This repo provides a working re-implementation of Latent Adversarial Diffusion Distillation by AMD

Python 118 6 Updated Jul 12, 2025

MoDM is a cache-aware, hybrid serving system that accelerates image generation by dynamically combining small and large diffusion models for efficient, high-quality output.

Python 3 Updated Aug 8, 2025

The open-source CapCut alternative

TypeScript 43,487 4,155 Updated Oct 24, 2025

Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation

Python 76 6 Updated Jul 13, 2025

[CVPR 2025 Highlight] VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step

Python 325 9 Updated Jul 4, 2025

[ICCV2025]LazyMAR: Accelerating Masked Autoregressive Models via Feature Caching

Python 45 3 Updated Oct 24, 2025
Python 92 15 Updated Sep 26, 2025

Official Implementation of Diffusion Step Annealing (DiSA) in Autoregressive Image Generation

Jupyter Notebook 141 1 Updated May 27, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,605 2,235 Updated Feb 1, 2025

Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache).

Python 176 13 Updated Sep 8, 2025

Official PyTorch implementation for "Effective and Efficient Masked Image Generation Models"

Python 27 2 Updated Apr 8, 2025

The official repo of continuous speculative decoding

Python 30 1 Updated Mar 28, 2025

12 Lessons to Get Started Building AI Agents

Jupyter Notebook 44,405 15,055 Updated Nov 10, 2025

[NeurIPS 2024] ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis

Python 24 Updated Nov 28, 2024

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Python 640 42 Updated Oct 16, 2024

Official Implementation of the CrossMAE paper: Rethinking Patch Dependence for Masked Autoencoders

Python 123 7 Updated Apr 10, 2025
Python 39 2 Updated May 20, 2025

[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,768 76 Updated Oct 22, 2025
Next