-
Shanghai AI Laboratory
- Shanghai, China
-
20:50
(UTC +08:00) - bujiazi.github.io
Stars
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
RLG: Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance
This is a repository to collect training-free algorithms for visual generation and manipulation
Official repo for "IDArb: Intrinsic Decomposition for arbitrary number of input views and illuminations"
Official implementation of "DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training".
[CVPR 2025 Oral] Alias-free Latent Diffusion Models (official implementation)
[NeurIPS 2025 Spotlight] A Generalist Diffusion Model for Vision Perception
S2R-HDR: A Large-Scale Rendered Dataset for HDR Fusion
[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention
[ICLR 2024] Official code for the paper 'Elucidating the Exposure Bias in Diffusion Models'
[ICML 2023] official implementation for "Input Perturbation Reduces Exposure Bias in Diffusion Models"
[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
G2RPO: Granular GRPO for precise reward in flow models
An official implementation of "SPARK: Synergistic Policy And Reward Co-Evolving Framework"
An official implementation of "CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning"
An official implementation of "SIM-CoT: Supervised Implicit Chain-of-Thought"
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Image composition toolbox: everything you want to know about image composition or object insertion
[ICML 2025] SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation
Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction
Official implementation of DiCache: Let Diffusion Model Determine Its Own Cache
Official repository of "Beyond Fixed: Training-Free Variable-Length Denoising for Diffusion Large Language Models"
Wan: Open and Advanced Large-Scale Video Generative Models