-
USTC
- China
-
18:46
(UTC +08:00)
Stars
DiffusionNFT: Online Diffusion Reinforcement with Forward Process
Native Multimodal Models are World Learners
Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model
Optimal Transport Aggregation for Visual Place Recognition
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"
Code release for Ming-UniVision: Joint Image Understanding and Geneation with a Continuous Unified Tokenizer
Official implementation of HPSv3: Towards Wide-Spectrum Human Preference Score (ICCV2025)
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.
Evaluation code for Ref-L4, a new REC benchmark in the LMM era
[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
Unified layout planning and image generation, ICCV2025
Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning
[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Enjoy the magic of Diffusion models!
Official PyTorch implementation of One-Minute Video Generation with Test-Time Training
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
A demo for the Direct Ascent Synthesis: Hidden Generative Capabilities in Discriminative Models paper (https://arxiv.org/abs/2502.07753)
This is a repo to track the latest autoregressive visual generation papers.
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
[ICLR2025] A versatile image-to-image visual assistant, designed for image generation, manipulation, and translation based on free-from user instructions.
This is a list of papers on the topic of how machine learning methods (including AI/LLM) are leveraged for specific tasks in quantum physics scenarios. (ML/AI/LLM for quantum science)