Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,110 37 Updated Oct 4, 2025

song630 / Awesome-Image-Video-Diffusion-Post-Training

4 Updated Sep 2, 2025

Steven-Xiong / GroundingBooth

Python 6 Updated Jul 21, 2025

yunlong10 / MMPerspective

[NeurIPS 2025] A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness

12 Updated Nov 9, 2025

hanghuacs / MMIG-Bench

7 Updated Jun 20, 2025

ThreeSR / Awesome-Inference-Time-Scaling

Paper List of Inference/Test Time Scaling/Computing

Python 320 9 Updated Aug 28, 2025

yunlong10 / CAT-V

[AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal Prompting

Python 58 4 Updated Oct 30, 2025

hanghuacs / FineCaption

HTML 37 1 Updated Jun 20, 2025

yunlong10 / AVicuna

[AAAI 2025] Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding

Python 33 Updated Mar 21, 2025

eisneim / ltx_lora_training_i2v_t2v

Lora traing script for Lightricks LTX-video

Python 66 4 Updated Feb 12, 2025

tdrussell / diffusion-pipe

A pipeline parallel training script for diffusion models.

Python 1,701 229 Updated Nov 7, 2025

NUS-HPC-AI-Lab / VideoSys

VideoSys: An easy and efficient system for video generation

Python 2,005 132 Updated Aug 27, 2025

chaofengc / IQA-PyTorch

👁️ 🖼️ 🔥PyTorch Toolbox for Image Quality Assessment, including PSNR, SSIM, LPIPS, FID, NIQE, NRQM(Ma), MUSIQ, TOPIQ, NIMA, DBCNN, BRISQUE, PI and more...

Python 2,883 227 Updated Oct 20, 2025

Lightricks / LTX-Video

Official repository for LTX-Video

Python 8,747 805 Updated Oct 25, 2025

BestJunYu / Awesome-Physics-aware-Generation

Physical laws underpin all existence, and harnessing them for generative modeling opens boundless possibilities for advancing science and shaping the future!

234 5 Updated Apr 21, 2025

McGill-NLP / AURORA

Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation

Python 30 2 Updated Jun 30, 2025

google-deepmind / physics-IQ-benchmark

Benchmarking physical understanding in generative video models

Python 219 19 Updated Oct 28, 2025

IamCreateAI / LayerAnimate

[ICCV 2025] LayerAnimate: Layer-specific Control for Animation

Python 192 7 Updated Aug 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yizhi Song song630

Achievements

Achievements

Highlights

Block or report song630

Stars

facebookresearch / dinov3

baaivision / Emu3.5

XueZeyue / DanceGRPO

BestiVictory / VADB

VectorSpaceLab / EditScore

apple / pico-banana-400k

HowieHwong / Agentic-Guardian

MizzenAI / HPSv3

KwaiVGI / VideoAlign

QwenLM / Qwen3-VL

xie-lab-ml / awesome-alignment-of-diffusion-models

yifan123 / flow_grpo

zhaochen0110 / Awesome_Think_With_Images