[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…

Jupyter Notebook 8,495 547 Updated Nov 10, 2025

NVlabs / QLIP

[arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation

Jupyter Notebook 94 3 Updated Mar 1, 2025

IDEA-Research / X-Pose

[ECCV 2024] Official implementation of the paper "X-Pose: Detecting Any Keypoints"

Python 763 38 Updated Aug 16, 2024

ShivamDuggal4 / adaptive-length-tokenizer

Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?

Python 138 7 Updated Feb 11, 2025

facebookresearch / blt

Code for BLT research paper

Python 2,011 182 Updated Nov 3, 2025

lllyasviel / IC-Light

More relighting!

Python 8,300 520 Updated Feb 20, 2025

JiePKU / MoLE

An official pytorch implementation of "MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts"

Python 34 Updated Nov 21, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,027 2,665 Updated Aug 12, 2024

hutaiHang / ToMe

[NeurIPS 2024] Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis

Python 81 5 Updated Feb 3, 2025

nv-tlabs / LLaMA-Mesh

Unifying 3D Mesh Generation with Language Models

Python 1,125 71 Updated Mar 28, 2025

cvg / depthsplat

[CVPR'25] DepthSplat: Connecting Gaussian Splatting and Depth

Python 1,070 53 Updated Apr 27, 2025

TencentARC / MotionCtrl

Official Code for MotionCtrl [SIGGRAPH 2024]

Python 1,466 76 Updated Feb 19, 2025

hehao13 / CameraCtrl

Python 614 28 Updated May 24, 2024

geopavlakos / hamer

HaMeR: Reconstructing Hands in 3D with Transformers

Python 785 101 Updated Mar 22, 2025

microsoft / MeshGraphormer

Research code of ICCV 2021 paper "Mesh Graphormer"

Python 416 55 Updated Jul 6, 2023

jyLin8100 / GenSAM

Code for AAAl 2024 paper: Relax Image-Specific Prompt Requirement in SAM: A Single Generic Prompt for Segmenting Camouflaged Objects

Jupyter Notebook 159 6 Updated Mar 4, 2025

Nightmare-n / DepthAnyVideo

Depth Any Video with Scalable Synthetic Data (ICLR 2025)

Python 508 29 Updated Dec 4, 2024

facebookresearch / InterWild

Official PyTorch implementation of "Bringing Inputs to Shared Domains for 3D Interacting Hands Recovery in the Wild", CVPR 2023

Python 193 15 Updated Jul 10, 2024

wenquanlu / HandRefiner

[ACM MM 2024] Offical Code for "HandRefiner: Refining Malformed Hands in Generated Images by Diffusion-based Conditional Inpainting"

Python 804 37 Updated Oct 31, 2024

facebookresearch / dino

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Python 7,322 1,010 Updated Jul 3, 2024

facebookresearch / dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.

Jupyter Notebook 11,936 1,123 Updated Aug 17, 2025

Qifan Fu fuqifan

Highlights

Starred repositories

video-generation

vqgan

MATLAB

communication

languages

Machine learning