unicat2

Fujiitree unicat2

9 followers · 50 following

06:24 (UTC -12:00)

Highlights

Lists (13)

Sort

Stars

LTH14 / JiT

PyTorch implementation of JiT https://arxiv.org/abs/2511.13720

Python 1,354 57 Updated Nov 18, 2025

facebookresearch / sam-3d-body

The repository provides code for running inference with the SAM 3D Body Model (3DB), links for downloading the trained model checkpoints and datasets, and example notebooks that show how to use the…

Python 1,838 127 Updated Nov 25, 2025

shiml20 / SVG

Official PyTorch Implementation of "Latent Diffusion Model Without Variational Autoencoder".

Python 347 12 Updated Nov 20, 2025

CVMI-Lab / VFMTok-RAR

(NeurIPS 2025, SOTA) Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation

Python 3 Updated Oct 14, 2025

PKU-ML / ClusterMIM

Official Code for ICLR 2024 Paper: On the Role of Discrete Tokenization in Visual Representation Learning

Python 7 Updated Sep 10, 2024

thu-nics / FrameFusion

[ICCV'25] The official code of paper "Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models"

Python 68 1 Updated Nov 24, 2025

BeingBeyond / Being-VL-0

From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities

12 1 Updated Jul 12, 2025

CVMI-Lab / VFMTok

(NeurIPS 2025) Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation

Python 53 Updated Oct 14, 2025

ThomasMrY / VCT

[NeurIPS 2022] code for "Visual Concepts Tokenization"

Python 23 Updated Oct 10, 2022

dsb-ifi / dHT

Differentiable Hierarchical Visual Tokenization

Python 27 Updated Nov 26, 2025

zhuangshaobin / WeTok

WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction

Python 57 1 Updated Sep 3, 2025

ApexGen-X / MergeVQ

[CVPR] MergeVQ: A Unified Framework for Visual Generation and Representation with Token Merging and Quantization

Python 46 3 Updated Jul 22, 2025

ZhengrongYue / UniFlow

Official Implementation of "UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation"

Jupyter Notebook 124 2 Updated Oct 17, 2025

showlab / FQGAN

FQGAN: Factorized Visual Tokenization and Generation

Python 55 3 Updated Mar 29, 2025

TencentARC / TokLIP

TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation

Python 231 5 Updated Aug 18, 2025

GitGyun / visual_token_matching

[ICLR'23 Oral] Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching

Python 254 13 Updated Oct 13, 2023

mohuangrui / ucasthesis

LaTeX Thesis Template for the University of Chinese Academy of Sciences

TeX 3,710 944 Updated Feb 29, 2024

FoundationVision / InfinityStar

[NeurIPS 2025 Oral]Infinity⭐️: Uniﬁed Spacetime AutoRegressive Modeling for Visual Generation

Python 581 18 Updated Nov 25, 2025

tkarras / progressive_growing_of_gans

Progressive Growing of GANs for Improved Quality, Stability, and Variation

Python 6,168 1,092 Updated Feb 17, 2022

hustvl / LightningDiT

[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Python 1,297 42 Updated Jun 12, 2025

MCG-NJU / DDT

DDT: Decoupled Diffusion Transformer

Python 322 16 Updated Aug 22, 2025

apple / ml-4m

4M: Massively Multimodal Masked Modeling

Python 1,773 110 Updated Jun 2, 2025

guijiejie / Self-supervised-Learning

160 34 Updated Jan 22, 2023

didih02 / pca-dino

PCA-Dino and NCA-Dino is development of Dino-ViT

Python 5 Updated Aug 18, 2025

hoovercj / vscode-power-mode

Your code is powerful, unleash it! The extension made popular by Code in the Dark has finally made its way to VS Code.

TypeScript 1,214 100 Updated Aug 1, 2024

baaivision / Emu3.5

Native Multimodal Models are World Learners

Python 1,282 45 Updated Nov 19, 2025

researchmm / VQD-SR

[ICCV'23] VQD-SR: Learning Data-Driven Vector-Quantized Degradation Model for Animation Video Super-Resolution

Python 46 4 Updated Jun 19, 2024

bytetriper / RAE

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,587 45 Updated Nov 15, 2025

zcablii / ViTP

Offical implementation of "Visual Instruction Pretraining for Domain-Specific Foundation Models"

Python 115 1 Updated Nov 12, 2025

Hhhhhhao / continuous_tokenizer

Python 286 7 Updated May 29, 2025

Fujiitree unicat2

Highlights

Lists (13)

AIGC

AR

CS

CV

Diffusion

dMLLM

GU

MLLM

SE

Tool

Vision Foundation Model

visual tokenizers

VL

Stars