Skip to content
View fei-aiart's full-sized avatar
I love coffee
I love coffee

Block or report fei-aiart

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Part-X-MLLM: Part-aware 3D Multimodal Large Language Model

79 2 Updated Nov 18, 2025

SAM 3D Objects

Python 3,696 254 Updated Nov 21, 2025

Depth Anything 3

Jupyter Notebook 2,800 200 Updated Nov 25, 2025

Educational implementation of the Discrete Flow Matching paper

Jupyter Notebook 126 7 Updated Aug 26, 2024

Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)

Python 1,933 138 Updated Oct 23, 2025

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Python 7,325 1,010 Updated Jul 3, 2024
Jupyter Notebook 395 20 Updated Nov 14, 2025

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 8,514 605 Updated Nov 20, 2025

StyleShot: A SnapShot on Any Style. 一款可以迁移任意风格到任意内容的模型,无需针对图片微调,即能生成高质量的个性风格化图片!

Python 441 39 Updated Jun 30, 2025

Code of ICCV 2023 paper titled General Image-to-Image Translation with One-Shot Image Guidance

Python 176 11 Updated Aug 26, 2023

Official Code for ICCV 2025 paper — Beyond Isolated Words: Diffusion Brush for Handwritten Text-Line Generation

Python 102 7 Updated Nov 24, 2025

About A collection of AWESOME things about information geometry Topics

172 16 Updated Jul 4, 2024

A xray/v2ray client for iOS/macOS, support vmess/vless/shadowsocks

Objective-C 16 21 Updated Oct 27, 2023

Official implementation of "UniLiP: Adapting CLIP for Unified Multimodal Understanding, Generation and Editing"

Python 105 2 Updated Nov 21, 2025
Python 247 6 Updated Oct 21, 2025

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Python 345 12 Updated Nov 25, 2025

Official PyTorch Implementation of "Latent Diffusion Model Without Variational Autoencoder".

Python 346 12 Updated Nov 20, 2025
Python 30 4 Updated Nov 13, 2025

Contexts Optical Compression

Python 20,871 1,830 Updated Oct 25, 2025

Official Repo For "Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos"

Python 1,438 102 Updated Nov 20, 2025

On-device TTS model by Neuphonic

Python 4,073 412 Updated Nov 18, 2025

The best ChatGPT that $100 can buy.

Python 37,512 4,589 Updated Nov 17, 2025

Pure TypeScript media toolkit for reading, writing, and converting video and audio files, directly in the browser.

TypeScript 4,527 159 Updated Nov 24, 2025

[SIGGRAPH Asia 2025] OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion

Python 149 8 Updated Nov 6, 2025

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

Python 398 34 Updated Sep 11, 2023

PyTorch code and models for V-JEPA self-supervised learning from video.

Python 3,287 326 Updated Feb 27, 2025

Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive arch…

Python 3,131 431 Updated May 8, 2024

Code for the paper "Conditional Representation Learning for Customized Tasks" (NeurIPS 2025 Spotlight)

Python 36 2 Updated Oct 11, 2025

[NeurIPS 2025] T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

Python 417 24 Updated Sep 18, 2025

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Python 2,510 113 Updated Oct 31, 2025
Next