lcs0215

lcs0215

1 follower · 2 following

Stars

geopavlakos / hamer

HaMeR: Reconstructing Hands in 3D with Transformers

Python 791 104 Updated Mar 22, 2025

TencentCloudADP / youtu-agent

A simple yet powerful agent framework that delivers with open-source models

Python 3,883 379 Updated Nov 28, 2025

Yuanshi9815 / OminiControl

[ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer

Python 1,842 140 Updated Jul 3, 2025

redredsheep / PrismLayers

PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models

Jupyter Notebook 22 1 Updated Aug 11, 2025

graphic-design-ai / creatiposter

This repository open-sources CreatiPoster, an AI-driven graphic design generation system for multi-layer and editable compositions with strong visual appeal.

71 2 Updated Jun 14, 2025

ryugo417 / TKG-DM

Python 11 2 Updated Apr 9, 2025

microsoft / art-msra

[CVPR 2025] Official repo for ART:Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation

Jupyter Notebook 353 38 Updated Aug 6, 2025

1230young / bizgen

[CVPR 2025] This is an official inference code of the paper "BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation" . Project page: https://bizgen-msra.github.io/

Python 294 40 Updated Apr 5, 2025

wileewang / TransPixeler

CVPR2025

Python 904 70 Updated May 14, 2025

wyczzy / AIGI-Holmes

(ICCV 2025)This repository is the official implementation of AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models

Python 140 4 Updated Jul 22, 2025

OBI-Future / OBI-Bench

[ICLR'25] The first benchmark aiming to evaluate whether LMMs can assist oracle bone inscription processing tasks

Python 4 Updated Jun 9, 2025

fallenshock / FlowEdit

Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"

Python 872 40 Updated Nov 17, 2025

Westlake-AGI-Lab / FlowDirector

Official PyTorch implementation of the paper "FlowDirector: Training-Free Flow Steering for Precise Text-to-Video Editing"

Python 71 2 Updated Jun 23, 2025

zai-org / VisionReward

[AAAI 2026] VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Python 335 9 Updated Mar 26, 2025

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,379 471 Updated Oct 27, 2025

ByteDance-Seed / Seed1.5-VL

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,505 59 Updated Jun 14, 2025

HiDream-ai / himar

[ICML 2025] Official Implementation of Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots

Python 29 1 Updated May 28, 2025

SandAI-org / MAGI-1

MAGI-1: Autoregressive Video Generation at Scale

Python 3,562 215 Updated Jun 17, 2025

FoundationVision / LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,903 91 Updated Aug 15, 2024

joanrod / star-vector

StarVector is a foundation model for SVG generation that transforms vectorization into a code generation task. Using a vision-language modeling architecture, StarVector processes both visual and te…

Python 4,126 227 Updated Nov 7, 2025

FoundationVision / VAR

[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…

Jupyter Notebook 8,511 547 Updated Nov 10, 2025

VARGPT-family / VARGPT

VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model

Python 343 17 Updated Apr 17, 2025

ali-vilab / VGen

Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models

Python 3,146 273 Updated Jan 10, 2025

ltzovo / DAMamba

Python 50 3 Updated Mar 17, 2025

ZYM-PKU / UDiffText

[ECCV 2024] Official repo for UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models

Python 233 18 Updated Feb 14, 2025

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 16,769 1,371 Updated Nov 28, 2025

EmbraceAGI / AIGC_Interview

📚 AIGC 求职面经、必备基础知识、提示词工程、ChatGPT、Stable Diffusion、Prompt、Embedding、Fintune 等 AIGC 求职你所需要知道的一切~

750 59 Updated Jun 26, 2024

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 11,328 1,011 Updated Nov 29, 2025

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 63,263 7,649 Updated Nov 27, 2025

guanhaisu / OBSD

Deciphering Oracle Bone Language with Diffusion Models (ACL 2024 Best Paper)

Python 218 11 Updated Sep 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly