Skip to content
View lcs0215's full-sized avatar

Block or report lcs0215

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

HaMeR: Reconstructing Hands in 3D with Transformers

Python 791 104 Updated Mar 22, 2025

A simple yet powerful agent framework that delivers with open-source models

Python 3,883 379 Updated Nov 28, 2025

[ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer

Python 1,842 140 Updated Jul 3, 2025

PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models

Jupyter Notebook 22 1 Updated Aug 11, 2025

This repository open-sources CreatiPoster, an AI-driven graphic design generation system for multi-layer and editable compositions with strong visual appeal.

71 2 Updated Jun 14, 2025
Python 11 2 Updated Apr 9, 2025

[CVPR 2025] Official repo for ART:Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation

Jupyter Notebook 353 38 Updated Aug 6, 2025

[CVPR 2025] This is an official inference code of the paper "BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation" . Project page: https://bizgen-msra.github.io/

Python 294 40 Updated Apr 5, 2025

CVPR2025

Python 904 70 Updated May 14, 2025

(ICCV 2025)This repository is the official implementation of AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models

Python 140 4 Updated Jul 22, 2025

[ICLR'25] The first benchmark aiming to evaluate whether LMMs can assist oracle bone inscription processing tasks

Python 4 Updated Jun 9, 2025

Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"

Python 872 40 Updated Nov 17, 2025

Official PyTorch implementation of the paper "FlowDirector: Training-Free Flow Steering for Precise Text-to-Video Editing"

Python 71 2 Updated Jun 23, 2025

[AAAI 2026] VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Python 335 9 Updated Mar 26, 2025

Open-source unified multimodal model

Python 5,379 471 Updated Oct 27, 2025

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,505 59 Updated Jun 14, 2025

[ICML 2025] Official Implementation of Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots

Python 29 1 Updated May 28, 2025

MAGI-1: Autoregressive Video Generation at Scale

Python 3,562 215 Updated Jun 17, 2025

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,903 91 Updated Aug 15, 2024

StarVector is a foundation model for SVG generation that transforms vectorization into a code generation task. Using a vision-language modeling architecture, StarVector processes both visual and te…

Python 4,126 227 Updated Nov 7, 2025

[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…

Jupyter Notebook 8,511 547 Updated Nov 10, 2025

VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model

Python 343 17 Updated Apr 17, 2025

Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models

Python 3,146 273 Updated Jan 10, 2025
Python 50 3 Updated Mar 17, 2025

[ECCV 2024] Official repo for UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models

Python 233 18 Updated Feb 14, 2025

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 16,769 1,371 Updated Nov 28, 2025

📚 AIGC 求职面经、必备基础知识、提示词工程、ChatGPT、Stable Diffusion、Prompt、Embedding、Fintune 等 AIGC 求职你所需要知道的一切~

750 59 Updated Jun 26, 2024

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 11,328 1,011 Updated Nov 29, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 63,263 7,649 Updated Nov 27, 2025

Deciphering Oracle Bone Language with Diffusion Models (ACL 2024 Best Paper)

Python 218 11 Updated Sep 17, 2025
Next