Lists (2)
Sort Name ascending (A-Z)
Stars
neural netowrk Quantun Monte Carlo
Adobe illustrator 科研组图插件,支持复制粘贴相对位置、形状尺寸批量设置、图片一键自动排列,一键添加子图label | Adobe Illustrator plugin, specifically designed for scientific illustration, supports copy-pasting with relative positioning, bat…
Dynamic 3D Foundation Model using Causal Transformer
3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding
A curated list of awesome papers for reconstructing 4D spatial intelligence from video. (arXiv 2507.21045)
[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
This project provides a script to perform full model fine-tuning on FLUX.1 [dev]. It is adapted from the original DreamBooth training example in the `diffusers` library.
Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capab…
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…
[ICCV 2025] SpatialTrackerV2: 3D Point Tracking Made Easy
Official inference repo for FLUX.1 models
Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team, Alibaba Cloud.
[NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents. RAG + Knowledge Graphs + Personali…
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
profintegra / raptor-rag
Forked from parthsarthi03/raptorThe official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
[ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
Cosmos-Transfer1-DiffusionRenderer: High-quality video de-lighting and re-lighting based on Cosmos video diffusion framework
[CVPR'25 Oral] Official implementation for "DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models"
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
[NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"