-
Fudan University
- Shanghai, China
-
18:26
(UTC +08:00) - https://chrisding.me
Highlights
- Pro
Lists (20)
Sort Name ascending (A-Z)
A-APR
B-Backend
B-Benchmark
C-SourceCode
F-FrameWork
F-Frontend
F-Fun
I-Interesting
L-LLM
O-Other
O-OtherReasearch
P-Paper-List
P-Paper-website
R-Read
R-RewardModel
S-School(fdu)
S-Sii-Lecs
S-Survey
T-Tools
W-World-Model
Stars
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
[Survey] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
A Unified Visual Generator with Interleaved OmniModal Context
A set of examples based on verl for end-to-end RL training recipes.
From Word to World: Can Large Language Models be Implicit Text-based World Models?
HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency
PyTorch code and models for VJEPA2 self-supervised learning from video.
τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment
Official repository of DARE: dLLM Alignment and Reinforcement Executor
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems. Use when building, optimizing, or debugging agent systems that require e…
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
A unified evaluation toolkit and leaderboard for rigorously assessing the scientific intelligence of large language and vision–language models across the full research workflow.
Towards Scalable Pre-training of Visual Tokenizers for Generation
Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
🔥 JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization
Native and Compact Structured Latents for 3D Generation
MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence
The paper list of "Memory in the Age of AI Agents: A Survey"
📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.