Lists (9)
Sort Name ascending (A-Z)
Starred repositories
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation
Tongyi Deep Research, the Leading Open-source Deep Research Agent
🥢像老乡鸡🐔那样做饭。主要部分于2024年完工,非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》,并做归纳、编辑与整理。CookLikeHOC.
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
A benchmark for LLMs on complicated tasks in the terminal
MindVL: Towards Efficient and Effective Training of Multimodal Large Language Models on Ascend NPUs
A simple, elegant, and fast workflow to write resumes and CVs in Markdown.
A Fully Self-Hosted Solution for Full-Duplex Voice Interaction
Repo-level benchmark for real-world Code Agents: from repo understanding → env setup → incremental dev/bug-fixing → task delivery, with cost-aware α metric.
[ICCV 2025] Explore the Limits of Omni-modal Pretraining at Scale
ICML 2025 Papers: Dive into cutting-edge research from the premier machine learning conference. Stay current with breakthroughs in deep learning, generative AI, optimization, reinforcement learning…
🛠Awesome Tools,程序员常用高效实用工具、软件资源精选,办公效率提升利器(A Curated Collection of High-Efficiency and Practical Tools and Software Resources for Programmers to Boost Office Productivity)。
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
Latest Advances on System-2 Reasoning
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
The first Large Audio Language Model that enables native in-depth thinking, which is trained on large-scale audio Chain-of-Thought data.
Visual R1: Trasfer Reasoning Ability from R1 to Visual R1
Latest Advances on Reasoning of Multimodal Large Language Models (Multimodal R1 \ Visual R1) ) 🍓