-
SJTU X-LANCE & BIGAI NLCo
- 中国
-
20:35
(UTC -12:00) - https://danjuan-77.github.io/
-
SLAM-LLM-lora-exp Public
Forked from cwx-worst-one/SLAM-LLMBeta version for SLAM-LLM
Python MIT License UpdatedOct 27, 2025 -
UltraVoice100K Public
This is the official repository for the UltraVoice100K dataset, providing code and dataset samples.
-
URO-Bench Public
Forked from Ruiqi-Yan/URO-BenchTowards Comprehensive Benchmark for End-to-End Spoken Dialogue Models
Shell MIT License UpdatedAug 31, 2025 -
-
danjuan-77.github.io Public
Forked from RayeRen/acad-homepage.github.ioAcadHomepage: A Modern and Responsive Academic Personal Homepage
JavaScript MIT License UpdatedAug 30, 2025 -
OpenS2S Public
Forked from CASIA-LM/OpenS2SOpenS2S : Advancing Fully Open-Source End-to-End Empathetic Large Speech Language Model
Python UpdatedAug 27, 2025 -
GLM-4-Voice Public
Forked from zai-org/GLM-4-VoiceGLM-4-Voice | 端到端中英语音对话模型
Python Apache License 2.0 UpdatedAug 27, 2025 -
Kimi-Audio Public
Forked from MoonshotAI/Kimi-AudioKimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
Python UpdatedAug 12, 2025 -
MIO Public
Forked from MIO-Team/MIOMIO: A Foundation Model on Multimodal Tokens
Python UpdatedJul 31, 2025 -
Qwen2.5-Omni Public
Forked from QwenLM/Qwen2.5-OmniQwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
Jupyter Notebook Apache License 2.0 UpdatedJul 22, 2025 -
F5-TTS Public
Forked from SWivid/F5-TTSOfficial code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Python MIT License UpdatedJun 18, 2025 -
EmoVoice Public
Forked from yanghaha0908/EmoVoiceOfficial code for "EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting"
Python UpdatedMay 27, 2025 -
CosyVoice Public
Forked from FunAudioLLM/CosyVoiceMulti-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Python Apache License 2.0 UpdatedMay 20, 2025 -
InternLM-XComposer Public
Forked from InternLM/InternLM-XComposerInternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Python Apache License 2.0 UpdatedMay 14, 2025 -
SALMONN Public
Forked from bytedance/SALMONNSALMONN: Speech Audio Language Music Open Neural Network
Python Apache License 2.0 UpdatedMay 14, 2025 -
MiniCPM-o Public
Forked from OpenBMB/MiniCPM-VMiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
Python Apache License 2.0 UpdatedMay 14, 2025 -
NExT-GPT Public
Forked from NExT-GPT/NExT-GPTCode and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model
Python BSD 3-Clause "New" or "Revised" License UpdatedMay 14, 2025 -
VITA Public
Forked from VITA-MLLM/VITA✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Python Other UpdatedMay 14, 2025 -
-
Ola Public
Forked from Ola-Omni/OlaOla: Pushing the Frontiers of Omni-Modal Language Model
Python Apache License 2.0 UpdatedMay 14, 2025 -
-
Awesome-Colorful-LLM Public
Forked from patrick-tssn/Awesome-Colorful-LLMRecent advancements propelled by large language models (LLMs), encompassing an array of domains including Vision, Audio, Agent, Robotics, Fundamental Sciences such as Mathematics, and Ominous.
MIT License UpdatedApr 28, 2025 -
async_cosyvoice Public
Forked from qi-hua/async_cosyvoice使用vllm加速cosyvoice2的推理
Jupyter Notebook Apache License 2.0 UpdatedApr 26, 2025 -
mini-omni2 Public
Forked from gpt-omni/mini-omni2Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
Python MIT License UpdatedApr 23, 2025 -
SpeechCraft Public
Forked from thuhcsi/SpeechCraftThe official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.
Python UpdatedApr 14, 2025 -
nn-zero-to-hero Public
Forked from karpathy/nn-zero-to-heroNeural Networks: Zero to Hero-[My learning notes]
Jupyter Notebook MIT License UpdatedNov 2, 2024 -
KV-Reuse-Not-KV-Evict Public
This repository contains the code for my experiments on inference acceleration using different methods based on the Phi3_mini model.
Python UpdatedJul 19, 2024 -
-