-
Northwestern Polytechnical University
- China
-
05:55
(UTC -12:00)
Pinned Loading
-
Qwen3-Omni
Qwen3-Omni PublicForked from QwenLM/Qwen3-Omni
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
Jupyter Notebook
-
Qwen2.5-Omni
Qwen2.5-Omni PublicForked from QwenLM/Qwen2.5-Omni
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
Jupyter Notebook
-
SoulX-Podcast
SoulX-Podcast PublicForked from Soul-AILab/SoulX-Podcast
SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.
Python
-
OSUM
OSUM PublicForked from ASLP-lab/OSUM
OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.
Python
-
If the problem persists, check the GitHub status page or contact support.