Stars
Large Audio Language Model for Natural Voice Interactions - All-in-One Docker Image with 7 Processing Modes
Professional Antigravity Account Manager & Switcher. One-click seamless account switching for Antigravity Tools. Built with Tauri v2 + React (Rust).专业的 Antigravity 账号管理与切换工具。为 Antigravity 提供一键无缝账号切…
Multilingual Voice Understanding Model
Fun-ASR-Nano-2512官方发布的仓库内容有点多,部署起来坑也比较多,本项目提供一个简化的部署方案。
Virtual whiteboard for sketching hand-drawn like diagrams
X-Talk is an open-source full-duplex cascaded spoken dialogue system framework enabling low-latency, interruptible, and human-like speech interaction with a lightweight, pure-Python, production-rea…
Utilizes ONNX Runtime for audio denoising.
基于 FunASR SenseVoice 模型的实时语音识别服务,支持说话人识别、音频降噪、ASR 错误修正等高级功能。
Port of Funasr's Sense-voice model in C/C++
Utilizes ONNX Runtime to transcribe audio into text.
Pseudo Streaming SenseVoice with Hotwords
很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。
一个基于 Sherpa-ONNX 的高性能语音识别服务,支持实时VAD(语音活动检测)、多语言语音识别和声纹识别功能。
Fun-CosyVoice3-0.5B-2512 语音合成服务的简化部署方案,以及快速测试和部署提供应用调用
Fun-Audio-Chat is a Large Audio Language Model built for natural, low-latency voice interactions.
Android Automation Tool Based on Vision-Language Models
Use ChatGPT On Wechat via wechaty
这是基于FunASR实现的区分说话人语音识别API | This is a speaker-diarization-based speech recognition API implemented using FunASR.
An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone
Burp Suite HTTP traffic monitoring & management extension for security testers
A next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visual…
An open source reinforcement learning framework for training, evaluating, and deploying robust trading agents.
The privacy-first, self-hosted CAPTCHA for the modern web.
GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning