Stars
🚀 Truly open-source AI avatar(digital human) toolkit for offline video generation and digital human cloning.
ACE-Step: A Step Towards Music Generation Foundation Model
Kortix – build, manage and train AI Agents. Fully Open Source.
Desk-Emoji is a truly open-source AI desktop robot featuring an emoji screen, a two-axis console, and LLM capabilities for voice chat.
Making a mini version of the BDX droid. https://discord.gg/UtJZsgfQGe
[NeurIPS2025] "AI-Researcher: Autonomous Scientific Innovation" -- A production-ready version: https://novix.science/chat
MaoTouHU / byte-langmanus
Forked from Darwin-lfl/langmanusA community-driven AI automation framework that builds upon the incredible work of the open source community. Our goal is to combine language models with specialized tools for tasks like web search…
Implementation of "EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer"(ICCV2025)
Create Epic Math and Physics Animations & Study Notes From Text and Images.
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Mobile-Agent: The Powerful GUI Agent Family
This is the repo for the LegalBench-RAG Paper: https://arxiv.org/abs/2408.10343.
Make Azure natural TTS voices accessible to any SAPI 5-compatible application.
Production-ready platform for agentic workflow development.
本项目为xiaozhi-esp32提供后端服务,帮助您快速搭建ESP32设备控制服务器。Backend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.
RAG Web UI is an intelligent dialogue system based on RAG (Retrieval-Augmented Generation) technology.
百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,集成DeepSeek R1等优秀大模型,时延低至800ms,Mac等低配置也可运行,支持打断
[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Instructions on how to use the Realtime API on Microcontrollers and Embedded Platforms
Simple, unified interface to multiple Generative AI providers