Starred repositories
Fun-Audio-Chat is a Large Audio Language Model built for natural, low-latency voice interactions.
哔哩哔哩 (bilibili.com) 辅助工具,可以替换播放器、推送通知并进行一些快捷操作
🌈谷粒-Chrome插件英雄榜, 为优秀的Chrome插件写一本中文说明书, 让Chrome插件英雄们造福人类~ ChromePluginHeroes, Write a Chinese manual for the excellent Chrome plugin, let the Chrome plugin heroes benefit the human~ 公众号「0加1」同步更新
A high-throughput and memory-efficient inference and serving engine for LLMs
Fabric is an open-source framework for augmenting humans using AI. It provides a modular system for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
🦋 An Infographic Generation and Rendering Framework, bring words to life with AI!
Awesome curated collection of images and prompts generated by gemini-2.5-flash-image (aka Nano Banana) state-of-the-art image generation and editing model. Explore AI generated visuals created with…
Open-source platform to build and deploy AI agent workflows.
一键将 Markdown 和网页 AI 对话(ChatGPT/DeepSeek等)完美粘贴到 Word、WPS 和 Excel 的效率工具 | One-click paste Markdown and AI responses (ChatGPT/DeepSeek) into Word, WPS, and Excel perfectly.
The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…
Taming large-scale few-step training with self-adversarial flows! 👏🏻
The open-source CapCut alternative
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning
🍑 relsim: Relational Visual Similarity | pip install relsim 🌍
[NeurIPS 2025] Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
A framework for building, orchestrating and deploying AI agents and multi-agent workflows with support for Python and .NET.
一个基于nano banana pro🍌的原生AI PPT生成应用,迈向真正的"Vibe PPT"; 支持上传任意模板图片;上传任意素材&智能解析;一句话/大纲/页面描述自动生成PPT;口头修改指定区域、一键导出可编辑ppt - An AI-native PPT generator based on nano banana pro🍌
Implementation of "Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length"
"ViMax: Agentic Video Generation (Director, Screenwriter, Producer, and Video Generator All-in-One)"
AnyTalker: Scaling Multi-person Talking Video Generation with Interactivity Refinement
An Open Source implementation of Notebook LM with more flexibility and features