Skip to content
View notedit's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report notedit

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Native Node.js addon for capturing per-app audio on macOS using ScreenCaptureKit. Real-time audio streaming with event-based API

TypeScript 1 1 Updated Nov 26, 2025

Digital Mind Extension

JavaScript 6,997 1,069 Updated Oct 26, 2025

A Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows

Python 161 8 Updated Nov 24, 2025

HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning

Python 927 137 Updated Oct 19, 2025
Python 391 62 Updated Jun 30, 2025

Official codebase for the paper "Presenting a Paper is an Art: Self-Improvement Aesthetic Agents for Academic Presentations"

Python 324 20 Updated Oct 14, 2025

Media over QUIC Transport Implementation

Go 66 12 Updated Nov 20, 2025

MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows

Python 108 10 Updated Sep 2, 2025

Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.

Swift 973 116 Updated Nov 27, 2025

Krea Realtime 14B. An open-source realtime AI video model.

Python 396 22 Updated Nov 13, 2025

​​Unlimited-length talking video generation​​ that supports image-to-video and video-to-video generation

Python 3,446 550 Updated Aug 25, 2025

Text to speech alignment using CTC forced alignment

Python 392 71 Updated Nov 26, 2025

realtime message based on socketio and redis

TypeScript 1 Updated Nov 16, 2025

Realtime Audio SDK for the Web — audio capture, echo cancellation (AEC), voice activity detection (VAD), and real-time encoding (Opus/PCM).

TypeScript 118 6 Updated Nov 25, 2025

The official implementation of GTCRN, an ultra-lightweight SE model.

Python 496 80 Updated May 28, 2025

LoRA-based phoneme/prosody control for LLM-based TTS with no G2P - Lightweight adapter for edit and control the target language's phoneme-level pronunciation and prosody while preserving other lang…

Python 19 9 Updated Aug 14, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 12,081 1,367 Updated Nov 14, 2025

Mobile and Web client for Codex and Claude Code, with realtime voice, encryption and fully featured

TypeScript 4,203 319 Updated Oct 4, 2025
Python 8,273 580 Updated Nov 12, 2025

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,873 116 Updated Jan 21, 2024

A Unified Framework for Expressive Speech Synthesis with Voice Cloning

Python 383 33 Updated Aug 18, 2025

Text-audio foundation model from Boson AI

Python 7,668 568 Updated Sep 15, 2025

iikira/BaiduPCS-Go原版基础上集成了分享链接/秒传链接转存功能

Go 4,223 561 Updated Nov 8, 2025

一个面向中文文本纠错任务的综合平台,集学术研究、模型训练、模型评测和推理部署于一体,覆盖拼写纠错与语法纠错两个核心方向。

Python 437 35 Updated Nov 26, 2025

The TypeScript AI agent framework. ⚡ Assistants, RAG, observability. Supports any LLM: GPT-4, Claude, Gemini, Llama.

TypeScript 18,508 1,312 Updated Nov 27, 2025

A codebase for data crawling and preprocessing for TTS and ASR systems training.

Python 19 5 Updated Nov 26, 2025

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 17,391 1,914 Updated Oct 21, 2025

A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

Python 16,985 1,202 Updated Nov 26, 2025

An AI agent development platform with all-in-one visual tools, simplifying agent creation, debugging, and deployment like never before. Coze your way to AI Agent creation.

TypeScript 18,699 2,627 Updated Nov 20, 2025

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 300 19 Updated Aug 22, 2025
Next