notedit

🎯

Focusing

leeoxiang notedit

🎯

Focusing

招聘音视频研发工程师

753 followers · 224 following

Achievements

x3 x2

Achievements

x3 x2

Highlights

Lists (1)

Sort

AI

51 repositories

Stars

MrLionware / screencapturekit-audio-capture

Native Node.js addon for capturing per-app audio on macOS using ScreenCaptureKit. Real-time audio streaming with event-based API

TypeScript 1 1 Updated Nov 26, 2025

pickle-com / glass

Digital Mind Extension

JavaScript 6,998 1,069 Updated Oct 26, 2025

ASLP-lab / MeanVC

A Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows

Python 162 8 Updated Nov 24, 2025

Phantom-video / HuMo

HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning

Python 934 137 Updated Oct 19, 2025

HumanAIGC / lite-avatar

Python 392 62 Updated Jun 30, 2025

eric-ai-lab / EvoPresent

Official codebase for the paper "Presenting a Paper is an Art: Self-Improvement Aesthetic Agents for Academic Presentations"

Python 324 20 Updated Oct 14, 2025

mengelbart / moqtransport

Media over QUIC Transport Implementation

Go 67 12 Updated Nov 20, 2025

xiquan-li / MeanAudio

MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows

Python 108 10 Updated Sep 2, 2025

FluidInference / FluidAudio

Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.

Swift 979 116 Updated Nov 27, 2025

krea-ai / realtime-video

Krea Realtime 14B. An open-source realtime AI video model.

Python 398 22 Updated Nov 13, 2025

MeiGen-AI / InfiniteTalk

Unlimited-length talking video generation that supports image-to-video and video-to-video generation

Python 3,466 553 Updated Aug 25, 2025

MahmoudAshraf97 / ctc-forced-aligner

Text to speech alignment using CTC forced alignment

Python 392 72 Updated Nov 26, 2025

realtime-ai / realtime-mesage

realtime message based on socketio and redis

TypeScript 1 Updated Nov 16, 2025

realtime-ai / realtime-audio-sdk

Realtime Audio SDK for the Web — audio capture, echo cancellation (AEC), voice activity detection (VAD), and real-time encoding (Opus/PCM).

TypeScript 118 6 Updated Nov 25, 2025

Xiaobin-Rong / gtcrn

The official implementation of GTCRN, an ultra-lightweight SE model.

Python 496 81 Updated May 28, 2025

shuheikatoinfo / UtterTune

LoRA-based phoneme/prosody control for LLM-based TTS with no G2P - Lightweight adapter for edit and control the target language's phoneme-level pronunciation and prosody while preserving other lang…

Python 19 9 Updated Aug 14, 2025

Wan-Video / Wan2.2

Wan: Open and Advanced Large-Scale Video Generative Models

Python 12,102 1,374 Updated Nov 14, 2025

slopus / happy

Mobile and Web client for Codex and Claude Code, with realtime voice, encryption and fully featured

TypeScript 4,209 320 Updated Oct 4, 2025

bytedance / UI-TARS

Python 8,278 580 Updated Nov 12, 2025

S-LoRA / S-LoRA

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,873 116 Updated Jan 21, 2024

AIDC-AI / Marco-Voice

A Unified Framework for Expressive Speech Synthesis with Voice Cloning

Python 383 33 Updated Aug 18, 2025

boson-ai / higgs-audio

Text-audio foundation model from Boson AI

Python 7,668 568 Updated Sep 15, 2025

qjfoidnh / BaiduPCS-Go

iikira/BaiduPCS-Go原版基础上集成了分享链接/秒传链接转存功能

Go 4,225 561 Updated Nov 8, 2025

TW-NLP / ChineseErrorCorrector

一个面向中文文本纠错任务的综合平台，集学术研究、模型训练、模型评测和推理部署于一体，覆盖拼写纠错与语法纠错两个核心方向。

Python 437 35 Updated Nov 26, 2025

mastra-ai / mastra

The TypeScript AI agent framework. ⚡ Assistants, RAG, observability. Supports any LLM: GPT-4, Claude, Gemini, Llama.

TypeScript 18,511 1,313 Updated Nov 27, 2025

PoTaTo-Mika / Shore-Data-Engine

A codebase for data crawling and preprocessing for TTS and ASR systems training.

Python 19 5 Updated Nov 26, 2025

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 17,407 1,917 Updated Oct 21, 2025

google / langextract

A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

Python 16,988 1,203 Updated Nov 27, 2025

coze-dev / coze-studio

An AI agent development platform with all-in-one visual tools, simplifying agent creation, debugging, and deployment like never before. Coze your way to AI Agent creation.

TypeScript 18,711 2,626 Updated Nov 20, 2025

MatthewCYM / VoiceBench

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 300 19 Updated Aug 22, 2025