Skip to content
View kehanlu's full-sized avatar
:shipit:
:shipit:

Highlights

  • Pro

Organizations

@AcademiaMeow

Block or report kehanlu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

语音方向实验室/公司/资源/实习等,欢迎推荐或自荐

582 67 Updated Nov 13, 2024

LALM knowledge editing

Python 5 Updated Nov 1, 2025
JavaScript 2 Updated Oct 12, 2025

Game-Time: Evaluating Temporal Dynamics in Spoken Language Models

4 Updated Oct 7, 2025

HighRateMOS is the first non-intrusive MOS prediction model that explicitly models sampling rates, achieving first place in five out of eight metrics in AudioMOS Challenge 2025 Track3.

10 Updated Sep 15, 2025

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

2,082 253 Updated Jun 6, 2024

A fancy self-hosted monitoring tool

JavaScript 78,910 7,010 Updated Nov 25, 2025

A list of publically available audio data that anyone can download for ASR or other speech activities

Shell 231 22 Updated Aug 6, 2021

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 84,609 9,580 Updated Nov 25, 2025
JavaScript 1 Updated Jul 22, 2025

PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models

873 71 Updated Nov 19, 2025

Code for DeSTA2.5-Audio

Python 122 7 Updated Aug 7, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 25,510 1,783 Updated Oct 13, 2025

SOTA Open Source TTS

Python 24,175 1,973 Updated Nov 6, 2025

Leaderboard and code for "Speech-IFEval", Interspeech 2025

Python 22 1 Updated May 27, 2025

Collection of works for evaluating (and analyzing) large audio-language models (LALMs)

40 Updated Aug 11, 2025

Official GitHub repository for paper "SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information" (Interspeech 2025)

Python 19 3 Updated Aug 14, 2025

⏩ Ship faster with Continuous AI. Open-source CLI that can be used in TUI mode as a coding agent or Headless mode to run background agents

TypeScript 29,994 3,817 Updated Nov 25, 2025
TypeScript 27,604 2,160 Updated Aug 7, 2025

A Benchmark for Evaluating Turn-Taking and Overlap Handling in Full-Duplex Spoken Dialogue Models

Python 104 4 Updated Sep 21, 2025

A TTS model capable of generating ultra-realistic dialogue in one pass.

Python 18,875 1,640 Updated Nov 19, 2025

A collaborative note taking, wiki and documentation platform that scales. Built with Django and React.

Python 14,855 463 Updated Nov 25, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 48,673 4,005 Updated Nov 25, 2025

Audio-FLAN

Jupyter Notebook 160 5 Updated Sep 23, 2025

Train transformer language models with reinforcement learning.

Python 16,423 2,312 Updated Nov 25, 2025

A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.

TypeScript 39,201 2,417 Updated Nov 25, 2025

The AI Code Editor

31,735 2,118 Updated Nov 19, 2025

Voice gender classifier using ECAPA-TDNN

Python 61 8 Updated Jan 24, 2025
Python 4,561 366 Updated Jun 12, 2025
Next