seastar105

HAESUNG JEON seastar105

63 followers · 94 following

Achievements

Lists (5)

Sort

Stars

AI-S2-Lab / GPT-Talker

[ACMMM'2024] Generative Expressive Conversational Speech Synthesis

42 2 Updated Oct 28, 2024

jordan-gibbs / hypercheap-voiceAI

The most cost-effective, highest performance AI voice agent possible today

Python 78 11 Updated Oct 31, 2025

OpenMOSS / MOSS-Speech

MOSS-Speech is a true speech-to-speech large language model without text guidance.

Python 103 5 Updated Oct 2, 2025

apple / dmel

Python 19 3 Updated May 13, 2025

KdaiP / DC-Speech-VAE

5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs

Python 45 9 Updated Nov 19, 2025

stepfun-ai / Step-Audio-R1

Python 252 16 Updated Nov 27, 2025

ASLP-lab / MeanVC

A Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows

Python 162 8 Updated Nov 24, 2025

idiap / OTTC

A Differentiable Alignment Framework for Sequence-to-Sequence Modeling via Optimal Transport

Python 11 2 Updated Nov 18, 2025

supertone-inc / supertonic

Lightning-fast, on-device TTS — running natively via ONNX.

JavaScript 1,207 95 Updated Nov 27, 2025

cornserve-ai / cornserve

Easy, Fast, and Scalable Multimodal AI

Python 73 5 Updated Nov 24, 2025

tencent-ailab / BridgeVocoder

The official repo of BridgeVoC, which explores using the Schrödinger Bridge framework for neural vocoding.

Python 188 36 Updated Nov 20, 2025

rom1504 / queue_as_dataset

A prototype implementation of the "dataset as a queue" pattern for processing web pages into interleaved image/text content.

Python 27 Updated Nov 16, 2025

auspicious3000 / ProsodyLM

ProsodyLM: Uncovering the Emerging Prosody Processing Capabilities in Speech Language Models

Python 31 3 Updated Nov 18, 2025

nvidia-china-sae / mair-hub

Jupyter Notebook 65 17 Updated Nov 25, 2025

corticph / error-align

Text-to-text alignment algorithm for speech recognition error analysis.

Python 23 1 Updated Nov 24, 2025

videosdk-live / NAMO-Turn-Detector-v1

High-performance, semantic turn detection for conversational AI

Python 26 3 Updated Oct 1, 2025

kamperh / linearvc

Voice conversion with just linear regression.

Jupyter Notebook 31 3 Updated Sep 25, 2025

oliverguhr / deepmultilingualpunctuation

A python package for deep multilingual punctuation prediction.

Python 151 34 Updated Aug 21, 2024

facebookresearch / omnilingual-asr

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,291 185 Updated Nov 19, 2025

hdaeun98 / emo100db

Emo100Songs: An Open Dataset of Improvised Songs with Emotion Data

6 1 Updated Nov 12, 2025

itsmekhoathekid / TASA

TASA & Speech Transformer implementation

Python 4 2 Updated Nov 8, 2025

DuyguA / TSD2025-CTC-LLM-based-Loss-Regularization

Python 4 2 Updated Aug 16, 2025

meituan-longcat / LongCat-Flash-Omni

This is the official repo for the paper "LongCat-Flash-Omni Technical Report"

Python 423 23 Updated Nov 25, 2025

Deep-unlearning / Llasa-GRPO

Python 15 Updated Nov 19, 2025

Andong-Li-speech / BridgeVoC

This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".

Python 55 3 Updated Nov 5, 2025

Lab-MSP / NaturalVoices

Jupyter Notebook 26 6 Updated Oct 28, 2025

mathllm / VoiceAssistant-Eval

A rigorous framework for evaluating and guiding the development of next-generation AI assistants.

Python 17 1 Updated Oct 14, 2025

MoonshotAI / Kimi-Linear

1,212 54 Updated Nov 17, 2025

BUTSpeechFIT / DiCoW

Python 70 6 Updated Oct 9, 2025

Soul-AILab / SoulX-Podcast

SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.

Python 2,349 279 Updated Nov 27, 2025

HAESUNG JEON seastar105

Lists (5)

diffusion

drive-model

generation

tts-dataset

vocoder

Stars