WhiteFu

WhiteFu

speech synthesis & voice conversion & speech enhancement

47 followers · 442 following

Lists (1)

Sort

🔮 Future ideas

Starred repositories

eddycmu / demystify-long-cot

Python 327 18 Updated May 31, 2025

sarulab-speech / UTMOSv2

UTokyo-SaruLab MOS Prediction System

Python 265 27 Updated Oct 12, 2025

gyt1145028706 / XY-Tokenizer

This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs. Demos, technical insights and experimental results are presented on

Python 82 5 Updated Sep 19, 2025

hemingkx / Awesome-Efficient-Reasoning

Paper list for Efficient Reasoning.

736 27 Updated Nov 20, 2025

MobileLLM / BudgetThinker

Python 6 Updated Aug 30, 2025

yuelinan / Awesome-Efficient-R1-style-LRMs

45 Updated Aug 14, 2025

ScienceOne-AI / AutoThink

AutoThink is a reinforcement learning framework designed to equip R1-style language models with adaptive reasoning capabilities. Instead of always thinking or never thinking, the model learns when …

Python 42 3 Updated Oct 14, 2025

staymylove / COT_Compresstion_via_Step_entropy

Python 17 Updated Aug 8, 2025

OpenMOSS / MOSS-TTSD

MOSS-TTSD is a spoken dialogue generation model that enables expressive dialogue speech synthesis in both Chinese and English, supporting zero-shot multi-speaker voice cloning, and long-form speech…

Python 1,039 91 Updated Nov 4, 2025

k2-fsa / ZipVoice

Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Python 724 94 Updated Nov 12, 2025

sentient-agi / OpenDeepSearch

SOTA search powered LLM

Python 3,726 343 Updated Apr 4, 2025

VITA-MLLM / Long-VITA

✨✨Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy

Python 306 29 Updated May 14, 2025

Eclipsess / Awesome-Efficient-Reasoning-LLMs

[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

699 34 Updated Oct 20, 2025

MoonshotAI / Kimi-Audio

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 4,368 317 Updated Jun 21, 2025

Sun-Haoyuan23 / Awesome-RL-based-Reasoning-MLLMs

This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

1,279 58 Updated Nov 16, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,144 311 Updated Nov 27, 2025

yzhangchuck / awesome-llm-reasoning-long2short-papers

16 1 Updated Apr 11, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,820 301 Updated Jun 12, 2025

PeterGriffinJin / Search-R1

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 3,570 302 Updated Nov 13, 2025

om-ai-lab / VLM-R1

Solve Visual Understanding with Reinforced VLMs

Python 5,721 371 Updated Oct 21, 2025

bytedance / MegaTTS3

Python 6,035 465 Updated Aug 29, 2025

Osilly / Vision-R1

This is the first paper to explore how to effectively use R1-like RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reas…

Python 730 19 Updated Sep 10, 2025

mtkresearch / TASTE-SpokenLM

A method that directly addresses the modality gap by aligning speech token with the corresponding text transcription during the tokenization stage.

Python 99 11 Updated Sep 3, 2025

TianxingChen / Embodied-AI-Guide

[Lumina Embodied AI] 具身智能技术指南 Embodied-AI-Guide

9,226 616 Updated Sep 22, 2025

Hongcheng-Gao / Awesome-Long2short-on-LRMs

Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains papers, codes, datasets, evaluations, and analyses.

255 9 Updated Aug 13, 2025