- Seoul, Republic of Korea
-
23:29
(UTC +09:00) - https://www.linkedin.com/in/kdrkdrkdr
- https://elnino.kr
Highlights
TTS
Foundational model for human-like, expressive TTS
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion
'Grad-TTS' with Multilingual Cleaners
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, …