-
National Tsing Hua University
- Taiwan
-
21:00
(UTC +08:00)
Highlights
- Pro
Starred repositories
The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…
A library for making PyTorch models streamable
A next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visual…
Modern and easy to use SQL client for MySQL, Postgres, SQLite, SQL Server, and more. Linux, MacOS, and Windows.
Flash-Muon: An Efficient Implementation of Muon Optimizer
Googleの音声復元モデルMiipher-2の再現実装の学習および推論コード。学習済みモデルも公開しています。
SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.
Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages
The main repository for the Sidekick project, a companion trade tool for Path of Exile and Path of Exile 2.
MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.
Official code for"DiaMoE-TTS: A Unified IPA-based Dialect TTS Framework with Mixture-of-Experts and Parameter-Efficient Zero-Shot Adaptation"
torchax is a PyTorch frontend for JAX. It gives JAX the ability to author JAX programs using familiar PyTorch syntax. It also provides JAX-Pytorch interoperability, meaning, one can mix JAX & Pytor…
Real-time streaming voice anonymization & voice conversion
[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"
OpenStock is an open-source alternative to expensive market platforms. Track real-time prices, set personalized alerts, and explore detailed company insights — built openly, for everyone, forever f…
FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates
An All-in-One Speech, Sound, Music Codec with Single Nested Codebook
Reverse engineering tool for linux games
Long-form streaming TTS system for multi-speaker dialogue generation
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"
ShinkaEvolve: Towards Open-Ended and Sample-Efficient Program Evolution