Stars
A dataset of 222 digital musical scores aligned with 1068 performances (more than 92 hours) of Western classical piano music.
Generative models for conditional audio generation
Web video downloader for Bilibili, iQIYI, Tencent Video, MGTV and WeTV. 网站视频下载器,主要支持Bilibili、爱奇艺、腾讯视频、芒果TV、WeTV、愛奇藝台灣站。
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
清华大学计算机系课程攻略 Guidance for courses in Department of Computer Science and Technology, Tsinghua University
Plainchant Analyser tool for MEI Neumes (PAM)
Transformer: PyTorch Implementation of "Attention Is All You Need"
Deezer source separation library including pretrained models.
A list of tools, papers and code related to Fake Audio Detection.
A curated list of awesome article, tutorial, library, webpage, etc.
An "awesome music theory" kinda wiki with books, resources and courses for studying everything about music and sound
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式
✨ AsrTools: Smart Voice-to-Text Tool | Efficient Batch Processing | User-Friendly Interface | No GPU Required | Supports SRT/TXT Output | Turn your audio into accurate text in an instant!
利用AI大模型,一键解说并剪辑视频; Using AI models to automatically provide commentary and edit videos with a single click.
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
A massively spiffy yet delicately unobtrusive compression library.
An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices