Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音，一键全自动视频搬运AI字幕组

Python 15,524 1,597 Updated May 18, 2025

wangxiongts / vllm

Python 17 13 Updated Dec 8, 2025

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…

Python 12,469 1,949 Updated Oct 20, 2025

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,191 196 Updated Oct 9, 2025

microsoft / UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Python 474 74 Updated Apr 5, 2024

qiuqiangkong / panns_inference

Python 254 39 Updated Mar 5, 2024

qiuqiangkong / audioset_tagging_cnn

Python 1,645 294 Updated Jul 25, 2024

MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi

Python 1,706 276 Updated Nov 15, 2025

pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 8,913 985 Updated Dec 13, 2025

gabrielmittag / NISQA

NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment

Python 900 147 Updated Dec 1, 2024

TEN-framework / ten-vad

Voice Activity Detector (VAD) : low-latency, high-performance and lightweight

C 1,856 146 Updated Dec 23, 2025

slhck / ffmpeg-normalize

Audio Normalization for Python/ffmpeg

HTML 1,460 125 Updated Dec 28, 2025

facebookresearch / audiobox-aesthetics

Unified automatic quality assessment for speech, music, and sound.

Python 653 48 Updated Jun 5, 2025

FeipengMa6 / VSC22-Submission

[CVPR 2023 Workshop] The code reproduce the results of our solutions on both tracks for Meta AI Video Similarity Challenge (CVPR 2023 Workshop). Our solutions got the first place on both tracks, in…

Python 54 12 Updated May 30, 2023

v-iashin / Synchformer

Source code for "Synchformer: Efficient Synchronization from Sparse Cues" (ICASSP 2024)

Python 101 9 Updated Sep 15, 2025

Tencent-Hunyuan / HunyuanVideo-Foley

HunyuanVideo-Foley: Multimodal Diffusion with Representation Alignment for High-Fidelity Foley Audio Generation.

Python 1,293 98 Updated Sep 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

songkq

Achievements

Achievements

Block or report songkq

Stars

snakers4 / silero-vad

YZY-stack / Effort-AIGI-Detection

tmlr-group / ConV

stdstu12 / YUME

roy-ch / Dual-Data-Alignment

video-reality-test / video-reality-test

Ekko-zn / AIGCDetectBenchmark

JoeLeelyf / Skyra

Pi3AI / Ivy-Fake

mattwright324 / youtube-metadata

FunAudioLLM / Fun-ASR

SubtitleEdit / subtitleedit

zai-org / GLM-ASR

mifi / lossless-cut

Huanshere / VideoLingo