rezzsl

Follow

rezzsl

Follow

National Taiwan University PHD Member@Academia Sinica Bio-ASP Lab & NTU Speech Processing and Machine Learning Laboratory

44 followers · 108 following

National Taiwan University
Taiwan
https://scholar.google.com.tw/citations?user=w5F00dYAAAAJ&hl=zh-TW

Highlights

Pro

Lists (31)

Sort

attention

audio preprocess

Audio-Vusial

18 repositories

AVSE3

Bias

Challenge

codec

Cuda

cv

Dataset

deepfake

ECG

energy_efficient_streamin_SE

addtion is all you need for energy-efficient streaming speech enhancement

GNN

knowledge distillation

leetcode

LLM

31 repositories

mamba

10 repositories

multichannel

Optimal Transport

papper reading

pytorch-study

11 repositories

speech assessment

13 repositories

Speech enhancement

23 repositories

Speech Separation

16 repositories

SSL

text&speech

TTS

urgent-Challenge2026

vocal burst

实验记录工具

Stars

Soul-AILab / SAC

Trainging, inference, and testing of the SAC speech codec model.

Python 18 2 Updated Oct 21, 2025

jingyaogong / minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 30,961 3,563 Updated Oct 21, 2025

meituan-longcat / LongCat-Audio-Codec

LongCat Audio Tokenizer and Detokenizer

Python 166 10 Updated Oct 20, 2025

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 29,626 3,086 Updated Oct 21, 2025

meta-pytorch / torchcodec

PyTorch media decoding and encoding

Python 754 66 Updated Oct 21, 2025

ga642381 / Game-Time-Benchmark

Game-Time: Evaluating Temporal Dynamics in Spoken Language Models

4 Updated Oct 7, 2025

jackfrued / Python-100-Days

Python - 100天从新手到大师

Jupyter Notebook 173,545 54,745 Updated Mar 28, 2025

JusperLee / Dolphin

Python 147 20 Updated Oct 1, 2025

wenet-e2e / west

We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction

Python 64 6 Updated Oct 15, 2025

sungwon23 / BSRNN

Python 116 23 Updated Apr 24, 2023

firecrawl / firecrawl

🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data

TypeScript 64,044 5,090 Updated Oct 21, 2025

ddlBoJack / Omni-Captioner

Data Pipeline, Models, and Benchmark for Omni-Captioner.

Python 67 Updated Oct 17, 2025

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 2,704 145 Updated Oct 9, 2025

XiaomiMiMo / MiMo

MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining

Python 1,599 67 Updated Jun 5, 2025

Gar-b-age / CookLikeHOC

🥢像老乡鸡🐔那样做饭。主要部分于2024年完工，非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》，并做归纳、编辑与整理。CookLikeHOC.

JavaScript 21,544 2,144 Updated Oct 17, 2025

chiyuanhsiao / ForgetSLM

Python 4 Updated Jul 11, 2025

Chaos96 / NTPP

Official code of ICML 2025 paper "NTPP: Generative Speech Language Modeling for Dual-Channel Spoken Dialogue via Next-Token-Pair Prediction"

Python 127 21 Updated Sep 14, 2025

jwasham / coding-interview-university

A complete computer science study plan to become a software engineer.

331,599 80,995 Updated Aug 28, 2025

rezzsl / HighRateMOS

HighRateMOS is the first non-intrusive MOS prediction model that explicitly models sampling rates, achieving first place in five out of eight metrics in AudioMOS Challenge 2025 Track3.

9 Updated Sep 15, 2025

alibaba / vstyle

Python 19 Updated Sep 15, 2025

urgent-challenge / urgent2026_challenge_track2

Official baseline for ICASSP 2026 URGENT Challenge Track 2 (Speech Quality Assessment)

Python 18 2 Updated Sep 20, 2025

NVlabs / GatedDeltaNet

[ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule

Python 329 15 Updated Sep 15, 2025

liu00222 / Open-Prompt-Injection

This repository provides a benchmark for prompt injection attacks and defenses

Python 306 43 Updated Oct 16, 2025

Phoenix8215 / torchstat2

Model analyzer in PyTorch

Python 90 12 Updated Aug 31, 2025

seongq / flowmse

(ICASSP 2025, official code)FlowSE: Flow Matching-based Speech Enhancement

Python 68 3 Updated Jul 23, 2025

tabahi / bournemouth-forced-aligner

Extract phoneme-level timestamps from speeh audio.

Python 81 8 Updated Oct 17, 2025

DanielLin94144 / Full-Duplex-Bench

A Benchmark for Evaluating Turn-Taking and Overlap Handling in Full-Duplex Spoken Dialogue Models

Python 92 4 Updated Sep 21, 2025

alessandroragano / scoreq

SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)

Python 92 7 Updated Aug 1, 2025

isaacOnline / SpEAT

Official implementation of the paper "Pre-trained Speech Processing Models Contain Human-Like Biases that Propagate to Speech Emotion Recognition"

Python 6 Updated Feb 23, 2024

kyutai-labs / unmute

Make text LLMs listen and speak

Python 918 164 Updated Oct 16, 2025