-
Massachusetts Institute of Technology
- Cambridge, MA
-
21:15
(UTC -05:00) - people.csail.mit.edu/hengjui
- @hjchang87
Highlights
- Pro
Stars
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".
A benchmark for evaluating audio encoders on various audio tasks.
State-of-the-art pretrained music models for training, evaluation, inference
[ICLR 2025 Oral] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
MIT IAP short course: Matrix Calculus for Machine Learning and Beyond
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Foundational Models for State-of-the-Art Speech and Text Translation
Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.
[CVPR2023] Blind Video Deflickering by Neural Filtering with a Flawed Atlas
基于 OpenAI API 的文本翻译、文本润色、语法纠错 Bob 插件,让我们一起迎接不需要巴别塔的新时代!Licensed under CC BY-NC-SA 4.0
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
CUDA implementation of autoregressive linear attention, with all the latest research findings
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.
Matplotlib styles for scientific plotting
Examples and guides for using the OpenAI API
Layer-wise analysis of self-supervised pre-trained speech representations
A library to generate LaTeX expression from Python code.
Vector (and Scalar) Quantization, in Pytorch
A curated list of audio-visual learning methods and datasets.
Robust Speech Recognition via Large-Scale Weak Supervision
Solutions to all questions of the book Introduction to the Theory of Computation, 3rd edition by Michael Sipser