Skip to content
View vectominist's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Organizations

@s3prl

Block or report vectominist

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

DACVAE

Python 172 14 Updated Dec 22, 2025

Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.

Python 64 8 Updated Jul 19, 2025

This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".

Jupyter Notebook 155 16 Updated Aug 24, 2025

A benchmark for evaluating audio encoders on various audio tasks.

Python 37 7 Updated Dec 11, 2025

State-of-the-art pretrained music models for training, evaluation, inference

Python 149 14 Updated Oct 9, 2025

[ICLR 2025 Oral] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Python 922 59 Updated Jul 10, 2025

Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".

Python 926 63 Updated Oct 28, 2024

MIT IAP short course: Matrix Calculus for Machine Learning and Beyond

Jupyter Notebook 561 83 Updated Dec 15, 2025

DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning

Python 51 4 Updated Jan 18, 2024

Mamba SSM architecture

Python 16,806 1,547 Updated Dec 23, 2025

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 11,729 1,168 Updated Nov 14, 2024

FAIR Sequence Modeling Toolkit 2

Python 1,101 132 Updated Dec 22, 2025

Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.

Python 6,280 982 Updated Apr 4, 2025

[CVPR2023] Blind Video Deflickering by Neural Filtering with a Flawed Atlas

Python 750 44 Updated May 21, 2025

基于 OpenAI API 的文本翻译、文本润色、语法纠错 Bob 插件,让我们一起迎接不需要巴别塔的新时代!Licensed under CC BY-NC-SA 4.0

TypeScript 5,658 264 Updated Dec 22, 2025

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

Python 400 34 Updated Sep 11, 2023

CUDA implementation of autoregressive linear attention, with all the latest research findings

Python 46 3 Updated May 23, 2023

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

Python 1,120 98 Updated Nov 24, 2025

A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.

Python 2,198 207 Updated Sep 26, 2025

Matplotlib styles for scientific plotting

Python 8,465 783 Updated Nov 20, 2025

Examples and guides for using the OpenAI API

Jupyter Notebook 69,972 11,760 Updated Dec 22, 2025

Layer-wise analysis of self-supervised pre-trained speech representations

Python 121 21 Updated Oct 18, 2024

Foundation Architecture for (M)LLMs

Python 3,128 222 Updated Apr 11, 2024

A library to generate LaTeX expression from Python code.

Python 7,590 394 Updated Feb 13, 2025

Vector (and Scalar) Quantization, in Pytorch

Python 3,787 310 Updated Dec 16, 2025

A curated list of audio-visual learning methods and datasets.

279 20 Updated Dec 3, 2024

Robust Speech Recognition via Large-Scale Weak Supervision

Python 92,372 11,573 Updated Dec 15, 2025

Solutions to all questions of the book Introduction to the Theory of Computation, 3rd edition by Michael Sipser

1,729 282 Updated Dec 8, 2020
Next