Skip to content
View dunky11's full-sized avatar
🚀
🚀

Highlights

  • Pro

Block or report dunky11

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 4,223 365 Updated Oct 19, 2025

On-device voice activity detection (VAD) powered by deep learning

Python 233 15 Updated Nov 19, 2025

A simple library and set of tools for parsing, modifying, and composing SRT files.

Python 526 49 Updated Mar 19, 2024

OCR & Document Extraction using vision models

TypeScript 11,969 820 Updated May 20, 2025

The official ElevenLabs MCP server

Python 1,071 183 Updated Nov 17, 2025

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,882 905 Updated Sep 30, 2025

Text to speech alignment using CTC forced alignment

Python 391 71 Updated Aug 13, 2025

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 11,327 1,135 Updated Nov 21, 2025

DUSTED: Spoken-Term Discovery using Discrete Speech Units

Jupyter Notebook 18 Updated Oct 2, 2024

Pretraining Efficiently on S2ORC!

Python 173 6 Updated Oct 23, 2024

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 9,116 826 Updated Nov 20, 2025

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 17,802 2,233 Updated Dec 25, 2024

Schedule-Free Optimization in PyTorch

Python 2,235 68 Updated May 21, 2025

♾️ A react hook that makes it easy to add infinite scroll in any components. It is very simple to integrate and supports any direction.

TypeScript 107 7 Updated Nov 8, 2023

Codemod Stripe used to migrate 6.5m+ lines of code from Flow to TypeScript

TypeScript 691 74 Updated Apr 11, 2025

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…

864 82 Updated Jul 8, 2025

A modern replacement for Redis and Memcached

C++ 29,394 1,119 Updated Nov 25, 2025

Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.

Python 543 73 Updated Sep 25, 2024

Material UI: Comprehensive React component library that implements Google's Material Design. Free forever.

JavaScript 97,345 32,796 Updated Nov 25, 2025

Shared data types for building collaborative software

JavaScript 20,628 720 Updated Nov 25, 2025

Unofficial implementation of NaturalSpeech2 for Voice Conversion and Text to Speech

Python 236 12 Updated Feb 29, 2024

Yjs binding for Slate

TypeScript 547 77 Updated Jun 20, 2024

Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/sp…

Python 1,770 102 Updated Aug 29, 2023

LLM inference in C/C++

C++ 90,392 13,823 Updated Nov 25, 2025
Jupyter Notebook 1,709 166 Updated Sep 27, 2024

phoneme tokenizer and grapheme-to-phoneme model for 8k languages

Python 173 18 Updated Jun 9, 2023

The Modular Platform (includes MAX & Mojo)

Mojo 25,253 2,736 Updated Nov 24, 2025

🤖 Assemble, configure, and deploy autonomous AI Agents in your browser.

TypeScript 35,270 9,488 Updated Apr 29, 2025
Python 7,843 526 Updated Apr 14, 2024

A family of diffusion models for text-to-audio generation.

Python 1,214 106 Updated Jul 29, 2025
Next