-
UniversitΓ© Paris-Saclay
- Paris, France
- medium.com/@juanmc2005
- @juanmc2005
- in/juanmcoria
Stars
State-of-the-art TTS model under 25MB π»
OctoTools: An agentic framework with extensible tools for complex reasoning
Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.
An android application that let you track your expenses
On-device Speech Recognition for Apple Silicon
Fast and memory-efficient exact attention
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024] and "LS-EEND: long-form streamingβ¦
LLM Chain querying a scientific Zotero library, with citations
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
A high-throughput and memory-efficient inference and serving engine for LLMs
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
Cross-Platform, GPU Accelerated Whisper ποΈ
Implementation of Nougat Neural Optical Understanding for Academic Documents
A natural language interface for computers
Foundational Models for State-of-the-Art Speech and Text Translation
Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.
Faster Whisper transcription with CTranslate2
Port of OpenAI's Whisper model in C/C++
πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
The Time Series Visualization Tool that you deserve.
Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
A library for efficient similarity search and clustering of dense vectors.