Skip to content
View tuanio's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@AI-CLUB-IUH

Block or report tuanio

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Vi_G2P or ViG2P: G2P package for Vietnamese: based on vPhon and phonology knowledge to convert Raw text - Graphoneme to IPA

Python 99 19 Updated Jun 21, 2024

A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.

Python 107 8 Updated Sep 19, 2025

Best practices & guides on how to write distributed pytorch training code

Python 519 48 Updated Oct 22, 2025

[ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels

C++ 42 4 Updated Mar 20, 2024

Foundation Architecture for (M)LLMs

Python 3,119 220 Updated Apr 11, 2024

AI-powered tool that transforms STEM concepts into narrated educational animations using Manim, LLMs, and multimodal AI

Python 68 25 Updated Oct 4, 2025

Suite of tools to discover new articles on the arXiv, filter them, and broadcast them as an RSS feed, for your own use or for others.

C++ 3 1 Updated Jul 12, 2018

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding a…

Python 2,397 446 Updated Mar 14, 2022

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 2,742 152 Updated Oct 9, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 61,051 10,778 Updated Oct 26, 2025

An Open-Source Asynchronous Coding Agent

TypeScript 4,962 658 Updated Oct 24, 2025

Tips and resources to prepare for Behavioral interviews.

7,205 1,397 Updated Aug 19, 2025

The python library for real-time communication

JavaScript 4,368 404 Updated Sep 19, 2025

Memory efficient transducer loss computation

CMake 69 12 Updated Jun 10, 2022
Python 29 2 Updated Jan 9, 2024

Hierarchical Reasoning Model Official Release

Python 11,567 1,681 Updated Sep 9, 2025

A python package to analyze and compare voices with deep learning

Python 3,130 466 Updated Oct 12, 2023

Mamba SSM architecture

Python 16,204 1,474 Updated Oct 10, 2025

[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Python 916 46 Updated Apr 30, 2025

Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis

255 18 Updated Jul 22, 2025
Python 378 61 Updated Sep 3, 2024
Python 6 Updated Jan 7, 2025

ViStreamASR - Real-Time Vietnamese Speech Recognition

Python 46 15 Updated Jul 12, 2025

Code for the paper "Instituto de Telecomunicações at IWSLT 2025: Aligning Small-Scale Speech and Language Models for Speech-to-Text Learning"

Python 2 Updated Sep 30, 2025

Update ASR paper everyday

Python 349 18 Updated Oct 26, 2025

Web interface for browsing, search and filtering recent arxiv submissions

Python 5,425 1,340 Updated Nov 27, 2021

PyTorch code and models for V-JEPA self-supervised learning from video.

Python 3,244 322 Updated Feb 27, 2025

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 51,823 5,681 Updated Sep 10, 2025

Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.

Python 10,188 856 Updated Oct 12, 2025
Next