Skip to content
View Achazwl's full-sized avatar
🤔
🤔

Block or report Achazwl

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An End-to-End Infrastructure for Training and Evaluating Various LLM Agents

Python 284 19 Updated Jan 20, 2026
Python 30 Updated Jan 13, 2026

Open ABI and FFI for Machine Learning Systems

C++ 304 49 Updated Jan 18, 2026

BurstEngine is an efficient framework designed to train LLMs on long-sequence data.

Python 9 3 Updated Sep 25, 2025
Python 203 12 Updated Sep 25, 2025

Summary of the Specs of Commonly Used GPUs for Training and Inference of LLM

72 13 Updated Aug 12, 2025

⚡️SwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with PyTorch / Transformers / verl / LLaMA Factory / ms-swift / U…

Python 3,450 184 Updated Jan 20, 2026

Source codes for paper "BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity".

Python 18 5 Updated Jan 10, 2026

A low-latency, billion-scale, and updatable graph-based vector store on SSD.

Jupyter Notebook 94 32 Updated Jan 4, 2026

CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge techniques in sparse architecture, speculative sampling and qua…

Cuda 219 20 Updated Jan 14, 2026

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 2,507 306 Updated Jan 19, 2026

Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design

Cuda 16 Updated May 29, 2025

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Python 1,513 221 Updated Dec 15, 2025

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Python 75,770 9,050 Updated Jan 19, 2026

Official Implementation of APB (ACL 2025 main Oral)

C++ 32 4 Updated Feb 22, 2025

[ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling

C++ 49 2 Updated Jul 15, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 6,108 802 Updated Jan 16, 2026

DeepEP: an efficient expert-parallel communication library

Cuda 8,903 1,069 Updated Jan 20, 2026
Python 13 Updated Oct 3, 2024

This repository contains a simple web application that allows users to generate QR codes for URLs.

CSS 7 4 Updated Apr 25, 2024

Vim for Visual Studio Code

TypeScript 82 11 Updated Jul 22, 2024

Sequence-level 1F1B schedule for LLMs.

Python 38 2 Updated Aug 26, 2025

[ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.

Python 253 22 Updated Oct 30, 2024

Fourier Controller Networks (FCNet) for Real-Time Decision-Making in Embodied Learning, ICML 2024

Python 31 1 Updated Jan 2, 2025

Tile primitives for speedy kernels

Cuda 3,092 225 Updated Jan 17, 2026

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,594 3,293 Updated Jan 19, 2026

Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)

Python 113 9 Updated Mar 20, 2025

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance…

Python 3,095 609 Updated Jan 17, 2026

MiniCPM on Android platform.

Python 636 52 Updated Mar 19, 2025

MiniCPM4 & MiniCPM4.1: Ultra-Efficient LLMs on End Devices, achieving 3+ generation speedup on reasoning tasks

Jupyter Notebook 8,497 530 Updated Oct 8, 2025
Next