Skip to content
View rijuyuezhu's full-sized avatar
  • Nanjing University
  • Shanghai, China
  • 20:17 (UTC +08:00)

Block or report rijuyuezhu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing …

Cuda 951 217 Updated Nov 8, 2025

Contexts Optical Compression

Python 20,086 1,481 Updated Oct 25, 2025

Feature packed AUR helper

Rust 7,583 271 Updated Oct 17, 2025

Puzzles for learning Triton

Jupyter Notebook 2,106 172 Updated Nov 18, 2024

common in-memory tensor structure

C++ 1,098 155 Updated Oct 11, 2025

A throughput-oriented high-performance serving framework for LLMs

Jupyter Notebook 912 44 Updated Oct 29, 2025

Hands-On Practical MLIR Tutorial

C++ 649 94 Updated Oct 20, 2023

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,147 1,917 Updated Nov 1, 2025

High performance Transformer implementation in C++.

C++ 140 16 Updated Jan 18, 2025

Disaggregated serving system for Large Language Models (LLMs).

Jupyter Notebook 722 81 Updated Apr 6, 2025

Sample codes for my CUDA programming book

Cuda 1,924 376 Updated Feb 15, 2025

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 39,748 6,884 Updated Nov 10, 2025

Development repository for the Triton language and compiler

MLIR 17,518 2,375 Updated Nov 10, 2025

Kimi K2 is the large language model series developed by Moonshot AI team

9,078 601 Updated Nov 7, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 3,880 303 Updated Nov 10, 2025

WaferLLM: Large Language Model Inference at Wafer Scale

Python 71 9 Updated Oct 31, 2025

An intuitive and low-overhead instrumentation tool for Python

Python 1,166 37 Updated Jul 8, 2025

The AWK Programming Language (AWK 程序设计语言, awkbook) 中文翻译, LaTeX 排版

TeX 1,498 279 Updated Jul 31, 2022

A local-first, cross-platform note-taking app leveraging the Typst ecosystem. Designed to minimize distractions and enhance the retention of information.

Vue 43 1 Updated Nov 10, 2025

A next-generation C++ language server for modern C++, focused on high performance and deep code intelligence

C++ 958 57 Updated Nov 9, 2025

Package management made easy

Rust 5,626 373 Updated Nov 10, 2025

[ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.

Python 79 3 Updated Nov 2, 2025

南京大学本科生/研究生评教 自动点击

JavaScript 109 1 Updated Jun 18, 2025

Optimized primitives for collective multi-GPU communication

C++ 4,216 1,063 Updated Nov 8, 2025

Fast and memory-efficient exact attention

Python 20,433 2,125 Updated Nov 9, 2025

Unified KV Cache Compression Methods for Auto-Regressive Models

Python 1,275 160 Updated Jan 4, 2025

A neovim plugin for rendering typst inline using the kitty unicode graphics protocol

Lua 51 3 Updated Aug 9, 2025

[ICLR 2025 Oral] PyTorch code for the paper "Open-World Reinforcement Learning over Long Short-Term Imagination"

Python 178 9 Updated Oct 16, 2025
Next