Highlights
- Pro
-
-
-
unsloth-zoo Public
Forked from unslothai/unsloth-zooUtils for Unsloth
Python GNU Lesser General Public License v3.0 UpdatedOct 27, 2025 -
unsloth Public
Forked from unslothai/unsloth2-5X faster 80% less memory LLM finetuning
Python Apache License 2.0 UpdatedOct 27, 2025 -
-
auto-merge-llm Public
Forked from Guinan-Su/auto-merge-llmAn official repository for GPTailor
Python UpdatedOct 14, 2025 -
-
nbdistributed Public
Forked from muellerzr/nbdistributedSeemless interface of using PyTOrch distributed with Jupyter notebooks
Jupyter Notebook Apache License 2.0 UpdatedSep 12, 2025 -
Next-Token-Failures Public
Forked from zaydzuhri/Next-Token-FailuresFork for Token Order Prediction
Python UpdatedAug 11, 2025 -
flash-dmattn Public
Forked from SmallDoges/flash-dmattnFlash Dynamic Mask Attention
C++ BSD 3-Clause "New" or "Revised" License UpdatedAug 5, 2025 -
extension-cpp Public
Forked from pytorch/extension-cppC++ extensions in PyTorch
Python UpdatedAug 3, 2025 -
ASI-Arch Public
Forked from GAIR-NLP/ASI-ArchAlphaGo Moment for Model Architecture Discovery.
Python Apache License 2.0 UpdatedJul 25, 2025 -
Long-RL Public
Forked from NVlabs/Long-RLLong-RL: Scaling RL to Long Sequences
Python Apache License 2.0 UpdatedJul 25, 2025 -
-
nano-sparse-attention Public
Forked from PiotrNawrot/nano-sparse-attentionThe simplest implementation of recent Sparse Attention patterns for efficient LLM inference.
Jupyter Notebook UpdatedJul 17, 2025 -
nano-vllm Public
Forked from GeeeekExplorer/nano-vllmNano vLLM
Python MIT License UpdatedJun 19, 2025 -
CPM.cu Public
Forked from OpenBMB/CPM.cuCPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge techniques in sparse architecture, speculative sampling and qua…
Cuda Apache License 2.0 UpdatedJun 18, 2025 -
RadeonFlow_Kernels Public
Forked from RadeonFlow/RadeonFlow_KernelsC++ MIT License UpdatedJun 17, 2025 -
torchtitan Public
Forked from pytorch/torchtitanA PyTorch native platform for training generative AI models
Python BSD 3-Clause "New" or "Revised" License UpdatedJun 13, 2025 -
flash-linear-attention Public
Forked from fla-org/flash-linear-attentionFLA repo fork for Semi-Compressed Attention (SCAN)
Python MIT License UpdatedJun 12, 2025 -
flame Public
Forked from fla-org/flame🔥 A minimal training framework for scaling FLA models
Python MIT License UpdatedJun 12, 2025 -
alpa Public
Forked from alpa-projects/alpaTraining and serving large-scale neural networks with auto parallelization.
Python Apache License 2.0 UpdatedJun 11, 2025 -
lm-evaluation-harness Public
Forked from EleutherAI/lm-evaluation-harnessA framework for few-shot evaluation of language models.
Python MIT License UpdatedJun 10, 2025 -
-
-
pccl Public
Forked from PrimeIntellect-ai/pcclPCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP
C++ MIT License UpdatedMay 21, 2025 -
fp4-all-the-way Public
Forked from Anonymous1252022/fp4-all-the-way -
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
Python Other UpdatedMay 18, 2025 -
noloco Public
Forked from gensyn-ai/nolocoExperimental repository for research implementation of NoLoCo.
Python UpdatedMay 15, 2025 -
ModernBERT Public
Forked from AnswerDotAI/ModernBERTBringing BERT into modernity via both architecture changes and scaling
Jupyter Notebook Apache License 2.0 UpdatedMay 15, 2025