mmarius

Follow

👨‍💻

mmarius

👨‍💻

Follow

Postdoc @ McGill & Mila

24 followers · 6 following

https://twitter.com/mariusmosbach

Achievements

Achievements

Highlights

Pro

Stars

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 36,207 4,247 Updated Nov 5, 2025

open-thoughts / open-thoughts

Fully open data curation for reasoning models

Python 2,136 177 Updated Sep 3, 2025

McGill-NLP / unequal-unlearning

Python 7 Updated Oct 3, 2025

socialfoundations / training-on-the-test-task

Code to reproduce the experiments in the paper Training on the Test Task Confounds Evaluation and Emergence.

Jupyter Notebook 11 1 Updated Dec 3, 2024

allenai / olmes

Reproducible, flexible LLM evaluations

Python 264 51 Updated Oct 27, 2025

allenai / OLMo

Modeling, training, eval, and inference code for OLMo

Python 6,098 669 Updated Oct 24, 2025

allenai / OLMo-core

PyTorch building blocks for the OLMo ecosystem

Python 316 58 Updated Nov 9, 2025

allenai / open-instruct

AllenAI's post-training codebase

Python 3,286 454 Updated Nov 9, 2025

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,732 273 Updated Jul 18, 2025

locuslab / acr-memorization

Python 37 8 Updated Dec 19, 2024

GraySwanAI / circuit-breakers

Improving Alignment and Robustness with Circuit Breakers

Jupyter Notebook 240 38 Updated Sep 24, 2024

meta-pytorch / torchtune

PyTorch native post-training library

Python 5,581 679 Updated Nov 9, 2025

stas00 / ml-engineering

Machine Learning Engineering Open Book

Python 15,663 957 Updated Oct 27, 2025

karpathy / minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 10,120 975 Updated Jul 1, 2024

huggingface / nanotron

Minimalistic large language model 3D-parallelism training

Python 2,303 254 Updated Sep 3, 2025

cxli233 / FriendsDontLetFriends

Friends don't let friends make certain types of data visualization - What are they and why are they bad.

R 6,908 281 Updated Sep 3, 2025

lucidrains / x-transformers

A concise but complete full-attention transformer with a set of promising experimental features from various papers

Python 5,673 486 Updated Nov 7, 2025

EleutherAI / pythia

The hub for EleutherAI's work on interpretability and learning dynamics

Jupyter Notebook 2,664 195 Updated Jun 9, 2025

huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 20,004 2,086 Updated Nov 5, 2025

JonasGeiping / cramming

Cramming the training of a (BERT-type) language model into limited compute.

Python 1,351 102 Updated Jun 13, 2024

lmcinnes / glasbey

Algorithmically create or extend categorical colour palettes

Python 226 8 Updated Oct 11, 2025

karpathy / makemore

An autoregressive character-level language model for making more things

Python 3,394 863 Updated Jun 4, 2024

pnkraemer / tueplots

Figure sizes, font sizes, fonts, and more configurations at minimal overhead. Fix your journal papers, conference proceedings, and other scientific publications.

Python 713 30 Updated Jul 14, 2025

toizzy / tilt-transfer

Code to run the TILT transfer learning experiments

Python 32 10 Updated Feb 13, 2021

uds-lsv / MCSE

NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Python 57 8 Updated Jun 10, 2024

cambridgeltl / composable-sft

A library for parameter-efficient and composable transfer learning for NLP with sparse fine-tunings.

Python 75 11 Updated Aug 9, 2024

shmsw25 / Channel-LM-Prompting

An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"

Python 131 18 Updated Apr 23, 2022

rycolab / bayesian-mi

This code accompanies the paper "Bayesian Framework for Information-Theoretic Probing" published in EMNLP 2021.

Python 10 Updated Aug 23, 2021

luyug / GC-DPR

Train Dense Passage Retriever (DPR) with a single GPU

Python 133 21 Updated Jun 16, 2021

luyug / GradCache

Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint

Python 415 26 Updated Mar 26, 2024