Stars
[NeurIPS '25] Code for Paper "IF-Guide: Influence Function-Guided Suppression of Harmful Training Data for Reducing LLM Toxicity"
`dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.
Schedule-Free Optimization in PyTorch
LLM Attributor: Attribute LLM's Generated Text to Training Data
Using Large Language Models for Hyperparameter Optimization
Code for the paper "The Journey, Not the Destination: How Data Guides Diffusion Models"
[NeurIPS D&B '25] The one-stop repository for large language model (LLM) unlearning. Supports TOFU, MUSE, WMDP, and many unlearning methods with easily feature extensibility.
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
Must-read Papers on Knowledge Editing for Large Language Models.
[ICML 2024] SIRFShampoo: Structured inverse- and root-free Shampoo in PyTorch (https://arxiv.org/abs/2402.03496)
Implementation of Influence Function approximations for differently sized ML models, using PyTorch
DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Influence Functions with (Eigenvalue-corrected) Kronecker-Factored Approximate Curvature
LLM finetuning in resource-constrained environments.
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Human annotated noisy labels for CIFAR-10 and CIFAR-100. The website of CIFAR-N is available at http://www.noisylabels.com/.
OpenDataVal: a Unified Benchmark for Data Valuation in Python (NeurIPS 2023)
Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers. For access to our API, please email us at contact@unita…
AI Logging for Interpretability and Explainability🔬
A simple and efficient baseline for data attribution