Starred repositories
An implementation of SEAL: Safety-Enhanced Aligned LLM fine-tuning via bilevel data selection.
A framework for few-shot evaluation of language models.
[NeurIPS 2025] BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models
The ultimate LLM/AI application development framework in Golang.
A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.
Implements harmful/harmless refusal removal using pure HF Transformers
文本去重算法,研究自推荐系统中新闻的去重,采用了雅虎的Near-duplicates and shingling算法,服务端用c实现,客户端用java实现,利用thrift框架进行通信,为了提高扩展性,去重可以在服务端实现,服务器也提供了计算的接口,方便客户端自己扩展
这是一个自动收集各大平台热点新闻(更关注 AI热点)、RSS订阅源以及特定Twitter Feed,进行处理、去重、总结,并通过多种渠道推送热点摘要的工具。该项目完全由Cursor和Trae接力编写
Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
The simplest, fastest repository for training/finetuning medium-sized GPTs.
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Supercharge Your LLM Application Evaluations 🚀
Evaluation and Tracking for LLM Experiments and AI Agents
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A survey on harmful fine-tuning attack for large language model
An extremely fast Python package and project manager, written in Rust.
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Multilingual Voice Understanding Model
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
本项目是一个面向小白开发者的大模型应用开发教程,在线阅读地址:https://datawhalechina.github.io/llm-universe/
chihebchebbi / Mastering-Machine-Learning-for-Penetration-Testing
Forked from PacktPublishing/Mastering-Machine-Learning-for-Penetration-TestingMastering Machine Learning for Penetration Testing, published by Packt