- Montreal
- gabrielhuang.github.io
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
The project code for the Overhearing Agents project (AI Agents Workshop @ COLM 2025).
Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)
Code for the paper "Defeating Prompt Injections by Design"
Flow Integrity Deterministic Enforcement System. Mechanisms for securing AI agents with information-flow control.
Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, datasets, evaluations, and analyses.
A fast + lightweight implementation of the GCG algorithm in PyTorch
Universal and Transferable Attacks on Aligned Language Models
rotaryhammer / code-autodan
Forked from llm-attacks/llm-attacksAn unofficial implementation of AutoDAN attack on LLMs (arXiv:2310.15140)
π The fast, Pythonic way to build MCP servers and clients
DoomArena is a Framework for Testing AI Agents Against Evolving Security Threats
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
Autonomous Agents (LLMs) research papers. Updated Daily.
SafeArena is a benchmark for assessing the harmful capabilities of web agents
ππͺ BrowserGym, a Gym environment for web task automation
Two conversational AI agents switching from English to sound-level protocol after confirming they are both AI agents
AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reproducibility.
AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM
Papers and resources related to the security and privacy of LLMs π€
Python tool for converting files and office documents to Markdown.
TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle
Interactive Tables and Data Grids for JavaScript
π©π Windows 95 in Electron. Runs on macOS, Linux, and Windows.
A curated list of trustworthy deep learning papers. Daily updating...
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
antimatter15 / alpaca.cpp
Forked from ggml-org/llama.cppLocally run an Instruction-Tuned Chat-Style LLM
Aligning pretrained language models with instruction data generated by themselves.
π¦π Build context-aware reasoning applications
A framework for the evaluation of autoregressive code generation language models.