- San Francisco, CA
-
19:58
(UTC -08:00) - https://scholar.google.com/citations?user=gJkhFkgAAAAJ&hl=en
Stars
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!
GenAI Agent Framework, the Pydantic way
Define, Prompt and Test MCP enabled Agents and Workflows
Daytona is a Secure and Elastic Infrastructure for Running AI-Generated Code
The AI framework that adds the engineering to prompt engineering (Python/TS/Ruby/Java/C#/Rust/Go compatible)
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
dstack is an open-source control plane for running development, training, and inference jobs on GPUs—across hyperscalers, neoclouds, or on-prem.
Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
AlphaBind code + model accompanying pre-print
A library for advanced large language model reasoning
Recipes to scale inference-time compute of open models
Official repository for the Boltz biomolecular interaction models
Best practices & guides on how to write distributed pytorch training code
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
Enhancing Gene Set Overrepresentation Analysis with Large Language Models
Improved antibody structure-based design using inverse folding
Convenience Python APIs for antibody numbering using ANARCI
From Chain-of-Thought prompting to OpenAI o1 and DeepSeek-R1 🍓
Chat With All Kinds of AI Models Through a Common Interface
A python module to repair invalid JSON from LLMs
Use the OpenAI Batch tool to make async batch requests to the OpenAI API.
A guidance language for controlling large language models.