Skip to content
View Timothyxxx's full-sized avatar
🧑‍💻
struggle with paradox
🧑‍💻
struggle with paradox

Organizations

@HKUNLP @xlang-ai @OpenLemur

Block or report Timothyxxx

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Memory_Driven_GUI_Agent

Python 5 1 Updated Nov 4, 2025

Claude Code Reverse Engineering Itself

13 2 Updated Aug 12, 2025

The official repo of VideoAgentTrek

Python 29 3 Updated Oct 24, 2025

Public repository for Skills

Python 15,983 1,393 Updated Oct 18, 2025

This is the official implementation for **"AUTOPR: LET'S AUTOMATE YOUR ACADEMIC PROMOTION!**".

Python 80 4 Updated Oct 16, 2025

All-in-One Sandbox for AI Agents that combines Browser, Shell, File, MCP and VSCode Server in a single Docker container.

Python 1,278 112 Updated Nov 9, 2025

A virtual environment for developing and evaluating automated scientific discovery agents.

Python 189 13 Updated Mar 10, 2025

My learning notes/codes for ML SYS.

Python 4,092 250 Updated Nov 6, 2025

An Open-Source Large-Scale Reinforcement Learning Project for Search Agents

Python 489 30 Updated Oct 8, 2025

OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents

Python 16 1 Updated Aug 16, 2025

Official implementation of "SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience"

Python 208 19 Updated Aug 7, 2025

Renderer for the harmony response format to be used with gpt-oss

Rust 3,988 223 Updated Nov 5, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,136 1,917 Updated Nov 1, 2025

Qwen Code is a coding agent that lives in the digital world.

TypeScript 15,177 1,254 Updated Nov 9, 2025

VeriGUI: Verifiable Long-Chain GUI Dataset

Python 82 2 Updated Oct 23, 2025

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >70% on SWE-bench verified!

Python 2,009 225 Updated Nov 3, 2025

τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment

Python 397 74 Updated Nov 8, 2025

Kimi K2 is the large language model series developed by Moonshot AI team

8,757 585 Updated Nov 7, 2025

A toolkit for building computer use AI agents

Python 177 18 Updated Jun 26, 2025
Python 113 7 Updated Oct 3, 2025

[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Python 2,304 325 Updated Nov 7, 2025

Easy Data Preparation with latest LLMs-based Operators and Pipelines.

Python 1,452 101 Updated Nov 7, 2025

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,379 1,365 Updated Jul 9, 2025

OpenCUA: Open Foundations for Computer-Use Agents

Python 553 63 Updated Oct 12, 2025

OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents [NeurIPS 2025 Spotlight]

Jupyter Notebook 38 Updated Sep 18, 2025

A benchmark for LLMs on complicated tasks in the terminal

Python 1,041 378 Updated Nov 7, 2025

A library for generating difficulty-scalable, multi-tool, and verifiable agentic tasks with execution trajectories.

Python 167 18 Updated Jul 6, 2025
Next