pooruss

Shihao Liang pooruss

Bytedance Seed. Love byte and love dancing.

Achievements

Stars

agent-infra / sandbox

All-in-One Sandbox for AI Agents that combines Browser, Shell, File, MCP and VSCode Server in a single Docker container.

Python 1,304 119 Updated Nov 10, 2025

laude-institute / terminal-bench

A benchmark for LLMs on complicated tasks in the terminal

Python 1,045 379 Updated Nov 7, 2025

OSU-NLP-Group / Mind2Web-2

[NeurIPS'25 D&B] Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge

Python 89 6 Updated Nov 1, 2025

ByteDance-Seed / Seed1.5-VL

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,487 58 Updated Jun 14, 2025

alexzhang13 / videogamebench

Benchmark environment for evaluating vision-language models (VLMs) on popular video games!

Python 310 33 Updated May 30, 2025

ByteDance-Seed / VeOmni

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 1,281 93 Updated Nov 8, 2025

lsdefine / simple_GRPO

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 1,437 110 Updated Aug 5, 2025

fchollet / ARC-AGI

The Abstraction and Reasoning Corpus

JavaScript 4,618 697 Updated Apr 4, 2025

google-research / android_world

AndroidWorld is an environment and benchmark for autonomous agents

Python 500 105 Updated Oct 27, 2025

web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

Python 1,215 193 Updated Oct 3, 2025

web-arena-x / visualwebarena

VisualWebArena is a benchmark for multimodal agents.

Python 400 66 Updated Nov 9, 2024

shulin16 / MMInA

[ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents

Python 47 3 Updated Feb 27, 2025

bytedance / UI-TARS-desktop

The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

TypeScript 19,451 1,846 Updated Nov 10, 2025

bytedance / UI-TARS

Python 8,166 573 Updated Nov 5, 2025

RUCBM / GUICourse

GUICourse: From General Vision Langauge Models to Versatile GUI Agents

Python 133 7 Updated Jul 17, 2024

OSU-NLP-Group / UGround

[ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents

Python 284 12 Updated Jul 18, 2025

microsoft / WindowsAgentArena

Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.

Python 783 83 Updated Apr 30, 2025

openai / prm800k

800,000 step-level correctness labels on LLM solutions to MATH problems

Python 2,066 122 Updated Jun 1, 2023

xlang-ai / Spider2-V

[NeurIPS 2024] Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

Jupyter Notebook 133 10 Updated Aug 26, 2024

pooruss / ML-Framework-for-Diverse-Applications-in-Trading-and-Finance

Final project of COMP 7409 Machine Learning in Trading and Finance – Group 7.

Python 5 2 Updated Nov 13, 2023

thunlp / UniMem

UniMem: Towards a Unified View of Long-Context Large Language Models (COLM 2024)

Python 9 1 Updated Aug 14, 2024

showlab / GUI-Narrator

Repository of GUI Action Narrator

JavaScript 11 Updated Apr 8, 2025

Dongping-Chen / GUI-World

(ICLR 2025) The Official Code Repository for GUI-World.

Python 67 3 Updated Dec 18, 2024

opendatalab / MinerU

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

Python 48,461 4,001 Updated Nov 10, 2025

pooruss / VisAgent

Graduation Project HKUCS

Python 2 Updated Jul 17, 2024

aburns4 / MoTIF

Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments

Jupyter Notebook 60 3 Updated Aug 19, 2024

OpenGVLab / GUI-Odyssey

[ICCV 2025] GUIOdyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUIOdyssey consists of 8,834 episodes from 6 mobile devices, spanning 6 types of cross-app…

Python 131 8 Updated Aug 4, 2025

LLaVA-VL / LLaVA-NeXT

Python 4,377 418 Updated Sep 14, 2025

abi / screenshot-to-code

Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)

Python 71,149 8,817 Updated Oct 21, 2025

ollama / ollama

Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.

Go 155,687 13,584 Updated Nov 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shihao Liang pooruss

Achievements

Achievements

Block or report pooruss

Stars

agent-infra / sandbox

laude-institute / terminal-bench

OSU-NLP-Group / Mind2Web-2

ByteDance-Seed / Seed1.5-VL

alexzhang13 / videogamebench

ByteDance-Seed / VeOmni

lsdefine / simple_GRPO

fchollet / ARC-AGI

google-research / android_world

web-arena-x / webarena

web-arena-x / visualwebarena

shulin16 / MMInA

bytedance / UI-TARS-desktop

bytedance / UI-TARS

RUCBM / GUICourse

OSU-NLP-Group / UGround

microsoft / WindowsAgentArena

openai / prm800k

xlang-ai / Spider2-V

pooruss / ML-Framework-for-Diverse-Applications-in-Trading-and-Finance

thunlp / UniMem

showlab / GUI-Narrator

Dongping-Chen / GUI-World

opendatalab / MinerU

pooruss / VisAgent

aburns4 / MoTIF

OpenGVLab / GUI-Odyssey

LLaVA-VL / LLaVA-NeXT

abi / screenshot-to-code

ollama / ollama