si0wang

Xiyao Wang si0wang

13 followers · 1 following

University of Maryland, College Park
https://si0wang.github.io/

Achievements

Stars

tyxiong23 / Multi-Crit

Python 13 1 Updated Dec 10, 2025

si0wang / ViCrit

Python 24 1 Updated Jun 18, 2025

morse-benchmark / morse-500

Jupyter Notebook 31 3 Updated Jan 6, 2026

AnsonZnl / RehabilitationGuide

颈椎病腰突康复指南，为程序员群体提供简单可靠的康复指南。

Python 3,417 220 Updated Dec 25, 2023

si0wang / ThinkLite-VL

Python 105 6 Updated Jun 10, 2025

2U1 / Qwen-VL-Series-Finetune

An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.

Python 1,565 189 Updated Jan 10, 2026

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)

Python 8,763 848 Updated Jan 8, 2026

EvolvingLMMs-Lab / open-r1-multimodal

A fork to add multimodal model training to open-r1

Python 1,434 70 Updated Feb 8, 2025

hkust-nlp / simpleRL-reason

Simple RL training for reasoning

Python 3,824 283 Updated Dec 23, 2025

si0wang / VisVM

Python 46 5 Updated Dec 30, 2024

Julia-LiuJ / NLFT

The official implementation of Natural Language Fine-Tuning

Python 54 4 Updated Jan 7, 2025

HJYao00 / Mulberry

[NIPS'25 Spotlight] Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS

Python 1,232 112 Updated Sep 19, 2025

tianyi-lab / Cherry_LLM

[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models

Python 412 26 Updated Jun 25, 2025

JiuhaiChen / CVPR2025-Florence-VL

Python 245 10 Updated Dec 7, 2024

hiyouga / LlamaFactory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 65,387 7,946 Updated Jan 9, 2026

YuxiXie / MCTS-DPO

This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.

Jupyter Notebook 328 37 Updated Aug 6, 2024

umd-huang-lab / SIMA

Forked from si0wang/SIMA

Python 9 Updated Apr 30, 2025

shenmishajing / toy_project

Python 8 1 Updated Jul 1, 2024

umd-huang-lab / Mementos

Forked from si0wang/Mementos

Jupyter Notebook 32 Updated Feb 8, 2024

si0wang / COPlanner

Python 23 2 Updated Apr 2, 2024

si0wang / Mementos

Jupyter Notebook 7 1 Updated Feb 28, 2024

weipu-zhang / STORM

Python 121 21 Updated Nov 25, 2025

burchim / DreamerV3-PyTorch

PyTorch implementation of DreamerV3, Mastering Diverse Domains through World Models.

Python 10 2 Updated Feb 16, 2024

apache / singa

a distributed deep learning platform

C++ 3,584 1,267 Updated Jan 10, 2026

kngwyu / mujoco-maze

Simple maze environments using mujoco-py

Python 58 12 Updated Dec 27, 2023

NM512 / dreamerv3-torch

Implementation of Dreamer v3 in pytorch.

Python 764 191 Updated Sep 27, 2024

ARISE-Initiative / robosuite

robosuite: A Modular Simulation Framework and Benchmark for Robot Learning

Python 2,142 638 Updated Jan 7, 2026

opendilab / awesome-model-based-RL

A curated list of awesome model based RL resources (continually updated)

1,271 73 Updated Dec 20, 2025

schroederdewitt / multiagent_mujoco

Benchmark for Continuous Multi-Agent Robotic Control, based on OpenAI's Mujoco Gym environments.

Python 364 36 Updated Mar 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xiyao Wang si0wang

Achievements

Achievements

Block or report si0wang

Stars

tyxiong23 / Multi-Crit

si0wang / ViCrit

morse-benchmark / morse-500

AnsonZnl / RehabilitationGuide

si0wang / ThinkLite-VL

2U1 / Qwen-VL-Series-Finetune

OpenRLHF / OpenRLHF

EvolvingLMMs-Lab / open-r1-multimodal

hkust-nlp / simpleRL-reason

si0wang / VisVM

Julia-LiuJ / NLFT

HJYao00 / Mulberry

tianyi-lab / Cherry_LLM

JiuhaiChen / CVPR2025-Florence-VL

hiyouga / LlamaFactory

YuxiXie / MCTS-DPO

umd-huang-lab / SIMA

shenmishajing / toy_project

umd-huang-lab / Mementos

si0wang / COPlanner

si0wang / Mementos

weipu-zhang / STORM

burchim / DreamerV3-PyTorch

apache / singa

kngwyu / mujoco-maze

NM512 / dreamerv3-torch

ARISE-Initiative / robosuite

opendilab / awesome-model-based-RL

schroederdewitt / multiagent_mujoco