-
xAI
- Palo Alto, CA
- http://ronghanghu.com/
- https://orcid.org/0000-0002-5060-9485
- @RonghangHu
- in/ronghanghu
Stars
🚀 Efficient implementations of state-of-the-art linear attention models
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)
verl: Volcano Engine Reinforcement Learning for LLMs
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
Microsoft PowerToys is a collection of utilities that help you customize Windows and streamline everyday tasks
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
The benchmark for "Video Object Segmentation in Panoptic Wild Scenes".
[ECCV2022] MOTR: End-to-End Multiple-Object Tracking with TRansformer
[CVPR2023] MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors
A PyTorch implementation of Connected Components Labeling
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
Monocular Depth Estimation Toolbox based on MMSegmentation.
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
Model parallel transformers in JAX and Haiku
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimentation and parallelization, and has demonstrated industry lead…
2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.
Making large AI models cheaper, faster and more accessible
JAX - A curated list of resources https://github.com/google/jax
ConvMAE: Masked Convolution Meets Masked Autoencoders
A paper list of some recent Transformer-based CV works.