Skip to content
View sbyebss's full-sized avatar

Block or report sbyebss

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Native Multimodal Models are World Learners

Python 1,206 42 Updated Nov 7, 2025

The best ChatGPT that $100 can buy.

Python 36,361 4,323 Updated Nov 5, 2025

RF-DETR is a real-time object detection and segmentation model architecture developed by Roboflow, SOTA on COCO and designed for fine-tuning.

Python 4,174 466 Updated Nov 5, 2025

Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the future state of the world in the form of video.

Python 375 28 Updated Nov 11, 2025

[CVPR 2025] Code for Segment Any Motion in Videos

Jupyter Notebook 427 34 Updated Jun 10, 2025
Python 124 4 Updated Oct 9, 2025

[ECCV2024 - Oral, Best Paper Award Candidate] SEA-RAFT: Simple, Efficient, Accurate RAFT for Optical Flow

Python 581 37 Updated Jun 29, 2025
Python 1,840 61 Updated Jun 28, 2024

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 6,924 595 Updated Jul 4, 2025

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 8,222 556 Updated Nov 3, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 11,588 1,295 Updated Oct 12, 2025

Code of π^3: Permutation-Equivariant Visual Geometry Learning

Python 1,351 66 Updated Sep 10, 2025

MAGI-1: Autoregressive Video Generation at Scale

Python 3,539 213 Updated Jun 17, 2025

A beautiful, simple, clean, and responsive Jekyll theme for academics

HTML 14,451 12,462 Updated Nov 7, 2025

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 82,140 9,189 Updated Nov 11, 2025

Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.

Python 660 89 Updated Oct 29, 2025

Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)

Python 2,813 198 Updated Sep 12, 2025

[ICLR 2025][arXiv:2406.07548] Image and Video Tokenization with Binary Spherical Quantization

Python 178 4 Updated Jun 12, 2024

Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"

Python 158 5 Updated Jan 31, 2025

🦜🔗 The platform for reliable agents.

Python 119,384 19,664 Updated Nov 11, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 14,667 2,120 Updated Jul 17, 2025

Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.

Python 786 65 Updated Nov 10, 2025

Cosmos-Predict1 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.

Jupyter Notebook 377 75 Updated Aug 20, 2025

Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments.

Python 724 98 Updated Oct 29, 2025

[NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.

Python 316 8 Updated Jul 9, 2024

Enjoy the magic of Diffusion models!

Python 10,626 990 Updated Nov 10, 2025

Perceptual video quality assessment based on multi-method fusion.

Python 5,149 797 Updated Nov 10, 2025

This repo contains the code for 1D tokenizer and generator

Jupyter Notebook 1,073 56 Updated Mar 20, 2025

ElasticTok: Adaptive Tokenization for Image and Video

Python 82 Updated Nov 4, 2024
Next