Skip to content
View ttengwang's full-sized avatar

Block or report ttengwang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 19 Updated Aug 7, 2025

Codebase of 'From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model'

Python 32 Updated Oct 27, 2025

Contexts Optical Compression

Python 18,201 1,193 Updated Oct 25, 2025

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Python 617 38 Updated Oct 15, 2025

Hierarchical Reasoning Model Official Release

Python 11,584 1,687 Updated Sep 9, 2025

🔥🔥🔥 Latest Papers, Codes and Datasets on Video-LMM Post-Training

Python 139 7 Updated Oct 14, 2025

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 1,246 82 Updated Oct 28, 2025
Python 41 1 Updated Jun 4, 2025
Python 8 Updated Sep 14, 2025
Python 1,055 92 Updated Oct 22, 2025

Reinforcement Learning of Vision Language Models with Self Visual Perception Reward

Python 136 18 Updated Sep 23, 2025

[EMNLP 2025 Oral] Official codebase for Seeing More, Saying More: Lightweight Language Experts are Dynamic Video Token Compressors.

Python 11 Updated Sep 7, 2025

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 9,392 731 Updated Sep 22, 2025

Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capab…

JavaScript 7,582 1,461 Updated May 26, 2025

Streamlining Cartoon Production with Generative Post-Keyframing

Python 452 38 Updated Aug 20, 2025

AudioStory: Generating Long-Form Narrative Audio with Large Language Models

Jupyter Notebook 284 18 Updated Sep 21, 2025

A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/

JavaScript 3,994 801 Updated Sep 4, 2025

Environments for LLM Reinforcement Learning

Python 3,395 404 Updated Oct 26, 2025

Structured Video Comprehension of Real-World Shorts

Python 211 8 Updated Sep 21, 2025

video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions, which is developed by the Department of Electronic Engineering at Tsin…

Python 96 6 Updated Oct 21, 2025

[ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.

Python 50 6 Updated May 2, 2025

Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.

Python 764 62 Updated Oct 1, 2025

Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos

Python 172 7 Updated Sep 4, 2025

Unified Vision-Language-Action Model

Python 214 12 Updated Oct 15, 2025

Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels with Hunyuan3D World Model

Python 2,344 194 Updated Oct 22, 2025

Pytorch implementation of MeanFlow on ImageNet and CIFAR10

Python 320 17 Updated Aug 23, 2025
9 Updated Aug 7, 2025

让你一眼惊艳的prompt

GCC Machine Description 738 145 Updated Oct 5, 2025

Code release for the paper "Progress-Aware Video Frame Captioning" (CVPR 2025)

Python 18 1 Updated Jul 16, 2025
Next