Skip to content
View XingruiWang's full-sized avatar
:shipit:
:shipit:

Highlights

  • Pro

Block or report XingruiWang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
XingruiWang/README.md

Pinned Loading

  1. Spatial457 Spatial457 Public

    [CVPR'25] A vision question answering (VQA) benchmark for 6D spatial reasoning.

    Python 15 2

  2. open-compass/VLMEvalKit open-compass/VLMEvalKit Public

    Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

    Python 3.4k 550

  3. KeyVID KeyVID Public

    Offical code of paper KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation.

    Python 5

  4. XModBench XModBench Public

    XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models

    Python 3

  5. DynSuperCLEVR DynSuperCLEVR Public

    A video question answering dataset that focuses on the dynamics properties of objects (velocity, acceleration) and their collisions within 4D scenes.

    Python 18

  6. 3D-Aware-VQA 3D-Aware-VQA Public

    Official Code for the NeurIPS'23 paper "3D-Aware Visual Question Answering about Parts, Poses and Occlusions"

    Jupyter Notebook 19