Skip to content
View avijit9's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report avijit9

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Code for "EgoX: Egocentric Video Generation from a Single Exocentric Video"

Python 458 22 Updated Jan 2, 2026

This is a collection of recent papers on reasoning in video generation models.

91 2 Updated Jan 8, 2026

DuoLoRA implementation

Python 7 Updated Oct 18, 2025

pySLAM is a hybrid Python/C++ Visual SLAM pipeline supporting monocular, stereo, and RGB-D cameras. It provides a broad set of modern local and global feature extractors, multiple loop-closure stra…

Python 2,790 451 Updated Jan 8, 2026

Official PyTorch implementation of the paper "Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs"

Python 87 13 Updated Jun 6, 2025

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

Python 22,592 1,707 Updated Sep 24, 2025

[CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".

Python 54 Updated May 25, 2025

[ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding

Python 373 34 Updated May 8, 2024

Group-wise Temporal Logit Adjustment for TAS

Python 10 Updated Oct 24, 2024

A curated list for awesome discrete diffusion models resources.

524 19 Updated Sep 9, 2025

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 12,165 1,288 Updated Oct 11, 2025

Building simple diffusion models for image generation. More so for understanding and learning.

Python 8 2 Updated Mar 30, 2025

[WACV'25] Temporal Instructional Diagram Grounding in Unconstrained Videos

Python 5 Updated Dec 17, 2024

Video Annotation Tool

Vue 233 32 Updated Jun 18, 2024

[ICLR 2025] Video Action Differencing

Python 49 2 Updated Jul 3, 2025

A collection of my book notes on various subjects, mainly computer science

Java 2,930 760 Updated Mar 1, 2025

[ECCV2024] Gated Temporal Action Anticipation for Stochastic Long-Term Anticipation

Python 22 1 Updated May 29, 2025

Code and data release for the paper "Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment" (NeurIPS 2023)

Python 19 3 Updated Apr 5, 2024

React + Next.js template for research websites (for PhD students, researchers, etc)

TypeScript 217 90 Updated Jan 12, 2025

Visualizing the learned space-time attention using Attention Rollout

Jupyter Notebook 41 8 Updated Apr 1, 2022

MLX: An array framework for Apple silicon

C++ 23,403 1,447 Updated Jan 8, 2026
Jupyter Notebook 152 14 Updated Nov 10, 2024

Collection of AWESOME vision-language models for vision tasks

3,052 232 Updated Oct 14, 2025

A declarative drawing API in Python

Python 298 15 Updated Aug 28, 2024

It is my belief that you, the postgraduate students and job-seekers for whom the book is primarily meant will benefit from reading it; however, it is my hope that even the most experienced research…

4,785 327 Updated Aug 22, 2025

A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.

739 37 Updated Dec 1, 2025

A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.

Python 6,001 1,132 Updated Jul 25, 2024

Code and models for the ICML 2024 paper "Tell, Don`t Show!: Language Guidance Eases Transfer Across Domains in Images and Videos"

Python 6 1 Updated May 18, 2024
Next