Skip to content
View ahmed-nady's full-sized avatar

Block or report ahmed-nady

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Python 398 37 Updated May 8, 2025

An easy way to apply LoRA to CLIP. Implementation of the paper "Low-Rank Few-Shot Adaptation of Vision-Language Models" (CLIP-LoRA) [CVPRW 2024].

Python 265 28 Updated Jun 6, 2025

Official repository for "VideoPrism: A Foundational Visual Encoder for Video Understanding" (ICML 2024)

Python 321 27 Updated Oct 2, 2025

The official implementation of 3DDFA_V3 in CVPR2024 (Highlight).

Python 349 31 Updated Nov 10, 2024

OpenFace 3.0 – open-source toolkit for facial landmark detection, action unit detection, eye-gaze estimation, and emotion recognition.

Python 97 13 Updated Jun 10, 2025

Fully Open Framework for Democratized Multimodal Training

Python 610 43 Updated Nov 10, 2025

The official Meta Llama 3 GitHub site

Python 29,090 3,479 Updated Jan 26, 2025

Official PyTorch implementation of the paper: UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training.

Python 31 4 Updated Nov 11, 2025

About This repository is a curated collection of the most exciting and influential CVPR 2025 papers. 🔥 [Paper + Code + Demo]

Python 808 46 Updated Jun 16, 2025

Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders (CVPR 2025, Highlight)

Python 793 90 Updated Apr 19, 2025

🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.

MDX 66,309 6,912 Updated Nov 1, 2025
Python 10 1 Updated Oct 2, 2024

🔥🔥First-ever hour scale video understanding models

Python 571 37 Updated Jul 14, 2025
Python 644 52 Updated Nov 28, 2023

An open-source framework for training large multimodal models.

Python 4,041 316 Updated Aug 31, 2024

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 9,463 732 Updated Sep 22, 2025

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 2,101 132 Updated Aug 7, 2025

A curated list of awesome temporal action segmentation resources.

228 17 Updated Apr 4, 2024

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Python 9,247 947 Updated Aug 12, 2024

🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.

2,897 131 Updated Nov 11, 2025

Code for our IJCV 2023 paper "CLIP-guided Prototype Modulating for Few-shot Action Recognition".

Python 74 11 Updated Mar 7, 2024

YACS -- Yet Another Configuration System

Python 1,326 90 Updated Apr 13, 2022

An arbitrary face-swapping framework on images and videos with one single trained model!

Python 5,083 1,005 Updated Aug 6, 2024

Industry leading face manipulation platform

Python 25,781 4,124 Updated Nov 12, 2025

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

Python 943 48 Updated Oct 16, 2024

[CVPR2023] The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

Python 961 139 Updated Jul 18, 2023

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Python 3,310 268 Updated Jan 18, 2025

Frontier Multimodal Foundation Models for Image and Video Understanding

Jupyter Notebook 1,041 75 Updated Aug 14, 2025

Student Classroom Behavior dataset

381 36 Updated Sep 18, 2025

Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2

Jupyter Notebook 3,025 350 Updated Nov 11, 2025
Next