Lists (21)
Sort Name ascending (A-Z)
art
book
CBM
CV library
dataset
domain_gereralization
efficient model & learning
feedback
Generative & CV
LLM
low-level vision tasks
neural symbolic
safety&interpretable ML
segmentation
test-time
useful softwares
softwares I usevideo
Visualization
XAI
Repositories about xai theories, algorithms and libraries. Mainly about post-hoc explaination methods.生信
简历模板
Stars
Repository of AAAI26 paper 'Flexible Concept Bottleneck Model'.
This repo contains documentation and code needed to use PACO dataset: data loaders and training and evaluation scripts for objects, parts, and attributes prediction models, query evaluation scripts…
Official implementation of MAIA, A Multimodal Automated Interpretability Agent
Mechanistic understanding and validation of large AI models with SemanticLens
LongLive: Real-time Interactive Long Video Generation
Layer-wise Relevance Propagation for Large Language Models and Vision Transformers [ICML 2024]
Probabilistic cell segmentation for in situ spatial transcriptomics
[NeurIPS'25 Spotlight] (DANCE) Disentangled Concepts Speak Louder Than Words:Explainable Video Action Recognition official code repository
Implementation of the proposed LVMAE, from the paper, Extending Video Masked Autoencoders to 128 frames, in Pytorch
Official implementation of paper: Characterizing Dataset Bias via Disentangled Visual Concepts
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
[CVPR 2025] Official implementation of the paper "Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models" (by Benou and Riklin-Raviv): https://arxiv.org/…
✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension"
Official repo for paper: "GRACE: Generative Representation Learning via Contrastive Policy Optimization"
This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".
zhiyunyao / pkuthss
Forked from CasperVector/pkuthssLaTeX template for dissertations in Peking University
Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"
Training Sparse Autoencoders on Language Models
Universal Image Restoration Pre-training via Masked Degradation Classification
Merlin is a 3D VLM for computed tomography that leverages both structured electronic health records (EHR) and unstructured radiology reports for pretraining.
A Transparent Generalist Model towards Holistic Medical Vision-Language Understanding
[Nature Machine Intelligence 2024] Code and evaluation repository for the paper
Documentation for iNaturalist Open Data