MMMMMMolly

Follow

Molly MMMMMMolly

Follow

Stars

shijingya / SAFFNet

Self-Attention based on Fourier Frequency Domain Filter Network for Visual Question Answering

Python 2 Updated Feb 22, 2025

CJR7 / MultiAtt-RSSC

5 1 Updated Jul 14, 2024

MeimeiZhang-data / MQVQA

We proposed a Multiple-Step Question-Driven VQA (MQVQA) system to improve the reasoning and understanding ability in remote sensing VQA tasks in cases where questions focus on not only image scenes…

Python 9 1 Updated Dec 21, 2022

LANMNG / LQVG

Python 27 3 Updated Nov 27, 2025

littlesunnywgc / MoPeD

Modality Perception Learning based Determinative Factor Discovery model

Python 3 Updated Feb 27, 2024

Yangzhangcst / Mamba-in-CV

A paper list of some recent Mamba-based CV works.

426 22 Updated Nov 10, 2025

microsoft / BiomedCLIP_data_pipeline

BiomedCLIP data pipeline

Jupyter Notebook 93 11 Updated Jan 14, 2025

AIFengheshu / Plug-play-modules

2025年全网最全即插即用模块，免费分享！CVPR2025，AAAI2025，ICLR2025，TNNLS2025，arXiv2025......包含人工智能全领域（机器学习、深度学习等），适用于图像分类、目标检测、实例分割、语义分割、全景分割、姿态识别、医学图像分割、视频目标分割、图像抠图、图像编辑、单目标跟踪、多目标跟踪、行人重识别、RGBT、图像去噪、去雨、去雾、去阴影、去模糊、超分辨…

Python 1,297 101 Updated May 24, 2025

Prasanna-raj-KJ-07 / Logical-representation-of-Sentences

Jupyter Notebook 1 Updated Feb 6, 2025

YeexiaoZheng / Multimodal-Sentiment-Analysis

多模态情感分析——基于BERT+ResNet的多种融合方法

Python 332 30 Updated Nov 20, 2022

Zhishe-Wang / CrossFuse

Python 18 Updated Mar 8, 2023

Jiaxin-Ye / DepMamba

[ICASSP 2025] Official PyTorch code for training and inference pipeline for DepMamba: Progressive Fusion Mamba for Multimodal Depression Detection

Python 82 8 Updated Mar 11, 2025

airsplay / lxmert

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".

Python 965 162 Updated Oct 22, 2022

RainyMoo / myvqa

The implementation of CLVIN、CAAN and MPCCT

Python 8 Updated Nov 27, 2024

jnhwkim / nips-mrn-vqa

Multimodal Residual Learning for Visual QA (NIPS 2016)

Lua 38 5 Updated Dec 27, 2016

Event-AHU / EFV_event_classification

[PRCV-2023, IEEE TMM-2025] Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion based Classification

Python 12 1 Updated Jun 3, 2025

NMS05 / Multimodal-Fusion-with-Attention-Bottlenecks

Python 34 Updated Nov 22, 2024

XianhuiChen / Bottleneck-Attention-Based-Fusion-Network-for-Sleep-Apnea-Detection

Jupyter Notebook 27 3 Updated Aug 22, 2024

youngzhou97qz / CGMVQA

not yet

Python 7 5 Updated Dec 5, 2019

tezansahu / VQA-With-Multimodal-Transformers

Exploring multimodal fusion-type transformer models for visual question answering (on DAQUAR dataset)

Jupyter Notebook 37 14 Updated Jan 20, 2022

VirajBagal / MMBERT

MMBERT: Multimodal BERT Pretraining for Improved Medical VQA

Python 38 8 Updated Mar 22, 2021

HarisIqbal88 / PlotNeuralNet

Latex code for making neural networks diagrams

TeX 24,144 3,025 Updated Aug 21, 2023

dldxzx / MM-IDTarget

MM-IDTarget: a novel deep learning framework for identifying targets using cross-attention based multimodal fusion strategy

Python 3 Updated Apr 6, 2025

YimianDai / open-aff

code and trained models for "Attentional Feature Fusion"

Python 802 102 Updated Jul 23, 2021

scut-cszcl / SFusion

Source code of SFusion

Python 28 2 Updated Mar 5, 2023

CrossmodalGroup / NAAF

Implementation of our CVPR2022 paper, Negative-Aware Attention Framework for Image-Text Matching.

Python 120 10 Updated Jun 19, 2023

Clound-Computing / 2021290227

Multimodal Fusion with Co-Attention Networks for Fake News Detection

Python 4 Updated Jul 9, 2024

plw-study / Reproduction_of_MCAN

This is the reproduction of MCAN from paper in ACL 2021: "Multimodal Fusion with Co-Attention Networks for Fake News Detection"

Python 49 4 Updated Dec 25, 2023

bowang-lab / DPM-MedImgEnhance

Pre-trained Diffusion Models for Plug-and-Play Medical Image Enhancement

Python 29 3 Updated Oct 3, 2023

MILVLG / mcan-vqa

Deep Modular Co-Attention Networks for Visual Question Answering

Python 456 89 Updated Dec 16, 2020