Stars
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…
《动手学大模型Dive into LLMs》系列编程实践教程
Project AirSim is Microsoft's evolution of AirSim, an advanced simulation platform for building, training, and testing autonomous systems in high-fidelity virtual environments
deep learning for image processing including classification and object-detection etc.
A latent text-to-image diffusion model
PyTorch code and models for V-JEPA self-supervised learning from video.
[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
Multimodal Large Models Are Effective Action Anticipators (IEEE TMM)🌳
The Simulation and Image Processing for Photonics and Acoustics (SIMPA) toolkit.
Revisiting Anchor Mechanisms for Temporal Action Localization (TIP 2020)
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).
A comprehensive collection of KAN(Kolmogorov-Arnold Network)-related resources, including libraries, projects, tutorials, papers, and more, for researchers and developers in the Kolmogorov-Arnold N…
DeepSurv is a deep learning approach to survival analysis.
A list of papers/resources in Survival Analysis that I have read or would like to read.
Neuro-Fuzzy Random Vector Functional Link Neural Network for Classification and Regression Problems
[IEEE TIP 2021] COVID-CS Dataset and Code of JCS: An explainable COVID-19 diagnosis system by joint classification and segmentation