Stars
Train your AI self, amplify you, bridge the world
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
A simple screen parsing tool towards pure vision based GUI agent
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Integrate the DeepSeek API into popular softwares
A tiny KV storage based on skiplist written in C++ language| 使用C++开发,基于跳表实现的轻量级键值数据库🔥🔥 🚀
An open-sourced end-to-end VLM-based GUI Agent
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.
code and data for "CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers"
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
中文命名实体识别(包括多种模型:HMM,CRF,BiLSTM,BiLSTM+CRF的具体实现)
本项目为量化开源课程,可以帮助人们快速掌握量化金融知识以及使用Python进行量化开发的能力。
A system that performs algorithmic trading
Collaborative lecture notes for Spring '19 NYU DL class
分享一些好用的 Dify DSL 工作流程,自用、学习两相宜。 Sharing some Dify workflows.
stock股票.获取股票数据,计算股票指标,筹码分布,识别股票形态,综合选股,选股策略,股票验证回测,股票自动交易,支持PC及移动设备。
心理健康大模型 (LLM x Mental Health), Pre & Post-training & Dataset & Evaluation & Depoly & RAG, with InternLM / Qwen / Baichuan / DeepSeek / Mixtral / LLama / GLM series models
Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.
🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"