Stars
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, le…
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
图解计算机网络、操作系统、计算机组成、数据库,共 1000 张图 + 50 万字,破除晦涩难懂的计算机基础知识,让天下没有难懂的八股文!🚀 在线阅读:https://xiaolincoding.com
The basic cache tool for java.(java 手写实现渐进式 redis 缓存工具,高性能+可拓展性强)
LangChain4j is an open-source Java library that simplifies the integration of LLMs into Java applications through a unified API, providing access to popular LLMs and vector databases. It makes impl…
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
The integration of HugeGraph with AI/LLM & GraphRAG
🔥 官方推荐 🔥 大学春招、秋招、应届项目,SpringBoot3 + Java17 + SpringCloud Alibaba + Vue3 等技术架构,完成高仿铁路 12306 用户 + 抢票 + 订单 + 支付服务,帮助学生主打就业的项目。
ChatGPT Java SDK。支持 GPT-4o、 GPT-5 API。开箱即用。An unofficial Java SDK for seamless integration with ChatGPT's GPT-5 and GPT-4 APIs. Ready-to-use, simple setup, and efficient for building AI-powered app…
中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。
A hex editor for WeChat/QQ/TIM - PC版微信/QQ/TIM防撤回补丁(我已经看到了,撤回也没用了)
Question and Answer based on Anything.
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
Anserini is a Lucene toolkit for reproducible information retrieval research
JPype is cross language bridge to allow Python programs full access to Java class libraries.
Apache GeaFlow: A Streaming Graph Computing Engine.
Document Chatbot — multiple files. Powered by GPT / Embedding.
an intro to retrieval augmented large language model
🦜🔗 The platform for reliable agents.
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用
A professional cross-platform SSH/Sftp/Shell/Telnet/Tmux/Serial terminal.
Transferring messages and files between the server and the client are realized in this project.
A website for learners of 《Introduction to Algorithms》