Stars
Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are commi…
ValueCell is a community-driven, multi-agent platform for financial applications.
"AI-Trader: Can AI Beat the Market?" Live Trading Bench: https://hkuds.github.io/AI-Trader/
Build and publish crates with pyo3, cffi and uniffi bindings as well as rust binaries as python packages
DLRover: An Automatic Distributed Deep Learning System
An extensible, state of the art columnar file format. Formerly at @spiraldb, now an Incubation Stage project at LFAI&Data, part of the Linux Foundation.
AIGitCommit is a command-line tool that generates meaningful, semantic commit messages from your staged Git changes using AI.
Production-ready platform for agentic workflow development.
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
Alluxio, data orchestration for analytics and machine learning in the cloud
"rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Azure Blob, Azure Files, Yandex Files
Apache DataFusion Comet Spark Accelerator
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
vsag is a vector indexing library used for similarity search.
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Extremely fast Query Engine for DataFrames, written in Rust
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Rust crate for Substrait: Cross-Language Serialization for Relational Algebra
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
A library to analyze PyTorch traces.
LlamaIndex is the leading framework for building LLM-powered agents over your data.
Chat with LLM in your terminal, be it shell generator, story teller, linux-terminal, etc.
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
JuiceFS is a distributed POSIX file system built on top of Redis and S3.