Stars
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
XSBench: The Monte Carlo Macroscopic Cross Section Lookup Benchmark
Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)
A modern replacement for Redis and Memcached
Google's Engineering Practices documentation
Trinity RNA-Seq de novo transcriptome assembly
Trinity RNA-Seq de novo transcriptome assembly
Scalable tumor phylogeny inference and validation from single-cell RNA or DNA data
Tumor Phylogeny Reconstruction via Integrative use of Single Cell and Bulk Sequencing Data
Single-cell analysis in Python. Scales to >100M cells.
Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"
Official code repository for GATK versions 4 and up
Apache Superset is a Data Visualization and Data Exploration Platform
Numba extension for compiling Pandas data frames, Intel® Scalable Dataframe Compiler
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
An open-source C++ library developed and used at Facebook.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Avro2TF is designed to fill the gap of making users' training data ready to be consumed by deep learning training frameworks.
Facebook's branch of the Oracle MySQL database. This includes MyRocks.
splunk / s3-tests
Forked from ceph/s3-testsCompatibility tests for S3 clones
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)