Skip to content
Change the repository type filter

All

    Repositories list

    • openhouse

      Public
      Open Control Plane for Tables in Data Lakehouse
      Java
      623711021Updated Nov 7, 2025Nov 7, 2025
    • venice

      Public
      Venice, Derived Data Platform for Planet-Scale Workloads.
      Java
      1065711716Updated Nov 7, 2025Nov 7, 2025
    • Multi-hop declarative data pipelines
      Java
      1412210Updated Nov 7, 2025Nov 7, 2025
    • helix

      Public
      Mirror of Apache Helix
      Java
      241109Updated Nov 7, 2025Nov 7, 2025
    • Liger-Kernel

      Public
      Efficient Triton Kernels for LLM Training
      Python
      4265.8k7328Updated Nov 7, 2025Nov 7, 2025
    • cruise-control

      Public
      Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of Kafka clusters.
      Java
      6382.9k21034Updated Nov 6, 2025Nov 6, 2025
    • General Metadata Architecture
      Java
      601331319Updated Nov 6, 2025Nov 6, 2025
    • This is a read-only mirror of apache/gobblin
      Java
      4600Updated Nov 6, 2025Nov 6, 2025
    • rest.li

      Public
      Rest.li is a REST+JSON framework for building robust, scalable service architectures using dynamic discovery and simple asynchronous APIs.
      Java
      5572.5k5055Updated Nov 4, 2025Nov 4, 2025
    • ambry

      Public
      Distributed object store
      Java
      2831.8k13119Updated Nov 4, 2025Nov 4, 2025
    • iceberg

      Public
      A temporary home for LinkedIn's changes to Apache Iceberg (incubating)
      Java
      3363024Updated Nov 3, 2025Nov 3, 2025
    • brooklin

      Public
      An extensible distributed system for reliable nearline data streaming at scale
      Java
      1409491716Updated Nov 3, 2025Nov 3, 2025
    • ghc25-ds-workshop

      Public
      This repo is specifically for the Grace Hopper 2025 DS Workshop
      Jupyter Notebook
      3600Updated Nov 1, 2025Nov 1, 2025
    • linkedin.github.com

      Public
      Listing of all our public GitHub projects.
      JavaScript
      4764162Updated Nov 1, 2025Nov 1, 2025
    • transport

      Public
      A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Apache Hive, and Presto.
      Java
      743022411Updated Oct 30, 2025Oct 30, 2025
    • zookeeper

      Public
      Mirror of Apache Hadoop ZooKeeper
      Java
      7.3k639Updated Oct 30, 2025Oct 30, 2025
    • avro-util

      Public
      Collection of utilities to allow writing java code that operates across a wide range of avro versions.
      Java
      67855713Updated Oct 29, 2025Oct 29, 2025
    • Repo for talent-solutions-java-sdk project
      Java
      1100Updated Oct 27, 2025Oct 27, 2025
    • fmchisel

      Public
      fmchisel: Efficient Compression and Training Algorithms for Foundation Models
      Python
      87100Updated Oct 23, 2025Oct 23, 2025
    • goavro

      Public
      Goavro is a library that encodes and decodes Avro data.
      Go
      2291k6121Updated Oct 22, 2025Oct 22, 2025
    • coral

      Public
      Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
      Java
      2008685932Updated Oct 20, 2025Oct 20, 2025
    • Burrow

      Public
      Kafka Consumer Lag Checking
      Go
      8123.9k22418Updated Oct 3, 2025Oct 3, 2025
    • Shake to send feedback for Android.
      Java
      55161105Updated Sep 17, 2025Sep 17, 2025
    • diderot

      Public
      A fast and flexible implementation of the xDS protocol
      Go
      31800Updated Sep 17, 2025Sep 17, 2025
    • forthic

      Public
      Python
      72800Updated Sep 16, 2025Sep 16, 2025
    • DuaLip

      Public
      DuaLip: Dual Decomposition based Linear Program Solver
      Scala
      106510Updated Sep 8, 2025Sep 8, 2025
    • A distributed Spark/Scala implementation of the isolation forest algorithm for unsupervised outlier detection, featuring support for scalable training and ONNX export for easy cross-platform inference.
      Scala
      5224931Updated Aug 30, 2025Aug 30, 2025
    • test

      Public archive
      Apache Pinot - A realtime distributed OLAP datastore
      Java
      1.4k000Updated Aug 29, 2025Aug 29, 2025
    • Repo for robustInfer
      Jupyter Notebook
      1000Updated Aug 28, 2025Aug 28, 2025
    • luminol

      Public
      Anomaly Detection and Correlation library
      Python
      2191.2k289Updated Aug 22, 2025Aug 22, 2025