- Mountain View
- https://www.linkedin.com/in/akshayrai09
Stars
An extensible distributed system for reliable nearline data streaming at scale
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Apache Hive, and Presto.
📚 List of awesome university courses for learning Computer Science!
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Hadoop filesystem implementation for Aliyun OSS
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and bat…
Simple JVM Profiler Using StatsD and Other Metrics Backends
Sends stacktrace-level performance data from a JVM process to Riemann.
Chef cookbook to install Dr Elephant for Hadoop.
Docker files for Linkedin's Dr. Elephant https://github.com/linkedin/dr-elephant
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark