Skip to content
View xinhhuynh's full-sized avatar

Block or report xinhhuynh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An open source ML system for the end-to-end data science lifecycle

Java 1,075 521 Updated Jan 1, 2026

Apache Spark - A unified analytics engine for large-scale data processing

Scala 42,569 28,987 Updated Dec 31, 2025

An efficient updatable key-value store for Apache Spark

Scala 254 77 Updated Mar 11, 2017

An Apache Spark-shell backend for IPython

Scala 105 29 Updated Jul 2, 2021

Code to accompany Advanced Analytics with Spark from O'Reilly Media

Scala 1,531 1,021 Updated Sep 25, 2024

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Java 6,596 2,833 Updated Dec 30, 2025

Interactive and Reactive Data Science using Scala and Spark.

JavaScript 3,154 652 Updated May 16, 2023

Example code from Learning Spark book

Java 3,896 2,419 Updated Jul 12, 2025

A collection of MapReduce tasks translated (from Pig, Hive, MapReduce streaming, Cascalog, etc.) into Scalding.

Ruby 24 4 Updated May 6, 2012

Orignal unmaintained version of the Lightbeam extension. See lightbeam-we for the new one which works in modern versions of Firefox.

JavaScript 586 149 Updated Mar 22, 2017

A private messenger for Windows, macOS, and Linux.

TypeScript 15,886 2,935 Updated Jan 1, 2026

The FourthParty web measurement platform.

JavaScript 44 19 Updated May 14, 2015

Private contact and calendar sync for Android.

Java 356 73 Updated Oct 20, 2015

A private messenger for iOS.

Swift 11,760 3,329 Updated Dec 18, 2025

Magpie contains a number of scripts for running Big Data software in HPC environments, including Hadoop and Spark. There is support for Lustre, Slurm, Moab, Torque. LSF, Flux, and more.

Shell 196 52 Updated Feb 13, 2025

The winning solution to the The Higgs Boson Machine Learning Challenge.

Common Lisp 128 43 Updated Mar 23, 2015

Solution to the Higgs Boson Machine Learning Challenge on Kaggle

Python 32 15 Updated Sep 16, 2014

Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning

Java 1,785 404 Updated Aug 16, 2021

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

C++ 27,801 8,826 Updated Dec 30, 2025

Storm-yarn enables Storm clusters to be deployed into machines managed by Hadoop YARN.

Java 418 158 Updated Jul 21, 2023

Getting Started With R

R 233 301 Updated Jan 19, 2014

A simple demonstration of sub-sequence sampling as used for anomaly detection with EKG signals

Java 102 30 Updated Oct 13, 2020

Seamless multi-primary syncing database with an intuitive HTTP/JSON API, designed for reliability

Erlang 6,765 1,061 Updated Dec 26, 2025

Twitter's Effective Scala Guide

HTML 2,242 625 Updated Apr 10, 2023

People. Places. Things. Graphs.

JavaScript 93 24 Updated Oct 2, 2014

A Scala API for Cascading

Scala 3,518 704 Updated May 28, 2023

A repository of information, examples and good practices around the Lambda Architecture

CSS 369 107 Updated Oct 26, 2017

The official home of the Presto distributed SQL query engine for big data

Java 16,610 5,514 Updated Jan 1, 2026

Abstract Algebra for Scala

Scala 2,303 351 Updated Nov 21, 2025

Lightning-fast cluster computing in Java, Scala and Python.

Scala 1,427 382 Updated Apr 8, 2014