-
University of Southern California
- Los Angeles
- https://www.linkedin.com/in/amanmj/
Stars
Simple web service providing a word embedding model
Apache Spark - A unified analytics engine for large-scale data processing
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
A booklet on machine learning systems design with exercises. NOT the repo for the book "Designing Machine Learning Systems", which is `dmls-book`
A fast, robust Python library to check for offensive language in strings.
A universal Python library for detecting and filtering profanity
Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of …
Spark: The Definitive Guide's Code Repository
A board editor for Halma game. Support output monitoring/applying and game running.
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
A list of upcoming hackathons from around the world.
Code for: "And the bit goes down: Revisiting the quantization of neural networks"
Text and supporting code for Think Stats, 2nd Edition
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Kafka Connect connector to stream data in real time from Twitter.
Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination…
An application observability facade for the most popular observability tools. Think SLF4J, but for observability.
A Java library that implements application/problem+json
Qubole Sparklens tool for performance tuning Apache Spark
Apache Camel is an open source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data.