Skip to content
View alexott's full-sized avatar
🏠
Working from home
🏠
Working from home

Organizations

@SOCI

Block or report alexott

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Databricks framework to validate Data Quality of pySpark DataFrames and Tables

Python 361 76 Updated Jan 9, 2026

SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]

Python 18,216 1,951 Updated Dec 29, 2025

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 15,232 1,288 Updated May 23, 2024

📚 Learn to write an embedded OS in Rust 🦀

Rust 14,523 860 Updated Feb 10, 2024

Open, Multi-modal Catalog for Data & AI

Java 3,254 560 Updated Jan 10, 2026

This is a Databricks Cybersecurity demo for building linked detection, investigation and response jobs in Databricks

Python 5 2 Updated Jan 10, 2024

Databricks CLI

Go 279 127 Updated Jan 10, 2026

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, P…

Python 439 88 Updated Dec 18, 2025

A Quality Spark DQ and transformation Library

Scala 5 5 Updated Jan 9, 2026

A curated list of resources about detecting threats and defending Kubernetes systems.

402 40 Updated Sep 2, 2023

An incremental parsing system for programming tools

Rust 23,299 2,341 Updated Jan 10, 2026

A native Rust library for Delta Lake, with bindings into Python

Rust 3,094 560 Updated Jan 10, 2026

Terraform provider for Azure Resource Manager

Go 4,882 4,920 Updated Jan 9, 2026

Databricks Terraform Provider

Go 566 482 Updated Jan 7, 2026

A cluster computing framework for processing large-scale geospatial data

Java 2,283 745 Updated Jan 9, 2026

GeoMesa is a suite of tools for working with big geo-spatial data in a distributed fashion.

Scala 1,474 441 Updated Jan 8, 2026

4mc - splittable lz4 and zstd in hadoop/spark/flink

C 109 37 Updated Apr 21, 2023

Send code blocks (Python, SQL, Scala, R) to a Databricks cluster

TypeScript 6 Updated Dec 30, 2022

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Java 6,597 2,832 Updated Jan 8, 2026

Concurrent Radix and Suffix Trees for Java

Java 515 84 Updated Jul 27, 2021

A List of Recommender Systems and Resources

4,796 707 Updated Dec 3, 2025

Bartosz Milewski's 'Category Theory for Programmers' unofficial PDF and LaTeX source

TeX 11,454 630 Updated Jan 8, 2026

Logback appender for Apache Kafka

Java 652 269 Updated May 27, 2022

Companion webpage to the book "Mathematics For Machine Learning"

Jupyter Notebook 14,929 2,692 Updated Mar 13, 2025

This note presents in a technical though hopefully pedagogical way the three most common forms of neural network architectures: Feedforward, Convolutional and Recurrent.

TeX 1,388 107 Updated Oct 9, 2019

Practical Gremlin - An Apache TinkerPop Tutorial

Ruby 854 259 Updated Dec 23, 2025

Dynamic Tensor Graph library in Clojure (think PyTorch, DynNet, etc.)

Clojure 289 18 Updated Jun 28, 2019

PlantUML sprites, macros, and other includes for AWS components.

Python 647 67 Updated May 7, 2019

Clojure transducers interface to Kafka Streams

Clojure 102 9 Updated Dec 15, 2017
Next