Skip to content
View alexott's full-sized avatar
🏠
Working from home
🏠
Working from home

Organizations

@SOCI

Block or report alexott

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Databricks framework to validate Data Quality of pySpark DataFrames

Python 325 65 Updated Oct 21, 2025

SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]

Python 17,621 1,850 Updated Oct 20, 2025

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 15,179 1,293 Updated May 23, 2024

📚 Learn to write an embedded OS in Rust 🦀

Rust 14,408 849 Updated Feb 10, 2024

Open, Multi-modal Catalog for Data & AI

Python 3,132 525 Updated Oct 21, 2025

Code examples and resources for DBRX, a large language model developed by Databricks

Python 2,571 244 Updated May 1, 2024

This is a Databricks Cybersecurity demo for building linked detection, investigation and response jobs in Databricks

Python 4 1 Updated Jan 10, 2024

Databricks CLI

Go 251 108 Updated Oct 21, 2025

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, P…

Python 429 82 Updated Oct 20, 2025

A Quality Spark DQ and transformation Library

Scala 5 5 Updated Sep 10, 2025

A curated list of resources about detecting threats and defending Kubernetes systems.

397 40 Updated Sep 2, 2023

An incremental parsing system for programming tools

Rust 22,426 2,140 Updated Oct 20, 2025

A native Rust library for Delta Lake, with bindings into Python

Rust 2,993 535 Updated Oct 21, 2025

Terraform provider for Azure Resource Manager

Go 4,835 4,883 Updated Oct 21, 2025

Databricks Terraform Provider

Go 544 462 Updated Oct 21, 2025

A cluster computing framework for processing large-scale geospatial data

Java 2,221 734 Updated Oct 20, 2025

GeoMesa is a suite of tools for working with big geo-spatial data in a distributed fashion.

Scala 1,465 439 Updated Oct 21, 2025

4mc - splittable lz4 and zstd in hadoop/spark/flink

C 109 38 Updated Apr 21, 2023

Send code blocks (Python, SQL, Scala, R) to a Databricks cluster

TypeScript 6 Updated Dec 30, 2022

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Java 6,572 2,835 Updated Oct 20, 2025

Concurrent Radix and Suffix Trees for Java

Java 515 84 Updated Jul 27, 2021

A List of Recommender Systems and Resources

4,764 707 Updated Feb 25, 2025

Bartosz Milewski's 'Category Theory for Programmers' unofficial PDF and LaTeX source

TeX 11,385 619 Updated Oct 21, 2025

Logback appender for Apache Kafka

Java 649 264 Updated May 27, 2022

Companion webpage to the book "Mathematics For Machine Learning"

Jupyter Notebook 14,656 2,641 Updated Mar 13, 2025

This note presents in a technical though hopefully pedagogical way the three most common forms of neural network architectures: Feedforward, Convolutional and Recurrent.

TeX 1,387 107 Updated Oct 9, 2019

Practical Gremlin - An Apache TinkerPop Tutorial

AsciiDoc 847 256 Updated Sep 17, 2025

Dynamic Tensor Graph library in Clojure (think PyTorch, DynNet, etc.)

Clojure 288 18 Updated Jun 28, 2019

PlantUML sprites, macros, and other includes for AWS components.

Python 645 67 Updated May 7, 2019
Next