Highlights
- Pro
Stars
Analyze and manipulate your Anki flashcards using pandas dataframes!
A library to find and visualise the most interesting slices in multidimensional data
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, a…
Materials for the Hugging Face Diffusion Models Course
dstack is an open-source control plane for running development, training, and inference jobs on GPUs—across hyperscalers, neoclouds, or on-prem.
Unofficial implementation of QaNER: Prompting Question Answering Models for Few-shot Named Entity Recognition.
Fast and customizable framework for automatic ML model creation (AutoML)
This repository contains a package with the model built on the well-known "Titanic" dataset.
Add dynamically generated Kaggle Tier & Medals on your readme.
Synthetic dataset with medical questions, created by medical division of Sber Artificial Intelligence laboratory
Pipeline for fast building text classification TF-IDF + LogReg baselines.
REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Implementation of Supervised Contrastive Learning with AMP, EMA, SWA, and many other tricks
CraftML is a restful web service for easy pipeline creation without code.
Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also …
The repository provides usefull python scripts for ML and data analysis
A GridMixup augmentation, inspired by GridMask and CutMix
Official implementation of 'FMix: Enhancing Mixed Sample Data Augmentation'
Tool for visualizing GitHub profiles
Pipeline for training NER models using PyTorch.
A comprehensive reference for all topics related to Natural Language Processing
Synthetic data generators for tabular and time-series data
Configuration classes enabling type-safe PyTorch configuration for Hydra apps