-
Couldn't load subscription status.
- Fork 314
Home
Hollin Wilkins edited this page Feb 10, 2017
·
55 revisions
Please see our complete documentation at http://mleap-docs.combust.ml.
MLeap deploys Spark ML (and some MLlib) transformers and pipelines to production without a Spark Context.
MLeap extends scikit-learn's functionality to be able to serialize and deploy scikit transformers, pipelines and feature unions without any dependencies on scikit (numpy, scipy, c++ libraries). It also serializes transformers and pipelines as Spark, so you can load and deploy your scikit pipelines on Spark infrastructure with a few lines of code.
- Serializing a Spark ML Pipeline and Serving with MLeap
- Setting up a Spark 2.0 notebook with MLeap and Toree
- Setting up PySpark 2.0 notebook with MLeap and Toree
- Setting up Scikit-Learn with MLeap
- ML Pipelines with AirBnb data - Scala
- ML Pipelines with AirBnb data - PySpark
- ML Pipelines with Lending Club data - Scala
| Transformer | Spark | MLeap | Scikit-Learn | TensorFlow |
|---|---|---|---|---|
| Binarizer | x | x | x | |
| BucketedRandomProjectionLSH | x | x | ||
| Bucketizer | x | x | ||
| ChiSqSelector | x | x | ||
| CountVectorizer | x | x | ||
| DCT | x | x | ||
| ElementwiseProduct | x | x | x | |
| HashingTermFrequency | x | x | x | |
| IDF | x | x | ||
| Imputer | x | x | x | |
| Interaction | x | x | x | |
| MaxAbsScaler | x | x | ||
| MinHashLSH | x | x | ||
| MinMaxScaler | x | x | x | |
| Ngram | x | x | ||
| Normalizer | x | x | ||
| OneHotEncoder | x | x | ||
| PCA | x | x | x | |
| QuantileDiscretizer | x | x | ||
| PolynomialExpansion | x | x | x | |
| ReverseStringIndexer | x | x | x | |
| StandardScaler | x | x | x | |
| StopWordsRemover | x | x | ||
| StringIndexer | x | x | x | |
| Tokenizer | x | x | x | |
| VectorAssembler | x | x | x | |
| VectorIndexer | x | x | ||
| VectorSlicer | x | x | ||
| WordToVector | x | x |
| Transformer | Spark | MLeap | Scikit-Learn | TensorFlow |
|---|---|---|---|---|
| DecisionTreeClassifier | x | x | x | |
| GradientBoostedTreeClassifier | x | x | ||
| LogisticRegression | x | x | x | |
| LogisticRegressionCv | x | x | x | |
| NaiveBayesClassifier | x | x | ||
| OneVsRest | x | x | ||
| RandomForestClassifier | x | x | x | |
| SupportVectorMachines | x | x | x | |
| MultiLayerPerceptron | x | x |
| Transformer | Spark | MLeap | Scikit-Learn | TensorFlow |
|---|---|---|---|---|
| AFTSurvivalRegression | x | x | ||
| DecisionTreeRegression | x | x | x | |
| GeneralizedLinearRegression | x | x | ||
| GradientBoostedTreeRegression | x | x | ||
| IsotonicRegression | x | x | ||
| LinearRegression | x | x | x | |
| RandomForestRegression | x | x | x |
| Transformer | Spark | MLeap | Scikit-Learn | TensorFlow |
|---|---|---|---|---|
| BisectingKMeans | x | x | ||
| GaussianMixtureModel | x | x | ||
| KMeans | x | x | ||
| LDA | x |
| Transformer | Spark | MLeap | Scikit-Learn | TensorFlow | Description |
|---|---|---|---|---|---|
| MathUnary | x | x | x | Simple set of unary mathematical operations | |
| MathBinary | x | x | x | Simple set of binary mathematical operations |
| Transformer | Spark | MLeap | Scikit-Learn | TensorFlow |
|---|---|---|---|---|
| ALS | x |