Projet réalisé en équipe avec Freddy Pupier (https://github.com/pups-enterprise).
The aim of this project is to introduce you to the basic concept behind linear classification based on the Harry Potter's Sorting Hat. For this project, you will have to create a one-vs-all classification using logistic regression, to sort Hogwarts students into houses.
pip install -r requirements.txt
- Reproduces the pandas .describe() method
py describe.py dataset_train.csv
- Plots an histogramm to show the most homogeneous course
py histogram.py dataset_train.csv
- Plots the two similar features, use
-c
to plot the Pearson's correlation coefficient heatmap
py scatter_plot.py dataset_train.csv
or py scatter_plot.py dataset_train.csv -c
- Plots the pair plot matrix of all features
py pair_plot.py dataset_train.csv
- Trains the model, use
-e
to evaluate the model, i.e slices the dataset into 80% / 20% to split the training and testing part and verify the accuracy of the prediction, use-c
to compare our model with the scikit-learn one using the accuracy score between the two predictions.
py train.py dataset_train.csv -e
or py train.py dataset_train.csv -c
- Make the prediction, use
-c
to compare prediction with the scikit-learn model (only if-c
has been used intrain.py
)
py predict.py dataset_test.csv -c