Skip to content

Nqsir/DSLR

Repository files navigation

DSLR

DataScience X Logistic Regression

Projet réalisé en équipe avec Freddy Pupier (https://github.com/pups-enterprise).

The aim of this project is to introduce you to the basic concept behind linear classification based on the Harry Potter's Sorting Hat. For this project, you will have to create a one-vs-all classification using logistic regression, to sort Hogwarts students into houses.

Installation

pip install -r requirements.txt

Usage

  • Reproduces the pandas .describe() method

py describe.py dataset_train.csv

  • Plots an histogramm to show the most homogeneous course

py histogram.py dataset_train.csv

hist

  • Plots the two similar features, use -c to plot the Pearson's correlation coefficient heatmap

py scatter_plot.py dataset_train.csv or py scatter_plot.py dataset_train.csv -c

scat_1 scat_2

  • Plots the pair plot matrix of all features

py pair_plot.py dataset_train.csv

pair

  • Trains the model, use -e to evaluate the model, i.e slices the dataset into 80% / 20% to split the training and testing part and verify the accuracy of the prediction, use -c to compare our model with the scikit-learn one using the accuracy score between the two predictions.

py train.py dataset_train.csv -e or py train.py dataset_train.csv -c

train_1 train_2 train

  • Make the prediction, use -c to compare prediction with the scikit-learn model (only if -c has been used in train.py)

py predict.py dataset_test.csv -c

predict

About

DataScience X Logistic Regression

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages