Unsupervised Recalibration

This is example code for a post-processing method to improve already trained ML models.

Unsupervised recalibration:

does not need access to the inner workings of the ML model, since it does not retrain them.
does not need access to ground truth, since it infers the most likely scenario from the imperfect ML model.
does not need the ground truth distribution in the field to match that in the lab.

But unsupervised recalibration:

does need access to the ML model's evaluation summaries during training time (or alternatively get a small set of training data with ground truth on which to observe the ML model).
does need the ML classifier to be unbiased.

When applied globally, unsupervised recalibration improves the calibration of a classifier. When applied locally (i.e. by subpopulation for some relevant and new subpopulation the ML model is not biased towards), this also improves the classifier's refinement.

Usage

The code is meant to be run in an interactive R session.

It depends on the tidyverse and rjson packages being installed and relies on the setup performed in .Rprofile. When opening the code as an RStudio project, that setup is performed automatically. When using a different R setup, you need to source(.Rprofile).

Minimal example

An artificial and minimal example is contained in the file minimalExample.R. Some data is made up, a basic classifier trained, put into a new context, and recalibrated without needing access to the ground truth in the new context.

Main example

A proper example using the iNaturalist dataset is available in the file mainExample.R. This relies on access to the classifier Wolfram ImageIdentify Net V1. The performance of this classifier on size-reduced pictures from the iNaturalist dataset is evaluated and improved in an unsupervised way.

Three steps are necessary in preparation:

Download and unzip the iNaturalist datasets (1, 2, 3) into the folder ./data/iNaturalist.
From R, run read_iNaturalist_data() %>% prepare_iNaturalist_data_for_processing_by_mathematica.
Apply the Wolfram classifier script to some resolutions. E.g. from a Linux shell with Wolfram Mathematica installed, use the command (./butterflies-beetles-no-resize.wls beetles/ ; ./butterflies-beetles-no-resize.wls butterflies/; ./butterflies-beetles.wls beetles/ 30 ; ./butterflies-beetles.wls beetles/ 40 ; ./butterflies-beetles.wls beetles/ 50 ; ./butterflies-beetles.wls beetles/ 75 ; ./butterflies-beetles.wls butterflies/ 30 ; ./butterflies-beetles.wls butterflies/ 40 ; ./butterflies-beetles.wls butterflies/ 50 ; ./butterflies-beetles.wls butterflies/ 75 ; ./butterflies-beetles.wls butterflies/ 100; ./butterflies-beetles.wls butterflies/ 200 ; )&.

Comparison with other quantification methods

Unsupervised recalibration can be treated as another method in the field of quantification. Comparison with three standard methods (Classify and Count, Adjusted Classify and Count, Expectation Maximization) is presented in quantification/ directory in a manner described in Karpov, Porshnev, Rudakov. For an overview of quantification methods see Karpov et al., Saerens, Latinne, Decaestecker, and Tasche.

Contributions

This code was published for reproducibility of the results described in the article Unsupervised Recalibration.

We are grateful if anyone finds bugs or problems and alerts us to them so we can correct them. But we are not looking to implement new features.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
data		data
quantification		quantification
.Rprofile		.Rprofile
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENCE		LICENCE
README.md		README.md
SECURITY.md		SECURITY.md
butterflies-beetles-no-resize.wls		butterflies-beetles-no-resize.wls
butterflies-beetles.wls		butterflies-beetles.wls
mainExample.R		mainExample.R
minimalExample.R		minimalExample.R
plattScaling.R		plattScaling.R
plotting.R		plotting.R
preparingData.R		preparingData.R
scoringFunctions.R		scoringFunctions.R
unsupervised-public-repo.Rproj		unsupervised-public-repo.Rproj
unsupervisedCalibration.R		unsupervisedCalibration.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Unsupervised Recalibration

Usage

Minimal example

Main example

Comparison with other quantification methods

Contributions

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

wunderalbert/unsupervised-calibration

Folders and files

Latest commit

History

Repository files navigation

Unsupervised Recalibration

Usage

Minimal example

Main example

Comparison with other quantification methods

Contributions

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages