Skip to content
Open
Changes from 1 commit
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
5a343f1
moved paradox vignette to new folder chapter 10_1
awinterstetter Sep 5, 2025
f101004
renaming folder names
awinterstetter Sep 5, 2025
820140e
chapter 14 renamed
awinterstetter Sep 5, 2025
33db51a
created paradox_vignette.qmd
awinterstetter Sep 5, 2025
b835da0
_quarto.yml updated
awinterstetter Sep 7, 2025
eb34137
removed authors line
awinterstetter Sep 7, 2025
eb48d9c
renaming paradox chapter + reordering of headlines level
awinterstetter Sep 8, 2025
809af7d
chapters 10 & 11 reordered; new chapter 11 renamed; "tihs"-type fixed
awinterstetter Sep 10, 2025
fda5172
typo fixed HERE
awinterstetter Sep 10, 2025
86b4e4e
removed ParamHelpers paragraph
awinterstetter Sep 10, 2025
63ac7aa
## Defining a Tuning Space - removed Creating ParamSets and Transform…
awinterstetter Sep 11, 2025
2369881
replacing $extra_trafo by .extra_trafo
awinterstetter Sep 11, 2025
eda2fb9
corrected typo regarding $set_values()
awinterstetter Sep 11, 2025
d0736bf
reordering of one sentence so text flows better
awinterstetter Sep 12, 2025
7419df8
moving text one line down so header has more space
awinterstetter Sep 12, 2025
32fd683
provided better context in section Factor Level Transformation
awinterstetter Sep 12, 2025
09bdf2e
implemented chatGPT recommendation 13/17/21 and 30 to 38
awinterstetter Sep 13, 2025
91beb88
comment out "C-classification"
awinterstetter Sep 15, 2025
7826844
changed () in forward/backward-references
awinterstetter Sep 16, 2025
b0619df
Merge branch 'main' into mlr3book_paradox_vignette
mb706 Oct 13, 2025
d9180ba
Merge branch 'main' into mlr3book_paradox_vignette
be-marc Nov 26, 2025
185d03a
...
be-marc Nov 26, 2025
da39370
changes in chapter overview and authors
awinterstetter Nov 27, 2025
93cadc3
changed r chapter from paradox vignette to proper name
awinterstetter Nov 27, 2025
1708039
removed miesmuschel, because package is not known (check with marc)
awinterstetter Nov 27, 2025
3a66f13
changed cloumn width
awinterstetter Nov 27, 2025
c110f8a
style guide changes
awinterstetter Nov 28, 2025
01159a9
adhere chapter 11 to style guide
awinterstetter Dec 3, 2025
c18247b
fixed `r rpart` error
awinterstetter Dec 3, 2025
1e20c49
fixed `rpart.control` error because rpart is not loaded
awinterstetter Dec 3, 2025
835dfa1
removed ref for as.data.table
awinterstetter Dec 3, 2025
4ba433b
more fixes regarding wrong referencing of data.table
awinterstetter Dec 3, 2025
01819b7
fix `r lhs`
awinterstetter Dec 3, 2025
69dcd70
fixes after seeing rendered chapter
awinterstetter Dec 4, 2025
a9b4118
fix regarding ref for Design
awinterstetter Dec 4, 2025
685a35d
more fixes after seeing rendered version
awinterstetter Dec 5, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
adhere chapter 11 to style guide
  • Loading branch information
awinterstetter committed Dec 3, 2025
commit 01159a909cd8e5c0a2b5e3a042abe3e0ee1a52c3
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
aliases:
- "/paradox_vignette.html"
- "/advanced_hyperparameter_specification_using_paradox.html"
---

# Advanced Hyperparameter Specification using paradox {#sec-paradox}
Expand All @@ -26,7 +26,7 @@ The tools provided by `paradox` therefore relate to:
* Parameter sampling: Generating parameter values that lie in the parameter space for systematic exploration of program behavior depending on these parameters

`paradox` is, by nature, an auxiliary package that derives its usefulness from other packages that make use of it.
It is heavily utilized in other packages of the `mlr3` ecosystem such as `r mlr3`, `r mlr3pipelines`, `r mlr3tuning` and r miesmuschel.
It is heavily utilized in other packages of the `mlr3` ecosystem such as `r mlr3`, `r mlr3pipelines`, `r mlr3tuning` and miesmuschel.

## Reference Based Objects

Expand Down Expand Up @@ -55,7 +55,7 @@ print(ps3) # is a "clone" of the old ps1 with 'a' == 1
### `Domain` Representing Single Parameters

Parameter spaces are made up of individual parameters, which usually can take a single atomic value.
Consider, for example, trying to configure the `rpart` package's `rpart.control` object.
Consider, for example, trying to configure the `r rpart` package's `r ref("rpart.control")` object.
It has various components (`minsplit`, `cp`, ...) that all take a single value.

These components are represented by `r ref("Domain")` objects, which can be created using the sugar functions in @tbl-paradox-define.
Expand Down Expand Up @@ -93,7 +93,7 @@ Every parameter can have:
* **special_vals** - A list of values that are accepted even if they do not conform to the type.
* **tags** - Tags that can be used to organize parameters.
* **trafo** - A transformation function that is applied to the parameter value after it has been sampled.
It is for example used through the `Design$transpose()` function after a `Design` was created by `generate_design_random()` or similar functions.
It is for example used through the `Design$transpose()` function after a `Design` was created by `r ref("generate_design_random()")` or similar functions.

The numeric (`p_int()` and `p_dbl()`) parameters furthermore allow for specification of a **lower** and **upper** bound.
Meanwhile, the `p_fct()` parameter must be given a vector of **levels** that define the possible states its parameter can take.
Expand All @@ -115,11 +115,6 @@ param_set$levels$parD
param_set$class
```

It is also possible to get all information of a parameter set as `data.table` by calling `as.data.table()`.

```{r}
as.data.table(param_set)
```

#### Type / Range Checking

Expand All @@ -141,7 +136,7 @@ It is typically created by calling `ps()`, but can also be initialized using the
The main difference is that `ps()` takes named arguments, whereas `ParamSet$new()` takes a named list.
The latter makes it easier to construct a parameter set programmatically, but is slightly more verbose.

`ParamSet`s can be combined using `c()` or `ps_union` (the latter of which takes a list), and they have a `$subset()` method that allows for subsetting.
`ParamSet`s can be combined using `c()` or `r ref("ps_union()")` (the latter of which takes a list), and they have a `$subset()` method that allows for subsetting.
All of these functions return a new, cloned parameter set-object, and do not modify the original parameter set.

```{r}
Expand All @@ -154,8 +149,8 @@ ps_all$subset(c("x", "z"))

`ParamSet`s of each individual parameters can be accessed through the `$subspaces()` function by returning a named list of single-parameter `ParamSets`s.

It is possible to get the `ParamSet` as a `data.table` using `as.data.table()`.
This makes it easy to subset parameters on certain conditions and aggregate information about them, using the variety of methods provided by `data.table`.
It is possible to get the `ParamSet` as a `r ref("data.table")` using `r ref("as.data.table()")`.
This makes it easy to subset parameters on certain conditions and aggregate information about them, using the variety of methods provided by `r data.table`.

```{r}
as.data.table(ps_all)
Expand Down Expand Up @@ -194,12 +189,12 @@ It is often the case that certain parameters are irrelevant or should not be giv
An example would be a parameter that switches a certain algorithm feature (for example regularization) on or off, combined with another parameter that controls the behavior of that feature (e.g. a regularization parameter).
The second parameter would be said to *depend* on the first parameter having the value `TRUE`.

A dependency can be added using the `$add_dep` method, which takes both the ids of the "depender" and "dependee" parameters as well as a `Condition` object.
A dependency can be added using the `$add_dep` method, which takes both the ids of the "depender" and "dependee" parameters as well as a `r ref("Condition")` object.
The `Condition` object represents the check to be performed on the "dependee".
Currently it can be created using `CondEqual()` and `CondAnyOf()`.
Currently it can be created using `r ref("CondEqual()")` and `r ref("CondAnyOf()")`.
Multiple dependencies can be added, and parameters that depend on others can again be depended on, as long as no cyclic dependencies are introduced.

The consequence of dependencies are twofold:
The consequences of dependencies are twofold:
For one, the `$check()`, `$test()`, and `$assert()` functions will reject any value supplied for a parameter if its dependency is not satisfied, when the `check_strict` argument is given as `TRUE`. This differs from simply omitting the parameter, which is always allowed.
Furthermore, when sampling or creating grid designs from a `ParamSet`, the dependencies will be respected.

Expand Down Expand Up @@ -274,11 +269,11 @@ Furthermore, for most samplers `r ref("p_int()")` and `r ref("p_dbl()")` must ha
### Parameter Designs

Functions that sample the parameter space fundamentally return an object of the `r ref("Design")` class.
These objects contain the sampled data as a `data.table` under the `$data` field, and also offer conversion to a list of parameter-values using the `$transpose()` function.
These objects contain the sampled data as a `r ref("data.table")` under the `$data` field, and also offer conversion to a list of parameter-values using the `$transpose()` function.

### Grid Design

The `generate_design_grid()` function is used to create grid designs that contain all combinations of parameter values: All possible values for `r ref("p_lgl()")` and `r ref("p_fct()")`, and values with a given resolution for `p_int()` and `p_dbl()`.
The `r ref("generate_design_grid()")` function is used to create grid designs that contain all combinations of parameter values: All possible values for `r ref("p_lgl()")` and `r ref("p_fct()")`, and values with a given resolution for `p_int()` and `p_dbl()`.
The resolution can be given for all numeric parameters, or for specific named parameters through the `param_resolutions` parameter.

```{r}
Expand All @@ -294,10 +289,10 @@ generate_design_grid(ps_small, param_resolutions = c(A = 3, B = 2))
### Random Sampling

`paradox` offers different methods for random sampling, which vary in the degree to which they can be configured.
The easiest way to get a uniformly random sample of parameters is `generate_design_random()`.
It is also possible to create latin hypercube sampled parameter values using `generate_design_lhs()`, which utilizes the `lhs` package.
The easiest way to get a uniformly random sample of parameters is `r ref("generate_design_random()")`.
It is also possible to create latin hypercube sampled parameter values using `r ref("generate_design_lhs()")`, which utilizes the `r lhs` package.
LHS-sampling creates low-discrepancy sampled values that cover the parameter space more evenly than purely random values.
`generate_design_sobol()` can be used to sample using the Sobol sequence.
`r ref("generate_design_sobol()")` can be used to sample using the Sobol sequence.

```{r}
pvrand = generate_design_random(ps_small, 500)
Expand All @@ -319,7 +314,7 @@ It may sometimes be desirable to configure parameter sampling in more detail.
`paradox` uses the `r ref("Sampler")` abstract base class for sampling, which has many different sub-classes that can be parameterized and combined to control the sampling process.
It is even possible to create further sub-classes of the `Sampler` class (or of any of *its* subclasses) for even more possibilities.

Every `Sampler` object has a `sample()` function, which takes one argument, the number of instances to sample, and returns a `Design` object.
Every `Sampler` object has a `r ref("sample()")` function, which takes one argument, the number of instances to sample, and returns a `Design` object.

#### 1D-Samplers

Expand Down Expand Up @@ -375,13 +370,13 @@ sampJ$sample(5)

#### SamplerUnif

The `Sampler` used in `generate_design_random()` is the `SamplerUnif` sampler, which corresponds to a `HierarchicalSampler` of `Sampler1DUnif` for all parameters with dependency-aware behavior identical to `generate_design_random()`.
The `Sampler` used in `generate_design_random()` is the `r ref("SamplerUnif")` sampler, which corresponds to a `HierarchicalSampler` of `Sampler1DUnif` for all parameters with dependency-aware behavior identical to `generate_design_random()`.

## Parameter Transformation

While the different `r ref("Sampler")`s allow for a wide specification of parameter distributions, there are cases where the simplest way of getting a desired distribution is to sample parameters from a simple distribution (such as the uniform distribution) and then transform them.
This can be done by constructing a `r ref("Domain")` with a `trafo` argument, or assigning a function to the `$extra_trafo` field of a `r ref("ParamSet")`.
The latter can also be done by passing an `.extra_trafo` argument to the `ps()` shorthand constructor.
The latter can also be done by passing an `.extra_trafo` argument to the `r ref("ps()")` shorthand constructor.

A `trafo` function in a `Domain` is called with a single parameter, the value to be transformed.
It can only operate on the dimension of a single parameter.
Expand All @@ -394,7 +389,7 @@ The `$extra_trafo` function is called with two parameters:

The `$extra_trafo` function must return a list of transformed parameter values.

The transformation is performed when calling the `$transpose` function of the `r ref("Design")` object returned by a `Sampler` with the `trafo` ParamSet to `TRUE` (the default).
The transformation is performed when calling the `$transpose` function of the `r ref("Design")` object returned by a `Sampler` with the `trafo` parameter set to `TRUE` (the default).
The following, for example, creates a parameter that is exponentially distributed:

```{r}
Expand Down Expand Up @@ -431,14 +426,14 @@ However, when transforming parameters independently the `trafo` way is more reco

Usually the design created with one `ParamSet` is then used to configure other objects that themselves have a parameter set which defines the values they take.
The parameter sets which can be used for random sampling, however, are restricted in some ways:
They must have finite bounds, and they may not contain "untyped" (`ParamUty`) parameters.
They must have finite bounds, and they may not contain "untyped" (`r ref("p_uty")`) parameters.
`$trafo` provides the glue for these situations.
There is relatively little constraint on the trafo function's return value, so it is possible to return values that have different bounds or even types than the original `ParamSet`.
It is even possible to remove some parameters and add new ones.

Suppose, for example, that a certain method requires a *function* as a parameter.
Let's say a function that summarizes its data in a certain way.
The user can pass functions like `median()` or `mean()`, but could also pass quantiles or something completely different.
The user can pass functions like `r ref("median()")` or `r ref("mean()")`, but could also pass quantiles or something completely different.
This method would probably use the following `ParamSet`:

```{r}
Expand Down Expand Up @@ -469,14 +464,14 @@ xvals = design$transpose()
print(xvals[[1]])
```

We can now check that it fits the requirements set by `methodPS`, and that `fun` it is in fact a function:
We can now check that it fits the requirements set by `methodPS`, and that `fun` is in fact a function:

```{r}
methodPS$check(xvals[[1]])
xvals[[1]]$fun(1:10)
```

`p_fct()` has a shortcut for this kind of transformation, where a `character` is transformed into a specific set of (typically non-scalar) values.
`r ref("p_fct()")` has a shortcut for this kind of transformation, where a `character` is transformed into a specific set of (typically non-scalar) values.
When its `levels` argument is given as a named `list` (or named non-`character` vector), it constructs a `Domain` that does the trafo automatically.
A way to perform the above would therefore be:
```{r}
Expand All @@ -492,7 +487,7 @@ The user wants to give a function that selects a certain quantile, where the qua
In that case the `$transpose` function could generate a function in a different way.

For interpretability, the parameter should be called "`quantile`" before transformation, and the "`fun`" parameter is generated on the fly.
We therefore use an `extra_trafo` here, given as a function to the `ps()` call.
We therefore use an `extra_trafo` here, given as a function to the `r ref("ps()")` call.

```{r}
samplingPS2 = ps(quantile = p_dbl(0, 1),
Expand Down Expand Up @@ -562,11 +557,11 @@ typeof(search_space$params$cost$levels)

Be aware that this results in an "unordered" hyperparameter, however.
Tuning algorithms that make use of ordering information of parameters, like genetic algorithms or model based optimization, will perform worse when this is done.
For these algorithms, it may make more sense to define a `p_dbl` or `p_int` with a more fitting trafo.
For these algorithms, it may make more sense to define a `r ref("p_dbl")` or `r ref("p_int")` with a more fitting trafo.

An example is the `class.weights` parameter of the Support Vector Machine (SVM), which takes a named vector of class weights with one entry per target class.
If only a few candidate vectors are to be tried, `class.weights` can be implemented as follows.
Note that the `levels` argument of `p_fct` must be named if there is no easy way for `as.character()` to create names:
Note that the `levels` argument of `p_fct` must be named if there is no easy way for `r ref("as.character()")` to create names:

```{r}
search_space = ps(
Expand All @@ -584,7 +579,7 @@ generate_design_grid(search_space)$transpose()

When running an optimization, it is important to inform the tuning algorithm about what hyperparameters are valid.
Here the names, types, and valid ranges of each hyperparameter are important.
All this information is communicated with objects of the class {`r ref("ParamSet")`}, which is defined in `paradox`.
All this information is communicated with objects of the class `r ref("ParamSet")`, which is defined in `r paradox`.

Note, that `ParamSet` objects exist in two contexts.
First, parameter set-objects are used to define the space of valid parameter settings for a learner (and other objects).
Expand All @@ -593,16 +588,16 @@ We are mainly interested in the latter.
For example we can consider the `minsplit` parameter of the `lrn("classif.rpart")`.
The `ParamSet` associated with the learner has a lower but *no* upper bound.
However, for tuning the value, a lower *and* upper bound must be given because tuning search spaces need to be bounded.
For `Learner` or `PipeOp` objects, typically "unbounded" parameter sets are used.
For `r ref("Learner")` or `r ref("PipeOp")` objects, typically "unbounded" parameter sets are used.
Here, however, we will mainly focus on creating "bounded" parameter sets that can be used for tuning.

How search spaces can be created using `ps` has been outlined in @sec-tune-ps.
How search spaces can be created using `r ref("ps()")` has been outlined in @sec-tune-ps.

### Creating Tuning ParamSets from other ParamSets {#sec-paradox-creation-tuning-paramset-from-learner}

Having to define a tuning `ParamSet` for a `Learner` that already has parameter set information may seem unnecessarily tedious, and there is indeed a way to create tuning `ParamSet`s from a `Learner`'s parameter set, making use of as much information as already available.

This is done by setting values of a `Learner`'s `ParamSet` to so-called `TuneToken`s, constructed with a `to_tune` call.
This is done by setting values of a `Learner`'s `ParamSet` to so-called `TuneToken`s, constructed with a `r ref("to_tune()")` call.
This can be done in the same way that other hyperparameters are set to specific values.
It can be understood as the hyperparameters being tagged for later tuning.
The resulting `ParamSet` used for tuning can be retrieved using the `$search_space()` method.
Expand All @@ -622,7 +617,7 @@ rbindlist(generate_design_grid(

It is possible to omit `lower` here, because it can be inferred from the lower bound of the `degree` parameter itself.
For other parameters, that are already bounded, it is possible to not give any bounds at all, because their ranges are already bounded.
An example is the logical `shrinking` hyperparameter:
An example is the logical `$shrinking` hyperparameter:
```{r, eval = FALSE}
learner$param_set$values$shrinking = to_tune()

Expand All @@ -633,11 +628,11 @@ rbindlist(generate_design_grid(
)
```

`"to_tune"` can also be constructed with a `Domain` object, i.e. something constructed with a `p_***` call.
`"to_tune()"` can also be constructed with a `Domain` object, i.e. something constructed with a `p_***` call.
This way it is possible to tune continuous parameters with discrete values, or to give trafos or dependencies.
One could, for example, tune the `cost` as above on three given special values, and introduce a dependency of `shrinking` on it.
One could, for example, tune the `$cost` as above on three given special values, and introduce a dependency of `$shrinking` on it.
Notice that a short form for `to_tune(<levels>)` is a short form of `to_tune(p_fct(<levels>))`.
When introducing the dependency, we need to use the `cost` value from *before* the implicit trafo, which is the name or `as.character()` of the respective value, here `"val2"`!
When introducing the dependency, we need to use the `cost` value from *before* the implicit trafo, which is the name or `r ref("as.character()")` of the respective value, here `"val2"`!

```{r, eval = FALSE}
learner$param_set$values$cost = to_tune(c(val1 = 0.3, val2 = 0.7))
Expand All @@ -648,9 +643,9 @@ print(learner$param_set$search_space())
rbindlist(generate_design_grid(learner$param_set$search_space(), 3)$transpose(), fill = TRUE)
```

The `search_space()` picks up dependencies from the underlying `ParamSet` automatically.
The `$search_space()` picks up dependencies from the underlying `ParamSet` automatically.
So if the `kernel` is tuned, then `degree` automatically gets the dependency on it, without us having to specify that.
(Here we reset `cost` and `shrinking` to `NULL` for the sake of clarity of the generated output.)
(Here we reset `$cost` and `$shrinking` to `NULL` for the sake of clarity of the generated output.)

```{r, eval = FALSE}
learner$param_set$values$cost = NULL
Expand All @@ -664,8 +659,8 @@ rbindlist(generate_design_grid(learner$param_set$search_space(), 3)$transpose(),

It is even possible to define whole `ParamSet`s that get tuned over for a single parameter.
This may be especially useful for vector hyperparameters that should be searched along multiple dimensions.
This `ParamSet` must, however, have an `.extra_trafo` that returns a list with a single element, because it corresponds to a single hyperparameter that is being tuned.
Suppose the `class.weights` hyperparameter should be tuned along two dimensions:
This parameter set must, however, have an `.extra_trafo` that returns a list with a single element, because it corresponds to a single hyperparameter that is being tuned.
Suppose the `$class.weights` hyperparameter should be tuned along two dimensions:

```{r, eval = FALSE}
learner$param_set$values$class.weights = to_tune(
Expand Down
Loading