Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
5a343f1
moved paradox vignette to new folder chapter 10_1
awinterstetter Sep 5, 2025
f101004
renaming folder names
awinterstetter Sep 5, 2025
820140e
chapter 14 renamed
awinterstetter Sep 5, 2025
33db51a
created paradox_vignette.qmd
awinterstetter Sep 5, 2025
b835da0
_quarto.yml updated
awinterstetter Sep 7, 2025
eb34137
removed authors line
awinterstetter Sep 7, 2025
eb48d9c
renaming paradox chapter + reordering of headlines level
awinterstetter Sep 8, 2025
809af7d
chapters 10 & 11 reordered; new chapter 11 renamed; "tihs"-type fixed
awinterstetter Sep 10, 2025
fda5172
typo fixed HERE
awinterstetter Sep 10, 2025
86b4e4e
removed ParamHelpers paragraph
awinterstetter Sep 10, 2025
63ac7aa
## Defining a Tuning Space - removed Creating ParamSets and Transform…
awinterstetter Sep 11, 2025
2369881
replacing $extra_trafo by .extra_trafo
awinterstetter Sep 11, 2025
eda2fb9
corrected typo regarding $set_values()
awinterstetter Sep 11, 2025
d0736bf
reordering of one sentence so text flows better
awinterstetter Sep 12, 2025
7419df8
moving text one line down so header has more space
awinterstetter Sep 12, 2025
32fd683
provided better context in section Factor Level Transformation
awinterstetter Sep 12, 2025
09bdf2e
implemented chatGPT recommendation 13/17/21 and 30 to 38
awinterstetter Sep 13, 2025
91beb88
comment out "C-classification"
awinterstetter Sep 15, 2025
7826844
changed () in forward/backward-references
awinterstetter Sep 16, 2025
b0619df
Merge branch 'main' into mlr3book_paradox_vignette
mb706 Oct 13, 2025
d9180ba
Merge branch 'main' into mlr3book_paradox_vignette
be-marc Nov 26, 2025
185d03a
...
be-marc Nov 26, 2025
da39370
changes in chapter overview and authors
awinterstetter Nov 27, 2025
93cadc3
changed r chapter from paradox vignette to proper name
awinterstetter Nov 27, 2025
1708039
removed miesmuschel, because package is not known (check with marc)
awinterstetter Nov 27, 2025
3a66f13
changed cloumn width
awinterstetter Nov 27, 2025
c110f8a
style guide changes
awinterstetter Nov 28, 2025
01159a9
adhere chapter 11 to style guide
awinterstetter Dec 3, 2025
c18247b
fixed `r rpart` error
awinterstetter Dec 3, 2025
1e20c49
fixed `rpart.control` error because rpart is not loaded
awinterstetter Dec 3, 2025
835dfa1
removed ref for as.data.table
awinterstetter Dec 3, 2025
4ba433b
more fixes regarding wrong referencing of data.table
awinterstetter Dec 3, 2025
01819b7
fix `r lhs`
awinterstetter Dec 3, 2025
69dcd70
fixes after seeing rendered chapter
awinterstetter Dec 4, 2025
a9b4118
fix regarding ref for Design
awinterstetter Dec 4, 2025
685a35d
more fixes after seeing rendered version
awinterstetter Dec 5, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
renaming paradox chapter + reordering of headlines level
  • Loading branch information
awinterstetter committed Sep 8, 2025
commit eb48d9cb4c9c43291c37ccb518493833127d6472
2 changes: 1 addition & 1 deletion book/_quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ book:
- chapters/chapter9/preprocessing.qmd
- part: "Advanced Topics"
chapters:
- chapters/chapter10/paradox_vignette.qmd
- chapters/chapter10/parameters_(using_paradox).qmd
- chapters/chapter11/advanced_technical_aspects_of_mlr3.qmd
- chapters/chapter12/large-scale_benchmarking.qmd
- chapters/chapter13/model_interpretation.qmd
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,12 @@ aliases:
- "/paradox_vignette.html"
---

# Paradox Vignette
# Parameters (using paradox)

{{< include ../../common/_setup.qmd >}}

`r chapter = "Paradox Vignette"`

## Parameters (using paradox)

The `paradox` package offers a language for the description of *parameter spaces*, as well as tools for useful operations on these parameter spaces.
A parameter space is often useful when describing:

Expand All @@ -26,7 +24,7 @@ The tools provided by `paradox` therefore relate to:
`paradox` is, by nature, an auxiliary package that derives its usefulness from other packages that make use of it.
It is heavily utilized in other [mlr-org](https://github.com/mlr-org) packages such as `mlr3`, `mlr3pipelines`, `mlr3tuning` and `miesmuschel`.

### Reference Based Objects
## Reference Based Objects

`paradox` is the spiritual successor to the `ParamHelpers` package and was written from scratch.
The most important consequence of this is that some objects created in `paradox` are "reference-based", unlike most other objects in R.
Expand All @@ -52,9 +50,9 @@ print(ps2) # contains the same reference as ps1, so also changed
print(ps3) # is a "clone" of the old ps1 with 'a' == 1
```

### Defining a Parameter Space
## Defining a Parameter Space

#### `Domain` Representing Single Parameters
### `Domain` Representing Single Parameters

Parameter spaces are made up of individual parameters, which usually can take a single atomic value.
Consider, for example, trying to configure the `rpart` package's `rpart.control` object.
Expand Down Expand Up @@ -119,7 +117,7 @@ It is also possible to get all information of a `ParamSet` as `data.table` by ca
as.data.table(param_set)
```

##### Type / Range Checking
#### Type / Range Checking

The `ParamSet` object offers the possibility to check whether a value satisfies its condition, i.e. is of the right type, and also falls within the range of allowed values, using the `$test()`, `$check()`, and `$assert()` functions.
Their argument must be a named list with values that are checked against the respective parameters, and it is possible to check only a subset of parameters.
Expand All @@ -132,7 +130,7 @@ param_set$test(list(parA = "FALSE"))
param_set$check(list(parA = "FALSE"))
```

#### Parameter Sets
### Parameter Sets

The ordered collection of parameters is handled in a `ParamSet`.
It is typically created by calling `ps()`, but can also be initialized using the `ParamSet$new()` function.
Expand Down Expand Up @@ -160,7 +158,7 @@ as.data.table(ps_all)
```


##### Values in a `ParamSet`
#### Values in a `ParamSet`

Although a `ParamSet` fundamentally represents a value space, it also has a field `$values` that can contain a point within that space.
This is useful because many things that define a parameter space need similar operations (like parameter checking) that can be simplified.
Expand All @@ -179,7 +177,7 @@ The parameter constraints are automatically checked:
ps1$values$x = 1.5
```

##### Dependencies
#### Dependencies

It is often the case that certain parameters are irrelevant or should not be given depending on values of other parameters.
An example would be a parameter that switches a certain algorithm feature (for example regularization) on or off, combined with another parameter that controls the behavior of that feature (e.g. a regularization parameter).
Expand Down Expand Up @@ -229,7 +227,7 @@ Therefore it is advised to be cautious.
p$deps
```

#### Vector Parameters
### Vector Parameters

Unlike in the old `ParamHelpers` package, there are no more vectorial parameters in `paradox`.
Instead, it is now possible to create multiple copies of a single parameter using the `ps_replicate` function.
Expand All @@ -253,7 +251,7 @@ ps2d$tags
ps2d$get_values(tags = "param_x")
```

### Parameter Sampling
## Parameter Sampling

It is often useful to have a list of possible parameter values that can be systematically iterated through, for example to find parameter values for which an algorithm performs particularly well (tuning).
`paradox` offers a variety of functions that allow creating evenly-spaced parameter values in a "grid" design as well as random sampling.
Expand All @@ -262,12 +260,12 @@ In the latter case, it is possible to influence the sampling distribution in mor
A point to always keep in mind while sampling is that only numerical and factorial parameters that are bounded can be sampled from, i.e. not `ParamUty`.
Furthermore, for most samplers `p_int()` and `p_dbl()` must have finite lower and upper bounds.

#### Parameter Designs
### Parameter Designs

Functions that sample the parameter space fundamentally return an object of the `Design` class.
These objects contain the sampled data as a `data.table` under the `$data` field, and also offer conversion to a list of parameter-values using the **`$transpose()`** function.

#### Grid Design
### Grid Design

The `generate_design_grid()` function is used to create grid designs that contain all combinations of parameter values: All possible values for `ParamLgl` and `ParamFct`, and values with a given resolution for `ParamInt` and `ParamDbl`.
The resolution can be given for all numeric parameters, or for specific named parameters through the `param_resolutions` parameter.
Expand All @@ -282,7 +280,7 @@ print(design)
generate_design_grid(ps_small, param_resolutions = c(A = 3, B = 2))
```

#### Random Sampling
### Random Sampling

`paradox` offers different methods for random sampling, which vary in the degree to which they can be configured.
The easiest way to get a uniformly random sample of parameters is `generate_design_random()`.
Expand All @@ -304,15 +302,15 @@ plot(pvlhs$data, main = "'lhs' design", xlim = c(0, 1), ylim=c(0, 1))
plot(pvsobol$data, main = "'sobol' design", xlim = c(0, 1), ylim=c(0, 1))
```

#### Generalized Sampling: The `Sampler` Class
### Generalized Sampling: The `Sampler` Class

It may sometimes be desirable to configure parameter sampling in more detail.
`paradox` uses the `Sampler` abstract base class for sampling, which has many different sub-classes that can be parameterized and combined to control the sampling process.
It is even possible to create further sub-classes of the `Sampler` class (or of any of *its* subclasses) for even more possibilities.

Every `Sampler` object has a `sample()` function, which takes one argument, the number of instances to sample, and returns a `Design` object.

##### 1D-Samplers
#### 1D-Samplers

There is a variety of samplers that sample values for a single parameter.
These are `Sampler1DUnif` (uniform sampling), `Sampler1DCateg` (sampling for categorical parameters), `Sampler1DNormal` (normally distributed sampling, truncated at parameter bounds), and `Sampler1DRfun` (arbitrary 1D sampling, given a random-function).
Expand All @@ -323,7 +321,7 @@ sampA = Sampler1DCateg$new(ps(x = p_fct(letters)))
sampA$sample(5)
```

##### Hierarchical Sampler
#### Hierarchical Sampler

The `SamplerHierarchical` sampler is an auxiliary sampler that combines many 1D-Samplers to get a combined distribution.
Its name "hierarchical" implies that it is able to respect parameter dependencies.
Expand All @@ -350,7 +348,7 @@ head(sampled$data)
table(sampled$data[, c("A", "B")], useNA = "ifany")
```

##### Joint Sampler
#### Joint Sampler

Another way of combining samplers is the `SamplerJointIndep`.
`SamplerJointIndep` also makes it possible to combine `Sampler`s that are not 1D.
Expand All @@ -364,11 +362,11 @@ sampJ = SamplerJointIndep$new(
sampJ$sample(5)
```

##### SamplerUnif
#### SamplerUnif

The `Sampler` used in `generate_design_random()` is the `SamplerUnif` sampler, which corresponds to a `HierarchicalSampler` of `Sampler1DUnif` for all parameters.

### Parameter Transformation
## Parameter Transformation

While the different `Sampler`s allow for a wide specification of parameter distributions, there are cases where the simplest way of getting a desired distribution is to sample parameters from a simple distribution (such as the uniform distribution) and then transform them.
This can be done by constructing a `Domain` with a `trafo` argument, or assigning a function to the `$extra_trafo` field of a `ParamSet`.
Expand Down Expand Up @@ -414,7 +412,7 @@ However, the `trafo` way is more recommended when transforming parameters indepe
`$extra_trafo` is more useful when transforming parameters that interact in some way, or when new parameters should be generated.


#### Transformation between Types
### Transformation between Types

Usually the design created with one `ParamSet` is then used to configure other objects that themselves have a `ParamSet` which defines the values they take.
The `ParamSet`s which can be used for random sampling, however, are restricted in some ways:
Expand Down Expand Up @@ -506,7 +504,7 @@ methodPS$check(xvals[[1]])
xvals[[1]]$fun(1:10)
```

### Defining a Tuning Spaces
## Defining a Tuning Spaces

When running an optimization, it is important to inform the tuning algorithm about what hyperparameters are valid.
Here the names, types, and valid ranges of each hyperparameter are important.
Expand All @@ -522,7 +520,7 @@ However, for tuning the value, a lower *and* upper bound must be given because t
For `Learner` or `PipeOp` objects, typically "unbounded" `ParamSet`s are used.
Here, however, we will mainly focus on creating "bounded" `ParamSet`s that can be used for tuning.

#### Creating `ParamSet`s
### Creating `ParamSet`s

An empty `"ParamSet` -- not yet very useful -- can be constructed using just the `"ps"` call:

Expand Down Expand Up @@ -578,7 +576,7 @@ Preferred:
search_space = ps(cost = p_dbl(0.1, 10), kernel = p_fct(c("polynomial", "radial")))
```

#### Transformations (`trafo`)
### Transformations (`trafo`)

We can use the `paradox` function `generate_design_grid` to look at the values that would be evaluated by grid search.
(We are using `rbindlist()` here because the result of `$transpose()` is a list that is harder to read.
Expand Down Expand Up @@ -642,7 +640,7 @@ generate_design_grid(search_space, 3)$transpose()

(We are omitting `rbindlist()` in this example because it breaks the vector valued return elements.)

### Automatic Factor Level Transformation
## Automatic Factor Level Transformation

A common use-case is the necessity to specify a list of values that should all be tried (or sampled from).
It may be the case that a hyperparameter accepts function objects as values and a certain list of functions should be tried.
Expand Down Expand Up @@ -699,7 +697,7 @@ search_space = ps(
generate_design_grid(search_space)$transpose()
```

#### Parameter Dependencies (`depends`)
### Parameter Dependencies (`depends`)

Some parameters are only relevant when another parameter has a certain value, or one of several values.
The [Support Vector Machine](https://machinelearningmastery.com/cost-sensitive-svm-for-imbalanced-classification/) (SVM), for example, has the `degree` parameter that is only valid when `kernel` is `"polynomial"`.
Expand All @@ -716,7 +714,7 @@ search_space = ps(
rbindlist(generate_design_grid(search_space, 3)$transpose(), fill = TRUE)
```

#### Creating Tuning ParamSets from other ParamSets
### Creating Tuning ParamSets from other ParamSets

Having to define a tuning `ParamSet` for a `Learner` that already has parameter set information may seem unnecessarily tedious, and there is indeed a way to create tuning `ParamSet`s from a `Learner`'s `ParamSet`, making use of as much information as already available.

Expand Down