Skip to content

Commit

Permalink
Ficx (d2l-ai#2390)
Browse files Browse the repository at this point in the history
* Ficx

* Fixes
  • Loading branch information
mseeger authored Dec 10, 2022
1 parent f9ed5b7 commit 70a207b
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 19 deletions.
21 changes: 5 additions & 16 deletions chapter_hyperparameter-optimization/hyperopt-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -263,22 +263,11 @@ algorithms, and potential pitfall one needs to be aware.
## Exercises

1. The goal of this exercise is to implement the objective function for a slightly more challenging HPO problem, and to run more realistic experiments. We will use the two hidden layer MLP `DropoutMLP` implemented in :numref:`sec_dropout`.
1. Code up the objective function, which should depend on all hyperparameters of the model and `batch_size`. Use `max_epochs=50`. GPUs do not help here, so `num_gpus=0`. Hint: Start with `hpo_objective_lenet`.
2. Choose a sensible search space, where `num_hiddens_1`, `num_hiddens_2`are integers in $[8, 1024]$, and dropout values lie in $[0, 0.95], while`batch_size` lies in $[16, 384]$. Provide code for `config_space`, using sensible distributions from `scipy.stats`.
3. Run random search on this example with `number_of_trials=20` and plot the results. Make sure to first evaluate the default configuration of :numref:`sec_dropout`, which is `initial_config = {'num_hiddens_1':256, 'num_hiddens_2':256, 'dropout_1':0.5, 'dropout_2':0.5, 'lr':0.1, 'batch_size: 256}`.
2. In this exercise, you will implement a searcher (subclass of `HPOSearcher`)
which aims to improve upon random search, by depending on past data. It
depends on parameters `probab_local`, `num_init_random`. Its
`sample_configuration` method works as follows. For the first `num_init_random`
calls, do the same as `RandomSearcher.sample_configuration`. Otherwise, with
probability `1 - probab_local`, do the same as
`RandomSearcher.sample_configuration`. Otherwise, pick the configuration
which attained the smallest validation error so far, select one of its
hyperparameters at random, and sample its value randomly like in
RandomSearcher.sample_configuration`, but leave all other values the
same. Return this configuration, which is identical to the best
configuration so far, except in this one hyperparameter.
1. Code up this new `LocalSearcher`. Hint: Your searcher requires `config_space` as argument at construction. Feel free to use a member of type `RandomSearcher`.
1. Code up the objective function, which should depend on all hyperparameters of the model and `batch_size`. Use `max_epochs=50`. GPUs do not help here, so `num_gpus=0`. Hint: Modify `hpo_objective_lenet`.
2. Choose a sensible search space, where `num_hiddens_1`, `num_hiddens_2` are integers in $[8, 1024]$, and dropout values lie in $[0, 0.95]$, while `batch_size` lies in $[16, 384]$. Provide code for `config_space`, using sensible distributions from `scipy.stats`.
3. Run random search on this example with `number_of_trials=20` and plot the results. Make sure to first evaluate the default configuration of :numref:`sec_dropout`, which is `initial_config = {'num_hiddens_1': 256, 'num_hiddens_2': 256, 'dropout_1': 0.5, 'dropout_2': 0.5, 'lr': 0.1, 'batch_size': 256}`.
2. In this exercise, you will implement a new searcher (subclass of `HPOSearcher`) which makes decisions based on past data. It depends on parameters `probab_local`, `num_init_random`. Its `sample_configuration` method works as follows. For the first `num_init_random` calls, do the same as `RandomSearcher.sample_configuration`. Otherwise, with probability `1 - probab_local`, do the same as `RandomSearcher.sample_configuration`. Otherwise, pick the configuration which attained the smallest validation error so far, select one of its hyperparameters at random, and sample its value randomly like in `RandomSearcher.sample_configuration`, but leave all other values the same. Return this configuration, which is identical to the best configuration so far, except in this one hyperparameter.
1. Code up this new `LocalSearcher`. Hint: Your searcher requires `config_space` as argument at construction. Feel free to use a member of type `RandomSearcher`. You will also have to implement the `update` method.
2. Re-run the experiment from the previous exercise, but using your new searcher instead of `RandomSearcher`. Experiment with different values for `probab_local`, `num_init_random`. However, note that a proper comparison between different HPO methods requires repeating experiments several times, and ideally considering a number of benchmark tasks.


Expand Down
2 changes: 1 addition & 1 deletion chapter_hyperparameter-optimization/hyperopt-intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -272,7 +272,7 @@ depends on a small subset of the hyperparameters.
4. Apart from the sheer amount of compute and storage required, what other issues would gradient-based hyperparameter optimization run into? Hint: Re-read about vanishing and exploding gradients in :numref:`sec_numerical_stability`.
5. *Advanced*: Read :cite:`maclaurin-icml15` for an elegant (yet still somewhat unpractical) approach to gradient-based HPO.
3. Grid search is another HPO baseline, where we define an equi-spaced grid for each hyperparameter, then iterate over the (combinatorial) Cartesian product in order to suggest configurations.
1. We stated above that random search can be much more efficient than grid search for HPO on a sizable number of hyperparameters, if the criterion most strongly depends on a small subset of the hyperparameters. Why is this? Hint: Read :cite:`bergstra-nips11`.
1. We stated above that random search can be much more efficient than grid search for HPO on a sizable number of hyperparameters, if the criterion most strongly depends on a small subset of the hyperparameters. Why is this? Hint: Read :cite:`bergstra-nips11`.


:begin_tab:`pytorch`
Expand Down
2 changes: 1 addition & 1 deletion chapter_hyperparameter-optimization/rs-async.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ tuner = Tuner(
scheduler=scheduler,
stop_criterion=stop_criterion,
n_workers=n_workers,
print_update_interval=max_wallclock_time // 2,
print_update_interval=int(max_wallclock_time * 0.6),
)
```

Expand Down
2 changes: 1 addition & 1 deletion chapter_hyperparameter-optimization/sh-async.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ tuner = Tuner(
scheduler=scheduler,
stop_criterion=stop_criterion,
n_workers=n_workers,
print_update_interval=max_wallclock_time // 2,
print_update_interval=int(max_wallclock_time * 0.6),
)
tuner.run()
```
Expand Down

0 comments on commit 70a207b

Please sign in to comment.