Ficx (d2l-ai#2390)

* Ficx * Fixes
naviat · Dec 10, 2022 · 70a207b · 70a207b
1 parent f9ed5b7
commit 70a207b
Show file tree

Hide file tree

Showing 4 changed files with 8 additions and 19 deletions.
diff --git a/chapter_hyperparameter-optimization/hyperopt-api.md b/chapter_hyperparameter-optimization/hyperopt-api.md
@@ -263,22 +263,11 @@ algorithms, and potential pitfall one needs to be aware.
 ## Exercises
 
 1. The goal of this exercise is to implement the objective function for a slightly more challenging HPO problem, and to run more realistic experiments. We will use the two hidden layer MLP `DropoutMLP` implemented in :numref:`sec_dropout`.
-    1. Code up the objective function, which should depend on all hyperparameters of the model and `batch_size`. Use `max_epochs=50`. GPUs do not help here, so `num_gpus=0`. Hint: Start with `hpo_objective_lenet`.
-    2. Choose a sensible search space, where `num_hiddens_1`, `num_hiddens_2`are integers in $[8, 1024]$, and dropout values lie in $[0, 0.95], while`batch_size` lies in $[16, 384]$. Provide code for `config_space`, using sensible distributions from `scipy.stats`.
-    3. Run random search on this example with `number_of_trials=20` and plot the results. Make sure to first evaluate the default configuration of :numref:`sec_dropout`, which is `initial_config = {'num_hiddens_1':256, 'num_hiddens_2':256, 'dropout_1':0.5, 'dropout_2':0.5, 'lr':0.1, 'batch_size: 256}`.
-2. In this exercise, you will implement a searcher (subclass of `HPOSearcher`)
-   which aims to improve upon random search, by depending on past data. It
-   depends on parameters `probab_local`, `num_init_random`. Its
-   `sample_configuration` method works as follows. For the first `num_init_random`
-   calls, do the same as `RandomSearcher.sample_configuration`. Otherwise, with
-   probability `1 - probab_local`, do the same as
-   `RandomSearcher.sample_configuration`. Otherwise, pick the configuration
-   which attained the smallest validation error so far, select one of its
-   hyperparameters at random, and sample its value randomly like in
-   RandomSearcher.sample_configuration`, but leave all other values the
-   same. Return this configuration, which is identical to the best
-   configuration so far, except in this one hyperparameter.
-    1. Code up this new `LocalSearcher`. Hint: Your searcher requires `config_space` as argument at construction. Feel free to use a member of type `RandomSearcher`.
+    1. Code up the objective function, which should depend on all hyperparameters of the model and `batch_size`. Use `max_epochs=50`. GPUs do not help here, so `num_gpus=0`. Hint: Modify `hpo_objective_lenet`.
+    2. Choose a sensible search space, where `num_hiddens_1`, `num_hiddens_2` are integers in $[8, 1024]$, and dropout values lie in $[0, 0.95]$, while `batch_size` lies in $[16, 384]$. Provide code for `config_space`, using sensible distributions from `scipy.stats`.
+    3. Run random search on this example with `number_of_trials=20` and plot the results. Make sure to first evaluate the default configuration of :numref:`sec_dropout`, which is `initial_config = {'num_hiddens_1': 256, 'num_hiddens_2': 256, 'dropout_1': 0.5, 'dropout_2': 0.5, 'lr': 0.1, 'batch_size': 256}`.
+2. In this exercise, you will implement a new searcher (subclass of `HPOSearcher`) which makes decisions based on past data. It depends on parameters `probab_local`, `num_init_random`. Its `sample_configuration` method works as follows. For the first `num_init_random` calls, do the same as `RandomSearcher.sample_configuration`. Otherwise, with probability `1 - probab_local`, do the same as `RandomSearcher.sample_configuration`. Otherwise, pick the configuration which attained the smallest validation error so far, select one of its hyperparameters at random, and sample its value randomly like in `RandomSearcher.sample_configuration`, but leave all other values the same. Return this configuration, which is identical to the best configuration so far, except in this one hyperparameter.
+    1. Code up this new `LocalSearcher`. Hint: Your searcher requires `config_space` as argument at construction. Feel free to use a member of type `RandomSearcher`. You will also have to implement the `update` method.
     2. Re-run the experiment from the previous exercise, but using your new searcher instead of `RandomSearcher`. Experiment with different values for `probab_local`, `num_init_random`. However, note that a proper comparison between different HPO methods requires repeating experiments several times, and ideally considering a number of benchmark tasks.
 
 

diff --git a/chapter_hyperparameter-optimization/hyperopt-intro.md b/chapter_hyperparameter-optimization/hyperopt-intro.md
@@ -272,7 +272,7 @@ depends on a small subset of the hyperparameters.
     4. Apart from the sheer amount of compute and storage required, what other issues would gradient-based hyperparameter optimization run into? Hint: Re-read about vanishing and exploding gradients in :numref:`sec_numerical_stability`.
     5. *Advanced*: Read :cite:`maclaurin-icml15` for an elegant (yet still somewhat unpractical) approach to gradient-based HPO.
 3. Grid search is another HPO baseline, where we define an equi-spaced grid for each hyperparameter, then iterate over the (combinatorial) Cartesian product in order to suggest configurations.
-   1. We stated above that random search can be much more efficient than grid search for HPO on a sizable number of hyperparameters, if the criterion most strongly depends on a small subset of the hyperparameters. Why is this? Hint: Read :cite:`bergstra-nips11`.
+    1. We stated above that random search can be much more efficient than grid search for HPO on a sizable number of hyperparameters, if the criterion most strongly depends on a small subset of the hyperparameters. Why is this? Hint: Read :cite:`bergstra-nips11`.
 
 
 :begin_tab:`pytorch`

diff --git a/chapter_hyperparameter-optimization/rs-async.md b/chapter_hyperparameter-optimization/rs-async.md
@@ -160,7 +160,7 @@ tuner = Tuner(
     scheduler=scheduler, 
     stop_criterion=stop_criterion,
     n_workers=n_workers,
-    print_update_interval=max_wallclock_time // 2,
+    print_update_interval=int(max_wallclock_time * 0.6),
 )
 ```
 

diff --git a/chapter_hyperparameter-optimization/sh-async.md b/chapter_hyperparameter-optimization/sh-async.md
@@ -176,7 +176,7 @@ tuner = Tuner(
     scheduler=scheduler,
     stop_criterion=stop_criterion,
     n_workers=n_workers,
-    print_update_interval=max_wallclock_time // 2,
+    print_update_interval=int(max_wallclock_time * 0.6),
 )
 tuner.run()
 ```