Nf test tune #166

alessiovignoli · 2024-07-24T12:25:57Z

Changes aported:

Setting the seed. Seed can be None but if given is a integer and is given in the tune config. That seed will initialize python, numpy and torch seeds in the TuneWrapper class init. Then numbers will be drown randomly (always the same if the seed is set) and those values will be the seeds for the trials inside tune. So each tune trial/experiment will have is own seed set in a reproducible manner because is dependant on the overall user given seed.
Seed determine also weights initialization for the specific model (1 model per trial). Model are initialized always to same parameters if seed is given.
Added debug mode to explroe weights initialization, seeds and raw output predicgtion on validation set for best model. All this files are outputed only if debug_mode is activated. They will be present in the debug dir under the TuneRun results subdir. This files are used for understanding reproduciobility across identical runs of tune.
handle_tune nf-test has been created and dnatofloat on cpu is set as proxy for reproducibility. More work here in the future
Now tune configs can specify a run_params key that handles RunConfig auxiliary information like stop criteria necessary for the FIFOscheduler now present in the dnatofloat cpu tune config.

TODO for the future is decide wich flavor of the nf-test tune should be put as guthub action to test reproducibility.

…ific ones

…e, weigths are now set in a reproducible and deterministic manner

…he correct tune_run path

…de of the process dir

mathysgrapotte · 2024-07-24T16:16:46Z

bin/launch_check_model.py

+        elif user_tune_config["tune"]["scheduler"]["name"] == "FIFOScheduler":
+            user_tune_config["tune"]["run_params"]["stop"]["training_iteration"] = 1
+
+        # TODO future schedulers specific info will go here as well. maybe find a cleaner way.


Most likely this needs to become a stimulus class in the same way experiment is

mathysgrapotte · 2024-07-24T16:19:45Z

bin/launch_tuning.py

    results.save_best_model(output)
    results.save_best_config(best_config_path)
    results.save_best_metrics_dataframe(best_metrics_path)
    results.save_best_optimizer(best_optimizer_path)

+    # debug section. predict the validation data using the best model.


Why does the behavior of debug load the best model and tests on the validation data, shouldn't this be reserved to the analysis module? How is this helping us with debug tuning?

mathysgrapotte · 2024-07-24T16:23:15Z

bin/src/learner/raytune_learner.py

@@ -179,12 +210,34 @@ def setup(self, config: dict, training: object, validation: object) -> None:
        self.training = DataLoader(training, batch_size=self.batch_size, shuffle=True)  # TODO need to check the reproducibility of this shuffling
        self.validation = DataLoader(validation, batch_size=self.batch_size, shuffle=True)

+        # debug section, first create a dedicated directory for each worker inside Ray_results/<tune_model_run_specific_dir> location


In the future, I believe this should be done regardless of debug or not (saving seed and initial model), this would be a "robustness mode" toggled "on" by default!

Or a "reproducibility" mode, as I believe the formula "model + initial state + seed + training code + training data" is our "deep learning" container

alessiovignoli added 24 commits June 19, 2024 15:44

start of nf-test architecture, still need fix on reproducibility

b7de5ee

moved from train_conf -> tune_conf

ee5d347

renamed model configs simple_model.yaml -> dnatofloat_model_*pu.yaml

65e6c83

started adding seed setting in ray.tnne

83bb4b6

set all seeds in a reproducible manner, both generalm and worker spec…

3d462ca

…ific ones

start of debug mode in ray tune and pipeline

bb0ba2e

reporting in debug mode model weights at initialization and seed valu…

09d4300

…e, weigths are now set in a reproducible and deterministic manner

added debug mode to check_model section as well

479cb5b

made predict clas use a already loaded DtaLoader object as input

1ed664f

fixed unittest after tune_learner modifications

90abfa1

starting to add ability to save prediction of best model

3c6a571

outputting the predictions on the vaalidatiion set in debug mode to t…

d9db634

…he correct tune_run path

added the possibility to rename the tune run in TuneWrapper

7d293cd

removed the possibility for the user to set the ray_results dir outsi…

a100295

…de of the process dir

Merge branch 'main' into nf-test_tune

681bab6

model weights and seeds checked by nf-test to be exactly identical

de1c36e

nf-test on tuning working with md5 matches on debug mode with cpu 9/10

980fe05

merged changes from #162

92ff948

Merge branch 'main' into nf-test_tune

e1075ab

debug now tells the seed of best model and minor fixes

3a4201a

temporary changes to explore synchronous scheduler FIFO

a7bd70c

updating test learn

16ffb55

allowinguser to give a None value as seed, aka not setting a seed

bcdba4d

added run_params to tune configs

14f00f4

alessiovignoli requested review from mathysgrapotte and suzannejin July 24, 2024 12:26

maaking the the adjustments to stop at iteration 1 for FIFO scheduler

02134de

alessiovignoli mentioned this pull request Jul 24, 2024

[nf-tests] Assure reproducibility #55

Open

alessiovignoli linked an issue Jul 24, 2024 that may be closed by this pull request

[tests] Add global pipeline tests (nf-tests) #47

Closed

mathysgrapotte approved these changes Jul 24, 2024

View reviewed changes

mathysgrapotte merged commit 7e57cbe into main Jul 25, 2024
4 checks passed

mathysgrapotte deleted the nf-test_tune branch July 25, 2024 10:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nf test tune #166

Nf test tune #166

alessiovignoli commented Jul 24, 2024

mathysgrapotte Jul 24, 2024

mathysgrapotte Jul 24, 2024

mathysgrapotte Jul 24, 2024

mathysgrapotte Jul 24, 2024

Nf test tune #166

Nf test tune #166

Conversation

alessiovignoli commented Jul 24, 2024

mathysgrapotte Jul 24, 2024

Choose a reason for hiding this comment

mathysgrapotte Jul 24, 2024

Choose a reason for hiding this comment

mathysgrapotte Jul 24, 2024

Choose a reason for hiding this comment

mathysgrapotte Jul 24, 2024

Choose a reason for hiding this comment