Impact of PRNG with multiple modules #140
andrewyates
started this conversation in
Notes
Replies: 1 comment
-
Update: replacing |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We currently seed the PRNG in the pipeline initialization (i.e., outside any specific module), which ensures that modules can use numpy/pytorch/python random libraries without specifying a seed. However, this means that the number of draws performed by each module can affect draws performed by a module later in the pipeline.
For example, say we have two modules that use the PRNG, X and Y, with X running before Y. If the number of draws X performs is constant, Y will always receive a PRNG in the same state. If the number of X's draws changes, however, the state of the PRNG when Y uses it will have changed.
This should not cause issues as long as (1) the number of draws X performs is a function of its config, so that the same config always creates the same output, and (2) Y operates on X's output. Capreolus experiments are functional in that the config fully describes them, so the first condition is satisfied.
The second condition is satisfied in the current pipeline, but we should keep this issue in mind if the pipeline changes. The danger is that X and Y may be independent modules (i.e., Y does not use X's output), but changing X's config effectively changes Y's seed by altering the state of the PRNG before Y uses it. This is unintuitive because we expect Y's behavior to rely on only its config, the pipeline config, and its inputs (which are specified by the configs of its input modules, and do not include X in this case).
Beta Was this translation helpful? Give feedback.
All reactions