Skip to content
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.

Track all of our RNG offsets to avoid collisions #65

Open
suchenzang opened this issue May 8, 2022 · 0 comments
Open

Track all of our RNG offsets to avoid collisions #65

suchenzang opened this issue May 8, 2022 · 0 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@suchenzang
Copy link
Contributor

We have RNG seed offsets sprinkled through the codebase:

(base) √ metaseq % ag seed --py | grep +
cpu_tests/test_streaming_token_block_dataset.py:78:        shadow_rng = np.random.default_rng(2273 + seed)
cpu_tests/test_streaming_token_block_dataset.py:124:        shadow_rng = np.random.default_rng(2273 + seed)
metaseq/tasks/language_modeling.py:217:            with data_utils.numpy_seed(self.args.seed + epoch):
metaseq/tasks/streaming_language_modeling.py:316:            seed=1284 + self.args.seed,
metaseq/trainer.py:1052:        seed = self.cfg.common.seed + self.get_num_updates()
metaseq/data/streaming_token_block_dataset.py:96:            rng = np.random.default_rng(2273 + self.seed)
metaseq/data/iterators.py:524:                batches = shuffle_batches(list(batches), self.seed + epoch)
metaseq/data/iterators.py:532:                batches = shuffle_batches(batches, self.seed + epoch + self.shard_id)
metaseq/data/iterators.py:535:                batches = shuffle_batches(list(self.frozen_batches), self.seed + epoch)

Would be good to track these offset to avoid collisions, in cases we're assuming no collision/coupling via offsets.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant