Skip to content

Commit

Permalink
Merge pull request princeton-nlp#121 from voidism/main
Browse files Browse the repository at this point in the history
fixed the accidentally masking problem for input_ids with --do_mlm
  • Loading branch information
Tianyu Gao authored Nov 24, 2021
2 parents 2adbdb3 + a0867f4 commit 121443e
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions train.py
Original file line number Diff line number Diff line change
Expand Up @@ -500,6 +500,7 @@ def mask_tokens(
"""
Prepare masked tokens inputs/labels for masked language modeling: 80% MASK, 10% random, 10% original.
"""
inputs = inputs.clone()
labels = inputs.clone()
# We sample a few tokens in each sequence for MLM training (with probability `self.mlm_probability`)
probability_matrix = torch.full(labels.shape, self.mlm_probability)
Expand Down

0 comments on commit 121443e

Please sign in to comment.