Skip to content

Commit

Permalink
Rephrasing comment for clarity
Browse files Browse the repository at this point in the history
  • Loading branch information
MalikMAlna committed Apr 7, 2023
1 parent 0689c2e commit 43ddc3e
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion data.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ def tokenize_inputs(config, tokenizer, examples):

# add target tokens, remove bos
input_ids[i, newline_plus_inputs: newline_plus_inputs + len(target_tokens)] = target_tokens
# add eos token, enforce stopping if we don't truncate
# add eos token; ensure generation stops if inputs aren't truncated
# we don't want long code to stop generating if truncated during training
if newline_plus_inputs + len(target_tokens) < max_length:
input_ids[i, newline_plus_inputs + len(target_tokens)] = tokenizer.eos_token_id
Expand Down

0 comments on commit 43ddc3e

Please sign in to comment.