Skip to content

Commit

Permalink
Update README.md (huggingface#8815)
Browse files Browse the repository at this point in the history
The tokenizer called at the input_ids of example 2 is currently encoding text_1. I think this should be changed to text_2.
  • Loading branch information
mdermentzi authored Nov 27, 2020
1 parent f8eda59 commit e3ef62b
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion model_cards/nlpaueb/bert-base-greek-uncased-v1/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ print(tokenizer_greek.convert_ids_to_tokens(outputs[0, 5].max(0)[1].item()))
# ================ EXAMPLE 2 ================
text_2 = 'Είναι ένας [MASK] άνθρωπος.'
# EN: 'He is a [MASK] person.'
input_ids = tokenizer_greek.encode(text_1)
input_ids = tokenizer_greek.encode(text_2)
print(tokenizer_greek.convert_ids_to_tokens(input_ids))
# ['[CLS]', 'ειναι', 'ενας', '[MASK]', 'ανθρωπος', '.', '[SEP]']
outputs = lm_model_greek(torch.tensor([input_ids]))[0]
Expand Down

0 comments on commit e3ef62b

Please sign in to comment.