Skip to content

Commit

Permalink
documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
BennoKrojer committed Dec 5, 2020
1 parent 3cd367e commit 0de3433
Showing 1 changed file with 15 additions and 1 deletion.
16 changes: 15 additions & 1 deletion reproducing.txt
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,18 @@ Eval_Accuracy_intersected = 0.2556610664718773

Dec 5, 9:00:
new code is equal to old code. let's use new code.
vocab_size=6506, data=neg_sampled, question: does it get the reported acc? or did we back then simply not increase vocab size?
vocab_size=6506, data=neg_sampled, question: does it get the reported acc? or did we back then simply not increase vocab size?


INFLUENCE OF VOCAB_SIZE:
relevant code pieces:

Embeddings:
modeling_bert.py 615
modeling_bert.py 149
-> shouldn't be a problem right? because the useless embeddings dont get any gradient updates

modeling_bert.py 465
-> this might be a problem
it predicts probs for 30522 labels, e.g. [0.1, 0.0001, ...]
but 25.000 of them will always be =0 so it quickly has to learn to ignore those

0 comments on commit 0de3433

Please sign in to comment.