-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Allow Bert Encoder to specify hidden dime for the fc layers #104
base: main
Are you sure you want to change the base?
Conversation
@byshiue Hey, I added some fixes, could you please take a look again? Thanks! |
Have you compiled the codes to verify the correctness? |
Really sorry about the inconvenience. I can't really run the tensorflow unit tests due to some constraints. I tried running
|
I think that original code can work normally. |
Besides, I still cannot compile this code successfully on TensorFlow. |
@842974287 I think we should try to compile/screen tf code to make sure it works for tf. For example, to avoid fixes like 55c6c69 |
This was the second enhancement in #98.
Add support for different hidden dimension size in bert's fc layers. Currently in bert encoder feed forward part, the hidden dim of the first fc is hardcoded to 4 * head_dim * head_size. This PR added a field to pass in the size of hidden dimension.