-
Notifications
You must be signed in to change notification settings - Fork 506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor dtype handling in export_llama #9430
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9430
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 2 Unrelated FailuresAs of commit 62f1e9d with merge base a828307 ( NEW FAILURE - The following job has failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D71515138 |
21b73fb
to
21941b6
Compare
Summary: Pull Request resolved: pytorch#9430 Differential Revision: D71515138
This pull request was exported from Phabricator. Differential Revision: D71515138 |
Summary: Pull Request resolved: pytorch#9430 Differential Revision: D71515138
21941b6
to
fa12dfa
Compare
Summary: Pull Request resolved: pytorch#9430 Differential Revision: D71515138
fa12dfa
to
73809d0
Compare
This pull request was exported from Phabricator. Differential Revision: D71515138 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D71515138 |
Summary: No more converting from fp32 -> checkpoint dtype (fp16 or lower) -> back to dtype override (fp32), where we are losing precision on buffers. Also cleans up the entire dtype, now it only occurs outside of model.py, who's responsibility should just be for loading the model. Differential Revision: D71515138
73809d0
to
6be43cf
Compare
This pull request was exported from Phabricator. Differential Revision: D71515138 |
Summary: While it might make sense intuitively to have the dtype of the model be the dtype of the checkpoint, this isn't possible for all backends which only support some dtypes. We need to be explicit about the dtype of the model for this reason. No more intermediate conversion into the checkpoint dtype, which could cause precision loss in situations like these: fp32 -> checkpoint dtype (fp16 or lower) -> back to dtype override (fp32), where we are losing precision on buffers that are instantiated in fp32 and downcast to fp16. Differential Revision: D71515138
6be43cf
to
e112ad8
Compare
This pull request was exported from Phabricator. Differential Revision: D71515138 |
Summary: While it might make sense intuitively to have the dtype of the model be the dtype of the checkpoint, this isn't possible for all backends which only support some dtypes. We need to be explicit about the dtype of the model for this reason. No more intermediate conversion into the checkpoint dtype, which could cause precision loss in situations like these: fp32 -> checkpoint dtype (fp16 or lower) -> back to dtype override (fp32), where we are losing precision on buffers that are instantiated in fp32 and downcast to fp16. Reviewed By: kimishpatel Differential Revision: D71515138
e112ad8
to
62f1e9d
Compare
This pull request was exported from Phabricator. Differential Revision: D71515138 |
Differential Revision: D71515138 Pull Request resolved: pytorch#9430
Differential Revision: D71515138