New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Refactor dtype handling in export_llama #9430

Merged

facebook-github-bot merged 1 commit into pytorch:main from jackzhxng:export-D71515138

Mar 21, 2025

Contributor

jackzhxng commented Mar 19, 2025

Differential Revision: D71515138

jackzhxng requested a review from lucylq as a code owner

March 19, 2025 23:59

pytorch-bot bot commented Mar 19, 2025 •

edited

Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9430

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Unrelated Failures

As of commit 62f1e9d with merge base a828307 ():

NEW FAILURE - The following job has failed:

Lint / lintrunner / linux-job (gh)
>>> Lint for examples/qualcomm/utils.py:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

pull / test-models-linux-basic (vit, portable, cmake, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11... / linux-job (gh) (detected as infra flaky with no runner)
pull / test-setup-linux-gcc / linux-job (gh) (detected as infra flaky with no runner)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot added the CLA Signed label

Contributor

facebook-github-bot commented Mar 19, 2025

This pull request was exported from Phabricator. Differential Revision: D71515138

facebook-github-bot added the fb-exported label

jackzhxng force-pushed the export-D71515138 branch from 21b73fb to 21941b6 Compare

March 20, 2025 15:50

jackzhxng added a commit to jackzhxng/executorch that referenced this pull request


          Refactor dtype handling in export_llama (pytorch#9430)

21941b6

Summary: Pull Request resolved: pytorch#9430

Differential Revision: D71515138

Contributor

facebook-github-bot commented Mar 20, 2025

This pull request was exported from Phabricator. Differential Revision: D71515138

jackzhxng changed the title ~~Refactor dtype handling in export_llama.~~ Refactor dtype handling in export_llama

jackzhxng added the topic: not user facing label

jackzhxng added a commit to jackzhxng/executorch that referenced this pull request


          Refactor dtype handling in export_llama (pytorch#9430)

fa12dfa

Summary: Pull Request resolved: pytorch#9430

Differential Revision: D71515138

jackzhxng force-pushed the export-D71515138 branch from 21941b6 to fa12dfa Compare

March 20, 2025 17:36

jackzhxng added a commit to jackzhxng/executorch that referenced this pull request


          Refactor dtype handling in export_llama (pytorch#9430)

73809d0

Summary: Pull Request resolved: pytorch#9430

Differential Revision: D71515138

jackzhxng force-pushed the export-D71515138 branch from fa12dfa to 73809d0 Compare

March 20, 2025 17:36

Contributor

facebook-github-bot commented Mar 20, 2025

This pull request was exported from Phabricator. Differential Revision: D71515138

1 similar comment

Contributor

facebook-github-bot commented Mar 20, 2025

This pull request was exported from Phabricator. Differential Revision: D71515138

jackzhxng added a commit to jackzhxng/executorch that referenced this pull request


          Refactor dtype handling in export_llama (pytorch#9430)

6be43cf

Summary:
No more converting from fp32 -> checkpoint dtype (fp16 or lower) -> back to dtype override (fp32), where we are losing precision on buffers. Also cleans up the entire dtype, now it only occurs outside of model.py, who's responsibility should just be for loading the model.


Differential Revision: D71515138

jackzhxng force-pushed the export-D71515138 branch from 73809d0 to 6be43cf Compare

March 20, 2025 18:27

Contributor

facebook-github-bot commented Mar 20, 2025

This pull request was exported from Phabricator. Differential Revision: D71515138

jackzhxng added a commit to jackzhxng/executorch that referenced this pull request


          Refactor dtype handling in export_llama (pytorch#9430)

e112ad8

Summary:
While it might make sense intuitively to have the dtype of the model be the dtype of the checkpoint, this isn't possible for all backends which only support some dtypes. We need to be explicit about the dtype of the model for this reason. No more intermediate conversion into the checkpoint dtype, which could cause precision loss in situations like these:

fp32 -> checkpoint dtype (fp16 or lower) -> back to dtype override (fp32), where we are losing precision on buffers that are instantiated in fp32 and downcast to fp16.


Differential Revision: D71515138

jackzhxng force-pushed the export-D71515138 branch from 6be43cf to e112ad8 Compare

March 20, 2025 18:40

Contributor

facebook-github-bot commented Mar 20, 2025

This pull request was exported from Phabricator. Differential Revision: D71515138


          Refactor dtype handling in export_llama (pytorch#9430)

62f1e9d

Summary:
While it might make sense intuitively to have the dtype of the model be the dtype of the checkpoint, this isn't possible for all backends which only support some dtypes. We need to be explicit about the dtype of the model for this reason. No more intermediate conversion into the checkpoint dtype, which could cause precision loss in situations like these:

fp32 -> checkpoint dtype (fp16 or lower) -> back to dtype override (fp32), where we are losing precision on buffers that are instantiated in fp32 and downcast to fp16.


Reviewed By: kimishpatel

Differential Revision: D71515138

jackzhxng force-pushed the export-D71515138 branch from e112ad8 to 62f1e9d Compare

March 21, 2025 06:40

Contributor

facebook-github-bot commented Mar 21, 2025

This pull request was exported from Phabricator. Differential Revision: D71515138

kimishpatel approved these changes

View reviewed changes

facebook-github-bot merged commit 0c1c362 into pytorch:main

78 of 82 checks passed

jackzhxng mentioned this pull request

Llama‘s freqs_cos data loss as for convert dtype #9393

Closed

DannyYuyang-quic pushed a commit to CodeLinaro/executorch that referenced this pull request


          Refactor dtype handling in export_llama

333e975

Differential Revision: D71515138

Pull Request resolved: pytorch#9430

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed fb-exported topic: not user facing