Using Hugging Face model card name in export_llama #8872

iseeyuan · 2025-03-02T00:59:07Z

🚀 The feature, motivation and pitch

Currently, user need to manually download hugging face safetensors, convert to llama_transformer format, and load the checkpoint and config for the export and inference.

It would be great to directly download and cache (don't have to load it again) the converted checkpoints, and do the inference. Similar to what mlx does:

from mlx_lm import load, generate

model, tokenizer = load("mlx-community/dolphin3.0-llama3.2-3B-4Bit")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

cc @mergennachin @cccclai @helunwencser @jackzhxng

The text was updated successfully, but these errors were encountered:

### Summary If no checkpoint is specified during `export_llama`, download the checkpoint from HuggingFace if it is an OSS model. Closes #8872 ### Test plan Manual export

iseeyuan assigned jackzhxng Mar 2, 2025

github-project-automation bot moved this to To triage in ExecuTorch Core Mar 2, 2025

github-project-automation bot added this to ExecuTorch Core Mar 2, 2025

iseeyuan added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: llm Issues related to LLM examples and apps, and to the extensions/llm/ code labels Mar 2, 2025

iseeyuan added this to etLLM: LLMs via ExecuTorch Mar 2, 2025

iseeyuan mentioned this issue Mar 2, 2025

Add Phi-4-mini-instruct #8856

Merged

iseeyuan added this to ExecuTorch DevX Mar 2, 2025

github-project-automation bot moved this to To triage in ExecuTorch DevX Mar 2, 2025

github-actions bot mentioned this issue Mar 10, 2025

Weekly issue metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#15

Open

github-actions bot mentioned this issue Mar 17, 2025

Weekly issue metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#17

Open

jackzhxng moved this from To triage to In progress in ExecuTorch Core Mar 20, 2025

github-actions bot mentioned this issue Mar 24, 2025

Weekly issue metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#19

Open

jackzhxng mentioned this issue Mar 24, 2025

Download checkpoints from HuggingFace #9538

Merged

jackzhxng closed this as completed in #9538 Mar 25, 2025

jackzhxng added a commit that referenced this issue Mar 25, 2025

Download checkpoints from HuggingFace (#9538)

766bbdc

### Summary If no checkpoint is specified during `export_llama`, download the checkpoint from HuggingFace if it is an OSS model. Closes #8872 ### Test plan Manual export

github-project-automation bot moved this from In progress to Done in ExecuTorch Core Mar 25, 2025

github-project-automation bot moved this from To triage to Done in ExecuTorch DevX Mar 25, 2025

github-project-automation bot moved this to Done in etLLM: LLMs via ExecuTorch Mar 25, 2025

github-actions bot mentioned this issue Mar 31, 2025

Weekly issue metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#21

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Hugging Face model card name in export_llama #8872

Using Hugging Face model card name in export_llama #8872

iseeyuan commented Mar 2, 2025 •

edited by pytorch-bot bot

Loading

Using Hugging Face model card name in export_llama #8872

Using Hugging Face model card name in export_llama #8872

Comments

iseeyuan commented Mar 2, 2025 • edited by pytorch-bot bot Loading

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

iseeyuan commented Mar 2, 2025 •

edited by pytorch-bot bot

Loading