Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Hugging Face model card name in export_llama #8872

Closed
iseeyuan opened this issue Mar 2, 2025 · 0 comments · Fixed by #9538
Closed

Using Hugging Face model card name in export_llama #8872

iseeyuan opened this issue Mar 2, 2025 · 0 comments · Fixed by #9538
Assignees
Labels
module: llm Issues related to LLM examples and apps, and to the extensions/llm/ code triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@iseeyuan
Copy link
Contributor

iseeyuan commented Mar 2, 2025

🚀 The feature, motivation and pitch

Currently, user need to manually download hugging face safetensors, convert to llama_transformer format, and load the checkpoint and config for the export and inference.

It would be great to directly download and cache (don't have to load it again) the converted checkpoints, and do the inference. Similar to what mlx does:

from mlx_lm import load, generate

model, tokenizer = load("mlx-community/dolphin3.0-llama3.2-3B-4Bit")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

cc @mergennachin @cccclai @helunwencser @jackzhxng

@github-project-automation github-project-automation bot moved this to To triage in ExecuTorch Core Mar 2, 2025
@iseeyuan iseeyuan added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: llm Issues related to LLM examples and apps, and to the extensions/llm/ code labels Mar 2, 2025
@github-project-automation github-project-automation bot moved this to To triage in ExecuTorch DevX Mar 2, 2025
@jackzhxng jackzhxng moved this from To triage to In progress in ExecuTorch Core Mar 20, 2025
jackzhxng added a commit that referenced this issue Mar 25, 2025
### Summary
If no checkpoint is specified during `export_llama`, download the
checkpoint from HuggingFace if it is an OSS model.

Closes #8872

### Test plan
Manual export
@github-project-automation github-project-automation bot moved this from In progress to Done in ExecuTorch Core Mar 25, 2025
@github-project-automation github-project-automation bot moved this from To triage to Done in ExecuTorch DevX Mar 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: llm Issues related to LLM examples and apps, and to the extensions/llm/ code triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Status: Done
Status: Done
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants