Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Gemma 3 (text) #1229

Merged
merged 2 commits into from
Mar 25, 2025
Merged

Add support for Gemma 3 (text) #1229

merged 2 commits into from
Mar 25, 2025

Conversation

xenova
Copy link
Collaborator

@xenova xenova commented Mar 13, 2025

Currently only the 1B model has been converted, but I'll make conversions for the rest soon!

Example usage:

import { pipeline } from "@huggingface/transformers";

// Create a text generation pipeline
const generator = await pipeline(
  "text-generation",
  "onnx-community/gemma-3-1b-it-ONNX",
  { dtype: "q4" },
);

// Define the list of messages
const messages = [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Write me a poem about Machine Learning." },
];

// Generate a response
const output = await generator(messages, { max_new_tokens: 512, do_sample: false });
console.log(output[0].generated_text.at(-1).content);
See example output
Okay, here's a poem about Machine Learning, aiming to capture its essence and potential:

**The Learning Algorithm**

A silent hum, a coded grace,
A network grows, a steady pace.
Machine Learning, swift and bright,
Unraveling data, day and night.

It learns from patterns, subtle, deep,
Where errors hide, secrets sleep.
With data flowing, vast and wide,
It builds a model, side by side.

Predicting trends, forecasting near,
Recognizing faces, calming fear.
Classifying images, text so true,
Discovering insights, shining new.

From spam filters, swift and keen,
To recommending what you’ve seen,
It learns your habits, day by day,
Improving swiftly, come what may.

But caution whispers, a gentle plea,
“Control the bias, let it be free.”
For ethics guide, a crucial art,
To use this power, play a vital part.

So let the learning algorithm flow,
Expanding knowledge, watch it grow.
A powerful tool, a future bright,
Machine Learning, shining light.

---

Would you like me to tweak this poem in any way? For example, would you like me to:

*   Focus on a specific application (e.g., image recognition)?
*   Adjust the tone (e.g., more optimistic, more cautionary)?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@TimPietrusky TimPietrusky mentioned this pull request Mar 16, 2025
5 tasks
@Glavin001
Copy link

Anything we can do to help this PR get merged? 🙂

@maciejwolski
Copy link

I am waiting for this too

@aribornstein
Copy link

Me as well

@xenova
Copy link
Collaborator Author

xenova commented Mar 24, 2025

The model works well in Node.js, but for some browsers, we were having some issues when running in-browser due to the large embedding layer. So, we're working on some optimizations so that it runs well on WebGPU.

If anyone would like to build and test this PR locally, that would help a ton!

@dariuszbasiak
Copy link

I'm trying to run in in Chrome in MacOS but getting error: ERROR 3304823240
This number seems to be Conten-Length of buffer that fails to be created

@xenova
Copy link
Collaborator Author

xenova commented Mar 25, 2025

cc @guschmue. Maybe model builder will help fix that? Any updates on that?

@xenova xenova changed the title Add support for Gemma 3 Add support for Gemma 3 (text) Mar 25, 2025
@xenova
Copy link
Collaborator Author

xenova commented Mar 25, 2025

Since the model works correct in Node.js (and the only remaining issue is browser support due to the large embedding size), I'll merge this PR and update the weights eventually once microsoft/onnxruntime-genai#1329 is ready.

Usage should not change once a newer export is created.

@xenova xenova merged commit 06a84b5 into main Mar 25, 2025
3 of 4 checks passed
@xenova xenova deleted the new-model branch March 25, 2025 21:37
@dariuszbasiak
Copy link

Maybe for someone it will be helpful. I run q8 gemma3 in the browser. It is slow but worked ;)

@xenova
Copy link
Collaborator Author

xenova commented Mar 26, 2025

Maybe for someone it will be helpful. I run q8 gemma3 in the browser. It is slow but worked ;)

Great to hear! Is that on WebGPU or WASM? Does q4/q4f16 work for you?

@dariuszbasiak
Copy link

I run q8 few time with example code, but after 3 time it start generating random text. I try other dtypes and only this one work. But it was really slow it took ~4 mins to initialize it and generate something.

@guschmue
Copy link
Contributor

cc @guschmue. Maybe model builder will help fix that? Any updates on that?
Model builder folks are working on it, initially gemma-3-1b-it.
With their latest changes it mostly works for me, there is one more issue that needs to be fixed imo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants