Support for FLAN-T5 #106

jihan-yin · 2022-11-21T22:07:20Z

I saw that T5 wasn't in the list of supported huggingface transformers models. Are there plans / ETA for when the T5 family would be added? FLAN-T5 is a very strong llm for zero/fewshot instruction prompting. I am currently building out a hacky implementation for hosting with deepspeed-inference, but having it natively supported in deepspeed-mii would be ideal.

mrwyattii · 2022-11-21T23:20:04Z

We do support the T5 family with DeepSpeed-Inference with a custom injection policy (see this DeepSpeed unit test). However, we have not yet brought this support into MII. It's on our radar to add this in the future. We are also open to outside contributions if you would like to submit a PR!

jeffra · 2022-11-21T23:27:44Z

Also keep an eye on this PR, it’s currently a work in progress for better T5 support: microsoft/DeepSpeed#2451

mhillebrand · 2024-01-18T05:04:23Z

Assuming that PR does get merged, would it also support Long T5?

philhchen mentioned this issue Dec 25, 2022

Support text2text-generation pipelines and T5 models in particular #124

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for FLAN-T5 #106

Support for FLAN-T5 #106

jihan-yin commented Nov 21, 2022

mrwyattii commented Nov 21, 2022 •

edited

Loading

jeffra commented Nov 21, 2022

mhillebrand commented Jan 18, 2024

Support for FLAN-T5 #106

Support for FLAN-T5 #106

Comments

jihan-yin commented Nov 21, 2022

mrwyattii commented Nov 21, 2022 • edited Loading

jeffra commented Nov 21, 2022

mhillebrand commented Jan 18, 2024

mrwyattii commented Nov 21, 2022 •

edited

Loading