Training OOM #27

rexainn · 2024-08-15T16:49:35Z

Hi, I want to know what GPU do you use for training?
I use a V100, but it kept reporting out of memory. I have turned off the 'convert_models_to_fp32'.

rexainn · 2024-08-15T17:06:22Z

I notice that in your paper you mentioned experiments using a single 3090.
So is it because I train it on my own dataset, and there exists 7 tasks?

zwx8981 · 2024-08-16T02:22:51Z

Maybe, try using smaller batch size

zwx8981 · 2024-08-16T02:26:05Z

You may also try setting opt = 1 in Line127, which freezes the weights of text encoder. Empirically, this would not affect the final performance very much, but can significantly reduce the memory cost.

rexainn · 2024-08-16T02:31:09Z

Maybe, try using smaller batch size

set batchsize = 1 still cause OOM, quite strange....

rexainn · 2024-08-16T02:34:59Z

You may also try setting opt = 1 in Line127, which freezes the weights of text encoder. Empirically, this would not affect the final performance very much, but can significantly reduce the memory cost.

This works, thanks! Meanwhile, I will still try to find the way to not freeze the text encoder

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training OOM #27

Training OOM #27

rexainn commented Aug 15, 2024

rexainn commented Aug 15, 2024

zwx8981 commented Aug 16, 2024

zwx8981 commented Aug 16, 2024 •

edited

Loading

rexainn commented Aug 16, 2024

rexainn commented Aug 16, 2024

Training OOM #27

Training OOM #27

Comments

rexainn commented Aug 15, 2024

rexainn commented Aug 15, 2024

zwx8981 commented Aug 16, 2024

zwx8981 commented Aug 16, 2024 • edited Loading

rexainn commented Aug 16, 2024

rexainn commented Aug 16, 2024

zwx8981 commented Aug 16, 2024 •

edited

Loading