require large memory #69

aijianiula0601 · 2020-06-02T11:56:37Z

Thanks for your jobs!

my environment:
V100，8gpu，per gpu 32g memoy.

I train with 4 gpu use multiproc and it take away almost 32g each gpu.The batch_szie is set to 32 instead of 4*32=128. Does it need that much memory? Why not switch to dataparallel.Thank you!

CookiePPP · 2020-06-02T11:59:21Z

@aijianiula0601
VRAM (GPGPUs) or System Memory (RAM)?

https://github.com/NVIDIA/mellotron/blob/master/distributed.py#L51

The multiproc is just a slightly modified version of Dataparallel, memory usage should be the same (or very close) to pytorch's built-in copy.

aijianiula0601 · 2020-06-03T11:17:25Z

@aijianiula0601
VRAM (GPGPUs) or System Memory (RAM)?

https://github.com/NVIDIA/mellotron/blob/master/distributed.py#L51

The multiproc is just a slightly modified version of Dataparallel, memory usage should be the same (or very close) to pytorch's built-in copy.

I change it to Dataparallel.The batch size can set to 128, but train slow.

rafaelvalle · 2020-07-09T21:33:21Z

With our implementation, try decreasing the batch size.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

require large memory #69

require large memory #69

aijianiula0601 commented Jun 2, 2020 •

edited

Loading

CookiePPP commented Jun 2, 2020 •

edited

Loading

aijianiula0601 commented Jun 3, 2020

rafaelvalle commented Jul 9, 2020

require large memory #69

require large memory #69

Comments

aijianiula0601 commented Jun 2, 2020 • edited Loading

CookiePPP commented Jun 2, 2020 • edited Loading

aijianiula0601 commented Jun 3, 2020

rafaelvalle commented Jul 9, 2020

aijianiula0601 commented Jun 2, 2020 •

edited

Loading

CookiePPP commented Jun 2, 2020 •

edited

Loading