Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

require large memory #69

Open
aijianiula0601 opened this issue Jun 2, 2020 · 3 comments
Open

require large memory #69

aijianiula0601 opened this issue Jun 2, 2020 · 3 comments

Comments

@aijianiula0601
Copy link

aijianiula0601 commented Jun 2, 2020

Thanks for your jobs!

my environment:
V100,8gpu,per gpu 32g memoy.

I train with 4 gpu use multiproc and it take away almost 32g each gpu.The batch_szie is set to 32 instead of 4*32=128. Does it need that much memory? Why not switch to dataparallel.Thank you!

@CookiePPP
Copy link

CookiePPP commented Jun 2, 2020

@aijianiula0601
VRAM (GPGPUs) or System Memory (RAM)?


https://github.com/NVIDIA/mellotron/blob/master/distributed.py#L51

The multiproc is just a slightly modified version of Dataparallel, memory usage should be the same (or very close) to pytorch's built-in copy.

@aijianiula0601
Copy link
Author

@aijianiula0601
VRAM (GPGPUs) or System Memory (RAM)?

https://github.com/NVIDIA/mellotron/blob/master/distributed.py#L51

The multiproc is just a slightly modified version of Dataparallel, memory usage should be the same (or very close) to pytorch's built-in copy.

I change it to Dataparallel.The batch size can set to 128, but train slow.

@rafaelvalle
Copy link
Contributor

With our implementation, try decreasing the batch size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants