Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU requirements for training and inference #1

Closed
treya-lin opened this issue Dec 18, 2023 · 1 comment
Closed

GPU requirements for training and inference #1

treya-lin opened this issue Dec 18, 2023 · 1 comment

Comments

@treya-lin
Copy link

treya-lin commented Dec 18, 2023

Hi, it would be very helpful if the GPU-related info can be added to the documentation so that we know if we have enough VRAM needed for training or inference. Thanks!

@xuyaoxun
Copy link
Collaborator

We've trained and performed inference on a 40G V100, using 16-mixed precision for training and float32 for inference. During training, LLaMA didn't utilize float32 parameters for import, allowing us to reach a batch size of 16. For inference, we've adopted a strategy of processing one audio piece at a time. To process the entire test set, which consists of 600 sentences, it took us 10 minutes on eight 40G V100s. If you're working with a different GPU, you might want to experiment with varying the batch size and trying out different inference methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants