You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, it would be very helpful if the GPU-related info can be added to the documentation so that we know if we have enough VRAM needed for training or inference. Thanks!
The text was updated successfully, but these errors were encountered:
We've trained and performed inference on a 40G V100, using 16-mixed precision for training and float32 for inference. During training, LLaMA didn't utilize float32 parameters for import, allowing us to reach a batch size of 16. For inference, we've adopted a strategy of processing one audio piece at a time. To process the entire test set, which consists of 600 sentences, it took us 10 minutes on eight 40G V100s. If you're working with a different GPU, you might want to experiment with varying the batch size and trying out different inference methods.
Hi, it would be very helpful if the GPU-related info can be added to the documentation so that we know if we have enough VRAM needed for training or inference. Thanks!
The text was updated successfully, but these errors were encountered: