OLLA (Optimizing the Lifetime and Location of Arrays) makes it possible to train larger deep neural networks on existing hardware. OLLA optimizes the order in which the neural network operators are executed to minimize peak memory usage. Furthermore OLLA eliminates memory fragmentation to ensure that no memory is wasted.
Our approach is described in detail on the OLLA arXiv paper
The source code will be available soon.
If you use OLLA, please cite us with:
@article{steiner2022olla,
title={OLLA: Optimizing the Lifetime and Location of Arrays to Reduce the Memory Usage of Neural Networks},
author={Steiner, Benoit and Elhoushi, Mostafa and Kahn, Jacob, and Hegarty, James},
doi = {10.48550/arXiv.2210.12924},
year={2022},
}