SpikeGPT is a lightweight generative language model with pure binary, event-driven spiking activation units. The arxiv paper of SpikeGPT can be found here.
If you are interested in SpikeGPT, feel free to join our Discord using this link!
This repo is inspired by the RWKV-LM.
If you find yourself struggling with environment configuration, consider using the Docker image for SpikeGPT available on Github.
- Download the enwik8 dataset.
- Run
train.py
You can choose to run inference with either your own customized model or with our pre-trained model. Our pre-trained model is available here. This model trained 5B tokens on OpenWebText.
- download our pre-trained model, and put it in the root directory of this repo.
- Modify the 'context' variable in
run.py
to your custom prompt - Run
run.py
##Fine-Tune with NLU tasks
- run the file in 'NLU' folders
- change the path in line 17 to the model path
If you find SpikeGPT useful in your work, please cite the following source:
@article{zhu2023spikegpt,
title = {SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks},
author = {Zhu, Rui-Jie and Zhao, Qihang and Eshraghian, Jason K.},
journal = {arXiv preprint arXiv:2302.13939},
year = {2023}
}