This is the official implementation of the paper "Gluformer: Transformer-Based Personalized Glucose Forecasting with Uncertainty Quantification" (link).
Table of Contents:
The code is organized as follows:
cache
:visualize_*
contains code for reproducing plots from the paper.
gludata
data
folder containing the data pre-processing tools.data_loader.py
provides the implementation of PyTorchDataset
for the data.
gluformer
provides our model implementation.trials
contains outputs from the experiments on the real / synthetic data sets.trials.txt
provides commands to run the experiments.
utils
contains common tools for model training / evaluation.experiment.ipynb
provides an example on the synthetic data of how to train and evaluate our model.model_*
scripts for model training and evaluation respectively that can be run from the command line.
Additionally, we provide environment.yaml
that gives a snpshot of our conda
build for reproducibility.
The repository provides a self-contained code to repoduce the results from the paper on the glucose and synthetic data sets. Below, we outline some futher instructions.
We recommend use conda
to create a virtual environment for this project. Once you have pulled the code and installed conda
on your system, you need to run the following command from the root (repository) folder: conda env create -f environment.yml
. Once the necessary packages are installed, you can activate your environment by running conda activate gluformer
.
We suggest to start the explortion of our model by running the model on the synthetic data. The code for this is provided in the experiment.ipynb
notebook. The code is largely self-contained as it gives the implementation of both the training and the evaluation loops. Additionally, the notebook contains the data generating function for the synthetic data. All of this is done to expose the use to the inner workings of the model and ease potential extensions to the other data sets.
For the results obtained in the paper, we use a publicly available CGM data set provided here. We recommend downloading the data set directly from the authors' repository and using our script gludata/data/split.py
to split the data. You can visualize and process the data at the same using the gludata/data/view.ipynb
notebook. If you have trouble with the downloading or pre-processing the data, do not hesitate to raise an issue.
(Update March 20, 2023) For simplicity, we provide the data in our repository. The full data set is UM_data.pkl
. The data corresponds to merging the processed_cgm_data_train.pkl
, processed_cgm_data_validation.pkl
, and processed_cgm_data_test.pkl
in the source repository. The data comes under the CC BY-NC-SA license, which implies that:
- You will no attempt re-identification.
- You will contact the University of Michigan if identifiers are detected.
- You will not redistribute or resell the data.
- Data ownership remains with the University of Michigan.
- Requirements survive changes in ownership of entity.
We provide model_train.py
and model_eval.py
scripts that can be run from the command line and give an implementation of the training and evaluation loops respectively. Both scripts expect to be run from the root (repository) folder and have all dependencies (specified in the environment.yaml
file) installed. For an example of what parameters each script takes, see the trials.txt
file in the trials/
folder.
@inproceedings{sergazinov2022gluformer,
title={Gluformer: Transformer-Based Personalized Glucose Forecasting with Uncertainty Quantification},
author={Renat Sergazinov and Mohammadreza Armandpour and Irina Gaynanova},
booktitle={{arXiv}},
year={2022},
}