- Paper link: https://arxiv.org/abs/2210.13339 This is the official Labor sampling example to reproduce the results in the original paper with the GraphSAGE GNN model. The model can be changed to any other model where NeighborSampler can be used.
pip install requests lightning==2.0.6 ogb
Train w/ mini-batch sampling on the GPU for node classification on "ogbn-products"
python3 train_lightning.py --dataset=ogbn-products
Test Accuracy: 0.797
Any integer passed as the --importance-sampling=i
argument runs the corresponding
LABOR-i variant. --importance-sampling=-1
runs the LABOR-* variant.
argument is used if a vertex sampling budget is needed. It adjusts
the batch size at the end of every epoch so that the average number of sampled vertices
converges to the provided vertex limit. Can be used to replicate the vertex sampling
budget experiments in the Labor paper.
During training runs, statistics about number of sampled vertices, edges, cache miss rates will be reported. One can use tensorboard to look at their plots during/after training:
tensorboard --logdir tb_logs
python3 train_lightning.py --dataset=ogbn-products --use-uva --cache-size=500000
python3 train_lightning.py --dataset=ogbn-products --use-uva --cache-size=500000 --batch-dependency=64
python3 train_lightning.py --dataset=ogbn-products --layer-dependency