Skip to content

Latest commit

 

History

History
 
 

labor

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Layer-Neighbor Sampling -- Defusing Neighborhood Explosion in GNNs

  • Paper link: https://arxiv.org/abs/2210.13339 This is the official Labor sampling example to reproduce the results in the original paper with the GraphSAGE GNN model. The model can be changed to any other model where NeighborSampler can be used.

Requirements

pip install requests lightning==2.0.6 ogb

How to run

Minibatch training for node classification

Train w/ mini-batch sampling on the GPU for node classification on "ogbn-products"

python3 train_lightning.py --dataset=ogbn-products

Results:

Test Accuracy: 0.797

Any integer passed as the --importance-sampling=i argument runs the corresponding LABOR-i variant. --importance-sampling=-1 runs the LABOR-* variant.

--vertex-limit argument is used if a vertex sampling budget is needed. It adjusts the batch size at the end of every epoch so that the average number of sampled vertices converges to the provided vertex limit. Can be used to replicate the vertex sampling budget experiments in the Labor paper.

During training runs, statistics about number of sampled vertices, edges, cache miss rates will be reported. One can use tensorboard to look at their plots during/after training:

tensorboard --logdir tb_logs

Utilize a GPU feature cache for UVA training

python3 train_lightning.py --dataset=ogbn-products --use-uva --cache-size=500000

Reduce GPU feature cache miss rate for UVA training

python3 train_lightning.py --dataset=ogbn-products --use-uva --cache-size=500000 --batch-dependency=64

Force all layers to share the same neighborhood for shared vertices

python3 train_lightning.py --dataset=ogbn-products --layer-dependency