Updated code with major refactoring and included additional problems …

…in ICLR 2019 paper
duducheng · Feb 6, 2019 · f563e33 · f563e33
1 parent c9d6fbd
commit f563e33
Show file tree

Hide file tree

Showing 111 changed files with 8,314 additions and 440 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,4 +1,11 @@
 data/
 outputs/
 logs/
+results/
 __pycache__/
+.idea
+*/.ipynb_checkpoints
+*.tc.*
+*.tc_backward.*
+*.log
+*.bak
diff --git a/README.md b/README.md
@@ -1,54 +1,113 @@
-# Attention Solves Your TSP, Approximately
+# Attention, Learn to Solve Routing Problems!
 
-Attention based model for learning to solve the Travelling Salesman Problem (TSP) and the Vehicle Routing Problem (VRP). Training with REINFORCE with greedy rollout baseline.
+Attention based model for learning to solve the Travelling Salesman Problem (TSP) and the Vehicle Routing Problem (VRP), Orienteering Problem (OP) and (Stochastic) Prize Collecting TSP (PCTSP). Training with REINFORCE with greedy rollout baseline.
 
 ## Paper
-Please see our paper [Attention Solves Your TSP, Approximately](https://arxiv.org/abs/1803.08475). 
+For more details, please see our paper [Attention, Learn to Solve Routing Problems!](https://openreview.net/forum?id=ByxBFsRqYm) which has been accepted at [ICLR 2019](https://iclr.cc/Conferences/2019). If this code is useful for your work, please cite our paper:
+
+```
+@inproceedings{
+    kool2018attention,
+    title={Attention, Learn to Solve Routing Problems!},
+    author={Wouter Kool and Herke van Hoof and Max Welling},
+    booktitle={International Conference on Learning Representations},
+    year={2019},
+    url={https://openreview.net/forum?id=ByxBFsRqYm},
+}
+``` 
 
 ## Dependencies
 
 * Python>=3.5
 * NumPy
 * SciPy
-* [PyTorch](http://pytorch.org/)=0.3
+* [PyTorch](http://pytorch.org/)=0.4
 * tqdm
 * [tensorboard_logger](https://github.com/TeamHG-Memex/tensorboard_logger)
 * Matplotlib (optional, only for plotting)
 
-## Usage
+## Quick start
 
 For training TSP instances with 20 nodes and using rollout as REINFORCE baseline:
 ```bash
 python run.py --graph_size 20 --baseline rollout --run_name 'tsp20_rollout'
 ```
 
-By default, training will happen on all available GPUs. To disable CUDA at all, add the flag `--no_cuda`. 
+## Usage
+
+### Generating data
+
+Training data is generated on the fly. To generate validation and test data (same as used in the paper) for all problems:
+```bash
+python generate_data.py --problem all --name validation --seed 4321
+python generate_data.py --problem all --name test --seed 1234
+```
+
+### Training
+
+For training TSP instances with 20 nodes and using rollout as REINFORCE baseline and using the generated validation set:
+```bash
+python run.py --graph_size 20 --baseline rollout --run_name 'tsp20_rollout' --val_dataset data/tsp/tsp20_validation_seed4321.pkl
+```
+
+#### Multiple GPUs
+By default, training will happen *on all available GPUs*. To disable CUDA at all, add the flag `--no_cuda`. 
 Set the environment variable `CUDA_VISIBLE_DEVICES` to only use specific GPUs:
 ```bash
 CUDA_VISIBLE_DEVICES=2,3 python run.py 
 ```
 Note that using multiple GPUs has limited efficiency for small problem sizes (up to 50 nodes).
 
-To evaluate a model, use the `--load_path` option to specify the model to load and add the `--eval_only` option, for example:
+#### Warm start
+You can initialize a run using a pretrained model by using the `--load_path` option:
+```bash
+python run.py --graph_size 100 --load_path pretrained/tsp_100/epoch-99.pt
+```
+
+The `--load_path` option can also be used to load an earlier run, in which case also the optimizer state will be loaded:
+```bash
+python run.py --graph_size 20 --load_path 'outputs/tsp_20/tsp20_rollout_{datetime}/epoch-0.pt'
+```
+
+The `--resume` option can be used instead of the `--load_path` option, which will try to resume the run, e.g. load additionally the baseline state, set the current epoch/step counter and set the random number generator state.
+
+### Evaluation
+To evaluate a model, you can add the `--eval-only` flag to `run.py`, or use `eval.py`, which will additionally measure timing and save the results:
+```bash
+python eval.py data/tsp/tsp20_test_seed1234.pkl --model pretrained/tsp_20 --decode_strategy greedy
+```
+If the epoch is not specified, by default the last one in the folder will be used.
+
+#### Sampling
+To report the best of 1280 sampled solutions, use
 ```bash
-python run.py --graph_size 20 --eval_only --load_path 'outputs/tsp_20/tsp20_rollout_{datetime}/epoch-0.pt'
+python eval.py data/tsp/tsp20_test_seed1234.pkl --model pretrained/tsp_20 --decode_strategy sample --width 1280 --eval_batch_size 1
 ```
+Beam Search (not in the paper) is also recently added and can be used using `--decode_strategy bs --width {beam_size}`.
 
-To load a pretrained model:
+#### To run baselines
+Baselines for different problems are within the corresponding folders and can be ran (on multiple datasets at once) as follows
 ```bash
-CUDA_VISIBLE_DEVICES=0 python run.py --graph_size 100 --eval_only --load_path pretrained/tsp_100/epoch-99.pt
+python -m problems.tsp.tsp_baseline farthest_insertion data/tsp/tsp20_test_seed1234.pkl data/tsp/tsp50_test_seed1234.pkl data/tsp/tsp100_test_seed1234.pkl
 ```
-Note that the results may differ slightly from the results reported in the paper, as a different test set was used than the validation set (which depends on the random seed).
+To run baselines, you need to install [Compass](https://github.com/bcamath-ds/compass) by running the `install_compass.sh` script from within the `problems/op` directory and [Concorde](http://www.math.uwaterloo.ca/tsp/concorde.html) using the `install_concorde.sh` script from within `problems/tsp`. [LKH3](http://akira.ruc.dk/~keld/research/LKH-3/) should be automatically downloaded and installed when required. To use [Gurobi](http://www.gurobi.com), obtain a ([free academic](http://www.gurobi.com/registration/academic-license-reg)) license and follow the [installation instructions](https://www.gurobi.com/documentation/8.1/quickstart_windows/installing_the_anaconda_py.html).
 
-For other options and help:
+### Other options and help
 ```bash
 python run.py -h
+python eval.py -h
 ```
 
-## Example CVRP solution
+### Example CVRP solution
 See `plot_vrp.ipynb` for an example of loading a pretrained model and plotting the result for Capacitated VRP with 100 nodes.
 
 ![CVRP100](images/cvrp_0.png)
 
 ## Acknowledgements
-Thanks to [pemami4911/neural-combinatorial-rl-pytorch](https://github.com/pemami4911/neural-combinatorial-rl-pytorch) for getting me started with the code for the Pointer Network.
+Thanks to [pemami4911/neural-combinatorial-rl-pytorch](https://github.com/pemami4911/neural-combinatorial-rl-pytorch) for getting me started with the code for the Pointer Network.
+
+This repository includes adaptions of the following repositories as baselines:
+* https://github.com/MichelDeudon/encode-attend-navigate
+* https://github.com/mc-ride/orienteering
+* https://github.com/jordanamecler/PCTSP
+* https://github.com/rafael2reis/salesman
diff --git a/environment.yml b/environment.yml
@@ -0,0 +1,13 @@
+# run: conda env create --file environment.yml
+name: attention_tsp
+channels:
+- pytorch
+dependencies:
+- python>=3.6
+- anaconda
+- tqdm
+- pytorch
+- torchvision
+- cuda91
+- pip:
+  - tensorboard_logger
diff --git a/eval.py b/eval.py
@@ -0,0 +1,216 @@
+import math
+import torch
+import os
+import argparse
+import numpy as np
+import itertools
+from tqdm import tqdm
+from utils import load_model, move_to
+from utils.data_utils import save_dataset
+from torch.utils.data import DataLoader
+import time
+from datetime import timedelta
+from utils.functions import parse_softmax_temperature
+mp = torch.multiprocessing.get_context('spawn')
+
+
+def get_best(sequences, cost, ids=None, batch_size=None):
+    """
+    Ids contains [0, 0, 0, 1, 1, 2, ..., n, n, n] if 3 solutions found for 0th instance, 2 for 1st, etc
+    :param sequences:
+    :param lengths:
+    :param ids:
+    :return: list with n sequences and list with n lengths of solutions
+    """
+    if ids is None:
+        idx = cost.argmin()
+        return sequences[idx:idx+1, ...], cost[idx:idx+1, ...]
+
+    splits = np.hstack([0, np.where(ids[:-1] != ids[1:])[0] + 1])
+    mincosts = np.minimum.reduceat(cost, splits)
+
+    group_lengths = np.diff(np.hstack([splits, len(ids)]))
+    all_argmin = np.flatnonzero(np.repeat(mincosts, group_lengths) == cost)
+    result = np.full(len(group_lengths) if batch_size is None else batch_size, -1, dtype=int)
+
+    result[ids[all_argmin[::-1]]] = all_argmin[::-1]
+
+    return [sequences[i] if i >= 0 else None for i in result], [cost[i] if i >= 0 else math.inf for i in result]
+
+
+def eval_dataset_mp(args):
+    (dataset_path, width, softmax_temp, opts, i, num_processes) = args
+
+    model, _ = load_model(opts.model)
+    val_size = opts.val_size // num_processes
+    dataset = model.problem.make_dataset(filename=dataset_path, num_samples=val_size, offset=opts.offset + val_size * i)
+    device = torch.device("cuda:{}".format(i))
+
+    return _eval_dataset(model, dataset, width, softmax_temp, opts, device)
+
+
+def eval_dataset(dataset_path, width, softmax_temp, opts):
+    # Even with multiprocessing, we load the model here since it contains the name where to write results
+    model, _ = load_model(opts.model)
+    use_cuda = torch.cuda.is_available() and not opts.no_cuda
+    if opts.multiprocessing:
+        assert use_cuda, "Can only do multiprocessing with cuda"
+        num_processes = torch.cuda.device_count()
+        assert opts.val_size % num_processes == 0
+
+        with mp.Pool(num_processes) as pool:
+            results = list(itertools.chain.from_iterable(pool.map(
+                eval_dataset_mp,
+                [(dataset_path, width, softmax_temp, opts, i, num_processes) for i in range(num_processes)]
+            )))
+
+    else:
+        device = torch.device("cuda:0" if use_cuda else "cpu")
+        dataset = model.problem.make_dataset(filename=dataset_path, num_samples=opts.val_size, offset=opts.offset)
+        results = _eval_dataset(model, dataset, width, softmax_temp, opts, device)
+
+    # This is parallelism, even if we use multiprocessing (we report as if we did not use multiprocessing, e.g. 1 GPU)
+    parallelism = opts.eval_batch_size
+
+    costs, tours, durations = zip(*results)  # Not really costs since they should be negative
+
+    print("Average cost: {} +- {}".format(np.mean(costs), 2 * np.std(costs) / np.sqrt(len(costs))))
+    print("Average serial duration: {} +- {}".format(
+        np.mean(durations), 2 * np.std(durations) / np.sqrt(len(durations))))
+    print("Average parallel duration: {}".format(np.mean(durations) / parallelism))
+    print("Calculated total duration: {}".format(timedelta(seconds=int(np.sum(durations) / parallelism))))
+
+    dataset_basename, ext = os.path.splitext(os.path.split(dataset_path)[-1])
+    model_name = "_".join(os.path.normpath(os.path.splitext(opts.model)[0]).split(os.sep)[-2:])
+    if opts.o is None:
+        results_dir = os.path.join(opts.results_dir, model.problem.NAME, dataset_basename)
+        os.makedirs(results_dir, exist_ok=True)
+
+        out_file = os.path.join(results_dir, "{}-{}-{}{}-t{}-{}-{}{}".format(
+            dataset_basename, model_name,
+            opts.decode_strategy,
+            width if opts.decode_strategy != 'greedy' else '',
+            softmax_temp, opts.offset, opts.offset + len(costs), ext
+        ))
+    else:
+        out_file = opts.o
+
+    assert opts.f or not os.path.isfile(
+        out_file), "File already exists! Try running with -f option to overwrite."
+
+    save_dataset((results, parallelism), out_file)
+
+    return costs, tours, durations
+
+
+def _eval_dataset(model, dataset, width, softmax_temp, opts, device):
+
+    model.to(device)
+    model.eval()
+
+    model.set_decode_type(
+        "greedy" if opts.decode_strategy in ('bs', 'greedy') else "sampling",
+        temp=softmax_temp)
+
+    dataloader = DataLoader(dataset, batch_size=opts.eval_batch_size)
+
+    results = []
+    for batch in tqdm(dataloader, disable=opts.no_progress_bar):
+        batch = move_to(batch, device)
+
+        start = time.time()
+        with torch.no_grad():
+            if opts.decode_strategy in ('sample', 'greedy'):
+                if opts.decode_strategy == 'greedy':
+                    assert width == 0, "Do not set width when using greedy"
+                    assert opts.eval_batch_size <= opts.max_calc_batch_size, \
+                        "eval_batch_size should be smaller than calc batch size"
+                    batch_rep = 1
+                    iter_rep = 1
+                elif width * opts.eval_batch_size > opts.max_calc_batch_size:
+                    assert opts.eval_batch_size == 1
+                    assert width % opts.max_calc_batch_size == 0
+                    batch_rep = opts.max_calc_batch_size
+                    iter_rep = width // opts.max_calc_batch_size
+                else:
+                    batch_rep = width
+                    iter_rep = 1
+                assert batch_rep > 0
+                # This returns (batch_size, iter_rep shape)
+                sequences, costs = model.sample_many(batch, batch_rep=batch_rep, iter_rep=iter_rep)
+                batch_size = len(costs)
+                ids = torch.arange(batch_size, dtype=torch.int64, device=costs.device)
+            else:
+                assert opts.decode_strategy == 'bs'
+
+                cum_log_p, sequences, costs, ids, batch_size = model.beam_search(
+                    batch, beam_size=width,
+                    compress_mask=opts.compress_mask,
+                    max_calc_batch_size=opts.max_calc_batch_size
+                )
+
+        if sequences is None:
+            sequences = [None] * batch_size
+            costs = [math.inf] * batch_size
+        else:
+            sequences, costs = get_best(
+                sequences.cpu().numpy(), costs.cpu().numpy(),
+                ids.cpu().numpy() if ids is not None else None,
+                batch_size
+            )
+        duration = time.time() - start
+        for seq, cost in zip(sequences, costs):
+            if model.problem.NAME == "tsp":
+                seq = seq.tolist()  # No need to trim as all are same length
+            elif model.problem.NAME in ("cvrp", "sdvrp"):
+                seq = np.trim_zeros(seq).tolist() + [0]  # Add depot
+            elif model.problem.NAME in ("op", "pctsp"):
+                seq = np.trim_zeros(seq)  # We have the convention to exclude the depot
+            else:
+                assert False, "Unkown problem: {}".format(model.problem.NAME)
+            # Note VRP only
+            results.append((cost, seq, duration))
+
+    return results
+
+
+if __name__ == "__main__":
+
+    parser = argparse.ArgumentParser()
+    parser.add_argument("datasets", nargs='+', help="Filename of the dataset(s) to evaluate")
+    parser.add_argument("-f", action='store_true', help="Set true to overwrite")
+    parser.add_argument("-o", default=None, help="Name of the results file to write")
+    parser.add_argument('--val_size', type=int, default=10000,
+                        help='Number of instances used for reporting validation performance')
+    parser.add_argument('--offset', type=int, default=0,
+                        help='Offset where to start in dataset (default 0)')
+    parser.add_argument('--eval_batch_size', type=int, default=1024,
+                        help="Batch size to use during (baseline) evaluation")
+    # parser.add_argument('--decode_type', type=str, default='greedy',
+    #                     help='Decode type, greedy or sampling')
+    parser.add_argument('--width', type=int, nargs='+',
+                        help='Sizes of beam to use for beam search (or number of samples for sampling), '
+                             '0 to disable (default), -1 for infinite')
+    parser.add_argument('--decode_strategy', type=str,
+                        help='Beam search (bs), Sampling (sample) or Greedy (greedy)')
+    parser.add_argument('--softmax_temperature', type=parse_softmax_temperature, default=1,
+                        help="Softmax temperature (sampling or bs)")
+    parser.add_argument('--model', type=str)
+    parser.add_argument('--no_cuda', action='store_true', help='Disable CUDA')
+    parser.add_argument('--no_progress_bar', action='store_true', help='Disable progress bar')
+    parser.add_argument('--compress_mask', action='store_true', help='Compress mask into long')
+    parser.add_argument('--max_calc_batch_size', type=int, default=10000, help='Size for subbatches')
+    parser.add_argument('--results_dir', default='results', help="Name of results directory")
+    parser.add_argument('--multiprocessing', action='store_true',
+                        help='Use multiprocessing to parallelize over multiple GPUs')
+
+    opts = parser.parse_args()
+
+    assert opts.o is None or (len(opts.datasets) == 1 and len(opts.width) <= 1), \
+        "Cannot specify result filename with more than one dataset or more than one width"
+
+    widths = opts.width if opts.width is not None else [0]
+
+    for width in widths:
+        for dataset_path in opts.datasets:
+            eval_dataset(dataset_path, width, opts.softmax_temperature, opts)