CycleFormer: TSP Solver Based on Language Modeling

We introduce CycleFormer, a novel transformer model tailored specifically for the Traveling Salesman Problem (TSP). Unlike conventional transformer models designed for natural language processing, TSP presents unique challenges due to its dynamic and unlimited node set. To address this, we have innovatively integrated key elements into CycleFormer to harness the full potential of transformers for TSP.

CycleFormer enhances the encoder-decoder interaction by directly connecting the encoder output with the decoder linear layer, thereby facilitating seamless information flow. Additionally, we have introduced positional encodings tailored to TSP's two-dimensional nature. The encoder tokens are equipped with positional encodings reflecting spatial coordinates, while circular positional encodings are employed for the decoder tokens to capture the cyclic properties of a tour.

Our experimental results demonstrate the superiority of CycleFormer over state-of-the-art transformer models across various TSP problem sizes, ranging from TSP-50 to TSP-500. Notably, on TSP-500, CycleFormer achieves a remarkable reduction in the optimality gap, from 3.09% to 1.10%, outperforming existing state-of-the-art solutions.

Model Architecture

Random TSP Results

Results on large-scale Random TSP.

(Results marked with an asterisk () are directly sourced from Sun and Yang \cite{Sun2023}. DACT \cite{Ma2021}, incorporating extra heuristics like 2-opt, was excluded from the comparison.)*

Environment

CUDA 12.1
Python 3.10.12
Torch 2.2.0+cu121

Pretrained Checkpoints

You can download the pretrained checkpoints from the following Dropbox link: Pretrained Checkpoints

Train/Test datasets

You can download the Train/Test datasets(TSP50, 100, 500, 1000, 10000) from the following Dropbox link: Train/Test datasets

Build and Run Docker Container

docker build -t cycleformer-image .
docker run —ipc host —gpus all -it —rm —name cycleformer_container cycleformer_image

Train Command

python3 train.py \
    --seed 1 \
    --train_data_path ./tsp{node_size}_train_concorde.txt \
    --val_data_path ./tsp{node_size}_test_concorde.txt \
    --node_size {node_size} \
    --train_batch_size 80 \
    --val_batch_size 1280 \
    --test_batch_size 1280 \
    --G 1 \
    --resume_checkpoint None \
    --gpus 3 \
    --max_epochs 100 \
    --enc_num_layers 6 \
    --dec_num_layers 6 \
    --d_model 128 \
    --d_ff 512 \
    --h 8 \
    --dropout 0.1 \
    --smoothing 0.1 \
    --lr 0.5 \
    --betas 0.9 0.98 \
    --eps 1e-9 \
    --factor 1.0 \
    --warmup 400 \
    --encoder_pe 2D \
    --decoder_pe "circular PE" \
    --decoder_lut memory \
    --comparison_matrix memory

Inference Command

Greedy decoding

python inference.py \
    --seed 1 \
    --test_data_path ./tsp{node_size}_test_concorde.txt \
    --test_batch_size 1 \
    --G 1 \ 
    --checkpoint_path ./Cycleformer_{node_size}.pth \
    --gpus 0

Multi_start decoding

python inference.py \
    --seed 1 \
    --test_data_path ./tsp{node_size}_test_concorde.txt \
    --test_batch_size 1 \
    --G {node_size} \ 
    --checkpoint_path ./Cycleformer_{node_size}.pth \
    --gpus 0

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
tools		tools
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
TSPmodel.py		TSPmodel.py
dataset_inference.py		dataset_inference.py
dataset_train.py		dataset_train.py
inference_greedy.py		inference_greedy.py
inference_multistart.py		inference_multistart.py
loss.py		loss.py
model.py		model.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CycleFormer: TSP Solver Based on Language Modeling

Model Architecture

Random TSP Results

Environment

Pretrained Checkpoints

Train/Test datasets

Build and Run Docker Container

Train Command

Inference Command

About

Releases

Packages

Contributors 2

Languages

Giventicket/CycleFormer

Folders and files

Latest commit

History

Repository files navigation

CycleFormer: TSP Solver Based on Language Modeling

Model Architecture

Random TSP Results

Environment

Pretrained Checkpoints

Train/Test datasets

Build and Run Docker Container

Train Command

Inference Command

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages