Name	Name	Last commit message	Last commit date
Latest commit History 50 Commits
selected-models	selected-models
src	src
.gitignore	.gitignore
Makefile	Makefile
README.md	README.md
conda_env.yml	conda_env.yml
main_training_dqn_portfolio.py	main_training_dqn_portfolio.py
main_training_dqn_tsptw.py	main_training_dqn_tsptw.py
main_training_ppo_portfolio.py	main_training_ppo_portfolio.py
main_training_ppo_tsptw.py	main_training_ppo_tsptw.py
run_training_dqn_portfolio.sh	run_training_dqn_portfolio.sh
run_training_dqn_tsptw.sh	run_training_dqn_tsptw.sh
run_training_ppo_portfolio.sh	run_training_ppo_portfolio.sh
run_training_ppo_tsptw.sh	run_training_ppo_tsptw.sh

Hybrid solver for combinatorial optimization problems

Combinatorial optimization has found applications in numerous fields, from aerospace to transportation planning and economics. The goal is to find an optimal solution among a finite set of possibilities. The well-known challenge one faces with combinatorial optimization is the state-space explosion problem: the number of possibilities grows exponentially with the problem size, which makes solving intractable for large problems.

In the last years, Deep Reinforcement Learning (DRL) has shown its promise for designing good heuristics dedicated to solve NP-hard combinatorial optimization problems. However, current approaches have two shortcomings: (1) they mainly focus on the standard travelling salesman problem and they cannot be easily extended to other problems, and (2) they only provide an approximate solution with no systematic ways to improve it or to prove optimality.

In another context, Constraint Programming (CP) is a generic tool to solve combinatorial optimization problems. Based on a complete search procedure, it will always find the optimal solution if we allow an execution time large enough. A critical design choice, that makes CP non-trivial to use in practice, is the branching decision, directing how the search space is explored. In this work, we propose a general and hybrid approach, based on DRL and CP, for solving combinatorial optimization problems. The core of our approach is based on a Dynamic Programming (DP) formulation, that acts as a bridge between both techniques.

In this work, we propose a general and hybrid approach, based on DRL and CP, for solving combinatorial optimization problems formulated as a DP. In the related paper, we show experimentally show that our solver is efficient to solve two challenging problems: the Travelling Salesman Problem with Time Windows and the 4-moments Portfolio Optimization Problem, that includes the means, deviations, skewnessess, and kurtosis of the assets. Results obtained show that the framework introduced outperforms the stand-alone RL and CP solutions, while being competitive with industrial solvers.

Please be aware that this project is still at research level.

Content of the repository

This repository contains the implementation of the paper (xxx). For each problem that we have considered, you can find:

A DP model serving as a basis for the RL environment and the CP model.
The RL enviroment and the CP model.
A RL training algorithm based on Deep Q-Learning (DQN).
A RL training algorithm based on Proximal Policy Optimization (PPO).
The models, and the hyperparameters used, that we trained.
Two CP solving algorithms leveraging the learned models: Iterative Limited Discrepancy Search (I-LDS) and Restart Based Search (RBS)
A random instance generators for training the model and evaluating the solver.

.
├── conda_env.yml  # configuration file for the conda environment
├── main_training_x_y.py  # main file for training a model for the problem y using algorithm x
├── run_training_x_y.sh  # script for running the training. It is where you have to enter the parameters 
├── trained_models/  # directory where the models that you train will be saved
├── selected_models/  # models that we used for our experiments
└── src/ 
	├── architecture/ # implementation of the NN used
        ├── util/  #  utilitary code (as the memory replay)
	├── problem/  # problems that we have implemented
		└── tsptw/ 
		      ├── environment/ # the generator, and the DP model, acting also as the RL environment
		      ├── training/  # PPO and DQN training algorithms
		      ├── solving/  # CP model and solving algorithm
		├── ...

Installation instructions

1. Importing the repository

git clone https://github.com/qcappart/dp-solver.git

2. Setting up the conda virtual environment

conda env create -f conda_env.yml

3. Building Gecode

Please refer to the setup instructions available on the official website.

4. Compiling the solver

A makefile is available in the root repository. First, modify it by adding your python path. Then, you can compile the project as follows:

make [problem] # e.g. make tsptw

It will create the executable solver_tsptw.

Basic use

1. Training a model

./run_training_ppo_tsptw.sh # for PPO
./run_training_dqn_tsptw.sh # for DQN

2. Solving the problem

# For TSPTW
./solver_tsptw --model=rl-ilds-dqn --time=60000 --size=20 --grid_size=100 --max_tw_size=100 --max_tw_gap=10 --d_l=5000 --cache=1 --seed=1  # Solve with ILDS-DQN
./solver_tsptw --model=rl-bab-dqn --time=60000 --size=20 --grid_size=100 --max_tw_size=100 --max_tw_gap=10 --d_l=5000 --cache=1 --seed=1 # Solve with ILDS-DQN
./solver_tsptw --model=rl-rbs-ppo --time=60000 --size=20 --grid_size=100 --max_tw_size=100 --max_tw_gap=10 --cache=1 --luby=1 --temperature=1 --seed=1 # Solve with RBS-PPO
./solver_tsptw --model=nearest --time=60000 --size=20 --grid_size=100 --max_tw_size=100 --max_tw_gap=10 --d_l=5000 --seed=1 # Solve with a nearest neigbour heuristic (no learning)

# For Portfolio
./solver_portfolio --model=rl-ilds-dqn --time=60000 --size=50 --capacity_ratio=0.5 --lambda_1=1 --lambda_2=5 --lambda_3=5 --lambda_4=5  --discrete_coeffs=0 --cache=1 --seed=1

For learning based methods, the model selected by default is the one located in the corresponding selected_model/ repository. For instance:

selected-models/ppo/tsptw/n-city-20/grid-100-tw-10-100/

Technologies and tools used

The code, at the exception of the CP model, is implemented in Python 3.7.
The CP model is implemented in C++ and is solved using Gecode. The reason of this design choice is that there is no CP solver in Python with the requirements we needed.
The neural network architecture as been implemented in Pytorch together with DGL.
The interface between the C++ and Python code is done with Pybind11.

Current implemented problems

At the moment, only the TSPTW problem is present in this repository. However, we also have the TSP, the 0/1 Knapsack problem, and a nonlinear portfolio optimization problem available, but in non-cleaned code. If there is demand for these problems, I will clean the code and add them in this repository. Feel free to open an issue for that.

Cite

Please use this reference:

TODO

Licence

This work is under MIT licence (https://choosealicense.com/licenses/mit/). It is a short and simple very permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hybrid solver for combinatorial optimization problems

Content of the repository

Installation instructions

1. Importing the repository

2. Setting up the conda virtual environment

3. Building Gecode

4. Compiling the solver

Basic use

1. Training a model

2. Solving the problem

Technologies and tools used

Current implemented problems

Cite

Licence

About

Releases

Packages

Languages

linkpark/hybrid-cp-rl-solver

Folders and files

Latest commit

History

Repository files navigation

Hybrid solver for combinatorial optimization problems

Content of the repository

Installation instructions

1. Importing the repository

2. Setting up the conda virtual environment

3. Building Gecode

4. Compiling the solver

Basic use

1. Training a model

2. Solving the problem

Technologies and tools used

Current implemented problems

Cite

Licence

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages