- python (the scripts were tested with versions 2.7 and 3.2)
- GCC 4.9 +
- cmake [
package] - Boost version 1.53+ [
package] - Eigen 3.2+ library
- lp_solve library [
Clone the AIToolbox repository, then build and test the installation with the following commands
NOTE: This was implemented with AIToolox in mid-year 2016. It is possile that more recent versions have different structures and import path mayb need to be changed.
cd AIToolbox_root
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j
ctest -V
Generate synthetic POMDP parameters to highlight the impact of using multiple environments. The model comprises as many environments as possible recommandations. The i-th environment corresponds to users choosing item i with a high probability (p=0.8
) and uniform preference towards other recommandations. The reward is 0 if the recommendation does not match the user's choice, and 1 otherwise.
cd Data/
./prepare_synth.py -n [1] -k [2] -a [3] -t [4] -o [5] --norm --help
Number of items (Defaults to 3).[2]
History length (Defaults to 2). Must be strictly greater than 1.[3]
Positive scaling parameter for correct recommandation. Must be greater than 1. Defaults to 1.1.[4]
Number of test sessions to generate following the generated distribution. Defaults to 2000.[5]
Path to the output directory (Defaults to../Code/Models
If present, normalize the output transition probabilities.[--zip]
If present, transitions are stored in an archive. Recommended for large state spaces.[--help]
displays help about the script.
Estimate a POMDP model parameters and test sequences from the Foodmart dataset (.csv dataset files are included in the Data/
cd Data/
./prepare_foodmart.py -p [1] -k [2] -u [3] -a [4] -t [5] -d [6] -o [7] -D [8] --norm --help
Items discretization level. Must be between 0 (1561 fine-grained products) and 4 (3 high-level categories). Defaults to 4.[2]
History length, > 1. Defaults to 2.[3]
Number of profiles to generate. Defaults to 5.[4]
Positive scaling parameter for correct recommandation. Must be greater than 1. Defaults to 1.1.[5]
Number of test sequences to generate. Defaults to 2000.[6]
Path to the Foodmart dataset. Defaults toData/Foodmart.gz
Path to the output directory. Defaults to../Code/Models
Number of sequences to isolate to estimate each environment's transition probabilities.[--norm]
If present, output transition probabilities are normalized.[--zip]
If present, transitions are stored in an archive. Recommended for large state spaces.[--help]
displays help about the script.
Generating POMDP parameters for a typical maze/path finding problem with multiple environments.
cd Data/
./prepare_maze.py -i [1] -n [2] -s [3] -t [4] -w [5] -g [6] -e [7] -wf [8] -o [9] --rdf --help
If given, load the maze structure from a file (see toy examples in theMazes
subdirectory). if not, the mazes are generated randomly with the following parameters.[2]
Maze width and height.[3]
Number of initial states in each maze. Defaults to 1.[4]
Number of trap states in each maze (non-rewarding absorbing states). Defaults to 0.[5]
Number of obstacles in each maze. Defaults to 0.[6]
Number of goal states in each maze. Defaults to 1.[7]
Number of mazes (environments) to generate. Defaults to 1.[8]
Failure rate (equivalent to falling in a trap state) when going forward in the direction of an obstacle. Defaults to 0.05.[9]
Path to the output directory (Defaults to../Code/Models
If present, normalize the output transition probabilities.[--rdf]
If present, the failure rates (probability of staying put instead of realizing the intended action) for each environment are sampled uniformly over [0; 0.5[[--help]
displays help about the script.
The following variables can be configured at the beginning of the run.sh
script (e.g. if some libaries are installed locally and not globally)
: path to the AIToolbox installation directory.EIGEN
: path to the Eigen library installation directory.LPSOLVE
: path to the lpsolve library installation directory.GCC
: path to the g++ binary.STDLIB
: path to the stdlib matching the given gcc compiler.
cd Code/
./run.sh -m [1] -d [2] -n [3] -k [4] -u [5] -g [6] -s [7] -h [8] -e [9] -x [10] -b [11] -c -p -v
Model to use. Defaults to mdp. Available options are- mdp. MDP model obtained by a weighted average of all the environments' transition probabilities and solved by Value iteration. The solver can be configured with
Number of iterations. Defaults to 1000.
- pbvi. point-based value iteration optimized for the MEMDP structure with options
Horizon parameter. Must be greater than 1. Defaults to 2.[11]
Belief size. Defaults to 500.
- pomcp, pomcpex, pamcpex, pamcp. Monte-carlo solvers. pamcp and pamcpex implement the past-aware graph initialization. pomcpex and pamcpex implement the exact belief computation. pomcp is the vanilla POMCP with MEMDP-optimized sampling (POMCP*)
Number of simulation steps. Defaults to 1000.[8]
Horizon parameter. Must be greater than 1. Defaults to 2.[10]
Exploration parameter. Defaults to 10000 (high exploration).[11]
Number of particles for the belief approximation. Defaults to 500.
- mdp. MDP model obtained by a weighted average of all the environments' transition probabilities and solved by Value iteration. The solver can be configured with
Dataset to use. Defaults to rd. Available options are- fm (foodmart recommandations) with following options
Product discretization level. Defaults to 4.[4]
History length. Must be strictly greater than 1. Defaults to 2.[5]
User discretization level. Defaults to 0.
- mz (maze solving problem) with following options
Base name for the directory containing the corresponding MEMDP model parameters.
- rd (synthetic data recommandations) with following options
Number of actions. Defaults to 4.[4]
History length. Must be strictly greater than 1. Defaults to 2.
- fm (foodmart recommandations) with following options
Discount Parameter. Must be strictly between 0 and 1. Defaults to 0.95.[9]
Convergence criterion. Defaults to 0.01.[-c]
If present, recompile the code before running (Note: this should be used whenever using a dataset with different parameters as the number of items, environments etc are determined at compilation time).[-p]
If present, normalize the transition and use Kahan summation for more precision while handling small probabilities. Use this option if AIToolbox throws anInput transition table does not contain valid probabilities
If present, enables verbose output. In verbose mode, evaluation results per environments are displayed, and the std::cerr stream is eanbled during evaluation.
- if needed, generate the data (already available on the repository)
cd Data/
python prepare_maze.py --norm --zip -n 5 -s 1 -g 1 -w 0 -t 0 -e 60 --rdf
- run the code (assuming the output directory is the default
cd ../Code/
./run.sh -m pbvi -d mz -n gen_5x5_101_60 -h 20 -b 100 -c
./run.sh -m pamcp -d mz -n gen_5x5_101_60 -h 10 -c
- if needed, generate the data (already available on the repository)
cd Data/
python prepare_synth.py --norm --zip -n 10 -k 2
- run the code (assuming the output directory is the default
cd ../Code/
./run.sh -m mdp -d rd -n 10 -k 2 -c
./run.sh -m pamcp -d rd -n 10 -k 2 -c
./run.sh -m pomcpex -d rd -n 10 -k 2 -c
- if needed, generate the data (already available on the repository)
cd Data/
python prepare_foodmart.py --norm --zip -u 5 -p 3 -k 2
- run the code (assuming the output directory is the default
cd ../Code/
./run.sh -m mdp -d fm -n 3 -k 2 -u 5 -c
./run.sh -m pamcp -d rd -n 3 -k 2 -u 5 -c
./run.sh -m pbvi -d rd -n 3 -k 2 -u 5 -c
- When using the
option for data generation, it might be necessary to run the script withpython3
due to an issue with the gzip library in python < 3.