- python (the scripts were tested with versions 2.7 and 3.2)
- GCC 4.9 +
- cmake [
cmake
package] - Boost version 1.53+ [
libboost-dev
package] - Eigen 3.2+ library
- lp_solve library [
lp-solve
package]
Clone the AIToolbox repository, then build and test the installation with the following commands
NOTE: This was implemented with AIToolox in mid-year 2016. It is possile that more recent versions have different structures and import path mayb need to be changed.
cd AIToolbox_root
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j
ctest -V
Generate synthetic POMDP parameters to highlight the impact of using multiple environments. The model comprises as many environments as possible recommandations. The i-th environment corresponds to users choosing item i with a high probability (p=0.8
) and uniform preference towards other recommandations. The reward is 0 if the recommendation does not match the user's choice, and 1 otherwise.
cd Data/
./prepare_synth.py -n [1] -k [2] -a [3] -t [4] -o [5] --norm --help
[1]
Number of items (Defaults to 3).[2]
History length (Defaults to 2). Must be strictly greater than 1.[3]
Positive scaling parameter for correct recommandation. Must be greater than 1. Defaults to 1.1.[4]
Number of test sessions to generate following the generated distribution. Defaults to 2000.[5]
Path to the output directory (Defaults to../Code/Models
).[--norm]
If present, normalize the output transition probabilities.[--zip]
If present, transitions are stored in an archive. Recommended for large state spaces.[--help]
displays help about the script.
Estimate a POMDP model parameters and test sequences from the Foodmart dataset (.csv dataset files are included in the Data/
directory).
cd Data/
./prepare_foodmart.py -p [1] -k [2] -u [3] -a [4] -t [5] -d [6] -o [7] -D [8] --norm --help
[1]
Items discretization level. Must be between 0 (1561 fine-grained products) and 4 (3 high-level categories). Defaults to 4.[2]
History length, > 1. Defaults to 2.[3]
Number of profiles to generate. Defaults to 5.[4]
Positive scaling parameter for correct recommandation. Must be greater than 1. Defaults to 1.1.[5]
Number of test sequences to generate. Defaults to 2000.[6]
Path to the Foodmart dataset. Defaults toData/Foodmart.gz
.[7]
Path to the output directory. Defaults to../Code/Models
.[8]
Number of sequences to isolate to estimate each environment's transition probabilities.[--norm]
If present, output transition probabilities are normalized.[--zip]
If present, transitions are stored in an archive. Recommended for large state spaces.[--help]
displays help about the script.
Generating POMDP parameters for a typical maze/path finding problem with multiple environments.
cd Data/
./prepare_maze.py -i [1] -n [2] -s [3] -t [4] -w [5] -g [6] -e [7] -wf [8] -o [9] --rdf --help
[1]
If given, load the maze structure from a file (see toy examples in theMazes
subdirectory). if not, the mazes are generated randomly with the following parameters.[2]
Maze width and height.[3]
Number of initial states in each maze. Defaults to 1.[4]
Number of trap states in each maze (non-rewarding absorbing states). Defaults to 0.[5]
Number of obstacles in each maze. Defaults to 0.[6]
Number of goal states in each maze. Defaults to 1.[7]
Number of mazes (environments) to generate. Defaults to 1.[8]
Failure rate (equivalent to falling in a trap state) when going forward in the direction of an obstacle. Defaults to 0.05.[9]
Path to the output directory (Defaults to../Code/Models
).[--norm]
If present, normalize the output transition probabilities.[--rdf]
If present, the failure rates (probability of staying put instead of realizing the intended action) for each environment are sampled uniformly over [0; 0.5[[--help]
displays help about the script.
The following variables can be configured at the beginning of the run.sh
script (e.g. if some libaries are installed locally and not globally)
AIROOT
: path to the AIToolbox installation directory.EIGEN
: path to the Eigen library installation directory.LPSOLVE
: path to the lpsolve library installation directory.GCC
: path to the g++ binary.STDLIB
: path to the stdlib matching the given gcc compiler.
cd Code/
./run.sh -m [1] -d [2] -n [3] -k [4] -u [5] -g [6] -s [7] -h [8] -e [9] -x [10] -b [11] -c -p -v
[1]
Model to use. Defaults to mdp. Available options are- mdp. MDP model obtained by a weighted average of all the environments' transition probabilities and solved by Value iteration. The solver can be configured with
[7]
Number of iterations. Defaults to 1000.
- pbvi. point-based value iteration optimized for the MEMDP structure with options
[8]
Horizon parameter. Must be greater than 1. Defaults to 2.[11]
Belief size. Defaults to 500.
- pomcp, pomcpex, pamcpex, pamcp. Monte-carlo solvers. pamcp and pamcpex implement the past-aware graph initialization. pomcpex and pamcpex implement the exact belief computation. pomcp is the vanilla POMCP with MEMDP-optimized sampling (POMCP*)
[7]
Number of simulation steps. Defaults to 1000.[8]
Horizon parameter. Must be greater than 1. Defaults to 2.[10]
Exploration parameter. Defaults to 10000 (high exploration).[11]
Number of particles for the belief approximation. Defaults to 500.
- mdp. MDP model obtained by a weighted average of all the environments' transition probabilities and solved by Value iteration. The solver can be configured with
[2]
Dataset to use. Defaults to rd. Available options are- fm (foodmart recommandations) with following options
[3]
Product discretization level. Defaults to 4.[4]
History length. Must be strictly greater than 1. Defaults to 2.[5]
User discretization level. Defaults to 0.
- mz (maze solving problem) with following options
[3]
Base name for the directory containing the corresponding MEMDP model parameters.
- rd (synthetic data recommandations) with following options
[3]
Number of actions. Defaults to 4.[4]
History length. Must be strictly greater than 1. Defaults to 2.
- fm (foodmart recommandations) with following options
[6]
Discount Parameter. Must be strictly between 0 and 1. Defaults to 0.95.[9]
Convergence criterion. Defaults to 0.01.[-c]
If present, recompile the code before running (Note: this should be used whenever using a dataset with different parameters as the number of items, environments etc are determined at compilation time).[-p]
If present, normalize the transition and use Kahan summation for more precision while handling small probabilities. Use this option if AIToolbox throws anInput transition table does not contain valid probabilities
error.[-v]
If present, enables verbose output. In verbose mode, evaluation results per environments are displayed, and the std::cerr stream is eanbled during evaluation.
- if needed, generate the data (already available on the repository)
cd Data/
python prepare_maze.py --norm --zip -n 5 -s 1 -g 1 -w 0 -t 0 -e 60 --rdf
- run the code (assuming the output directory is the default
ROOT/Code/Models/
)
cd ../Code/
./run.sh -m pbvi -d mz -n gen_5x5_101_60 -h 20 -b 100 -c
./run.sh -m pamcp -d mz -n gen_5x5_101_60 -h 10 -c
- if needed, generate the data (already available on the repository)
cd Data/
python prepare_synth.py --norm --zip -n 10 -k 2
- run the code (assuming the output directory is the default
ROOT/Code/Models/
)
cd ../Code/
./run.sh -m mdp -d rd -n 10 -k 2 -c
./run.sh -m pamcp -d rd -n 10 -k 2 -c
./run.sh -m pomcpex -d rd -n 10 -k 2 -c
- if needed, generate the data (already available on the repository)
cd Data/
python prepare_foodmart.py --norm --zip -u 5 -p 3 -k 2
- run the code (assuming the output directory is the default
ROOT/Code/Models/
)
cd ../Code/
./run.sh -m mdp -d fm -n 3 -k 2 -u 5 -c
./run.sh -m pamcp -d rd -n 3 -k 2 -u 5 -c
./run.sh -m pbvi -d rd -n 3 -k 2 -u 5 -c
- When using the
--zip
option for data generation, it might be necessary to run the script withpython3
due to an issue with the gzip library in python < 3.