For people who are not comfortable with R.
Feature-based landscape analysis of continuous and constrained optimization problems is now available in Python as well. This package provides a python interface to the R package flacco by Pascal Kerschke in version v0.4.0. And now it also provides a native Python implementation with additional features such as:
- Features for exploiting black-box optimization problem structure.
- Ruggedness, funnels and gradients in fitness landscapes and the effect on PSO performance.
- Global characterization of the CEC 2005 fitness landscapes using fitness-distance analysis.
- Analysing and characterising optimization problems using length scale.
- Fitness Landscape Analysis Metrics based on Sobol Indices and Fitness-and State-Distributions..
- Local optima networks for continuous fitness landscapes
The following is the description of the original flacco package:
flacco is a collection of features for Explorative Landscape Analysis (ELA) of single-objective, continuous (Black-Box-)Optimization Problems. It allows the user to quantify characteristics of an (unknown) optimization problem's landscape.
Features, which used to be spread over different packages and platforms (R, Matlab, Python, etc.), are now combined within this single package. Amongst others, this package contains feature sets, such as ELA, Information Content, Dispersion, (General) Cell Mapping or Barrier Trees.
Furthermore, the package provides a unified interface for all features -- using a so-called feature object and (if required) control arguments. In total, the current release (1.7) consists of 17 different feature sets, which sum up to approximately 300 features.
In addition to the features themselves, this package also provides visualizations, e.g. of the cell mappings, barrier trees or information content
The calculation procedure and further background information of ELA features is given in Comprehensive Feature-Based Landscape Analysis of Continuous and Constrained Optimization Problems Using the R-Package flacco.
-
For some (very few) features the values for the same sample differ between pflacco and flacco: This is a known occurence. The differences can be traced back to the underlying methods to calculate the features. For example,
ela_meta
relies on linear models. The method to construct a linear model in R is based on qr (quantile regression) whereas theLinearModel()
in scikit-learn uses the conventional OLS method. For a large enough sample, there is no statistical difference. However, to keep this consistent between programming language this issue will be addressed in future version. -
What is the difference between 0.* and 1.* version of pflacco? The 0.* version of pflacco provided a simple interface to the programming language R and calculated any landscape features using the R-package flacco. While this is convenient for me as a developer, the downside is that the performance of pflacco is horrendous. Hence, the >=1.* releases of pflacco offer an implementation of almost all features of the R-package flacco in native python. Thereby, the calculation of features is expedited by an order of magnitude.
-
Is it possible to calculate landscape features for CEC or Nevergrad? Generally speaking, this is definitely possible. However, to the best of my knowledge, Nevergrad does not offer a dedicated API to query single objective functions and the CEC benchmarks are mostly written in C or Matlab. Some CEC benchmarks have an unofficial Python wrapper (which is not kept up to date) like CEC2017. These require additional compiling steps to run any of the functions.
For a stable (and tested) outcome, pflacco requires at least Python>=3.8
Easy as it usually is in Python:
python -m pip install pflacco
from pflacco.sampling import create_initial_sample
from pflacco.classical_ela_features import calculate_ela_distribution
from pflacco.misc_features import calculate_fitness_distance_correlation
from pflacco.local_optima_network_features import compute_local_optima_network, calculate_lon_features
# Arbitrary objective function
def objective_function(x):
return x[0]**2 - x[1]**2
dim = 2
# Create inital sample using latin hyper cube sampling
X = create_initial_sample(dim, sample_type = 'lhs')
# Calculate the objective values of the initial sample using an arbitrary objective function (here y = x1^2 - x2^2)
y = X.apply(lambda x: objective_function(x), axis = 1)
# Compute an exemplary feature set from the convential ELA features of the R-package flacco
ela_distr = calculate_ela_distribution(X, y)
print(ela_distr)
# Compute an exemplary feature set from the novel features which are not part of the R-package flacco yet.
fdc = calculate_fitness_distance_correlation(X, y)
print(fdc)
# Compute a Local Optima Network (LON). From this network, LON features can be calculated.
nodes, edges = compute_local_optima_network(f=objective_function, dim=dim, lower_bound=0, upper_bound=1)
lon = calculate_lon_features(nodes, edges)
print(lon)
It is also possible to include objective functions provided by other packages such as COCO
and ioh
.
Note that these packages do not always pandas dataframes as input. Hence, sometimes it is necessary to transform the initial sample X to a numpy array
In order for the following code snippet to work, you have install coco first (which is not possible via pip/conda). This code snippet calculates the specified landscape features for the well-known single-objective noiseless Black-Box Optimization Benchmark (BBOB). The optimization problems are comprised of all 24 functions in dimensions 2 and 3 for the first five instances.
import cocoex
from pflacco.classical_ela_features import *
from pflacco.sampling import create_initial_sample
features = []
# Get all 24 single-objective noiseless BBOB function in dimension 2 and 3 for the first five instances.
suite = cocoex.Suite("bbob", f"instances:1-5", f"function_indices:1-24 dimensions:2,3")
for problem in suite:
dim = problem.dimension
fid = problem.id_function
iid = problem.id_instance
# Create sample
X = create_initial_sample(dim, lower_bound = -5, upper_bound = 5)
y = X.apply(lambda x: problem(x), axis = 1)
# Calculate ELA features
ela_meta = calculate_ela_meta(X, y)
ela_distr = calculate_ela_distribution(X, y)
nbc = calculate_nbc(X, y)
disp = calculate_dispersion(X, y)
ic = calculate_information_content(X, y, seed = 100)
# Store results in pandas dataframe
data = pd.DataFrame({**ic, **ela_meta, **ela_distr, **nbc, **disp, **{'fid': fid}, **{'dim': dim}, **{'iid': iid}}, index = [0])
features.append(data)
features = pd.concat(features).reset_index(drop = True)
Similar to the example above, this code snippet calculates the specified landscape features for the well-known single-objective noiseless Black-Box Optimization Benchmark (BBOB).
The optimization problems are comprised of all 24 functions in dimensions 2 and 3 for the first five instances.
In constrast to coco
, ioh
can be installed via pip/conda and offers other benchmark problems. See the respective documentation for more details.
from pflacco.classical_ela_features import *
from pflacco.sampling import create_initial_sample
from ioh import get_problem, ProblemType
features = []
# Get all 24 single-objective noiseless BBOB function in dimension 2 and 3 for the first five instances.
for fid in range(1,25):
for dim in [2, 3]:
for iid in range(1, 6):
# Get optimization problem
problem = get_problem(fid, iid, dim, problem_type = ProblemType.BBOB)
# Create sample
X = create_initial_sample(dim, lower_bound = -5, upper_bound = 5)
y = X.apply(lambda x: problem(x), axis = 1)
# Calculate ELA features
ela_meta = calculate_ela_meta(X, y)
ela_distr = calculate_ela_distribution(X, y)
ela_level = calculate_ela_level(X, y)
nbc = calculate_nbc(X, y)
disp = calculate_dispersion(X, y)
ic = calculate_information_content(X, y, seed = 100)
# Store results in pandas dataframe
data = pd.DataFrame({**ic, **ela_meta, **ela_distr, **nbc, **disp, **{'fid': fid}, **{'dim': dim}, **{'iid': iid}}, index = [0])
features.append(data)
features = pd.concat(features).reset_index(drop = True)
A comprehensive documentation can be found here.
I endorse and appreciate every comment and participation. Feel free to open an issue here on GitHub or contact me under [email protected]