This repository proposed the inclusion of VCNet: Variational Counter Net, and CROCO: Cost-efficient RObust COunterfactuals into the CARLA framework.
The detailed research papers for VCNet and CROCO are available at the following links: VCNet, CROCO.
Both method are implemented in the CARLA framework to conduct comparisons with other counterfactual methods.
CARLA is a python library to benchmark counterfactual explanation and recourse models. It comes out-of-the box with commonly used datasets and various machine learning models. Designed with extensibility in mind: Easily include your own counterfactual methods, new machine learning models or other datasets. Find extensive documentation here! The arXiv paper can be found here.
Name | Source |
---|---|
Adult | Source |
COMPAS | Source |
Give Me Some Credit | Source |
HELOC | Source |
Model | Description | Tensorflow | Pytorch | Sklearn | XGBoost |
---|---|---|---|---|---|
ANN | Artificial Neural Network with 2 hidden layers and ReLU activation function. | X | X | ||
LR | Linear Model with no hidden layer and no activation function. | X | X | ||
RandomForest | Tree Ensemble Model. | X | |||
XGBoost | Gradient boosting. | X |
The framework a counterfactual method currently works with is dependent on its underlying implementation. It is planned to make all recourse methods available for all ML frameworks . The latest state can be found here:
Recourse Method | Paper | Tensorflow | Pytorch | SKlearn | XGBoost |
---|---|---|---|---|---|
Actionable Recourse (AR) | Source | X | X | ||
Causal Recourse | Source | X | X | ||
CCHVAE | Source | X | |||
Contrastive Explanations Method (CEM) | Source | X | |||
Counterfactual Latent Uncertainty Explanations (CLUE) | Source | X | |||
CRUDS | Source | X | |||
Diverse Counterfactual Explanations (DiCE) | Source | X | X | ||
Feasible and Actionable Counterfactual Explanations (FACE) | Source | X | X | ||
FeatureTweak | Source | X | X | ||
FOCUS | Source | X | X | ||
Growing Spheres (GS) | Source | X | X | ||
Revise | Source | X | |||
Wachter | Source | X | |||
VCNet | Source | X | |||
CROCO | Source | X |
python3.7
pip
Run in the root folder (where the setup.py file is):
pip install -e .
Two versions of VCNet can be selected:
- The original version of the arcticle (immutable=False)
- A specific version that handle immutable features (immutable=True)
from carla.self_explaining_model import VCNet
ml_model = VCNet(data_catalog,hyperparams,immutable_features,immutable=False)
The hyperparameters values for each dataset are provided in HYPERPARAMETERS_original and HYPERPARAMETERS_immutable for the original and specific version respectively.
The models weights are provided in MODELS_original and MODELS_immutable respectively.
from carla.models.negative_instances import predict_negative_instances
from carla.data.catalog import OnlineCatalog
from carla import MLModelCatalog
from carla import Benchmark
from carla.self_explaining_model.catalog.vcnet.library.utils import *
from carla.self_explaining_model import VCNet
from carla.self_explaining_model.catalog.vcnet.library.utils import fix_seed
from carla.recourse_methods import Face
from carla.evaluation.catalog import Distance, Redundancy, SuccessRate, AvgTime, ConstraintViolation, YNN
import os
# Load a dataset from the carla framework
name = "adult"
data_catalog = OnlineCatalog(name,encoding_method="OneHot_drop_binary")
# Immutable features
immutable_features = data_catalog.immutables
# Fix the seed
fix_seed()
# Define hyperparams for counterfactual search :
hyperparams = {
"name" : name ,
"vcnet_params" : {
"train" : False,
"lr": 1.14e-5,
"batch_size": 91,
"epochs" : 174,
"lambda_1": 0,
"lambda_2": 0.93,
"lambda_3": 1,
"latent_size" : 19,
"latent_size_share" : 304,
"mid_reduce_size" : 152,
"kld_start_epoch" : 0.112,
"max_lambda_1" : 0.034
}
}
# Instantiate a VCNet model
ml_model = VCNet(data_catalog,hyperparams,immutable_features,immutable=False)
# Select a subsample (here the 100 first) of test instances that are predicted class 0
factuals_drop = predict_negative_instances(ml_model, data_catalog.df_test.drop(columns=[data_catalog.target])).iloc[:100].reset_index(drop=True)
# Benchmark VCNet
benchmark = Benchmark(ml_model,ml_model,factuals_drop)
# Load metrics from carla
distances = Distance(ml_model)
success_rate = SuccessRate()
constraint_violation = ConstraintViolation(ml_model)
ynn = YNN(ml_model,{"y" : 5, "cf_label" : 1})
# Run the benchmark
results = benchmark.run_benchmark([success_rate,distances,constraint_violation,ynn])
# Save the results
outname = f'results_VCNet.csv'
outdir = f'./carla_results/vcnet/{data_catalog.name}'
if not os.path.exists(outdir):
os.makedirs(outdir)
fullname = os.path.join(outdir, outname)
results.to_csv(fullname)
from carla.data.catalog import OnlineCatalog
from carla.models.catalog import MLModelCatalog
from carla.models.negative_instances import predict_negative_instances
from carla.recourse_methods.catalog.croco import CROCO
import os
from carla import Benchmark
from carla.evaluation.catalog import Distance, Redundancy, SuccessRate, AvgTime, ConstraintViolation, YNN, Robustness
# Load a dataset from the carla framework
name = "adult"
data_catalog = OnlineCatalog(name,encoding_method="OneHot_drop_binary")
## Load and train a machine learning model from the framework
model_type = "ann"
# Params for neural network model
training_params_ann = {
"adult": {"lr": 0.002,
"epochs": 30,
"batch_size": 1024},
"compas": {"lr": 0.002,
"epochs": 25,
"batch_size": 25},
"give_me_some_credit": {"lr": 0.002,
"epochs": 50,
"batch_size": 2048},
"breast_cancer": {"lr": 0.002,
"epochs": 25,
"batch_size": 32} }
params = training_params_ann[name]
model = MLModelCatalog(
data_catalog, model_type, load_online=False, backend="pytorch"
)
model.train(
learning_rate=params["lr"],
epochs=params["epochs"],
batch_size=params["batch_size"],
hidden_size=[50],
)
model.use_pipeline = False
# Select a subsample (here the 10 first) of test instances that are predicted class 0
factuals_drop = predict_negative_instances(ml_model, data_catalog.df_test.drop(columns=[data_catalog.target])).iloc[:10].reset_index(drop=True)
# Hyperparameters for croco
hyperparams_croco = {"n_samples" : 10_000,
"binary_cat_features": True,
"sigma2": 0.01,
"n_iter": 200,
"robustness_target" : 0.35,
"m" : 0.1,
"robustness_epsilon" : 0.01,
"lambda_param" : 1,
"distribution" : "gaussian"}
# CROCO method
croco = CROCO(model,hyperparams_croco)
# Benchmark CROCOs
benchmark = Benchmark(model,croco,factuals_drop)
# Load metrics from carla
distances = Distance(model)
success_rate = SuccessRate()
constraint_violation = ConstraintViolation(model)
ynn = YNN(model,{"y" : 5, "cf_label" : 1})
# Robustness metric (recourse invalidation rate)
robustness = Robustness(model,{"n_samples" : 10_000, "sigma2" : 0.01, "distribution" : "gaussian" })
# Run the benchmark
results = benchmark.run_benchmark([success_rate,distances,constraint_violation,robustness,ynn])
# Save the results
outname = f'results_CROCO.csv'
outdir = f'./carla_results/croco/{data_catalog.name}'
if not os.path.exists(outdir):
os.makedirs(outdir)
fullname = os.path.join(outdir, outname)
results.to_csv(fullname)
python3.7-venv
(when not already shipped with python3.7)- Recommended: GNU Make
Using make:
make requirements
Using python directly or within activated virtual environment:
pip install -U pip setuptools wheel
pip install -e .
Using make:
make test
Using python directly or within activated virtual environment:
pip install -r requirements-dev.txt
python -m pytest test/*
VCNet is under the MIT Licence. See the LICENCE for more details. CROCO is under the MIT Licence. See the LICENCE for more details.
VCNet came from a paper published to ECML/PKDD 2022. If you conduct comparison with it, please cite:
@inproceedings{Guyomard2022VCNetAS,
title={{VCNet}: A self-explaining model for realistic counterfactual generation},
author={Victor Guyomard and Françoise Fessant and Thomas Guyet},
booktitle={Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD)},
pages={437--453},
location={Grenoble, Fr},
year={2022}
}
CROCO came from a paper published to ECML/PKDD 2023. If you conduct comparison with it, please cite:
@inproceedings{CROCO,
title={Generating robust counterfactual explanations},
author={Victor Guyomard and Françoise Fessant and Thomas Guyet},
booktitle={Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD)},
location={Turin, Italy},
year={2023}
}
The CARLA framwork that is used for implementation, came from a project accepted to NeurIPS 2021 (Benchmark & Data Sets Track). If you use this codebase, please cite:
@misc{pawelczyk2021carla,
title={CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms},
author={Martin Pawelczyk and Sascha Bielawski and Johannes van den Heuvel and Tobias Richter and Gjergji Kasneci},
year={2021},
eprint={2108.00783},
archivePrefix={arXiv},
primaryClass={cs.LG}
}