Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
synthetic_data		synthetic_data
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE.MIT		LICENSE.MIT
README.md		README.md
logging_config.ini		logging_config.ini
mia_cli.py		mia_cli.py
mleai_cli.py		mleai_cli.py
requirements.txt		requirements.txt
runconfig_attr_example.json		runconfig_attr_example.json
runconfig_mia_example.json		runconfig_mia_example.json

Repository files navigation

Privacy evaluation framework for synthetic data publishing

Implementation of a privacy evaluation framework for synthetic data publishing

Attack models

The module attack_models so far includes:

MIAttackClassifier is a privacy adversary that implements a generative model MIA, and can be used to evaluate the risk of linkability. Given a single synthetic dataset output by a generative model, this adversary produces a binary label that predicts whether a target record belongs to the model’s training set or not
AttributeInferenceAttack is a privacy adversary that learns to predict the value of an unknown sensitive attribute from a set of known attributes, and uses this knowledge to guess a target record’s sensitive value.

Generative models

The module generative_models so far includes:

IndependentHistogramModel: An independent histogram model adapted from Data Responsibly's DataSynthesiser
BayesianNetModel: A generative model based on a Bayesian Network adapted from Data Responsibly's DataSynthesiser
GaussianMixtureModel: A simple Gaussian Mixture model taken from the sklearn library
CTGAN: A conditional tabular generative adversarial network that integrates the CTGAN model from CTGAN
PateGan: A model that builds on the Private Aggregation of Teacher Ensembles (PATE) to achieve differential privacy for GANs adapted from PateGan

Setup

Requirements

The framework and its building blocks have been developed and tested on Python 3.6 and 3.7

We recommend to create a virtual environment for installing all dependencies and running the code

python3 -m venv pyvenv3
source pyvenv3/bin/activate
pip install -r requirements.txt

Dependencies

PyTorch

The PyTorch package to install depends on the version of CUDA (if any) installed on your system. Please refer to their website to install the correct PyTorch package on your virtual environment.

CTGAN

The CTGAN model depends on a fork of the original model training algorithm that can be found here

To install the correct version clone the repository above and run

cd CTGAN
make install

To test your installation try to run

import ctgan

from within your virtualenv python

Unittests

To run the test suite included in tests run

python -m unittest discover

Example

To run an example evaluation of the expected privacy gain with respect to the risk of linkability for all five generative models you can run

python mia_cli.py -D data/germancredit -RC runconfig_mia_example.json -O .

To run an example evaluation of the expected privacy gain with respect to the risk of attribute inference for all five generative models you can run

python mleai_cli.py -D data/germancredit -RC runconfig_attr_example.json -O .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

Privacy evaluation framework for synthetic data publishing

Attack models

Generative models

Setup

Requirements

Dependencies

PyTorch

CTGAN

Unittests

Example

About

Licenses found

Releases 1

Packages

Contributors 4

Languages

License

Licenses found

spring-epfl/synthetic_data_release

Folders and files

Latest commit

History

Repository files navigation

Privacy evaluation framework for synthetic data publishing

Attack models

Generative models

Setup

Requirements

Dependencies

PyTorch

CTGAN

Unittests

Example

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 4

Languages

Packages