Skip to content

Commit

Permalink
updated citation and use typings
Browse files Browse the repository at this point in the history
  • Loading branch information
liznerski committed Jan 29, 2021
1 parent d403ed8 commit 308dafb
Show file tree
Hide file tree
Showing 28 changed files with 153 additions and 119 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,9 @@ target/
# mypy
.mypy_cache/

# Pycharm
# Pycharm and VSCode
.idea
.vscode

# CMake
cmake-build-debug/
Expand Down
22 changes: 11 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,23 @@
Here we provide the implementation of *Fully Convolutional Data Description* (FCDD), an explainable approach to deep one-class classification.
The implementation is based on PyTorch 1.4.0 and Python 3.6.

Deep one-class classification variants for anomaly detection learn a mapping that concentrates nominal samples in feature space causing anomalies to be mapped away. Because this transformation is highly non-linear, finding interpretations poses a significant challenge. We present an explainable deep one-class classification method, *Fully Convolutional Data Description* (FCDD), where the mapped samples are themselves also an explanation heatmap. FCDD yields competitive detection performance and provides reasonable explanations on common anomaly detection benchmarks with CIFAR-10 and ImageNet. On MVTec-AD, a recent manufacturing dataset offering ground-truth anomaly maps, FCDD meets the state of the art in an unsupervised setting, and outperforms its competitors in a semi-supervised setting. The following image shows some of the FCDD explanation heatmaps for test samples of MVTec-AD:
Deep one-class classification variants for anomaly detection learn a mapping thatconcentrates nominal samples in feature space causing anomalies to be mapped away. Because this transformation is highly non-linear, finding interpretations poses a significant challenge. In this paper we present an explainable deep one-class classification method, *Fully Convolutional Data Description* (FCDD), where the mapped samples are themselves also an explanation heatmap. FCDD yields competitive detection performance and provides reasonable explanations on common anomaly detection benchmarks with CIFAR-10 and ImageNet. On MVTec-AD, a recent manufacturing dataset offering ground-truth anomaly maps, FCDD sets a new state of the art in the unsupervised setting. Our method can incorporate ground-truth anomaly maps during training and using even a few of these (∼5) improves performance significantly. Finally, using FCDD’s explanations we demonstrate the vulnerability of deep one-class classification models to spurious image features such as image watermarks. The following image shows some of the FCDD explanation heatmaps for test samples of MVTec-AD:

<img src="data/git_images/fcdd_explanations_mvtec.png?raw=true" height="373" width="633" >


## Cite the Paper
A preprint of our paper is available at: https://arxiv.org/abs/2007.01760
## Citation
A PDF of our ICLR 2021 paper is available at: https://openreview.net/forum?id=A5VV3UyIQz.

If you use our work, please also cite the current preprint:
If you use our work, please also cite the paper:
```
@misc{liznerski2020explainable,
@inproceedings{
liznerski2021explainable,
title={Explainable Deep One-Class Classification},
author={Philipp Liznerski and Lukas Ruff and Robert A. Vandermeulen and Billy Joe Franks and Marius Kloft and Klaus-Robert Müller},
year={2020},
eprint={2007.01760},
archivePrefix={arXiv},
primaryClass={cs.CV}
author={Philipp Liznerski and Lukas Ruff and Robert A. Vandermeulen and Billy Joe Franks and Marius Kloft and Klaus Robert Muller},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=A5VV3UyIQz}
}
```

Expand Down Expand Up @@ -293,5 +293,5 @@ Cf. one of the existing implementations of Outlier Exposure datasets, e.g. `fcdd


## Need help?
If you find any bugs, have questions, or need help modifying FCDD, feel free to write us an [email](mailto:p_liznersk13@cs.uni-kl.de)!
If you find any bugs, have questions, need help modifying FCDD, or want to get in touch in general, feel free to write us an [email](mailto:liznerski@cs.uni-kl.de)!

13 changes: 8 additions & 5 deletions python/fcdd/datasets/__init__.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
from copy import deepcopy
from typing import List

from fcdd.datasets.bases import TorchvisionDataset
from fcdd.datasets.cifar import ADCIFAR10
from fcdd.datasets.fmnist import ADFMNIST
from fcdd.datasets.mvtec import ADMvTec
from fcdd.datasets.imagenet import ADImageNet
from fcdd.datasets.mvtec import ADMvTec
from fcdd.datasets.pascal_voc import ADPascalVoc
from copy import deepcopy

DS_CHOICES = ('mnist', 'cifar10', 'fmnist', 'mvtec', 'imagenet', 'pascalvoc')
PREPROC_CHOICES = (
Expand All @@ -13,7 +16,7 @@

def load_dataset(dataset_name: str, data_path: str, normal_class: int, preproc: str,
supervise_mode: str, noise_mode: str, online_supervision: bool, nominal_label: int,
oe_limit: int, logger=None):
oe_limit: int, logger=None) -> TorchvisionDataset:
""" Loads the dataset with given preprocessing pipeline and supervise parameters """

assert dataset_name in DS_CHOICES
Expand Down Expand Up @@ -55,7 +58,7 @@ def load_dataset(dataset_name: str, data_path: str, normal_class: int, preproc:
return dataset


def no_classes(dataset_name):
def no_classes(dataset_name: str) -> int:
return {
'cifar10': 10,
'fmnist': 10,
Expand All @@ -65,7 +68,7 @@ def no_classes(dataset_name):
}[dataset_name]


def str_labels(dataset_name):
def str_labels(dataset_name: str) -> List[str]:
return {
'cifar10': ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck'],
'fmnist': [
Expand Down
13 changes: 7 additions & 6 deletions python/fcdd/datasets/bases.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
from abc import ABC, abstractmethod
from typing import Tuple

import numpy as np
import torch
from fcdd.datasets.noise_modes import generate_noise
from fcdd.datasets.offline_supervisor import noise as apply_noise, malformed_normal as apply_malformed_normal
from fcdd.datasets.preprocessing import get_target_label_idx
from fcdd.util.logging import Logger
from torch.utils.data import DataLoader
from torch.utils.data import Subset
from torch.utils.data.dataset import Dataset
from fcdd.util.logging import Logger


class BaseADDataset(ABC):
Expand All @@ -31,8 +32,8 @@ def __init__(self, root: str, logger: Logger = None):
self.logger = logger

@abstractmethod
def loaders(self, batch_size: int, shuffle_train=True, shuffle_test=False, num_workers: int = 0) -> (
DataLoader, DataLoader):
def loaders(self, batch_size: int, shuffle_train=True, shuffle_test=False, num_workers: int = 0) -> Tuple[
DataLoader, DataLoader]:
""" Implement data loaders of type torch.utils.data.DataLoader for train_set and test_set. """
pass

Expand Down Expand Up @@ -62,7 +63,7 @@ def __init__(self, root: str, logger=None):
super().__init__(root, logger=logger)

def loaders(self, batch_size: int, shuffle_train=True, shuffle_test=False, num_workers: int = 0)\
-> (DataLoader, DataLoader):
-> Tuple[DataLoader, DataLoader]:
assert not shuffle_test, \
'using shuffled test raises problems with original GT maps for GT datasets, thus disabled atm!'
# classes = None means all classes
Expand Down Expand Up @@ -168,7 +169,7 @@ def _generate_artificial_anomalies_train_set(self, supervise_mode: str, noise_mo
if supervise_mode not in ['unsupervised', 'other']:
self.logprint('Artificial anomalies generated.')

def _generate_noise(self, noise_mode: str, size: torch.Size, oe_limit: int = None, datadir: str = None):
def _generate_noise(self, noise_mode: str, size: torch.Size, oe_limit: int = None, datadir: str = None) -> torch.Tensor:
generated_noise = generate_noise(noise_mode, size, oe_limit, logger=self.logger, datadir=datadir)
return generated_noise

Expand Down Expand Up @@ -223,7 +224,7 @@ def targets(self):
def data(self):
return self.ds.data

def __getitem__(self, index):
def __getitem__(self, index: int):
gtmap = self.extended_gtmaps[index]

if isinstance(self.ds, GTMapADDataset):
Expand Down
4 changes: 3 additions & 1 deletion python/fcdd/datasets/cifar.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from typing import Tuple

import numpy as np
import torch
import torchvision.transforms as transforms
Expand Down Expand Up @@ -171,7 +173,7 @@ def __init__(self, root, train=True, transform=None, target_transform=None,
self.all_transform = all_transform
self.normal_classes = normal_classes

def __getitem__(self, index):
def __getitem__(self, index: int) -> Tuple[torch.Tensor, int]:
"""
Args:
index (int): Index
Expand Down
4 changes: 3 additions & 1 deletion python/fcdd/datasets/fmnist.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from typing import Tuple

import PIL.Image as Image
import torch
import torchvision.transforms as transforms
Expand Down Expand Up @@ -160,7 +162,7 @@ def __init__(self, root, train=True, transform=None, target_transform=None,
self.all_transform = all_transform
self.normal_classes = normal_classes

def __getitem__(self, index):
def __getitem__(self, index: int) -> Tuple[torch.Tensor, int]:
img, target = self.data[index], int(self.targets[index])

if self.target_transform is not None:
Expand Down
5 changes: 3 additions & 2 deletions python/fcdd/datasets/imagenet.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import os
import os.path as pt
import random
from typing import List, Tuple

import numpy as np
import torch
Expand Down Expand Up @@ -182,7 +183,7 @@ def __init__(self, root, transform=None, target_transform=None,
self.all_transform = all_transform
self.split = split

def __getitem__(self, index):
def __getitem__(self, index: int) -> Tuple[torch.Tensor, int]:
target = self.targets[index]

if self.target_transform is not None:
Expand All @@ -206,5 +207,5 @@ def __getitem__(self, index):

return img, target

def get_class_idx(self, classes):
def get_class_idx(self, classes: List[str]):
return [self.class_to_idx[c] for c in classes]
9 changes: 5 additions & 4 deletions python/fcdd/datasets/mvtec_base.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
import gzip
import os
import shutil
import tarfile
import tempfile
import zipfile
import shutil
from typing import Callable
from typing import Tuple

import numpy as np
import torch
from typing import Callable
import torchvision.transforms as transforms
from PIL import Image
from fcdd.datasets.bases import GTMapADDataset
Expand Down Expand Up @@ -99,7 +100,7 @@ def __init__(self, root: str, split: str = 'train', target_transform: Callable =
assert -3 not in [self.nominal_label, self.anom_label]
print('Dataset complete.')

def __getitem__(self, index: int) -> (torch.Tensor, int, torch.Tensor):
def __getitem__(self, index: int) -> Tuple[torch.Tensor, int, torch.Tensor]:
img, label = self.data[index], self.targets[index]

if self.split == 'test_anomaly_label_target':
Expand Down Expand Up @@ -139,7 +140,7 @@ def __getitem__(self, index: int) -> (torch.Tensor, int, torch.Tensor):

return img, label, gt

def __len__(self):
def __len__(self) -> int:
return len(self.data)

def download(self, verbose=True, shape=None, cls=None):
Expand Down
11 changes: 7 additions & 4 deletions python/fcdd/datasets/noise.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from typing import Tuple

import numpy as np
import torch
from kornia import gaussian_blur2d
Expand Down Expand Up @@ -40,10 +42,11 @@ def gkern(k: int, std: float = None):
return gkern2d


def confetti_noise(size: torch.Size, p: float = 0.01, blobshaperange: ((int, int), (int, int)) = ((3, 3), (5, 5)),
def confetti_noise(size: torch.Size, p: float = 0.01,
blobshaperange: Tuple[Tuple[int, int], Tuple[int, int]] = ((3, 3), (5, 5)),
fillval: int = 255, backval: int = 0, ensureblob: bool = True, awgn: float = 0.0,
clamp: bool = False, onlysquared: bool = True, rotation: int = 0,
colorrange: (int, int) = None) -> torch.Tensor:
colorrange: Tuple[int, int] = None) -> torch.Tensor:
"""
Generates "confetti" noise, as seen in the paper.
The noise is based on sampling randomly many rectangles (in the following called blobs) at random positions.
Expand Down Expand Up @@ -158,8 +161,8 @@ def confetti_noise(size: torch.Size, p: float = 0.01, blobshaperange: ((int, int
return res


def colorize_noise(img: torch.Tensor, color_min: (int, int, int) = (-255, -255, -255),
color_max: (int, int, int) = (255, 255, 255), p: float = 1) -> torch.Tensor:
def colorize_noise(img: torch.Tensor, color_min: Tuple[int, int, int] = (-255, -255, -255),
color_max: Tuple[int, int, int] = (255, 255, 255), p: float = 1) -> torch.Tensor:
"""
Colorizes given noise images by asserting random color values to pixels that are not black (zero).
:param img: torch tensor (n x c x h x w)
Expand Down
4 changes: 2 additions & 2 deletions python/fcdd/datasets/noise_modes.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
import torch
from fcdd.datasets.noise import confetti_noise, colorize_noise, solid, smooth_noise
from fcdd.datasets.outlier_exposure.imagenet import OEImageNet, OEImageNet22k
from fcdd.datasets.outlier_exposure.cifar100 import OECifar100
from fcdd.datasets.outlier_exposure.emnist import OEEMNIST
from fcdd.datasets.outlier_exposure.imagenet import OEImageNet, OEImageNet22k
from fcdd.util.logging import Logger
import torch

MODES = [
'gaussian', 'uniform', 'blob', 'mixed_blob', 'solid', 'confetti', # Synthetic Anomalies
Expand Down
6 changes: 4 additions & 2 deletions python/fcdd/datasets/offline_supervisor.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
from typing import List

import torch
from torch.utils.data.dataset import Dataset


def noise(outlier_classes: [int], generated_noise: torch.Tensor, norm: torch.Tensor,
def noise(outlier_classes: List[int], generated_noise: torch.Tensor, norm: torch.Tensor,
nom_class: int, train_set: Dataset, gt: bool = False) -> Dataset:
"""
Creates a dataset based on the nominal classes of a given dataset and generated noise anomalies.
Expand All @@ -27,7 +29,7 @@ def noise(outlier_classes: [int], generated_noise: torch.Tensor, norm: torch.Ten
return train_set


def malformed_normal(outlier_classes: [int], generated_noise: torch.Tensor, norm: torch.Tensor, nom_class: int,
def malformed_normal(outlier_classes: List[int], generated_noise: torch.Tensor, norm: torch.Tensor, nom_class: int,
train_set: Dataset, gt: bool = False, brightness_threshold: float = 0.11*255) -> Dataset:
"""
Creates a dataset based on the nominal classes of a given dataset and generated noise anomalies.
Expand Down
5 changes: 3 additions & 2 deletions python/fcdd/datasets/online_supervisor.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import random
import traceback
from itertools import cycle
from typing import List, Tuple

import numpy as np
import torch
Expand All @@ -16,7 +17,7 @@ class OnlineSupervisor(ImgGTTargetTransform):
invert_threshold = 0.025

def __init__(self, ds: TorchvisionDataset, supervise_mode: str, noise_mode: str, oe_limit: int = np.infty,
p: float = 0.5, exclude: [str] = ()):
p: float = 0.5, exclude: List[str] = ()):
"""
This class is used as a Transform parameter for torchvision datasets.
During training it randomly replaces a sample of the dataset retrieved via the get_item method
Expand Down Expand Up @@ -81,7 +82,7 @@ def __init__(self, ds: TorchvisionDataset, supervise_mode: str, noise_mode: str,
)

def __call__(self, img: torch.Tensor, gt: torch.Tensor, target: int,
replace: bool = None) -> (torch.Tensor, torch.Tensor, int):
replace: bool = None) -> Tuple[torch.Tensor, torch.Tensor, int]:
"""
Based on the probability defined in __init__, replaces (img, gt, target) with an artificial anomaly.
:param img: some torch tensor image
Expand Down
4 changes: 2 additions & 2 deletions python/fcdd/datasets/outlier_exposure/cifar100.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,10 +78,10 @@ def __init__(self, size: torch.Size, root: str = None, train: bool = True, limit
.format(limit_var, size[0], rep, len(self), size[0])
)

def data_loader(self):
def data_loader(self) -> DataLoader:
return DataLoader(dataset=self, batch_size=self.size[0], shuffle=True, num_workers=0)

def __getitem__(self, index):
def __getitem__(self, index: int) -> torch.Tensor:
sample, target = super().__getitem__(index)
sample = sample.mul(255).byte()

Expand Down
4 changes: 2 additions & 2 deletions python/fcdd/datasets/outlier_exposure/emnist.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,10 +78,10 @@ def __init__(self, size: torch.Size, root: str = None, split='letters', limit_va
.format(limit_var, size[0], rep, len(self), size[0])
)

def data_loader(self):
def data_loader(self) -> DataLoader:
return DataLoader(dataset=self, batch_size=self.size[0], shuffle=True, num_workers=0)

def __getitem__(self, index):
def __getitem__(self, index: int) -> torch.Tensor:
sample, target = super().__getitem__(index)
sample = sample.squeeze().mul(255).byte()

Expand Down
9 changes: 5 additions & 4 deletions python/fcdd/datasets/outlier_exposure/imagenet.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,15 @@
from torchvision.datasets import DatasetFolder
from torchvision.datasets.folder import has_file_allowed_extension, default_loader, IMG_EXTENSIONS
from torchvision.datasets.vision import StandardTransform
from typing import List, Tuple


def ceil(x: float):
return int(np.ceil(x))


class OEImageNet(torchvision.datasets.ImageNet):
def __init__(self, size: torch.Size, root: str = None, split='val', limit_var: int = np.infty, exclude: [str] = ()):
def __init__(self, size: torch.Size, root: str = None, split='val', limit_var: int = np.infty, exclude: List[str] = ()):
"""
Outlier Exposure dataset for ImageNet.
:param size: size of the samples in n x c x h x w, samples will be resized to h x w. If n is larger than the
Expand Down Expand Up @@ -216,7 +217,7 @@ def __init__(self, root: str, size: torch.Size, exclude_imagenet1k=True, *args,
if exclude_imagenet1k:
self.samples = [s for s in self.samples if not any([idx in s[0] for idx in self.imagenet1k_idxs])]

def __getitem__(self, index: int) -> (torch.Tensor, int):
def __getitem__(self, index: int) -> Tuple[torch.Tensor, int]:
"""
Override the original method of the ImageFolder class to catch some errors (seems like a few of the 22k
images are broken).
Expand Down Expand Up @@ -284,10 +285,10 @@ def __init__(self, size: torch.Size, root: str = None, limit_var=np.infty, logge
def __len__(self):
return len(self.picks if self.picks is not None else self.samples)

def data_loader(self):
def data_loader(self) -> DataLoader:
return DataLoader(dataset=self, batch_size=self.size[0], shuffle=True, num_workers=0)

def __getitem__(self, index):
def __getitem__(self, index: int) -> torch.Tensor:
index = self.picks[index] if self.picks is not None else index

sample, target = super().__getitem__(index)
Expand Down
Loading

0 comments on commit 308dafb

Please sign in to comment.