DeepOD
is an open-source python library for Deep Learning-based Outlier Detection
and Anomaly Detection. DeepOD
supports tabular anomaly detection and time-series anomaly detection.
DeepOD includes 26 deep outlier detection / anomaly detection algorithms (in unsupervised/weakly-supervised paradigm). More baseline algorithms will be included later.
DeepOD is featured for:
- Unified APIs across various algorithms.
- SOTA models includes reconstruction-, representation-learning-, and self-superivsed-based latest deep learning methods.
- Comprehensive Testbed that can be used to directly test different models on benchmark datasets (highly recommend for academic research).
- Versatile in different data types including tabular and time-series data (DeepOD will support other data types like images, graph, log, trace, etc. in the future, welcome PR 🔭).
- Diverse Network Structures can be plugged into detection models, we now support LSTM, GRU, TCN, Conv, and Transformer for time-series data. (welcome PR as well ✨)
If you are interested in our project, we are pleased to have your stars and forks 👍 🍻 .
The DeepOD framework can be installed via:
pip install deepod
install a developing version (strongly recommend)
git clone https://github.com/xuhongzuo/DeepOD.git
cd DeepOD
pip install .
DeepOD can be used in a few lines of code. This API style is the same with Sklean and PyOD.
for tabular anomaly detection:
# unsupervised methods
from deepod.models.tabular import DeepSVDD
clf = DeepSVDD()
clf.fit(X_train, y=None)
scores = clf.decision_function(X_test)
# weakly-supervised methods
from deepod.models.tabular import DevNet
clf = DevNet()
clf.fit(X_train, y=semi_y) # semi_y uses 1 for known anomalies, and 0 for unlabeled data
scores = clf.decision_function(X_test)
# evaluation of tabular anomaly detection
from deepod.metrics import tabular_metrics
auc, ap, f1 = tabular_metrics(y_test, scores)
for time series anomaly detection:
# time series anomaly detection methods
from deepod.models.time_series import TimesNet
clf = TimesNet()
clf.fit(X_train)
scores = clf.decision_function(X_test)
# evaluation of time series anomaly detection
from deepod.metrics import ts_metrics
from deepod.metrics import point_adjustment # execute point adjustment for time series ad
eval_metrics = ts_metrics(labels, scores)
adj_eval_metrics = ts_metrics(labels, point_adjustment(labels, scores))
Testbed contains the whole process of testing an anomaly detection model, including data loading, preprocessing, anomaly detection, and evaluation.
Please refer to testbed/
testbed/testbed_unsupervised_ad.py
is for testing unsupervised tabular anomaly detection models.testbed/testbed_unsupervised_tsad.py
is for testing unsupervised time-series anomaly detection models.
Key arguments:
--input_dir
: name of the folder that contains datasets (.csv, .npy)--dataset
: "FULL" represents testing all the files within the folder, or a list of dataset names using commas to split them (e.g., "10_cover*,20_letter*")--model
: anomaly detection model name--runs
: how many times running the detection model, finally report an average performance with standard deviation values
Example:
- Download ADBench datasets.
- modify the
dataset_root
variable as the directory of the dataset. input_dir
is the sub-folder name of thedataset_root
, e.g.,Classical
orNLP_by_BERT
.- use the following command in the bash
cd DeepOD
pip install .
cd testbed
python testbed_unsupervised_ad.py --model DIF --runs 5 --input_dir ADBench
Tabular Anomaly Detection models:
Model | Venue | Year | Type | Title |
---|---|---|---|---|
Deep SVDD | ICML | 2018 | unsupervised | Deep One-Class Classification [1] |
REPEN | KDD | 2018 | unsupervised | Learning Representations of Ultrahigh-dimensional Data for Random Distance-based Outlier Detection [2] |
RDP | IJCAI | 2020 | unsupervised | Unsupervised Representation Learning by Predicting Random Distances [3] |
RCA | IJCAI | 2021 | unsupervised | RCA: A Deep Collaborative Autoencoder Approach for Anomaly Detection [4] |
GOAD | ICLR | 2020 | unsupervised | Classification-Based Anomaly Detection for General Data [5] |
NeuTraL | ICML | 2021 | unsupervised | Neural Transformation Learning for Deep Anomaly Detection Beyond Images [6] |
ICL | ICLR | 2022 | unsupervised | Anomaly Detection for Tabular Data with Internal Contrastive Learning |
DIF | TKDE | 2023 | unsupervised | Deep Isolation Forest for Anomaly Detection |
SLAD | ICML | 2023 | unsupervised | Fascinating Supervisory Signals and Where to Find Them: Deep Anomaly Detection with Scale Learning |
DevNet | KDD | 2019 | weakly-supervised | Deep Anomaly Detection with Deviation Networks |
PReNet | KDD | 2023 | weakly-supervised | Deep Weakly-supervised Anomaly Detection |
Deep SAD | ICLR | 2020 | weakly-supervised | Deep Semi-Supervised Anomaly Detection |
FeaWAD | TNNLS | 2021 | weakly-supervised | Feature Encoding with AutoEncoders for Weakly-supervised Anomaly Detection |
RoSAS | IP&M | 2023 | weakly-supervised | RoSAS: Deep semi-supervised anomaly detection with contamination-resilient continuous supervision |
Time-series Anomaly Detection models:
Model | Venue | Year | Type | Title |
---|---|---|---|---|
DCdetector | KDD | 2023 | unsupervised | DCdetector: Dual Attention Contrastive Representation Learning for Time Series Anomaly Detection [9] |
TimesNet | ICLR | 2023 | unsupervised | TIMESNET: Temporal 2D-Variation Modeling for General Time Series Analysis [8] |
AnomalyTransformer | ICLR | 2022 | unsupervised | Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy [7] |
TranAD | VLDB | 2022 | unsupervised | TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data |
COUTA | arXiv | 2022 | unsupervised | Calibrated One-class Classification for Unsupervised Time Series Anomaly Detection |
USAD | KDD | 2020 | unsupervised | USAD: UnSupervised Anomaly Detection on Multivariate Time Series |
DIF | TKDE | 2023 | unsupervised | Deep Isolation Forest for Anomaly Detection |
TcnED | TNNLS | 2021 | unsupervised | An Evaluation of Anomaly Detection and Diagnosis in Multivariate Time Series |
Deep SVDD (TS) | ICML | 2018 | unsupervised | Deep One-Class Classification |
DevNet (TS) | KDD | 2019 | weakly-supervised | Deep Anomaly Detection with Deviation Networks |
PReNet (TS) | KDD | 2023 | weakly-supervised | Deep Weakly-supervised Anomaly Detection |
Deep SAD (TS) | ICLR | 2020 | weakly-supervised | Deep Semi-Supervised Anomaly Detection |
NOTE:
- For Deep SVDD, DevNet, PReNet, and DeepSAD, we employ network structures that can handle time-series data. These models' classes have a parameter named
network
in these models, by changing it, you can use different networks. - We currently support 'TCN', 'GRU', 'LSTM', 'Transformer', 'ConvSeq', and 'DilatedConv'.
If you use this library in your work, please cite this paper:
Hongzuo Xu, Guansong Pang, Yijie Wang and Yongjun Wang, "Deep Isolation Forest for Anomaly Detection," in IEEE Transactions on Knowledge and Data Engineering, doi: 10.1109/TKDE.2023.3270293.
You can also use the BibTex entry below for citation.
@ARTICLE{xu2023deep,
author={Xu, Hongzuo and Pang, Guansong and Wang, Yijie and Wang, Yongjun},
journal={IEEE Transactions on Knowledge and Data Engineering},
title={Deep Isolation Forest for Anomaly Detection},
year={2023},
volume={},
number={},
pages={1-14},
doi={10.1109/TKDE.2023.3270293}
}
[1] | Ruff, Lukas, et al. "Deep one-class classification." ICML. 2018. |
[2] | Pang, Guansong, et al. "Learning representations of ultrahigh-dimensional data for random distance-based outlier detection". KDD (pp. 2041-2050). 2018. |
[3] | Wang, Hu, et al. "Unsupervised Representation Learning by Predicting Random Distances". IJCAI (pp. 2950-2956). 2020. |
[4] | Liu, Boyang, et al. "RCA: A Deep Collaborative Autoencoder Approach for Anomaly Detection". IJCAI (pp. 1505-1511). 2021. |
[5] | Bergman, Liron, and Yedid Hoshen. "Classification-Based Anomaly Detection for General Data". ICLR. 2020. |
[6] | Qiu, Chen, et al. "Neural Transformation Learning for Deep Anomaly Detection Beyond Images". ICML. 2021. |
[7] | Xu Jiehui, et al. "Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy". ICLR, 2022. |
[8] | Wu Haixu, et al. "TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis". ICLR. 2023. |
[9] | Yang Yiyuan et al. "DCdetector: Dual Attention Contrastive Representation Learning for Time Series Anomaly Detection". KDD. 2023 |