Skip to content
/ GAMA Public

GAMA: A Multi-graph-based Anomaly Detection Framework for Business Processes via Graph Neural Networks

Notifications You must be signed in to change notification settings

guanwei49/GAMA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GAMA: A Multi-graph-based Anomaly Detection Framework for Business Processes via Graph Neural Networks

This is the source code of our paper 'GAMA: A Multi-graph-based Anomaly Detection Framework for Business Processes via Graph Neural Networks'. model

Requirements

Using Our Code

    python main.py --mode eval --TF FAP

Two modes have been implemented:

  • eval: Utilizing the anomalous event logs located in the eventlogs folder to obtain evaluation results (For reproducibility of the experiments).
  • test: Detecting anomalies in the event log with the 'xes' format and obtaining anomaly detection results (For practical application).

Three teacher forcing (TF) styles have been implemented:

  • AN: We consider that the current attribute value depends mainly on the current activity name. Therefore, at current event , the ground truth activity name is used to guide the prediction of the probability distribution.
  • PAV: We consider that the current attribute value depends mainly on the previous attribute value. Therefore, the previous ground truth attribute value is used to guide the prediction of the probability distribution.
  • FAP: We consider that the current attribute value depends both on the current activity name and the previous attribute value. Therefore, the fusion of current ground truth activity name and the previous ground truth attribute value is used to guide the prediction of the probability distribution.

Datasets

Six commonly used real-life logs:

i) Billing: This log contains events that pertain to the billing of medical services provided by a hospital.

ii) Receipt: This log contains records of the receiving phase of the building permit application process in an anonymous municipality.

iii) Sepsis: This log contains events of sepsis cases from a hospital.

iv) RTFMP: Real-life event log of an information system managing road traffic fines.

v) Permit: it contains events related to travel permits (including all related events of relevant prepaid travel cost declarations and travel declarations).

vi) Declaration: it contains events related to international travel declarations.

Eight synthetic logs: i.e., Paper, P2P, Small, Medium, Large, Huge, Gigantic, and Wide.

The summary of statistics for each event log is presented below:

Log #Activities #Traces #Events Max trace length Min trace length #Attributes #Attribute values
Gigantic 76-78 5000 28243-31989 11 3 1-4 70-363
Huge 54 5000 36377-42999 11 5 1-4 69-340
Large 42 5000 51099-56850 12 10 1-4 68-292
Medium 32 5000 28416-31372 8 3 1-4 66-276
P2p 13 5000 37941-42634 11 7 1-4 39-146
Paper 14 5000 49839-54390 12 9 1-4 36-128
Small 20 5000 42845-46060 10 7 1-4 39-144
Wide 23-34 5000 29128-31228 7 5-6 1-4 53-264
Billing 18 100000 451359 217 1 0 0
Receipt 27 1434 8577 25 1 2 58
Sepsis 16 1050 15214 185 3 1 26
RTFMP 11 150370 561470 20 2 0 0
Permit 51 7065 86581 90 3 2 10
Declaration 34 6449 72151 27 3 2 10

Logs containing artificial anomalies ranging from 5% to 45% are stored in the folder 'eventlogs'. The file names are formatted as log_name-anomaly_ratio-ID.

Experiment Results

Critical difference diagram over trace-level anomaly detection: img.png Critical difference diagram over attribute-level anomaly detection: img.png

F−scores over synthetic logs where 'T' and 'A' represent trace- and attribute-level anomaly detection respectively.

Paper Paper P2P P2P Small Small Medium Medium Large Large Huge Huge Gigantic Gigantic Wide Wide
T A T A T A T A T A T A T A T A
OC-SVM 0.498 - 0.480 - 0.522 - 0.446 - 0.480 - 0.446 - 0.462 - 0.460 -
Naive 0.866 - 0.850 - 0.898 - 0.691 - 0.715 - 0.690 - 0.574 - 0.779 -
Sampling 0.901 - 0.886 - 0.896 - 0.860 - 0.910 - 0.890 - 0.800 - 0.888 -
GAE 0.472 - 0.559 - 0.468 - 0.449 - 0.530 - 0.429 - 0.434 - 0.561 -
DAE 0.799 0.468 0.767 0.475 0.829 0.463 0.713 0.436 0.747 0.433 0.691 0.415 0.580 0.288 0.753 0.455
VAE 0.828 0.190 0.655 0.212 0.788 0.219 0.637 0.230 0.772 0.201 0.589 0.213 0.495 0.181 0.640 0.230
LAE 0.678 0.243 0.666 0.266 0.748 0.239 0.584 0.270 0.571 0.250 0.531 0.268 0.504 0.234 0.699 0.271
BINet 0.543 0.330 0.557 0.342 0.566 0.358 0.521 0.319 0.549 0.333 0.526 0.331 0.525 0.320 0.551 0.345
GAMA-AN 0.949 0.701 0.950 0.686 0.955 0.717 0.873 0.716 0.945 0.768 0.916 0.763 0.821 0.701 0.921 0.724
GAMA-PAV 0.976 0.675 0.974 0.664 0.981 0.663 0.903 0.654 0.944 0.678 0.909 0.663 0.809 0.614 0.950 0.670
GAMA-FAP 0.955 0.699 0.949 0.683 0.955 0.708 0.872 0.700 0.947 0.752 0.922 0.750 0.833 0.691 0.923 0.712

F−scores over real-life logs where 'T' and 'A' represent trace- and attribute-level anomaly detection respectively.

Billing Billing Receipt Receipt Sepsis Sepsis RTFMP RTFMP Permit Permit Declaration Declaration
T A T A T A T A T A T A
OC-SVM 0.340 - 0.464 - 0.415 - 0.507 - 0.405 - 0.449 -
Naive 0.668 - 0.638 - 0.392 - 0.776 - 0.462 - 0.495 -
Sampling 0.701 - 0.647 - 0.391 - 0.721 - 0.458 - 0.507 -
GAE 0.385 - 0.420 - 0.391 - 0.341 - 0.386 - 0.406 -
DAE 0.754 0.444 0.650 0.158 0.461 0.136 0.865 0.498 0.522 0.182 0.576 0.201
VAE 0.731 0.435 0.524 0.134 0.448 0.172 0.813 0.517 0.484 0.188 0.476 0.180
LAE 0.784 0.509 0.526 0.218 0.408 0.126 0.874 0.505 0.486 0.287 0.514 0.345
BINet 0.621 0.442 0.575 0.416 0.435 0.192 0.744 0.493 0.641 0.423 0.678 0.499
GAMA-AN 0.792 0.545 0.763 0.548 0.570 0.457 0.899 0.534 0.682 0.428 0.727 0.461
GAMA-PAV 0.791 0.544 0.778 0.538 0.510 0.376 0.914 0.574 0.634 0.369 0.669 0.406
GAMA-FAP 0.811 0.548 0.752 0.535 0.573 0.450 0.914 0.576 0.679 0.402 0.718 0.444

To Cite Our Paper

@article{guan2024gama,
  title={GAMA: A multi-graph-based anomaly detection framework for business processes via graph neural networks},
  author={Guan, Wei and Cao, Jian and Gu, Yang and Qian, Shiyou},
  journal={Information Systems},
  volume={124},
  pages={102405},
  year={2024},
  publisher={Elsevier}
}

About

GAMA: A Multi-graph-based Anomaly Detection Framework for Business Processes via Graph Neural Networks

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages