TVDiag

TVDiag: A Task-oriented and View-invariant Failure Diagnosis Framework with Multimodal Data

TVDiag is a multimodal failure diagnosis framework designed to locate the root cause and identify the failure type in microservice-based systems. This repository offers the core implementation of TVDiag.

Project Structure

.
├── core
│   ├── loss
│   │   ├── AutomaticWeightedLoss.py
│   │   ├── SupervisedContrastiveLoss.py
│   │   └── UnsupervisedContrastiveLoss.py
│   ├── model
│   │   ├── backbone
│   │   │   ├── FC.py
│   │   │   ├── sage.py
│   │   │   └── cnn1d.py
│   │   ├── Classifier.py
│   │   ├── Voter.py
│   │   ├── Encoder.py
│   │   └── MainModel.py
│   ├── aug.py
│   ├── ita.py
│   ├── multimodal_dataset.py
│   └── TVDiag.py
├── data
│   └── gaia
│       ├── tmp
│       ├── raw
│       └── label.csv
├── helper
│   ├── eval.py
│   ├── io_uitl.py
│   ├── logger.py
│   ├── scaler.py
│   └── time_util.py
├── process
│   ├── events
│   │   ├── fasttext_w2v.py
│   │   ├── cnn1d_w2v.py
│   │   └── lda_w2v.py
│   └── EventProcess.py
├── requirements.txt
├── README.md
├── train.sh
└── main.py

Dataset

We conducted experiments on two datasets:

GAIA. GAIA dataset records metrics, traces, and logs of the MicroSS simulation system in July 2021, which consists of ten microservices and some middleware such as Redis, MySQL, and Zookeeper. The extracted events of GAIA can be accessible on DiagFusion.
AIOps-22. The AIOps-22 dataset is derived from the training data released by the AIOps 2022 Challenge, where failures at three levels (node, service, and instance) were injected into a Web-based e-commerce platform Online-boutique.

Getting Started

Requirements

python=3.8.12
pytorch=2.1.1
fasttext=0.9.2
dgl=2.1.0.cu118 (my cuda version is 11.8)

Run

You can run the below commands:

sh train.sh

The parameters in main.py are described as follows:

Common args

dataset: The dataset that you want to use.
reconstruct: This parameter represents whether the events should be regenerated. (Default: False)

Model

TO: TO denotes whether the task-oriented learning module should be loaded. (Default: True)
CM: CM denotes whether the cross-modal association should be established. (Default: True)
dynamic_weight: dynamic_weight denotes whether weights are dynamically assigned for each loss. (Default: True)
guide_weight: This parameter adjusts the scale of the contrastive loss. (Default: 0.1)
temperature: This parameter adjusts the temprature parameter $\tau$, controlling the the attention to difficult samples. (Default: 0.3)
patience: This parameter adjusts the patience used in early break. (Default: 10)
aug_percent: The inactivation probability. (Default: 0.2)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
core		core
data/gaia		data/gaia
draw		draw
extractor		extractor
helper		helper
imgs		imgs
process		process
LICENSE		LICENSE
README.md		README.md
main.py		main.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TVDiag

TVDiag: A Task-oriented and View-invariant Failure Diagnosis Framework with Multimodal Data

Project Structure

Dataset

Getting Started

About

Releases

Packages

Languages

License

WHU-AISE/TVDiag

Folders and files

Latest commit

History

Repository files navigation

TVDiag

TVDiag: A Task-oriented and View-invariant Failure Diagnosis Framework with Multimodal Data

Project Structure

Dataset

Getting Started

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages