Skip to content

Code for TAMA: See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers.

License

Notifications You must be signed in to change notification settings

ChongKaKam/TAMA

Repository files navigation

See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers

License: MIT python arXiv

This is the official implementation of TAMA in the following paper: See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers.

Flow-Chart

1. Environment Setup

pip install -r requirements.txt

2.Add your API keys

Before you get started, it is necessary to get API keys of LLMs. In our framework, you should create a .yaml file called api_keys.yaml in BigModel/ directory. The format is shown below:

openai:
  api_key: 'Your API Keys'
chatglm:
  api_key: 'Your API Keys'

3. Prepare for datasets

We use datasets across four domains to evaluate our framework, the detailed information is shown below:

Dataset Domain Source
UCR Industry Current Time Series Anomaly Detection Benchmarks are Flawed and are Creating the Illusion of Progress
NASA-SMAP Industry Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding
NASA-MSL Industry Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding
NormA Industry Unsupervised and scalable subsequence anomaly detection in large data series
SMD Web service Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network
Dodgers Transportation Dodgers Loop Sensor
ECG Health care TSB-UAD: an end-to-end benchmark suite for univariate time-series anomaly detection
Synthetic - We have uploaded this dataset to Google Drive. ( Link )

It is recommended to create a directory data before downloading datasets. The file tree should be like:

./data/
├── Anomaly_Classification -> /nas/datasets/Anomaly_Classification
├── Dodgers
├── ECG
├── NASA-MSL
├── NASA-SMAP
├── NormA
├── SMD
├── UCR
│   ├── 135_labels.npy
│   ├── 135_test.npy
│   └── ...
└── synthetic_datasets

4. Run

# convert sequence data into image
python3 make_dataset.py --dataset UCR --mode train --modality image --window_size 600 --stride 200
python3 make_dataset.py --dataset UCR --mode test --modality image --window_size 600 --stride 200
# convert sequence data into text
python3 make_dataset.py --dataset UCR --mode train --modality text --window_size 600 --stride 200
python3 make_dataset.py --dataset UCR --mode test --modality text --window_size 600 --stride 200

# Image-modality
python3 main_cli.py --dataset UCR --normal_reference 3 --LLM 'GPT-4o'
# Text-modality
python3 main_cli_text.py --dataset UCR --normal_reference 1 --LLM 'GPT-4o'

5. Results Analysis

# evaluation
python3 evaluation.py
# ablation study
python3 ablation_eval.py

The quantitative results across seven datasets:

Table8

Tip

Feel free to explore our other papers in the field of time series! [Time Series Forecasting] [Time Series Anomaly Detection]

About

Code for TAMA: See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •