GitHub - canorbal/kaspersky_hackathon: https://events.kaspersky.com/hackathon/

Winning solution for Hackathon on Data Analysis from "Kaspersky lab"

https://events.kaspersky.com/hackathon/

Team DMIA [Data Mining In Action]:

https://github.com/aguschin, https://github.com/canorbal, https://github.com/ohld

check out our elective course on Data Mining in MIPT - https://github.com/vkantor/MIPT_Data_Mining_In_Action_2016

Task

Multivariate time series classification ("normal" TS vs TS with anomalies) based on Tennessee Eastman Problem http://users.abo.fi/khaggblo/RS/McAvoy.pdf

Detailed task description can be found in README.pdf

Data can be downloaded from https://yadi.sk/d/LzWCsMmo3GvWrt

Brief solution description:

Train LSTM to predict timeseries on 10 ticks ahead using "normal" TS as training data. lstm_baseline_nextstep.ipynb
Use LSTM to predict all TS from Train and Test and calculate new features based on error amount statistics.
Train Xgboost in xgboost_baseline.ipynb (producing xgb_best_4_knn.csv).
Train ExtraTrees in extratrees_baseline-window-lstm.ipynb (producing et_window_250_lstm.csv)
Train KNN in KNN_baseline.ipynb (producing knn_best.csv and final mixed submission knn_xgb_et_RANKS_FINAL_002.csv)

Basically, all three models share the same features - different statistics based upon different columns and their derivatives which belong to the same file and thus have the same label (either 1 for anomalies or 0 for "normal" TS). Extratrees also have "error features" provided by LSTM predictions.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
KNN_baseline.ipynb		KNN_baseline.ipynb
README.md		README.md
README.pdf		README.pdf
extratrees_baseline-window-lstm.ipynb		extratrees_baseline-window-lstm.ipynb
lstm_baseline_nextstep.ipynb		lstm_baseline_nextstep.ipynb
lstm_baseline_nextstep_pred.ipynb		lstm_baseline_nextstep_pred.ipynb
sample_submission.csv		sample_submission.csv
train_test_plots.ipynb		train_test_plots.ipynb
xgb_fourier.ipynb		xgb_fourier.ipynb
xgboost_baseline.ipynb		xgboost_baseline.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Winning solution for Hackathon on Data Analysis from "Kaspersky lab"

Team DMIA [Data Mining In Action]:

Task

Brief solution description:

About

Releases

Packages

Languages

canorbal/kaspersky_hackathon

Folders and files

Latest commit

History

Repository files navigation

Winning solution for Hackathon on Data Analysis from "Kaspersky lab"

Team DMIA [Data Mining In Action]:

Task

Brief solution description:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages