- Introduction
- Examples
- Hands-on: Small problem for students
- (Break)
- Decision Trees: Concept and characteristics, Implementation, and Example of application
- Information gain
- Notion of supervised learning
- Classification
- Algorithms and implementation principles: J48, VFDT
- Hands-on on cyber-security data
- 1st ex: without normalization
- Normalization process
- Examples of use using Weka with/without normalization
- Analysis of results
- Bayesian networks: Concept and characteristics, Implementation, and Example of application
- Probability and inference
- Algorithms and implementation principles: Naive Bayes
- Examples of application in cyber-security using Weka
- Clustering: Concept and characteristics, implementation
- Distance metrics
- Notion of unsupervised learning
- Algorithms and implementation principles: K-means, KNN
- Examples of application in cyber-security using Weka
- Binarization
- Concept and characteristics, the perceptron model
- Feed-Forward (FF) Networks, Multiple Feed-Forward Networks
- Back-propagation and learning
- Deep learning
- Algorithms and implementation principles: Linear regression, MLP
- Implementation, and Example of application in cyber-security using Weka
- Concept and characteristics: data streams, single pass, concept drift
- Implementation, and Example of application in cyber-security
- Algorithms and implementation principles: Novelty detection
- Implementation, and Example of application in cyber-security using MOA
- Phases of knowledge data discovery
- Pre-processing: normalization, binarization
- Data types for cybersecurity: packets, flows, log files
- Public datasets for cybersecurity: CIC-IDS, CTU13, Kyoto, ICSX-botnet, ICSX-SlowDoS
- Reduction of false positives, tradeoff between precision and recall
- Ensembles
- Bagging
- Boosting
- An overview of tools
- Tools for learning/testing techniques and algorithms (WEKA, MOA)
- Deep learning frameworks (PyToch, TensorFlow, etc)
- BigData (Hadoop, Spark, Flink)
- Data Mining: Practical Machine Learning Tools and Techniques, 4th Edition. Ian H. Witten, Eibe Frank, Mark A. Hall, Christopher J. Pal. Morgan Kauffman, 2017.
- Machine Learning, Tom Mitchell. McGraw-Hill, 1997.
Scientific papers:
- Buczak, Anna L., and Erhan Guven. "A survey of data mining and machine learning methods for cyber security intrusion detection." IEEE Communications surveys & tutorials 18.2 (2015): 1153-1176.
- Viegas, Eduardo, et al. "Bigflow: Real-time and reliable anomaly-based intrusion detection for high-speed networks." Future Generation Computer Systems 93 (2019): 473-485.