Skip to content

Terraform stack to deploy ELK Threat Hunting on Amazon AWS.

Notifications You must be signed in to change notification settings

apolloclark/argos

 
 

Repository files navigation

Argos

Description

WARNING: LAUNCHING THIS WILL COST YOU MONEY

This is a fully end-to-end encrypted, auto-scaling, AWS Multi-tier LAMP webstack, with ELK metrics and log monitoring, integrating osquery, and multiple AWS security features. It enables groups to deploy a fully secured web stack, and perform threat hunting. It is deployed with:

Security requirements for:

Components:

The firewall rules are set to only allow your personal IP address. You can customize configuration by changing the var.yml files in the ansbible folders. The central Terraform config file is here.




Before You Begin

If you haven't already configured the AWS CLI, or another SDK, on the machine where you will be running Terraform you should follow these instructions to setup the AWS CLI and create a credential profile which Terraform will use for authentication: http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html


Single-step Deploy

# you trust rando scripts from the internet, right? 😎
curl -s https://raw.githubusercontent.com/apolloclark/argos/master/single_step_deploy.sh | bash -

Deploy

# install Terraform, Packer, Ansible, Serverspec

# create an EC2 keypair named "packer"
aws ec2 create-key-pair --key-name packer --query "KeyMaterial" \
   --output text > ~/.ssh/packer.pem

# configure key file permissions
chmod 0600 ~/.ssh/packer.pem

# add the newly created key to the keychain
ssh-add ~/.ssh/packer.pem



# download project, and the nested submodules
git clone --recurse-submodules https://github.com/apolloclark/argos
cd ./argos/aws-ec2

# use Packer to build the AWS AMI's, takes ~ 40 minutes
./build_packer_aws.sh

# deploy AWS infrastructure with Terraform, takes ~ 5 minutes
./build_terraform.sh



# Add the packer.pem to the local ssh agent
chmod 0600 ~/.ssh/packer.pem
ssh-add -k ~/.ssh/packer.pem
ssh-add -l | grep "packer"

# Retrieve the Bastion host Public IP, and Kibana host Private IP
export BASTION_IP=$(aws ec2 describe-addresses --filters 'Name=tag:Name,Values=tf-bastion_eip' --query 'Addresses[].PublicIp' --output text);
export KAFKA_IP=$(aws ec2 describe-addresses --filters 'Name=tag:Name,Values=tf-kafka_eip' --query 'Addresses[].PrivateIpAddress' --output text);
export LOGSTASH_IP=$(aws ec2 describe-addresses --filters 'Name=tag:Name,Values=tf-logstash_eip' --query 'Addresses[].PrivateIpAddress' --output text);
printenv | grep "_IP"

# SSH into the Bastion host, forwarding the packer key
ssh -A ubuntu@$BASTION_IP

# check the user data startup script log
sudo nano /var/log/cloud-init-output.log



# SSH into the Bastion host, creating a tunnel to view Kibana
ssh -L 5601:$KAFKA_IP:5601 ubuntu@$BASTION_IP

# Open a Browser, to view Kibana
google-chrome 127.0.0.1:5601

Update

# update submodules
git submodule update --recursive --remote

Custom website

# edit the contents of the webapp "EC2 User Data" startup script:
https://github.com/apolloclark/argos/blob/master/terraform/webapp/userdata.sh#L21

OR

# fork this repo, and have your website baked into the base EC2 webapp AMI
https://github.com/apolloclark/packer-aws-webapp/blob/master/ansible/playbook.yml




Network Diagram

AWS Network Diagram

AWS Network Diagram


Overview

Terraform project for deploying and monitoring a multi-tier webservice including:

Roadmap:






Tech Debt

Log Files

authlog

nano /var/log/auth.log

apache

service apache2 status | cat
nano /var/log/apache2/access.log
nano /var/log/apache2/audit.log
nano /var/log/apache2/error.log

mysql

nano /var/log/mysql/audit.log

osquery

nano /var/log/osquery/osqueryd.results.log
nano /var/log/osquery/osqueryd.INFO
nano /var/log/osquery/osqueryd.WARNING

Filebeat

service filebeat status | cat
/usr/share/filebeat/bin/filebeat version
nano /etc/filebeat/filebeat.yml
nano /var/log/filebeat/filebeat.log
tail -f /var/log/filebeat/filebeat.log

Metricbeat

service metricbeat status | cat
/usr/share/metricbeat/bin/metricbeat version
nano /etc/metricbeat/metricbeat.yml
nano /var/log/metricbeat/metricbeat.log

Heartbeat

service heartbeat status | cat
/usr/share/heartbeat/bin/heartbeat version
nano /etc/heartbeat/heartbeat.yml
nano /var/log/heartbeat/heartbeat.log
tail -f /var/log/heartbeat/heartbeat.log

Packetbeat

service packetbeat status | cat
/usr/share/heartbeat/bin/heartbeat version
nano /etc/heartbeat/heartbeat.yml
nano /var/log/heartbeat/heartbeat.log
tail -f /var/log/heartbeat/heartbeat.log

Auditbeat

service auditbeat status | cat
/usr/share/auditbeat/bin/auditbeat version
nano /etc/auditbeat/auditbeat.yml
nano /var/log/auditbeat/auditbeat
tail -f /var/log/auditbeat/auditbeat

Logstash

service logstash status | cat
/usr/share/logstash/bin/logstash --version
nano /etc/logstash/logstash.yml
nano /var/log/logstash/logstash-plain.log
tail -f /var/log/logstash/logstash-plain.log

Elasticsearch

# Elasticsearch 5.x cheat sheet
# https://gist.github.com/apolloclark/c9eb0c1a01798ac2e48492ceeb367a4f

service elasticsearch status
/usr/share/elasticsearch/bin/elasticsearch --version
nano /etc/elasticsearch/elasticsearch.yml
nano /var/log/elasticsearch/elasticsearch.log
tail -f /var/log/elasticsearch/elasticsearch.log

# list indices
curl -XGET 'http://127.0.0.1:9200/_cat/indices?v'

Kibana

service kibana status
/usr/share/kibana/bin/kibana --version
nano /etc/kibana/kibana.yml
nano /var/log/kibana/kibana.log




VM Images, Ansible Roles

packer-aws-beats


  • packer-aws-kafka
    • Kafka node
  • packer-aws-logstash
    • Logstash node
  • packer-aws-es_node
    • Elasticsearch node

  • packer-aws-repo
    • gitlab
    • apt-get repo
  • packer-aws-builder
    • Jenkins
    • Packer
  • packer-aws-builder_node
    • Jenkins node
  • packer-aws-stresser
    • JMeter
  • packer-aws-stresser_node
    • JMeter node

  • packer-aws-http_cache
    • Varnishcache
  • packer-aws-tcp_proxy
    • HAProxy
  • packer-aws-sql_proxy
    • ProxySQL


Dashboards

1. Inventory Management

If you want to protect a network, you need to protect everything everywhere. Attacks target the oldest and most obscure systems on your network.

2. Access Management

Even if you're running the latest patched version of everything, a default login, weak login, or reused login is easily exploited.

3. Patch Management

If you're running End-of-Life (EOL) operating systems or packages that are more than 1 month old, you are vulnerable to CRITICAL and HIGH severity vulnerabilities.

4. Configuration Management

Accounting for everything, keeping logins secure, and patching everything, is useless if the service is configured in an insecure way. All configuration changes need to be tracked.

5. Metrics and Logging

How will you know something is being attacked if you're not monitoring it? Logs provide a lot of value, and should be easily correlated with metrics to add more context to the event.

6. Alerts

Baseline the system, and create alerts for anything that's out of the ordinary.

7. Automated Remediation

After the alerts have been proven reliable, responses can be automated.


Secrets Management

Deploying infrastructure at scale requires automation. Hostnames, IP addresses, usernames, passwords, security certificates, keys, and other credentials should not be hardcoded into VM images, be easily revoked, with all access monitored and logged. There are multiple services available, such as:

Initially I was going to use the AWS Key Management Service, but it has a limit of 1024 bytes per key, which is too small for things like SSL Certificates. Instead I chose the Amazon Parameter Store (released Jan 2017), which allows up to 4096 bytes.

Process:

Deployment Details

Steps:

  • VPC
  • Security Groups
  • KMS
    • custom key
  • IAM
    • Roles
    • Policy Documents
    • Role Policies
    • Instance Profiles
  • RDS Aurora
  • ELK Master
    • Security Groups
    • Elastic IP
    • EC2 Instance
  • Parameter Store
    • save RDS config
    • save ELK config
  • Bastion host
    • Security Groups
    • Elastic IP
    • EC2 Instance
  • Webapp ASG
    • Security Groups
    • Application Load Balancer
    • Launch Configuration
    • Auto Scaling Group
    • Auto Scaling Triggers
  • Private Network


Project Values

I've been developing websites since 2001. Here are the values I've built this project around:

1. Measurable value

  • usage analytics
  • time spent on feature
  • completed transactions
  • monthly, weekly, daily, hourly reports

2. Maintainable

  • automated code quality tests
  • easy for a new engineer to use
  • single step deploy
  • follows RFC standards
  • follows programming language standards
  • follows OS standards
  • code is less than 80 chars wide
  • use as few lines as possible
  • use Ansible instead of Bash
  • Bash scripts are less than 50 lines
  • Packer scripts are less than 100 lines
  • documentation is 1/4 or more of code
  • documentation is standardized, consistent
  • decompose components into seperate projects
  • doesn't use complex lambda, regex, or obscure functions
  • uses popular libraries

3. Resilient

  • log monitoring
  • input validation
  • error condition handling
  • withstands "Big List of Naughty Strings"
  • error logging, alerts

4. Performant

  • metrics monitoring, alerts
  • withstands simulated peak usage
  • auto-scales

5. Secure

  • security log monitoring, alerts
  • access control
  • no bypass without authentication
  • doesn't allow privileged access between users
  • no SQL injection
  • no XSS


Contact

twitter - @apolloclark
email - [email protected]

I live in Boston, so am generally available 9 AM EST to 5 PM EST, Mon - Fri.


References

About

Terraform stack to deploy ELK Threat Hunting on Amazon AWS.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • HCL 81.9%
  • Shell 18.1%