Skip to content

Latest commit

 

History

History
193 lines (144 loc) · 10.1 KB

README.md

File metadata and controls

193 lines (144 loc) · 10.1 KB

Terraform Rancher HA on AWS

This project contain Terraform modules that will provide us with a full Rancher HA in a secure infrastructure, this project have been highly inspired by this end out guys you can see their work on the following repo https://github.com/nextrevision/terraform-rancher-ha-example

This project use Terraform Remote State to store the states remotely on our AWS S3 account and sharing across each module those variables this way we'll have our system modularize into 3 independent components leveraging security, maintainability, composability/reuse.

Index

Dependencies

This project depends on:

Versioning

We follow semver to tag our repos

Available Modules

Compute

Rancher HA compute module, compounded by:

  • ELB
  • Bastion
  • EC2 instances

Inputs

Name Description Default Required
ami AMI ID - yes
aws_account_id AWS account ID to prevent you from mistakenly using an incorrect one (and potentially end up destroying a live environment) - yes
bucket_name The name of the S3 bucket - yes
db_name The name for your database of up to 8 alpha-numeric characters. If you do not provide a name - yes
db_pass Password for the DB user - yes
db_user Username for the DB user - yes
domain The domain of the certificate to look up - yes
environment The environment where we are building the resource production no
instance_count The number of instances to create 3 no
instance_type The type of instance to start - yes
key_name The name of the SSH key to use on the instance, e.g. moltin - yes
key_path The path of the public SSH key to use on the instance, e.g. ~/.ssh/id_rsa.pub - yes
name The prefix name for all resources - yes
rancher_version The version of Rancher to be install - yes
region The region where all the resources will be created - yes
role_arn The ARN of the role to assume - yes
session_name The session name to use when making the AssumeRole call - yes
user_data_path The path for the template to generate the user data that will be provided when launching the instance - yes
vpc_cidr VPC CIDR block - yes

Outputs

Name Description
bastion_instance_private_ip Private IP address to associate with the bastion instance in a VPC
bastion_instance_public_ip The public IP address assigned to the bastion instance
bastion_user User to access bastion
elb_dns_name The DNS name of the ELB
elb_zone_id The canonical hosted zone ID of the ELB (to be used in a Route 53 Alias record)

Database

Database module that will create:

  • RDS Cluster
  • Rancher Membership Security Group

Inputs

Name Description Default Required
aws_account_id AWS account ID to prevent you from mistakenly using an incorrect one (and potentially end up destroying a live environment) - yes
backup_retention_period The backup retention period

This is the minimun recommendable retention period for backups however it will dependes on our needs
7 no
bucket_name The name of the S3 bucket - yes
db_name The name for your database of up to 8 alpha-numeric characters. If you do not provide a name - yes
db_pass Password for the DB user - yes
db_user Username for the DB user - yes
environment The environment where we are building the resource production no
final_snapshot_identifier The name of your final DB snapshot when this DB cluster is deleted. If omitted, no final snapshot will be made - yes
name The prefix name for all resources - yes
preferred_backup_window The time window on which backups will be made (HH:mm-HH:mm)

We are choosing a 2 hours window early in the morning between 2am and 4am before a maintenance window so they won't collide an affect one to another
02:00-04:00 no
preferred_maintenance_window The weekly time range during which system maintenance can occur, in (UTC) e.g. wed:04:00-wed:04:30

We have choosen this window as the default one because it fits our needs and the most important regions won't be affected
wed:06:00-wed:06:30 no
region The region where all the resources will be created - yes
role_arn The ARN of the role to assume - yes
session_name The session name to use when making the AssumeRole call - yes
skip_final_snapshot false no
vpc_cidr VPC CIDR block - yes

Outputs

Name Description
rds_cluster_endpoint The DNS address of the RDS instance
rds_cluster_port The port on which the DB accepts connections
sg_membership_rancher_id The ID of the Rancher Membership Security Group

Network

Network module that will create:

  • VPC
  • Public Subnet
  • Private Subnet
  • Internet Gateway

Inputs

Name Description Default Required
aws_account_id AWS account ID to prevent you from mistakenly using an incorrect one (and potentially end up destroying a live environment) - yes
name The prefix name for all resources - yes
private_subnet_azs A list of availability zones to place in the private subnets - yes
private_subnet_cidrs A list of CIDR blokcs to use in the private subnets - yes
public_subnet_azs A list of availability zones to place in the public subnets - yes
public_subnet_cidrs A list of CIDR blokcs to use in the public subnets - yes
region The region where all the resources will be created - yes
role_arn The ARN of the role to assume - yes
session_name The session name to use when making the AssumeRole call - yes
vpc_cidr VPC CIDR block - yes

Outputs

Name Description
private_subnet_ids A list of private subnet IDs
public_subnet_ids A list of public subnet IDs
vpc_id The ID of the VPC

Upgrade Rancher Server version

At the minute there is no way to upgrade Rancher HA automatically without down time. We are using cloud-config to initiate the instance with a rancher-server container however if we change cloud-config file the user_data field will be mark as changed by Terraform and therefore it will destroy the instance.

There is no option from Terraform to stop the instance and apply changes on user_data which will be the approach to follow and even with lifecycle { create_before_destroy = true } there will be downtime as the rancher-server container take time to start working again, so for the moment we are going to need to manually stop each instance and update the Rancher version on AWS Instance Settings -> View/Change User Data and wait per each of them to be working again, you can see the progress from http://myrancher.com/admin/ha

Ideally we will have a configuration management tool like Ansible to handle this for us or we could use the lifecycle { create_before_destroy = true } in our instances and just accept that maintenance window to perform the upgrade.

Authors

  • Israel Sotomayor - Initial work - zot24

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE file for details

Resources