Skip to content

Commit

Permalink
Feat: Full Proxmox Support (Application Deployment Part) (#28)
Browse files Browse the repository at this point in the history
* added proxmox to ansible hosts

* fix: try to fix some timing issues with ansible

* fix: fixed error with list vs maps

* fix: fixed a bug that forced gcp to recreate some network related resources on every apply

- added the network URI instead of name or id

* tested state that can deploy all docker images on proxmox

* docs: updated the documentation

* fix: changed default checkout branch to main

* pull again after git checkout in order to ensure all changes are present

Co-authored-by: ciklista <[email protected]>
  • Loading branch information
JulianLegler and ciklista authored Jan 26, 2021
1 parent de5cd85 commit b9d1b25
Show file tree
Hide file tree
Showing 8 changed files with 156 additions and 88 deletions.
47 changes: 30 additions & 17 deletions terraform/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,24 @@ We define one module per cloud provider, with each possibly having more submodul
Create and populate a ``terraform.tfvars`` file in this directory. Use
[terraform.tfvars.example](terraform.tfvars.example) as a template.

Set the ``git_checkout_branch`` variable in the `main.tf` to the branch you want to work with.

To build the entire infrastructure:
```
terraform apply
```

After the process is finished you should be able to access the following dashboards:
* Grafana Dashboard: http://<hetzner.gateway_ipv4_address>:3000
* CockroachDB Dashboard: http://\<any public ip that is not a gateway\>:8080

A full build of the infrastructure takes over 30 minutes because the azure gateway is very slow.
You can deploy only a part of the infrastructure by using the `-target` parameter.
This infrastructure should take less then 30 minutes to complete:
```
terraform apply -target=module.hetzner -target=module.gcp -target=module.tooling
```

### Terraform modules
To use a module, simply call it in your terraform script and provide respective variable names.
This will create all the resources listed in that module when executing ``terraform apply``.
Expand All @@ -37,40 +50,44 @@ terraform apply

## Infrastructure
The current infrastructure consists of the following components:
- build on GCP, Azure and Hetzner:
- build on GCP, Azure, Hetzner and Proxmox:
- internal networks
- external network / resinfra net
- worker vms and deployment of distributed service
- build on Hetzner:
- build on Hetzner and Proxmox:
- gateway vm
- builder vm that triggers provisioning
- monitoring
- build only on Hetzner:
- builder vm that triggers provisioning

### Internal networks
On all providers, a new virtual private network is created including a subnet that holds the worker vms. Depending
on the implementation requirements of the specific cloud provider, an additional subnet is required for the gateways
that will connect to other nodes in the resinfra net. All subnets are disjoint as they will be unified to one virtual
network for the resinfra net. The cidr distribution currently looks as follows.
```
azure_cidr = 10.1.0.0/16 (Azure does not allow to add overlapping subnets when creating vpn routes)
azure_vm_subnet_cidr = 10.1.0.0/24
azure_gateway_subnet_cidr = 10.1.1.0/24
azure_cidr = cidrsubnet(var.vpc_cidr, 8, 1) # 10.1.0.0/16 (Azure does not allow to add overlapping subnets when creating vpn routes)
azure_vm_subnet_cidr = cidrsubnet(var.vpc_cidr, 16, 256) # 10.1.0.0/24
azure_gateway_subnet_cidr = cidrsubnet(var.vpc_cidr, 16, 257) # 10.1.1.0/24
gcp_cidr = cidrsubnet(var.vpc_cidr, 8, 2) # 10.2.0.0/16
gcp_vm_subnet_cidr = cidrsubnet(var.vpc_cidr, 16, 512) # 10.2.0.0/24
gcp_cidr = 10.2.0.0/16
gcp_vm_subnet_cidr = 10.2.0.0/24
hetzner_cidr = var.vpc_cidr # 10.0.0.0/8 (Hetzner needs to have all subnets included in the big VPN)
hetzner_vm_subnet_cidr = cidrsubnet(var.vpc_cidr, 16, 768) # 10.3.0.0/24
hetzner_cidr = 10.0.0.0/8 (Hetzner needs to have all subnets included in the big VPN)
hetzner_vm_subnet_cidr = 10.3.0.0/24
proxmox_cidr = cidrsubnet(var.vpc_cidr, 8, 4) # 10.4.0.0/16
proxmox_vm_subnet_cidr = cidrsubnet(local.proxmox_cidr, 8, 0) # 10.4.0.0/24
```
These are defined in the top-level ``main.tf`` file.

### External network (resinfra net)
The network between the internal networks is created by creating site-to-site IPsec tunnels. These tunnels are created
between gateways that then forward requests to the specific machine within their network. As a result, all machines can
reach each other from the set of private subnet cidrs (`10.1.0.0/24, 10.2.0.0/24, 10.3.0.0/24`).
reach each other from the set of private subnet cidrs (`10.1.0.0/24, 10.2.0.0/24, 10.3.0.0/24, 10.4.0.0/24`).
![network](../docs/networking/img/network.png)
For Azure and GCP, managed gateway services are used while a gateway machine is manually created and set up for hetzner
using [StrongSwan](https://wiki.strongswan.org/projects/strongswan).
using [StrongSwan](https://wiki.strongswan.org/projects/strongswan) and for Proxmox using [Libreswan](https://libreswan.org/).

### Worker vms and deployment of cockroachdb
A fixed number of worker vms is created on each cloud provider that will be used to run the distributed service, namely
Expand All @@ -82,11 +99,7 @@ vm is created on one of the providers (currently Hetzner) to trigger all necessa
Monitoring is set up through the builder vm and ansible scripts. A docker container running the
[node exporter](https://prometheus.io/docs/guides/node-exporter/) is installed on all worker vms. On the monitoring
machine a prometheus instance is installed to collect metrics from respective targets. Metrics are then
displayed in Grafana.

## Caveats
The modules [aws](modules/aws) and [proxmox](modules/proxmox) modules are still in development and therefore
not referenced in the top-level ``main.tf`` file.
displayed in Grafana.



68 changes: 37 additions & 31 deletions terraform/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ locals {

path_private_key = "~/.ssh/ri_key"
path_public_key = "~/.ssh/ri_key.pub"

// This branch will get checked out on the ansible deployer machine
git_checkout_branch = "main"
}

module "hetzner" {
Expand All @@ -35,27 +38,27 @@ module "hetzner" {
}

module "azure" {
source = "./modules/azure"
subscription_id = var.subscription_id
client_id = var.client_id
client_secret = var.client_secret
tenant_id = var.tenant_id
location = "eastus"
vm_size = "Standard_D2s_v3" # Standard_D2s_v3, Standard_B2s | For more info https://azureprice.net/
path_private_key = local.path_private_key
path_public_key = local.path_public_key
azure_gateway_subnet_cidr = local.azure_gateway_subnet_cidr
azure_vm_subnet_cidr = local.azure_vm_subnet_cidr
azure_vpc_cidr = local.azure_cidr
gcp_gateway_ipv4_address = module.gcp.gcp_gateway_ipv4_address
gcp_vm_subnet_cidr = local.gcp_vm_subnet_cidr
hcloud_gateway_ipv4_address = module.hetzner.gateway_ipv4_address
hcloud_vm_subnet_cidr = local.hetzner_vm_subnet_cidr
source = "./modules/azure"
subscription_id = var.subscription_id
client_id = var.client_id
client_secret = var.client_secret
tenant_id = var.tenant_id
location = "eastus"
vm_size = "Standard_D2s_v3" # Standard_D2s_v3, Standard_B2s | For more info https://azureprice.net/
path_private_key = local.path_private_key
path_public_key = local.path_public_key
azure_gateway_subnet_cidr = local.azure_gateway_subnet_cidr
azure_vm_subnet_cidr = local.azure_vm_subnet_cidr
azure_vpc_cidr = local.azure_cidr
gcp_gateway_ipv4_address = module.gcp.gcp_gateway_ipv4_address
gcp_vm_subnet_cidr = local.gcp_vm_subnet_cidr
hcloud_gateway_ipv4_address = module.hetzner.gateway_ipv4_address
hcloud_vm_subnet_cidr = local.hetzner_vm_subnet_cidr
proxmox_gateway_ipv4_address = module.proxmox.gateway_ipv4_address
proxmox_vm_subnet_cidr = local.proxmox_vm_subnet_cidr
shared_key = var.shared_key
prefix = var.prefix
instances = var.instances
proxmox_vm_subnet_cidr = local.proxmox_vm_subnet_cidr
shared_key = var.shared_key
prefix = var.prefix
instances = var.instances
}

module "gcp" {
Expand Down Expand Up @@ -101,15 +104,18 @@ module "proxmox" {
}

module "tooling" {
source = "./modules/hetzner/tooling"
hcloud_token = var.hcloud_token
path_private_key = local.path_private_key
path_public_key = local.path_public_key
prefix = var.prefix
azure_worker_hosts = module.azure.azure_private_ip_addresses
gcp_worker_hosts = module.gcp.gcp_private_ip_addresses
hetzner_worker_hosts = module.hetzner.hcloud_private_ip_addresses
hetzner_subnet_id = module.hetzner.hcloud_subnet_id
hcloud_ssh_key_id = module.hetzner.hcloud_ssh_key_id
strongswan_ansible_updated = module.hetzner.ansible_strongswan_updated
source = "./modules/hetzner/tooling"
git_checkout_branch = local.git_checkout_branch
hcloud_token = var.hcloud_token
path_private_key = local.path_private_key
path_public_key = local.path_public_key
prefix = var.prefix
proxmox_worker_hosts = module.proxmox.proxmox_private_ip_addresses
azure_worker_hosts = module.azure.azure_private_ip_addresses
gcp_worker_hosts = module.gcp.gcp_private_ip_addresses
hetzner_worker_hosts = module.hetzner.hcloud_private_ip_addresses
hetzner_subnet_id = module.hetzner.hcloud_subnet_id
hcloud_ssh_key_id = module.hetzner.hcloud_ssh_key_id
hcloud_strongswan_ansible_updated = module.hetzner.ansible_strongswan_updated
proxmox_strongswan_ansible_updated = module.proxmox.ansible_strongswan_updated
}
40 changes: 20 additions & 20 deletions terraform/modules/gcp/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,13 @@ resource "google_compute_subnetwork" "vms" {
name = "${var.prefix}-internal-${random_id.id.hex}"
ip_cidr_range = var.gcp_subnet_cidr
region = var.gcp_region
network = google_compute_network.main.id
network = google_compute_network.main.self_link
}

# create firewall rule for port 22 (ssh)
resource "google_compute_firewall" "allow_ssh" {
name = "${var.prefix}-network-internal-allow-ssh-${random_id.id.hex}"
network = google_compute_network.main.name
network = google_compute_network.main.self_link


allow {
Expand All @@ -47,7 +47,7 @@ resource "google_compute_firewall" "allow_ssh" {
# create firewall rule for port 80,443 (ssh)
resource "google_compute_firewall" "allow_internet" {
name = "${var.prefix}-network-internal-allow-internet-${random_id.id.hex}"
network = google_compute_network.main.name
network = google_compute_network.main.self_link


allow {
Expand All @@ -61,7 +61,7 @@ resource "google_compute_firewall" "allow_internet" {
# allow all internal traffic (10.0.0.0/8)
resource "google_compute_firewall" "allow_internal" {
name = "${var.prefix}-network-internal-allow-internal-${random_id.id.hex}"
network = google_compute_network.main.name
network = google_compute_network.main.self_link


allow {
Expand All @@ -74,7 +74,7 @@ resource "google_compute_firewall" "allow_internal" {
# allow icmp
resource "google_compute_firewall" "allow_icmp" {
name = "${var.prefix}-network-internal-allow-icmp-${random_id.id.hex}"
network = google_compute_network.main.name
network = google_compute_network.main.self_link


allow {
Expand All @@ -100,7 +100,7 @@ resource "google_compute_address" "gateway_ip_address" {
# Create classic VPN
resource "google_compute_vpn_gateway" "main" {
name = "${var.prefix}-vpn"
network = google_compute_network.main.id
network = google_compute_network.main.self_link
}

# create VPN forwarding routes
Expand All @@ -111,23 +111,23 @@ resource "google_compute_forwarding_rule" "fr_esp" {
name = "fr-esp"
ip_protocol = "ESP"
ip_address = google_compute_address.gateway_ip_address.address
target = google_compute_vpn_gateway.main.id
target = google_compute_vpn_gateway.main.self_link
}

resource "google_compute_forwarding_rule" "fr_udp500" {
name = "fr-udp500"
ip_protocol = "UDP"
port_range = "500"
ip_address = google_compute_address.gateway_ip_address.address
target = google_compute_vpn_gateway.main.id
target = google_compute_vpn_gateway.main.self_link
}

resource "google_compute_forwarding_rule" "fr_udp4500" {
name = "fr-udp4500"
ip_protocol = "UDP"
port_range = "4500"
ip_address = google_compute_address.gateway_ip_address.address
target = google_compute_vpn_gateway.main.id
target = google_compute_vpn_gateway.main.self_link
}

# Create the tunnel & route trafic to remote networks through tunnel
Expand All @@ -137,7 +137,7 @@ resource "google_compute_vpn_tunnel" "azure_tunnel" {
peer_ip = var.azure_gateway_ipv4_address
shared_secret = var.shared_key

target_vpn_gateway = google_compute_vpn_gateway.main.id
target_vpn_gateway = google_compute_vpn_gateway.main.self_link
local_traffic_selector = [var.gcp_subnet_cidr]
remote_traffic_selector = [var.azure_subnet_cidr]

Expand All @@ -149,10 +149,10 @@ resource "google_compute_vpn_tunnel" "azure_tunnel" {
}
resource "google_compute_route" "azure-route" {
name = "${var.prefix}-azure-route-${random_id.id.hex}"
network = google_compute_network.main.name
network = google_compute_network.main.self_link
dest_range = var.azure_subnet_cidr

next_hop_vpn_tunnel = google_compute_vpn_tunnel.azure_tunnel.id
next_hop_vpn_tunnel = google_compute_vpn_tunnel.azure_tunnel.self_link
}

# Create the tunnel & route trafic to remote networks through tunnel
Expand All @@ -162,7 +162,7 @@ resource "google_compute_vpn_tunnel" "hetzner_tunnel" {
peer_ip = var.hetzner_gateway_ipv4_address
shared_secret = var.shared_key

target_vpn_gateway = google_compute_vpn_gateway.main.id
target_vpn_gateway = google_compute_vpn_gateway.main.self_link
local_traffic_selector = [var.gcp_subnet_cidr]
remote_traffic_selector = [var.hetzner_subnet_cidr]

Expand All @@ -174,10 +174,10 @@ resource "google_compute_vpn_tunnel" "hetzner_tunnel" {
}
resource "google_compute_route" "hetzner-route" {
name = "${var.prefix}-hetzner-route-${random_id.id.hex}"
network = google_compute_network.main.name
network = google_compute_network.main.self_link
dest_range = var.hetzner_subnet_cidr

next_hop_vpn_tunnel = google_compute_vpn_tunnel.hetzner_tunnel.id
next_hop_vpn_tunnel = google_compute_vpn_tunnel.hetzner_tunnel.self_link
}

# Create the tunnel & route trafic to remote networks through tunnel
Expand All @@ -187,7 +187,7 @@ resource "google_compute_vpn_tunnel" "proxmox_tunnel" {
peer_ip = var.proxmox_gateway_ipv4_address
shared_secret = var.shared_key

target_vpn_gateway = google_compute_vpn_gateway.main.id
target_vpn_gateway = google_compute_vpn_gateway.main.self_link
local_traffic_selector = [var.gcp_subnet_cidr]
remote_traffic_selector = [var.proxmox_subnet_cidr]

Expand All @@ -199,10 +199,10 @@ resource "google_compute_vpn_tunnel" "proxmox_tunnel" {
}
resource "google_compute_route" "proxmox-route" {
name = "${var.prefix}-proxmos-route-${random_id.id.hex}"
network = google_compute_network.main.name
network = google_compute_network.main.self_link
dest_range = var.proxmox_subnet_cidr

next_hop_vpn_tunnel = google_compute_vpn_tunnel.proxmox_tunnel.id
next_hop_vpn_tunnel = google_compute_vpn_tunnel.proxmox_tunnel.self_link
}

/*
Expand Down Expand Up @@ -238,8 +238,8 @@ resource "google_compute_instance" "worker_vm" {
}

network_interface {
network = google_compute_network.main.id
subnetwork = google_compute_subnetwork.vms.id
network = google_compute_network.main.self_link
subnetwork = google_compute_subnetwork.vms.self_link
access_config {
nat_ip = google_compute_address.static[count.index].address
}
Expand Down
3 changes: 3 additions & 0 deletions terraform/modules/hetzner/tooling/cockroach_host.ini.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ ${host}
%{ for host in hetzner_hosts ~}
${host}
%{ endfor ~}
%{ for host in proxmox_hosts ~}
${host}
%{ endfor ~}

[deployer_server]
${deployer_vm}
Expand Down
Loading

0 comments on commit b9d1b25

Please sign in to comment.