Infrastructure As Code with Terraform
Creating a VS in the Cloud is easy. Open the account dashboard, do a few clicks. It may take less than four minutes to spin off a new server or fire up a managed database cluster.
But then you need to add a load balancer. And a firewall. And another VS. And then repeat this two times for QA and Dev environments, precisely the same. Now, the task is getting complicated, and the use of the dashboard is less practical. How would you automate that?
The concept of Infrastructure as Code introduces the idea of managing your infrastructure just as you would a software delivery - by creating, testing, deploying, and maintaining code. You get version control, CI/CD, code reviews, and reduction of errors.
To create the "code" part, you can write deployment scripts. The downside is extensive modifications or added checks when resource needs to be added, removed, or changed. If the script creates one database node and now you need to have a total of three nodes, re-running the same script should result in three nodes total and not four.
Using the tool that automates taking care of the deltas is another option.
Terraform is one of such tools build for infrastructure automation, specifically provisioning and deployment. Terraform analyzes resources described in the configuration file, connects via plugins to the provider's APIs, and produces a change plan to create environments per your specifications.
The nice thing about the resource description approach is that Terraform is comparing what you want (i.e., configuration file) with what you have (i.e., currently managed live environment) and then figures out the difference. Making a small modification and re-running the plan will not result in double amount of servers created.
Another bonus is the ability to call into multiple providers. For disaster recovery, costs, and security reasons, it may make sense not to tie in the single cloud provider and instead use a multi-cloud host approach.
Because provider capacities vary, it's not possible to use the same configuration file across all hosts, but it reduces complexity to have the entire configuration in one spot.
Terraform CLI vs Cloud
Terraform can be used as an entirely local installation, Cloud or a hybrid of CLI installed on a local machine and interacting with the Cloud account.
The last option allows for the most flexibility - a Cloud account can be shared with the team, everybody is running on the same version, and options that are not available from the Cloud interface can be accessed via CLI.
I prefer to use Cloud whenever possible instead of running local installs.
Taking Terraform for a Spin
Let's say I need to create a droplet on Digital Ocean with its own set of SSH keys and a PostgreSQL managed cluster.
Prerequisites
- Digital Ocean account
- Cloud git account (GitHub/BitBucket)
- Terraform Cloud account
Initial Setup
Creating a Repository Project
Terraform can be triggered from the command line, git repository updates, or via API. For this example, I want to use the repository method: an API trigger is an overkill in this case, and I want plans to be analyzed immediately after commits.
Terraform configuration file ("Code" part of Infrastructure As Code) needs to reside in the cloud repository.
- Confirm the latest version of DigitalOcean provider API by checking documentation on the Terraform website
- Create a new empty repository with a single main.tf file in it.
- Specify the required provider and version.
Provider setup requires an authentication token, but because we are using Terraform Cloud, the token will be stored in Environment variables and does not need to be included in the configuration file.
terraform { required_providers { digitalocean = { source = "digitalocean/digitalocean" version = "1.22.2" } } } # setup digital ocean provider provider "digitalocean" { }
Creating Digital Ocean Application Token
To call into Digital Ocean API, Terraform needs to be authenticated by using the application token.
The token can be generated on the Digital Ocean account page under API->Token/Keys tab. Copy the token as it needs to be added to the Terraform Workspace.
Creating a Terraform Workspace
After creating and verifying the Terraform Cloud account, click on the New Workspace button and specify the newly created repository as a source. Under the Variables tab, add an Environmental variable called DIGITALOCEAN_TOKEN and paste DO Application token.
At this point, commits to the git repository will trigger plan analysis by Terraform. When the plan is applied, Terraform will execute the resource creation commands by calling into the Digital Ocean API.
Defining resources
For this simple infrastructure, we are going to create:
- A separate development project (to keep everything organized).
- A new SSH key pair for connecting to the droplet.
- An instance of Postgre Database
- A droplet. The smallest size is OK.
Creating variables - variables.tf
All resources will be created in NY Region 1. However, later on, the different regions can be chosen, so it makes sense to make the region value a variable instead of copy/pasting it multiple times in the configuration file.
variable "region" { type = string default = "nyc1" }
Creating Output - output.tf
For now, I want the script to print the IP of the new droplet for convenience. It would make it easy to immediately SSH into it and confirm that keys were created successfully.
Per DO documentation, ipv4 is a valid attribute of a droplet resource so that the output file will look like this:
output "web_droplet" { value = digitalocean_droplet.web.ipv4_address }
Creating SSH Public Key
There are a few approaches that can be taken here:
- Create an SSH key pair on the local computer, then provide the public key to Terraform for the creation of a new SSH key on Digital Ocean
- Use the previously created key
- Create an SSH key pair on the fly, then output the private key in the output script and save it to the local computer.
I want to keep separate keys for different environments, so option two won't work. Printing private keys in output is a potential security issue - if my account is ever compromised. For this example, I'll go with the first option of generating a new key pair.
ssh-keygen -t rsa
The resulting terraform.pub key is then saved to the repository to be referenced by the configuration file.
Creating the main configuration file - main.tf
For more extensive infrastructure, it makes sense to split data storage and VS to make things easy to find. But because the example is small, I'll keep everything in the same file.
terraform { required_providers { digitalocean = { source = "digitalocean/digitalocean" version = "1.22.2" } } } # setup digital ocean provider provider "digitalocean" { } # create ssh key resource "digitalocean_ssh_key" "terraform_key" { name = "Terraform Key" public_key = file("./terraform.pub") } # vs resource "digitalocean_droplet" "web" { image = "ubuntu-18-04-x64" name = "web" region = var.region size = "s-1vcpu-1gb" ssh_keys = [digitalocean_ssh_key.terraform_key.fingerprint] } # database cluster resource "digitalocean_database_cluster" "postgres" { name = "postgresql-12-datastore" engine = "pg" version = "11" size = "db-s-1vcpu-1gb" region = var.region node_count = 1 } resource "digitalocean_project" "terraform" { name = "Terraform" description = "Terraform Project" purpose = "Web Application" environment = "Development" resources = [digitalocean_droplet.web.urn, digitalocean_database_cluster.postgres.urn] }
Performing Infrastructure Change
Analyzing the plan
Once changes are committed, the workspace will attempt to validate the configuration and create a plan. Validation will fail if any misconfiguration errors occur.
Terraform v0.13.2 Configuring remote state backend... Initializing Terraform configuration... Error: Unsupported attribute on outputs.tf line 2, in output "web_droplet": 2: value = digitalocean_droplet.web.ip4_address This object has no argument, nested block, or exported attribute named "ip4_address". Did you mean "ipv4_address"?
Terraform produces a plan outline, indicating which resources will be added, updated, or deleted, which should be reviewed for accuracy. The unexpected removal of the vital resource is not a good surprise.
If everything checks out, select "Confirm And Apply" to initialize the execution.
Applying the plan
After a few minutes (it can take up to 5 to create DB Cluster), the process completes, and droplet ipv4 is printed to the console as requested
Apply complete! Resources: 4 added, 0 changed, 0 destroyed. Outputs: web_droplet = xxx.xxx.xxx.xxx
To verify SSH keys, use the private key to log in
ssh root@xxx.xxx.xxx.xxx -i /Users/user/.ssh/terraform
Modifying the plan
I now want to add Spaces to enable CDN. Spaces require a separate key and a secret. Create them from the Digital Ocean Account->API tab.
Copy both values to Terraform Environment Variables as SPACES_ACCESS_KEY_ID and SPACES_SECRET_ACCESS_KEY
Now a new resource definition needs to be added to main.tf. The resource should belong to our dev workspace. Because Spaces are currently created in NY Region 3 only, the definition cannot use the existing region variable. For now, I'll hardcode the region to nyc3.
#spaces bucket resource "digitalocean_spaces_bucket" "terrastorage" { name = "terrastorage" region = "nyc3" force_destroy = true } resource "digitalocean_project" "terraform" { name = "Terraform" description = "Terraform Project" purpose = "Web Application" environment = "Development" resources = [digitalocean_droplet.web.urn, digitalocean_database_cluster.postgres.urn, digitalocean_spaces_bucket.terrastorage.urn] }
Committing the file to git triggers Terraform to start analyzing the new plan. The result shows one new resource creation (Spaces) and update in-place for workspace.
# digitalocean_spaces_bucket.file_storage will be created + resource "digitalocean_spaces_bucket" "terrastorage" { + acl = "private" + bucket_domain_name = (known after apply) + force_destroy = true + id = (known after apply) + name = "terrastorage" + region = "nyc1" + urn = (known after apply) } Plan: 1 to add, 1 to change, 0 to destroy.
After applying the changes, the log confirms the creation and update of resources.
Terraform v0.13.2 Initializing plugins and modules... digitalocean_spaces_bucket.terrastorage: Creating... digitalocean_spaces_bucket.terrastorage: Creation complete after 9s [id=terrastorage] digitalocean_project.terraform: Modifying... [id=f633ed61-5ac5-4ebc-89a4-96840e3466c7] digitalocean_project.terraform: Modifications complete after 3s [id=f633ed61-5ac5-4ebc-89a4-96840e3466c7] Apply complete! Resources: 1 added, 1 changed, 0 destroyed.
Note: Space Bucket resource should not have underscores in the name. Underscore is resolved to space when creating the name, and that would result in an error.
Reviewing the state
After each successful execution, Terraform saves the current state, which can be viewed under the States tab on the Terraform Cloud console. The file contains the list of all managed resources, along with IPs, database user names and passwords, and key references. States are persisted, so you can audit changes done to infrastructure after each plan apply.
This information can be pulled programmatically using Terraform API to integrate into the dashboard view or infrastructure report. It's nice to have all resources consolidated in one spot instead of creating multiple integrations with providers.
It's worth pointing out the amount of sensitive information stored in the state file. Make sure to turn on 2FA for the Terraform account.
Deleting the resource
To decommission a resource, remove it from the configuration file, commit the change, and wait for Terraform to analyze it. Here is the resulting output after deleting the database cluster:
Terraform will perform the following actions: # digitalocean_database_cluster.postgres will be destroyed
Deleting the plan
Sometimes it makes sense to create a new infrastructure and then delete it after a short period. Instead of deleting all resources from the configuration file, pick the "Destroy Plan" option from Settings->Destruction And Deletion, accessible from the Terraform Cloud dashboard menu.
After applying the plan, all resources are removed from the provider.
Apply complete! Resources: 0 added, 0 changed, 4 destroyed.
Additional Options
This example is elementary.
For more complex infrastructure, there are additional options that can be used:
Modules
Modules allow reusing grouped resources. For example, Dev, QA, and Prod environments may be the same, but reside in different regions and have different VM sizes. Instead of copy/pasting the same code three times in each configuration file, it's possible to create a single module and then import it with environment-specific variables.
Data Sources
Data sources give information on currently existing resources, making it possible to create a configuration plan that combines managed and unmanaged resources.
Policy as Code
Mistakes and miscommunications do happen. Using Sentinel, it's possible to set the max allowed amount of VS created at one time, restrict the regions, only allow plan changes to be applied on the weekend, or require tags to be added to each resource.
Server Provisioning
After VS is created, it needs to be configured and set up. While Terraform provides Provisioners (which allow executing a command on the server), this approach is not recommended for server configuration. Instead, pass a user_data metadata during droplet creation containing a cloud-init file, or use another tool like Ansible.