This aims to be a gentle introduction to Infrastructure as Code using Hashicorp’s Terraform. All examples herein target Amazon’s AWS as it’s the most widely used Infrastructure as a Service provider (with ~33% as Synergy Research Group reports), but Terraform supports many different providers (among them Google Cloud Platform and Microsoft’s Azure). Before we dive in to some code examples, we should briefly talk about what we mean when we say “Infrastructure as Code”.

Wikipedia says the term refers to “the process of managing and provisioning computer data centers through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools.” That’s a mouthful; let me paraphrase: rather than configuring our machines, databases, load balancers and certificates (and whatever else we manage) manually (via a web interface or CLI), we express our desired infrastructure as code (that’s where the name comes from!)

Now, this code may be written using different paradigms. Terraform’s paradigm is declarative, which means rather than expressing how Terraform should accomplish our desired state, we only define the end state and let Terraform worry about how to get there (kinda like telling the Taxi driver the destination and letting him handle the rest).

Before we start, you need to install Terraform; there are several ways of obtaining the Terraform binary, which depend on the OS you’re using, but if there’s no package for your OS or your OS doesn’t use a package manager, there’s always the Terraform downloads. Make sure the binary is in your path, so you can execute it comfortably.

Before I lose you by rambling too much, here’s an example: Let’s assume I was planning on installing Wordpress for my personal blog (sorry, Medium!). The requirements say I need a web and a database server; using AWS, this could correspond to an EC2 server and a RDS instance. Rather than risking Tenosynovitis using AWS’ web interface, I decide to rely on Terraform to accomplish this task.

In Terraform, components of our infrastructure we’d like to manage are called “resources”. You declare every resource in a separate block, like so:

resource "aws_instance" "web-server" {
  ami           = "ami-0f5dbc86dd9cbf7a8"
  instance_type = "t2.micro"
}

Let’s dissect this: the first line defines the type of resource we’d like to manage (in this case aws_instance, which is Terraformspeak for EC2 instance) and an internal name, which we can use to refer to this resource in our Terraform code. Resources have attributes, required and optional ones. In this case, we supply Terraform with the id of the AMI we’d like to create (hardcoding the AMI is bad practice as AMI ids are region specific, but this is just an example) and the instance type.

Now that we’ve picked up steam, let’s quickly define the second resource, the RDS instance:

resource "aws_db_instance" "db" {
  allocated_storage = 10
  engine            = "postgres"
  instance_class    = "db.t2.small"
  username          = "superuser"
  password          = "thisismyverysecurepasswordplsdon'thackme"
}

As we’ve learned, we define a aws_db_instance resource with internal name of db. But we’ve got lots of attributes; all of them required if we wish to define an instance this way (which is a bad way for several reasons, most importantly because we hardcode the superuser’s credentials in plaintext — ouch!). Apart from the credentials, we tell Terraform that this instance should use the PostgreSQL engine and have 10 gibibytes (exactly 2³² bytes, or roughly 10 GB) storage allocated to it.

So far so good, but how do we get from code to infrastructure? Before we can use Terraform productively, Terraform needs some information about the provider we use, like the region we’d like our resources to live in by default, or more importantly, what access credentials to use when accessing AWS.

As these credentials should be handled very carefully, I don’t feel comfortable devising even an example where they’re supposed to be provided inline. Suffice it to say, there are two other ways to provide Terraform with credentials, either via environment variables (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY) or by having Terraform read them from a credentials file (which, by default, lives at $HOME/.aws/credentials); AWS provides a short article on how to use the credentials file.

Now that we’ve talked about the sensitive topic of credential management, let’s provide Terraform with some basic information like the region and what version of Terraform we require:

provider "aws" {
  region = "us-west-1"
}

terraform {
  required_version = "> 0.8.0"
}

Throwing everything together, we get the following file, which we save as wordpress.tf:

provider "aws" {
  region = "us-west-1"
}

terraform {
  required_version = "> 0.8.0"
}

resource "aws_instance" "web-server" {
  ami           = "ami-0f5dbc86dd9cbf7a8"
  instance_type = "t2.micro"
}

resource "aws_db_instance" "db" {
  allocated_storage = 10
  engine            = "postgres"
  instance_class    = "db.t2.small"
  username          = "superuser"
  password          = "thisismyverysecurepasswordplsdon'thackme"
}

So, again: how do we make this run? Terraform by itself doesn’t understand how to interact with the AWS API, as each provider’s functionality is provided by what Terraform calls a plugin. Luckily, Terraform supports downloading the AWS plugin automatically by running terraform init. After the download successfully completes, we can run terraform plan, which is a dry run: it lists what changes Terraform would make once run. It shows you a diff: + signifies creating resources, ~ signifies modifying existing resources and signifies deletion of resources. That’s right: Terraform supports modifying existing resources to match your requirements. It does so by keeping a state of what resources it manages and only deleting and recreating resources when it absolutely has to.

By default, Terraform saves its state into a file called terraform.tfstate. If you’re the only one managing your infrastructure, that’s fine; if there’s more people that execute your Terraform scripts, this becomes a problem: how does Terraform on computer A know about the state of the infrastructure if that infrastructure previously was managed on computer B? You could commit the state file into your repository, but that’s not a good solution, as it defeats the purpose of using a decentralised VCS (also, locking the state file to forbid parallel access is near impossible). Luckily, Terraform supports storing the state remotely and mechanisms to lock it, among others using AWS S3. As this is a somewhat advanced topic, I will write a short guide on how to do this in a separate article.

For now, let’s just create our resources with terraform apply. After Terraform finishes, we check our AWS Console and sure enough, both the EC2 instance and the RDS instance were created. To destroy our resources again, just use terraform destroy.

As not all resources are created and managed by Terraform in most cases (at least not in the beginning), Terraform supports referencing entities that are not managed by Terraform itself, called Data Sources. For example, in the EC2 example above we used a hardcoded AMI ID to specify the image we’d like our server to run. As this is bad practice (maybe we want to automatically use the most recent build of Ubuntu 16.04), we can instruct Terraform to query AWS and use the resulting ID in our script. To do this, we use a data block, like this:

data "aws_ami" "ubuntu" {
  most_recent = true

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-*"]
  }
}

The syntax is familiar: we specify the type of resource we’re querying for (aws_ami) and an internal name used to reference the resulting entity. While we tell Terraform to fetch the most recent version matching (most_recent = true), we also filter our search results by name: we match only AMIs whose name matches ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-*. There’s more than the name we could filter for; use Terraform’s excellent documentation to find out more.

To use this newly created entity, Terraform supports string interpolation. For example, to reference the Ubuntu image’s id, we could say ${data.aws_ami.ubuntu.id}. You might have noticed that I snuck something in there: objects in Terraform can expose fields, like id in this case (that’s why we give our objects internal names). Let’s refactor the example from above to use the the most recent version of Ubuntu 16.04-amd64-server:

data "aws_ami" "ubuntu" {
  most_recent = true

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-*"]
  }
}

resource "aws_instance" "web-server" {
  ami           = "${data.aws_ami.ubuntu}"
  instance_type = "t2.micro"
}

To confirm this works, just terraform plan. You can see that Terraform queries for the data source and saves it to the state to keep track of it.

The same way we can interpolate fields of our resources or data sources into our scripts, we can use user-supplied values, which are, unsurprisingly, called variables. The declaration of a variable looks like this:

variable "stage" {
  type = "string"
  default = "development"
  description = "the stage this server serves"
}

Simple enough, right? We declare a new variable called stage which is of type string (at the time of writing the only supported variable types are string, list, and map). We also supply a default value and a description.

As this was just the declaration of a variable, how do we define it? There’s three ways to supply Terraform with the definition of variables:

  1. Varfile This is the most used method (afaik): you supply Terraform with a var-file containing the definitions to the variables you’re using by setting the command line argument -var-file. Example: If I had a file called vars.tfvars containing the definition stage=testing, I’d invoke Terraform like this: $ terraform plan -var-file=vars.tfvars.

  2. Environment variables You can define variables when invoking Terraform at the command-line by prepending TF_VAR to the variable name. For example: if I wanted to invoke Terraform with the stage variable set to testing, I could write $ TF_VAR_stage='testing' terraform plan.

  3. Interactively If variables are not defined via the other methods and don’t have a default field set, Terraform prompts the user for the definition at runtime.

Let’s refactor our Wordpress infrastructure example to reflect what we’ve learned thus far:

# vars.tfvars

aws_region="eu-central-1"
ec2_instance_type="t2.medium"
rds_storage="10"
rds_engine="postgres"
rds_instance_class="db.t2.micro"
rds_username="mysuperuser"
rds_password="verysecretnobodycanhackme"
variable "aws_region" {
  type = "string"
  description = "region for resources to live in"
}

terraform {
  required_version = "> 0.8.0"
}

provider "aws" {
  region = "${var.aws_region}"
}

variable "ami_filter" {
  type = "string"
  description = "name to filter AMIs for"
  default = "ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-*"
}

variable "ec2_instance_type" {
  type = "string"
  description = "the identifier of the instance type to use for the EC2 instance"
}

variable "rds_storage" {
  type = "string"
  description = "how many gibibytes to allocate to the RDS instance"
}

variable "rds_engine" {
  type = "string"
  description = "the identifier of the RDS engine"
}

variable "rds_instance_class" {
  type = "string"
  description = "the identifier of the instance class to be used for the RDS instance"
}

variable "rds_username" {
  type = "string"
  description = "the username of the superuser for the RDS instance"
}

variable "rds_password" {
  type = "string"
  description = "the password of the superuser for the RDS instance"
}

data "aws_ami" "ubuntu" {
  most_recent = true

  filter {
    name   = "name"
    values = ["${var.ami_filter}"]
  }
}

resource "aws_instance" "web-server" {
  ami           = "${data.aws_ami.ubuntu.id}"
  instance_type = "${var.ec2_instance_type}"
}

resource "aws_db_instance" "db" {
  allocated_storage = "${var.rds_storage}"
  engine            = "${var.rds_engine}"
  instance_class    = "${var.rds_instance_class}"
  username          = "${var.rds_username}"
  password          = "${var.rds_password}"
}

As you can see, we extracted most of the arguments into variables which we define in a separate file (vars.tfvars). You probably noticed that this becomes very messy: we only have two resources right now but 65 lines of code. Also, we are mixing Data Sources, Resources and variable declarations. To clean it up, I recommend grouping each entity type (variable, resource, data, etc.) and extract them into different files (maybe variables.tf, resource.tf, data.tf?) The choice of file name is yours, Terraform parses every file in the current module. That raises the question: what’s a module?

This is an easy question to answer: modules in Terraform are just folders. Every file (direct child, i.e. not recursively) in that folder is considered part of the module.

The same way we can reference data sources’ attributes, we can use the resource’s attributes to give us a summary of useful information about the resources created and data sources used after Terraform is finished; Terraform provides output blocks for that:

output "ec2_public_ip" {
  value = "${aws_instance.web-server.public_ip}"
}

output "rds_connection_string" {
  value = "host=${aws_db_instance.db.address} port=${aws_db_instance.db.port} user=${aws_db_instance.db.username}"
}

output "ami_instance_id" {
  value = "${data.aws_ami.ubuntu.id}"
}

In this example, I print the public ip of the EC2 instance we created, a PostgreSQL conninfo string and the AMI id we ended up using.

That’s it for this part, see you in part 2.