Julius von Froreich
Software Engineer
This aims to be a gentle introduction to Infrastructure as Code using Hashicorp’s Terraform. All examples herein target Amazon’s AWS as it’s the most widely used Infrastructure as a Service provider (with ~33% as Synergy Research Group reports), but Terraform supports many different providers (among them Google Cloud Platform and Microsoft’s Azure). Before we dive in to some code examples, we should briefly talk about what we mean when we say “Infrastructure as Code”.
Wikipedia says the term refers to “the process of managing and provisioning computer data centers through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools.” That’s a mouthful; let me paraphrase: rather than configuring our machines, databases, load balancers and certificates (and whatever else we manage) manually (via a web interface or CLI), we express our desired infrastructure as code (that’s where the name comes from!)
Now, this code may be written using different paradigms. Terraform’s paradigm is declarative, which means rather than expressing how Terraform should accomplish our desired state, we only define the end state and let Terraform worry about how to get there (kinda like telling the Taxi driver the destination and letting him handle the rest).
Before we start, you need to install Terraform; there are several ways of obtaining the Terraform binary, which depend on the OS you’re using, but if there’s no package for your OS or your OS doesn’t use a package manager, there’s always the Terraform downloads. Make sure the binary is in your path, so you can execute it comfortably.
Before I lose you by rambling too much, here’s an example: Let’s assume I was planning on installing Wordpress for my personal blog (sorry, Medium!). The requirements say I need a web and a database server; using AWS, this could correspond to an EC2 server and a RDS instance. Rather than risking Tenosynovitis using AWS’ web interface, I decide to rely on Terraform to accomplish this task.
In Terraform, components of our infrastructure we’d like to manage are called “resources”. You declare every resource in a separate block, like so:
resource "aws_instance" "web-server" {
ami = "ami-0f5dbc86dd9cbf7a8"
instance_type = "t2.micro"
}
Let’s dissect this: the first line defines the type of resource we’d like to
manage (in this case aws_instance
, which is Terraformspeak for EC2 instance)
and an internal name, which we can use to refer to this resource in our
Terraform code. Resources have attributes, required and optional ones. In this
case, we supply Terraform with the id of the AMI we’d like to create (hardcoding
the AMI is bad practice as AMI ids are region specific, but this is just an
example) and the instance type.
Now that we’ve picked up steam, let’s quickly define the second resource, the RDS instance:
resource "aws_db_instance" "db" {
allocated_storage = 10
engine = "postgres"
instance_class = "db.t2.small"
username = "superuser"
password = "thisismyverysecurepasswordplsdon'thackme"
}
As we’ve learned, we define a aws_db_instance
resource with internal name of
db
. But we’ve got lots of attributes; all of them required if we wish to
define an instance this way (which is a bad way for several reasons, most
importantly because we hardcode the superuser’s credentials in plaintext —
ouch!). Apart from the credentials, we tell Terraform that this instance should
use the PostgreSQL engine and have 10 gibibytes (exactly 2³² bytes, or roughly
10 GB) storage allocated to it.
So far so good, but how do we get from code to infrastructure? Before we can use Terraform productively, Terraform needs some information about the provider we use, like the region we’d like our resources to live in by default, or more importantly, what access credentials to use when accessing AWS.
As these credentials should be handled very carefully, I don’t feel comfortable
devising even an example where they’re supposed to be provided inline. Suffice
it to say, there are two other ways to provide Terraform with credentials,
either via environment variables (AWS_ACCESS_KEY_ID
and
AWS_SECRET_ACCESS_KEY
) or by having Terraform read them from a credentials
file (which, by default, lives at $HOME/.aws/credentials
); AWS provides a
short
article
on how to use the credentials file.
Now that we’ve talked about the sensitive topic of credential management, let’s provide Terraform with some basic information like the region and what version of Terraform we require:
provider "aws" {
region = "us-west-1"
}
terraform {
required_version = "> 0.8.0"
}
Throwing everything together, we get the following file, which we save as
wordpress.tf
:
provider "aws" {
region = "us-west-1"
}
terraform {
required_version = "> 0.8.0"
}
resource "aws_instance" "web-server" {
ami = "ami-0f5dbc86dd9cbf7a8"
instance_type = "t2.micro"
}
resource "aws_db_instance" "db" {
allocated_storage = 10
engine = "postgres"
instance_class = "db.t2.small"
username = "superuser"
password = "thisismyverysecurepasswordplsdon'thackme"
}
So, again: how do we make this run? Terraform by itself doesn’t understand how
to interact with the AWS API, as each provider’s functionality is provided by
what Terraform calls a plugin. Luckily, Terraform supports downloading the AWS
plugin automatically by running terraform init
. After the download
successfully completes, we can run terraform plan
, which is a dry run: it
lists what changes Terraform would make once run. It shows you a diff: +
signifies creating resources, ~
signifies modifying existing resources and —
signifies deletion of resources. That’s right: Terraform supports modifying
existing resources to match your requirements. It does so by keeping a state of
what resources it manages and only deleting and recreating resources when it
absolutely has to.
By default, Terraform saves its state into a file called terraform.tfstate
. If
you’re the only one managing your infrastructure, that’s fine; if there’s more
people that execute your Terraform scripts, this becomes a problem: how does
Terraform on computer A know about the state of the infrastructure if that
infrastructure previously was managed on computer B? You could commit the state
file into your repository, but that’s not a good solution, as it defeats the
purpose of using a decentralised VCS (also, locking the state file to forbid
parallel access is near impossible). Luckily, Terraform supports storing the
state remotely and mechanisms to lock it, among others using AWS S3. As this is
a somewhat advanced topic, I will write a short guide on how to do this in a
separate article.
For now, let’s just create our resources with terraform apply
. After Terraform
finishes, we check our AWS Console and sure enough, both the EC2 instance and
the RDS instance were created. To destroy our resources again, just use
terraform destroy
.
As not all resources are created and managed by Terraform in most cases (at
least not in the beginning), Terraform supports referencing entities that are
not managed by Terraform itself, called Data Sources. For example, in the EC2
example above we used a hardcoded AMI ID to specify the image we’d like our
server to run. As this is bad practice (maybe we want to automatically use the
most recent build of Ubuntu 16.04), we can instruct Terraform to query AWS and
use the resulting ID in our script. To do this, we use a data
block, like
this:
data "aws_ami" "ubuntu" {
most_recent = true
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-*"]
}
}
The syntax is familiar: we specify the type of resource we’re querying for
(aws_ami
) and an internal name used to reference the resulting entity. While
we tell Terraform to fetch the most recent version matching (most_recent = true
), we also filter our search results by name: we match only AMIs whose name
matches ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-*
. There’s more
than the name we could filter for; use Terraform’s excellent documentation to
find out more.
To use this newly created entity, Terraform supports string interpolation. For
example, to reference the Ubuntu image’s id, we could say
${data.aws_ami.ubuntu.id}
. You might have noticed that I
snuck something in there: objects
in Terraform can expose fields, like id
in this case (that’s why we give our
objects internal names). Let’s refactor the example from above to use the the
most recent version of Ubuntu 16.04-amd64-server:
data "aws_ami" "ubuntu" {
most_recent = true
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-*"]
}
}
resource "aws_instance" "web-server" {
ami = "${data.aws_ami.ubuntu}"
instance_type = "t2.micro"
}
To confirm this works, just terraform plan
. You can see that Terraform queries
for the data source and saves it to the state to keep track of it.
The same way we can interpolate fields of our resources or data sources into our scripts, we can use user-supplied values, which are, unsurprisingly, called variables. The declaration of a variable looks like this:
variable "stage" {
type = "string"
default = "development"
description = "the stage this server serves"
}
Simple enough, right? We declare a new variable called stage
which is of type
string
(at the time of writing the only supported variable types are string
,
list
, and map
). We also supply a default value and a description.
As this was just the declaration of a variable, how do we define it? There’s three ways to supply Terraform with the definition of variables:
Varfile This is the most used method (afaik): you supply Terraform with a var-file containing the definitions to the variables you’re using by setting the command line argument
-var-file
. Example: If I had a file called vars.tfvars containing the definitionstage=testing
, I’d invoke Terraform like this:$ terraform plan -var-file=vars.tfvars
.Environment variables You can define variables when invoking Terraform at the command-line by prepending
TF_VAR
to the variable name. For example: if I wanted to invoke Terraform with the stage variable set totesting
, I could write$ TF_VAR_stage='testing' terraform plan
.Interactively If variables are not defined via the other methods and don’t have a default field set, Terraform prompts the user for the definition at runtime.
Let’s refactor our Wordpress infrastructure example to reflect what we’ve learned thus far:
# vars.tfvars
aws_region="eu-central-1"
ec2_instance_type="t2.medium"
rds_storage="10"
rds_engine="postgres"
rds_instance_class="db.t2.micro"
rds_username="mysuperuser"
rds_password="verysecretnobodycanhackme"
variable "aws_region" {
type = "string"
description = "region for resources to live in"
}
terraform {
required_version = "> 0.8.0"
}
provider "aws" {
region = "${var.aws_region}"
}
variable "ami_filter" {
type = "string"
description = "name to filter AMIs for"
default = "ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-*"
}
variable "ec2_instance_type" {
type = "string"
description = "the identifier of the instance type to use for the EC2 instance"
}
variable "rds_storage" {
type = "string"
description = "how many gibibytes to allocate to the RDS instance"
}
variable "rds_engine" {
type = "string"
description = "the identifier of the RDS engine"
}
variable "rds_instance_class" {
type = "string"
description = "the identifier of the instance class to be used for the RDS instance"
}
variable "rds_username" {
type = "string"
description = "the username of the superuser for the RDS instance"
}
variable "rds_password" {
type = "string"
description = "the password of the superuser for the RDS instance"
}
data "aws_ami" "ubuntu" {
most_recent = true
filter {
name = "name"
values = ["${var.ami_filter}"]
}
}
resource "aws_instance" "web-server" {
ami = "${data.aws_ami.ubuntu.id}"
instance_type = "${var.ec2_instance_type}"
}
resource "aws_db_instance" "db" {
allocated_storage = "${var.rds_storage}"
engine = "${var.rds_engine}"
instance_class = "${var.rds_instance_class}"
username = "${var.rds_username}"
password = "${var.rds_password}"
}
As you can see, we extracted most of the arguments into variables which we
define in a separate file (vars.tfvars
). You probably noticed that this
becomes very messy: we only have two resources right now but 65 lines of code.
Also, we are mixing Data Sources, Resources and variable declarations. To clean
it up, I recommend grouping each entity type (variable
, resource
, data
,
etc.) and extract them into different files (maybe variables.tf
,
resource.tf
, data.tf
?) The choice of file name is yours, Terraform parses
every file in the current module. That raises the question: what’s a module?
This is an easy question to answer: modules in Terraform are just folders. Every file (direct child, i.e. not recursively) in that folder is considered part of the module.
The same way we can reference data sources’ attributes, we can use the
resource’s attributes to give us a summary of useful information about the
resources created and data sources used after Terraform is finished; Terraform
provides output
blocks for that:
output "ec2_public_ip" {
value = "${aws_instance.web-server.public_ip}"
}
output "rds_connection_string" {
value = "host=${aws_db_instance.db.address} port=${aws_db_instance.db.port} user=${aws_db_instance.db.username}"
}
output "ami_instance_id" {
value = "${data.aws_ami.ubuntu.id}"
}
In this example, I print the public ip of the EC2 instance we created, a
PostgreSQL conninfo
string and the AMI id we ended up using.
That’s it for this part, see you in part 2.