HCL Inside Out: Blocks, Data Types, Expressions

K
Kai··7 min read

The last two articles we wrote HCL by imitation: copying the syntax from examples and changing the values. That works, but once you need to write more complex expressions — a list of subnets generated automatically, a resource name assembled from several variables — you have to understand the language rather than guess. This article dissects HCL from the ground up: what a block is made of, what data types exist, how expressions work. We'll use terraform console to try each thing right on the spot, with no resource to create.

Goal

Be able to read any .tf file and know exactly what each line is: which part is a block, which is an argument, what type the value is, what the expression yields. Also grasp what the terraform{} block can declare beyond required_providers.

The three components of HCL syntax

HCL (HashiCorp Configuration Language) is built around two basic constructs: arguments and blocks.

An argument assigns a value to a name: region = "ap-southeast-1". Left of the = is an identifier, right of it is an expression. The docs define it succinctly: "an argument assigns a value to a particular name."

A block is a container for other content. It consists of a type (block type), zero or more labels, and a body inside curly braces. Take the resource from the last article and inspect each part:

resource  "aws_s3_bucket"  "first"  {
   │           │              │      └── body (block body)
   │           │              └───────── label 2: local name
   │           └──────────────────────── label 1: resource type
   └──────────────────────────────────── type: block kind

    bucket_prefix = "tf-series-bai2-"
    └── identifier ┘ └─ expression ─┘
    └──────────── argument ────────────┘

    tags = {                # a map value
      Project = "terraform-series"
    }
}

The number and meaning of labels depend on the block kind. resource needs two labels (resource type and local name). provider needs one ("aws"). terraform needs no labels. This is why Article 2 wrote resource "aws_s3_bucket" "first" with two strings: they aren't parameters, they're the block's identifying labels.

An identifier (argument name, block name, variable name) consists of letters, digits, underscores and hyphens, and must not start with a digit. Comments come in three kinds: # for a single line (the standard, recommended kind), // also single-line (but terraform fmt will change it to #), and /* */ for multiple lines.

terraform console: a lab for expressions

Before talking about data types, get to know the tool for trying them. terraform console opens an interactive prompt; type in HCL expressions and it prints the result. This is the fastest way to learn, because you don't apply anything:

$ echo 'upper("hello")' | terraform console
"HELLO"
$ echo '5 + 3 * 2' | terraform console
11

The console respects operator precedence (3 * 2 first, then + 5), so 5 + 3 * 2 yields 11, not 16. Every result below comes from the real console.

The six value types

Terraform has six value types. Three primitive types:

string: a sequence of Unicode characters, enclosed in double quotes like "ap-southeast-1". number: a numeric value, representing both integers (15) and reals (6.283). bool: true or false.

$ echo 'true && false' | terraform console
false

Two groups of types that aggregate multiple values:

list / tuple: an ordered sequence of values, indexed from 0, e.g. ["us-east-1a", "us-east-1c"]. map / object: a group of values labeled by name, e.g. { name = "web", port = 443 }. The console prints a map across multiple lines:

$ echo '{ name = "web", port = 443 }' | terraform console
{
  "name" = "web"
  "port" = 443
}

The difference between list-vs-tuple and map-vs-object lies in whether the elements have the same type: list/map require every element to be the same type, while tuple/object allow a different type at each position. When writing configuration you just type [...] and {...}, and Terraform infers the concrete type; this distinction only matters when declaring a type for a variable (Article 9).

The sixth type is null — "a value that represents absence or omission." When you assign null to an argument, Terraform treats it as if you hadn't written that argument at all: it uses the default if there is one, or errors if the argument is required. null is quite different from the empty string "" or the number 0: it means "no value," not "a value equal to empty."

Expressions: tying values together

Arguments are rarely just constants. Expressions let you compute a value from other values.

Arithmetic and logical operators work as expected. Comparison and logic return a bool. The ternary operator condition ? a : b picks one of two branches:

$ echo '1 == 1 ? "yes" : "no"' | terraform console
"yes"

The ternary is extremely common for toggling configuration by environment (e.g. prod enables delete-protection, dev doesn't).

String interpolation embeds an expression inside a string with ${...}:

$ echo '"web-${1 + 1}"' | terraform console
"web-2"

In practice you'll write "${var.env}-web" to splice the environment name into a resource name. The ${...} part is computed first, then spliced into the string.

Functions are where most of HCL's expressive range comes from. Terraform has hundreds of built-in functions for strings, numbers, collections, encoding, networking, time. A few examples:

$ echo 'length(["a", "b", "c"])' | terraform console
3
$ echo 'tostring(42)' | terraform console
"42"
$ echo 'cidrsubnet("10.0.0.0/16", 8, 2)' | terraform console
"10.0.2.0/24"

cidrsubnet is worth noting: it carves a CIDR block into sub-subnets. Here it takes 10.0.0.0/16, adds 8 bits (making a /24), then takes the 2nd subnet, yielding 10.0.2.0/24. This kind of function is what lets us generate network ranges automatically in the later VPC article, instead of typing each subnet by hand. Terraform doesn't let you define your own functions; you use the built-in set (and you'll look them up frequently).

The terraform{} block: declarations about Terraform itself

The terraform {} block does not describe infrastructure. It declares settings about how Terraform runs this configuration, and takes static values (you can't use variables in it). Six things it can contain:

required_version — "which Terraform CLI versions are permitted to run this configuration." Set it so users on a too-old version are blocked immediately.

required_providers — "all the provider plugins needed to create and manage the resources in the configuration." This is what we used in Article 2 to declare hashicorp/aws ~> 6.0.

backend is where the state file is stored (the default is local; Article 6 switches to S3). cloud configures the use of HCP Terraform instead of a backend. experiments enables experimental language features. provider_meta holds metadata for a provider, rarely used directly.

Across the whole series, the first three are the parts you touch: required_version, required_providers, and backend. Knowing the other three exist is enough.

Why line order doesn't matter

A common stumbling point coming from imperative languages: in HCL, the order in which you declare blocks does not determine the order of execution. You can write an output before a resource, or declare a bucket after the security group that uses it; the result doesn't change. The reason lies in its declarative nature: Terraform doesn't read the file top to bottom and act accordingly. It loads the whole configuration, builds a dependency graph from the references between resources (aws_s3_bucket.first.arn creates a dependency edge), and only then decides the order. You describe what needs to exist, not what steps to follow. That dependency graph is the subject of Article 5.

HCL also has a JSON syntax variant (.tf.json files) for cases where configuration is machine-generated. When writing by hand, always use the native HCL syntax as above; knowing the JSON variant exists is enough, you rarely need it.

Wrap-up

HCL has only two constructs: arguments (name = expression) and blocks (type, label, body). The six value types are string, number, bool, list/tuple, map/object, null, where null means "omit the argument." Expressions tie values together via operators, the ternary, ${...} interpolation and built-in functions, and terraform console is the fastest place to try them. The terraform{} block declares things about the runtime environment: version, providers, backend. Line order doesn't matter because Terraform builds a graph from references rather than running sequentially.

We've mentioned state a few times without dissecting it. Next article digs into the state file: what exactly it stores, why Terraform needs it instead of asking AWS directly each time, what the state list / state show commands read, and why this file must be kept carefully.