Terraform data source, functions, for expression, dynamic block

In the previous article the configuration started receiving values from outside via variables. But many values we need shouldn't be typed by hand at all: the id of the latest Amazon Linux AMI changes constantly, the list of available zones depends on the region, and the account number shouldn't be hardcoded. Those things already exist on AWS; our job is to read them. This article brings together three tools that make configuration truly flexible: data sources to read, the for expression to transform data, and the dynamic block to generate repeated configuration.

Goal

Read live information from AWS instead of hardcoding it, transform and filter collections with for, and generate nested blocks (like a security group's ingress) with dynamic instead of copying by hand.

data source: read instead of create

A resource creates and manages infrastructure. A data source reads information that already exists, creates nothing, and brings nothing under management. Three commonly used data sources:

data "aws_caller_identity" "current" {}

data "aws_availability_zones" "available" {
  state = "available"
}

data "aws_ami" "al2023" {
  most_recent = true
  owners      = ["amazon"]
  filter {
    name   = "name"
    values = ["al2023-ami-2023.*-x86_64"]
  }
}

aws_caller_identity returns the identity you're using (account id, ARN). aws_availability_zones lists the available AZs in the region. aws_ami with most_recent = true and a filter set finds the latest Amazon Linux 2023 AMI — this is the right way to always use an up-to-date image instead of pinning an id that goes stale. Data sources are read at plan/apply time, and referenced with data.<type>.<name>.<attribute>.

Apply, then see what they return (the real account number replaced with 111122223333):

$ terraform apply -auto-approve
$ terraform output
account_id        = "111122223333"
az_names          = tolist([
  "ap-southeast-1a",
  "ap-southeast-1b",
  "ap-southeast-1c",
])
latest_al2023_ami = "ami-05b741ae2ab9f1742"

None of these three values are in the code — Terraform reads them straight from AWS at run time. Next time Amazon releases a new AMI, latest_al2023_ami updates itself without anyone editing anything.

Functions and for expressions: transform data

Article 3 tried functions in terraform console. They become much more useful when you pair them with the for expression, the tool for transforming and filtering collections. The syntax has two forms: producing a list uses square brackets, producing a map uses curly braces.

variable "allowed_ports" {
  type    = list(number)
  default = [80, 443, 22]
}

locals {
  # list: take the ports, filter out 22 with an if clause
  web_ports = [for p in var.allowed_ports : p if p != 22]

  # map: port -> description
  port_desc = { for p in var.allowed_ports : p => "cho phép cổng ${p}" }
}

The trailing if clause filters elements. The result:

web_ports = [80, 443]          # 22 has been filtered out

port_desc = {
  "22"  = "cho phép cổng 22"
  "80"  = "cho phép cổng 80"
  "443" = "cho phép cổng 443"
}

The list form of for keeps order and can filter; the map form turns a list into key-value pairs. This is how you mold input data into exactly the shape a resource needs, right in the configuration.

dynamic block: generate repeated nested blocks

Many resources have nested blocks that repeat — a security group has multiple ingress blocks, CloudFront has multiple origin blocks. Writing each block by hand when the count changes with the input isn't workable. A dynamic block generates them from a collection. The docs describe it as acting "much like a for expression, but producing nested blocks instead of a value."

resource "aws_security_group" "web" {
  name_prefix = "tf-series-bai10-"
  vpc_id      = data.aws_vpc.default.id

  dynamic "ingress" {
    for_each = local.web_ports
    content {
      description = "HTTP/HTTPS ${ingress.value}"
      from_port   = ingress.value
      to_port     = ingress.value
      protocol    = "tcp"
      cidr_blocks = ["0.0.0.0/0"]
    }
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

The label "ingress" is the name of the block to be generated. for_each iterates over local.web_ports (already filtered down to [80, 443]). Inside content, ingress.value is the current element. Note the egress block is written statically as usual — not every block needs to be dynamic. Apply, then inspect the actual rules generated:

$ terraform state show aws_security_group.web
        description = "HTTP/HTTPS 443"
        from_port   = 443
        to_port     = 443
        description = "HTTP/HTTPS 80"
        from_port   = 80
        to_port     = 80

Exactly two rules for 80 and 443, no 22 (filtered out earlier by for). Change var.allowed_ports and the number of rules changes with it, without editing the block.

The docs come with a warning worth heeding: "overuse of dynamic blocks can make configuration hard to read and maintain." The recommendation is to write nested blocks statically whenever the count is fixed, using dynamic only when the count truly varies with input. A dynamic block that just copies a variable's attributes verbatim is usually a sign of needless abstraction.

🧹 Cleanup

$ terraform destroy -auto-approve
Destroy complete! Resources: 1 destroyed.

Wrap-up

Data sources read existing information on AWS (latest AMI, available AZs, account identity) instead of hardcoding it, referenced via data.<type>.<name>. The for expression transforms and filters collections — list form [for x in ... : ... if ...], map form {for k,v in ... : ... => ...} — to mold data into the shape a resource needs. The dynamic block generates repeated nested blocks from a collection, but should only be used when the count genuinely varies; otherwise write them statically for readability.

Both the dynamic block and creating multiple copies of a resource rest on the idea of iteration. The next article goes into the two mechanisms for creating multiple resources — count and for_each — along with the real-world pitfalls of picking the wrong one, and templatefile to generate file content from variables.

Data Sources, Functions, for Expressions and Dynamic Blocks

Goal

data source: read instead of create

Functions and for expressions: transform data

dynamic block: generate repeated nested blocks

🧹 Cleanup

Wrap-up

Related Posts

From Messy Bank Statements to AI Insights in 48h: An AWS-Native AI Money Coach System Design

AWS-native Observability for EC2 with the CloudWatch Agent