State: What Terraform Stores, Why It's Needed, and Drift

K
Kai··7 min read

In Article 2 we glanced at the state file and saw it maps local names to real resources. This article digs deep, because state is what confuses and trips up beginners the most. We'll answer three questions: why Terraform needs a separate file instead of asking AWS directly each time, what exactly it stores, and what happens when reality deviates from state. The last one we'll create by hand to watch.

Goal

Understand the role of state well enough not to fear it: know what it stores, why it's needed, be able to read it with state list / state show, and grasp the refresh mechanism that leads to drift detection.

Why Terraform needs state

The natural question: Terraform already knows the configuration you want, AWS knows what actually exists, so what's the state file in the middle for? The HashiCorp docs give four reasons.

Mapping to the real world. Your configuration writes aws_s3_bucket.demo, while AWS only knows a bucket named tf-series-bai4-2026.... State is the bridge between those two names: it records "the local resource named demo is this bucket id." Without this mapping, Terraform doesn't know which bucket on AWS to touch when you change the demo block. Terraform's early prototype tried not to use state, relying on tags to identify resources, and failed because not every resource supports tags.

Dependency metadata. State also stores the dependency relationships between resources. This matters most when deleting: when you remove a resource from the configuration, it's no longer in the file for Terraform to infer the deletion order, so that order must be remembered in state beforehand. Without it, Terraform could delete in the wrong order (delete a subnet before the instance in that subnet).

Performance. State caches the attributes of every resource. With large infrastructure (hundreds, thousands of resources), querying each one over the API on every plan is too slow due to network latency and rate limits. State lets Terraform compute the plan from the cache, refreshing only when needed.

Team synchronization. When multiple people work on the same infrastructure, state placed in a shared (remote) location ensures everyone works on the same copy, with locking so two people don't apply over each other. This is why Part II moves state up to S3.

Taken together, state exists so that Terraform doesn't have to "rediscover" the entire infrastructure from scratch on every run — a thing that's both slow and not always possible.

What state stores

Recreate a bucket to inspect (this article adds an Env = "dev" tag):

resource "aws_s3_bucket" "demo" {
  bucket_prefix = "tf-series-bai4-"
  force_destroy = true

  tags = {
    Project = "terraform-series"
    Env     = "dev"
  }
}

After apply, view state with commands instead of opening the raw JSON file:

$ terraform state list
aws_s3_bucket.demo

$ terraform state show aws_s3_bucket.demo
    bucket                      = "tf-series-bai4-20260525025632034200000001"
    hosted_zone_id              = "Z3O0J2DXBE1FTB"
    id                          = "tf-series-bai4-20260525025632034200000001"
    tags                        = {
        "Env"     = "dev"
        "Project" = "terraform-series"
    }
    ...

state list lists every resource Terraform manages by local address. state show <address> prints all the attributes of one resource exactly as recorded in state. This is not Terraform asking AWS — it reads from the cached state file. With large infrastructure, that's the difference between instant and waiting several minutes.

Refresh: comparing three ways

On each plan (and apply), before computing the diff, Terraform does a refresh step: for each resource in state, it calls the provider to ask AWS how that resource currently looks, then updates the in-memory read. It then compares three things:

   main.tf                terraform.tfstate            AWS (reality)
   (what you WANT)        (what was last KNOWN)        (what EXISTS now)
   Env = "dev"            Env = "dev"                  Env = "dev"
        │                       │                            │
        └───────────┬───────────┴──────────────┬────────────┘
                    ▼                           ▼
              refresh: read AWS, update the in-memory copy
                    │
                    ▼
              compare config ⟷ reality  →  diff

When all three match, the diff is empty, the plan reports "No changes" (seen in Article 2). What this mechanism is for shows up when they don't match.

Drift: when reality deviates from the configuration

Drift is when someone modifies infrastructure outside Terraform — clicking the console during an incident, running the aws CLI by hand, or another tool touching it. Let's create drift to see how Terraform reacts. Change the Env tag from dev to production with the AWS CLI, entirely behind Terraform's back:

$ aws s3api put-bucket-tagging --bucket tf-series-bai4-2026... \
    --tagging 'TagSet=[{Key=Project,Value=terraform-series},{Key=Env,Value=production}]'

Now run plan:

$ terraform plan
aws_s3_bucket.demo: Refreshing state... [id=tf-series-bai4-20260525025632034200000001]

  ~ update in-place

  # aws_s3_bucket.demo will be updated in-place
  ~ resource "aws_s3_bucket" "demo" {
        id   = "tf-series-bai4-20260525025632034200000001"
      ~ tags = {
          ~ "Env"     = "production" -> "dev"
            "Project" = "terraform-series"
        }
    }

Plan: 0 to add, 1 to change, 0 to destroy.

The ~ symbol means "modify in place." Terraform sees reality is production but the configuration says dev, so it proposes pulling reality back to dev: "production" -> "dev". This is the default behavior and the core point of the declarative model — the configuration is the source of truth, and any manual change gets flattened back to exactly what you wrote on the next apply. If you run terraform apply now, the tag returns to dev.

Two options when you hit drift

You don't always want to wipe out a manual change. Sometimes that change is correct and you want to accept it. Terraform has a -refresh-only mode: it only syncs state to match reality, without modifying infrastructure.

$ terraform plan -refresh-only
Note: Objects have changed outside of Terraform

  # aws_s3_bucket.demo has changed
  ~ resource "aws_s3_bucket" "demo" {
      ~ tags = {
          ~ "Env"     = "dev" -> "production"
            "Project" = "terraform-series"
        }
    }

This is a refresh-only plan, so Terraform will not take any actions to undo
these. If you were expecting these changes then you can apply this plan to
record the updated values in the Terraform state without changing any remote
objects.

Notice the arrow direction is reversed: "dev" -> "production". This time Terraform isn't going to modify AWS; it proposes writing the real value (production) into state. The distinction between the two modes:

A normal plan treats the configuration as truth and wants to pull reality back to match the configuration (it will modify AWS). A -refresh-only plan treats reality as truth and only updates state to match (it touches neither AWS nor the configuration — afterward you should update the configuration yourself to stay in sync). The right choice depends on the situation: if the manual change was a mistake, use a normal plan to flatten it; if the manual change was intentional and you'll update the configuration later, use -refresh-only so state doesn't raise a false alarm.

🧹 Cleanup

$ terraform destroy -auto-approve
aws_s3_bucket.demo: Destruction complete after 0s
Destroy complete! Resources: 1 destroyed.

State is plaintext

One thing to repeat because it's serious: the state file contains every attribute of a resource in plain text, including sensitive values like database passwords or private keys. Anyone who can read the state file can read them. The practical consequence: absolutely never commit terraform.tfstate to git, and when working as a team you must keep state somewhere encrypted, with access control. That's exactly what Part II does with remote state on S3, plus state locking so multiple people don't overwrite each other.

Wrap-up

State exists for four reasons: mapping local names to real resources, remembering dependencies to delete in the right order, caching attributes for speed, and synchronizing when working as a team. Each plan begins with a refresh — read reality then compare three ways across config, state and AWS. When they deviate, a normal plan pulls reality back to the configuration (~ ... -> config_value), while -refresh-only writes reality into state without modifying infrastructure. The state file is plaintext, so it must be kept private.

We've mentioned several times that Terraform "infers the order" and "builds a dependency graph." Next article dissects that very graph: where implicit dependencies come from, when you need depends_on, and seeing the graph Terraform builds directly with the graph command.