Terraform Data Source (AWS)

Published: (December 9, 2025 at 08:51 AM EST)
3 min read
Source: Dev.to

Source: Dev.to

What Are Terraform Data Sources?

A data source in Terraform is a read‑only lookup to an existing resource. Instead of creating something new, Terraform queries the cloud provider (AWS in this case) and returns information that can be used inside your configuration.

When to Use Data Sources

  • A resource is already created (e.g., shared VPCs, existing AMIs).
  • Another team manages the resource (network or security team).
  • Your Terraform module should not own or recreate the resource.
  • You need the latest or filtered version of something (e.g., the newest AMI).
  • You want to avoid hard‑coding identifiers such as IDs or ARNs.

Using data sources leads to cleaner, more dynamic infrastructure code.

Example 1: Fetching VPC ID Using a Data Source

In many organizations networking is centralized. The VPC already exists, and your Terraform code will only deploy application resources inside it.

data "aws_vpc" "vpc_name" {
  filter {
    name   = "tag:Name"
    values = ["default-vpc"]
  }
}
  • Searches for a VPC where the tag Name = default-vpc.
  • Returns the VPC’s ID, accessible as data.aws_vpc.vpc_name.id.
  • Avoids the need to manually capture or maintain the VPC ID.

Example 2: Fetching Subnet ID from a Specific VPC

Once the VPC is retrieved, you often need a subnet inside it.

data "aws_subnet" "shared_subnet" {
  filter {
    name   = "tag:Name"
    values = ["subnet-a"]
  }
  vpc_id = data.aws_vpc.vpc_name.id
}
  • Fetches a subnet with Name = subnet-a.
  • Ensures the subnet belongs to the VPC fetched earlier.
  • Returns a single subnet ID, usable as data.aws_subnet.shared_subnet.id.

This lets Terraform deploy EC2 or Lambda resources into the correct shared subnet without hard‑coding anything.

Example 3: Fetching the Latest Amazon Linux 2 AMI

AMI IDs change frequently across regions, and using outdated or hard‑coded AMIs leads to deployment failures.

data "aws_ami" "linux2" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }
}
  • Retrieves the latest Amazon Linux 2 AMI.
  • Limits results to official images owned by the Amazon account.
  • Ensures the AMI matches the required architecture and virtualization type.

A perfect example of how data sources keep images up to date automatically.

Using the Data Sources to Launch an EC2 Instance

After fetching the VPC, Subnet, and AMI, you can provision an EC2 instance using those dynamic values:

resource "aws_instance" "ec2_one" {
  ami           = data.aws_ami.linux2.id
  instance_type = var.instance_type
  subnet_id     = data.aws_subnet.shared_subnet.id
  tags          = var.tags
}
  • Uses the AMI from the data source.
  • Places the EC2 instance inside the shared subnet.
  • Applies the user‑provided instance type and tags.

The result is a reusable, environment‑independent, and future‑proof Terraform configuration.

Why Data Sources Matter

  1. Avoids Hardcoding – No need to store IDs, ARNs, or AMIs manually.
  2. Enables Multi‑Team, Multi‑Account Use – Teams can reference central resources without needing permissions to modify them.
  3. Improves Reusability – Modules become generic and work across dev, test, and prod seamlessly.
  4. Supports Dynamic and Automated Infrastructure – Fetching the latest AMIs ensures security and consistency.
  5. Reduces Human Error – Eliminates error‑prone copy‑pasting of IDs.

Conclusion

Terraform data sources are essential for building dynamic, secure, and production‑ready infrastructure. They allow your code to interact with existing AWS resources—such as VPCs, subnets, AMIs, and more—without recreating them. The examples above represent real‑world scenarios where infrastructure teams rely heavily on these patterns, especially in shared network environments. By using data sources effectively, your Terraform setup becomes more scalable, maintainable, and aligned with best DevOps practices.

Back to Blog

Related posts

Read more »

Day-13: Data sources in Terraform

What are Data Sources? You can use data sources to fetch information about existing VPCs, subnets, AMIs, security groups, etc. hcl data 'data_source_type' 'dat...

Day 13: Terraform Data Sources

Data Source Think of a data source like a phone directory with a username and phone number as key‑value pairs accessed via an API. Instead of hard‑coding value...

Day 8 - Terraform Meta-Arguments

Whenever we create any resource using Terraform—whether it is an S3 bucket, an EC2 instance, or a security group—we have to pass certain arguments that are spec...