AWS cloud infrastructure automatically converted to Terraform code using CloudToRepo

21 Mar 2026 AWS Cloud

👁33views
Reverse Engineering Your AWS Estate into Terraform Using CloudToRepo

CloudScale AI SEO - Article Summary

1.
What it is
A step-by-step scripted approach to reverse engineering an inherited AWS estate into Terraform, using CLI tools like Former2 and Terraformer to generate HCL configuration and state across multiple accounts and regions.
2.
Why it matters
Unmanaged AWS infrastructure creates invisible drift, compliance blind spots, and cost waste — getting it into Terraform gives teams a reviewable source of truth and a foundation for incremental cleanup without starting from scratch.
3.
Key takeaway
Commit a complete, imperfect baseline first, then refactor incrementally — trying to generate clean Terraform code before anything is checked in is what makes these projects fail.

If you have ever inherited an AWS estate, you know the feeling before you can even describe it. Hundreds of resources spread across regions you did not know were enabled. Lambdas with no source repos. Config rules that predate the current team. IAM roles that look like they were generated by a sleep-deprived octopus at 2am during a compliance audit.

Eventually someone asks the question: “Can we just put all of this into Terraform?”

You can. But the tooling situation is messier than most guides let on, and the common advice to reach for Terraformer or Former2 is increasingly stale. This guide covers the current state of the tooling landscape honestly, then walks through a purpose-built open source script that automates the tedious parts correctly called CloudToRepo.

Main site: https://cloudtorepo.com

GitHub repo: https://github.com/andrewbakercloudscale/cloudtorepo

Why This Is Hard (And Why Most Guides Get It Wrong)

The instinct is to think of this as “exporting Terraform.” It is not. What you are actually doing is closer to reverse compilation: discovering all resources across accounts and regions, generating Terraform configuration from live infrastructure, reconstructing dependencies, capturing state, and then refactoring everything into something a human can maintain.

Two problems make this harder than it looks.

The tooling is older than it appears. Terraformer, the tool most guides recommend, was built by the Waze engineering team and has not had meaningful maintenance in years. It works, but it predates Terraform’s native import blocks and generates output that needs significant cleanup. Former2 is primarily a browser-based tool, and the CLI variant is a separate community project with limited coverage. Both are fine for getting a rough baseline, but neither should be your primary strategy in 2026.

AWS was never designed to be reverse-compiled. Resources reference each other in ways that tooling will not always catch. Some services do not map cleanly to Terraform resources no matter what you do. IAM is particularly brutal, and the relationship between roles, policies, attachments, and instance profiles is rarely clean in a lived-in estate. Accept these rough edges going in and you will be far less surprised.

The Tooling Landscape in 2026

Before reaching for a third-party tool, it is worth understanding what is actually available today.

Native Terraform Import Blocks (Terraform 1.5+) are the most important thing missing from older guides. Since Terraform 1.5, you can declare imports directly in your configuration as first-class HCL:

import {
  to = aws_s3_bucket.my_bucket
  id = "my-existing-bucket"
}

resource "aws_s3_bucket" "my_bucket" {}

Combine this with the -generate-config-out flag and Terraform will populate the resource block from live state automatically:

terraform plan -generate-config-out=generated.tf

This is version-controlled, reviewable in pull requests, previewed before apply, and supports for_each for bulk imports. For targeted imports of known resources, it is now the right default. The limitation is discovery; it does not tell you what resources exist. That is where scripting fills the gap.

Terraformer is still useful for bulk discovery. It supports AWS, GCP, and Azure, and generates both HCL and state. The caveats: it is largely unmaintained, the output requires significant cleanup, and it predates provider version 5.x so generated code often needs attribute corrections. Use it as a starting point, not an end state.

Former2 is a browser-based tool that scans your account via the AWS JavaScript SDK and generates HCL, CloudFormation, or Troposphere. Genuinely useful for targeted exports of specific resources. It requires a browser extension to bypass CORS on some services. There is no reliable CLI variant; treat any guide that uses former2 generate in a shell script with scepticism.

For anything beyond a handful of resources, the workflow that actually works is: use scripted AWS CLI discovery to enumerate every resource across every account and region, generate native Terraform import blocks from that discovery, run terraform plan -generate-config-out to populate the HCL from live state, and commit the messy baseline before refactoring incrementally. This is what CloudToRepo automates.

Introducing CloudToRepo

CloudToRepo is an open source Bash script that sweeps your AWS estate across accounts and regions and generates ready-to-use Terraform import blocks and skeleton resource configurations, structured by account, region, and service.

GitHub: https://github.com/andrewbakercloudscale/cloudtorepo

Main Site: https://cloudtorepo.com

Out of the box it handles 65+ services across every major category found in production AWS estates:

Category	Services
Compute	`ec2`, `ebs`, `ecs`, `eks` (clusters, node groups, addons, Fargate profiles), `lambda`
Networking	`vpc` (VPCs, subnets, security groups, route tables, IGWs, NAT gateways), `elb`, `cloudfront`, `route53`, `acm`, `transitgateway`, `vpcendpoints`
Data	`rds`, `dynamodb`, `elasticache`, `msk`, `s3`, `efs`, `opensearch`, `redshift`, `documentdb`
Streaming	`kinesis` (Data Streams and Firehose)
Integration	`sqs`, `sns`, `apigateway`, `eventbridge`, `stepfunctions`, `ses`
Security & Compliance	`iam` (roles, instance profiles, OIDC providers), `kms`, `secretsmanager`, `wafv2`, `config`, `cloudtrail`, `guardduty`
Platform & CI/CD	`ecr`, `ssm`, `cloudwatch`, `backup`, `codepipeline`, `codebuild`
Auth	`cognito` (user pools, clients, identity pools, fully paginated)
ETL	`glue` (jobs, crawlers, databases, connections)
Storage & Transfer	`fsx` (Windows, Lustre, ONTAP, OpenZFS), `transfer` (SFTP/FTPS)
App Platform	`elasticbeanstalk`, `apprunner`, `lightsail`
Analytics	`athena` (workgroups and data catalogs), `lakeformation`, `memorydb`
Governance	`servicecatalog` (portfolios and products), `organizations`, `servicequotas`, `ram`
Other	`emr`, `sagemaker`, `xray`, `appconfig`, `bedrock`, `connect`

Adding a new service is two steps: write an export_<service>() function, add one line to the dispatch table. Nothing else changes.

The requirements are AWS CLI v2, jq, Terraform >= 1.5, Bash 4+, and appropriate IAM permissions. ReadOnlyAccess is sufficient for discovery.

Getting Started

Install by cloning the repository and making the scripts executable:

git clone https://github.com/andrewbakercloudscale/cloudtorepo.git
cd cloudtorepo
chmod +x cloudtorepo.sh reconcile.sh drift.sh run.sh import.sh

Verify your dependencies are in place:

aws sts get-caller-identity
terraform version   # must be >= 1.5
jq --version

The basic usage pattern is:

./cloudtorepo.sh [OPTIONS]

Options:
  --accounts         "123456789012,987654321098"    Comma-separated account IDs
  --regions          "us-east-1,eu-west-1"          Comma-separated regions
  --services         "ec2,s3,rds"                   Comma-separated services, or "list" to print all
  --profile          "prod-readonly"                Named AWS profile
  --role             "OrganizationAccountAccessRole" Cross-account role to assume
  --state-bucket     "my-tf-state-bucket"           S3 bucket for remote state backend
  --state-region     "us-east-1"                    Region of the state bucket
  --output           "./tf-output"                  Output root directory
  --parallel         5                              Max concurrent service scans
  --exclude-services "config,guardduty"             Services to skip
  --tags             "Env=prod,Team=sre"            Only import resources with these tags
  --since            "2025-01-01"                   Only include resources created after this date
  --resume                                          Skip account/region/service combos already written
  --output-format    json                           Also write summary.json alongside summary.txt
  --dry-run                                         Print resource counts; do not write files
  --debug                                           Verbose logging
  --version                                         Print version and exit

Example Runs

Always start with a dry run. It will tell you exactly what the script intends to export without touching the filesystem:

./cloudtorepo.sh \
  --regions "us-east-1" \
  --services "ec2,vpc,rds" \
  --dry-run

Output:

[10:14:02] [INFO ] Dependencies OK (terraform 1.7.4)
[10:14:02] [WARN ] DRY-RUN mode -- no files will be written
[10:14:02] [INFO ] ========================================
[10:14:02] [INFO ] Account: 123456789012
[10:14:02] [INFO ] ========================================
[10:14:02] [INFO ] Processing region: us-east-1
[10:14:03] [INFO ]   [vpc] scanning...
[10:14:04] [INFO ]   [vpc] 23 resources found
[10:14:04] [INFO ]   [ec2] scanning...
[10:14:06] [INFO ]   [ec2] 14 resources found
[10:14:06] [INFO ]   [rds] scanning...
[10:14:08] [INFO ]   [rds] 6 resources found
[10:14:08] [INFO ] [DRY-RUN] Would write: ./tf-output/123456789012/us-east-1/vpc/backend.tf
[10:14:08] [INFO ] [DRY-RUN] Would write: ./tf-output/123456789012/us-east-1/vpc/imports.tf
[10:14:08] [INFO ] [DRY-RUN] Would write: ./tf-output/123456789012/us-east-1/vpc/resources.tf
[10:14:08] [INFO ] [DRY-RUN] Would write: ./tf-output/123456789012/us-east-1/ec2/backend.tf
[10:14:08] [INFO ] [DRY-RUN] Would write: ./tf-output/123456789012/us-east-1/ec2/imports.tf
[10:14:08] [INFO ] [DRY-RUN] Would write: ./tf-output/123456789012/us-east-1/ec2/resources.tf
[10:14:08] [INFO ] [DRY-RUN] Would write: ./tf-output/123456789012/us-east-1/rds/backend.tf
[10:14:08] [INFO ] [DRY-RUN] Would write: ./tf-output/123456789012/us-east-1/rds/imports.tf
[10:14:08] [INFO ] [DRY-RUN] Would write: ./tf-output/123456789012/us-east-1/rds/resources.tf

For a single account with targeted services and remote state:

./cloudtorepo.sh \
  --regions "us-east-1,eu-west-1" \
  --services "ec2,eks,rds,s3,vpc" \
  --state-bucket my-tf-state-prod \
  --state-region us-east-1 \
  --output ./tf-output

For a multi-account org sweep:

./cloudtorepo.sh \
  --accounts "123456789012,234567890123,345678901234" \
  --role OrganizationAccountAccessRole \
  --regions "us-east-1,eu-west-1,ap-southeast-2" \
  --state-bucket my-tf-state-org \
  --output ./tf-output \
  --debug

For a named AWS profile:

./cloudtorepo.sh \
  --profile prod-readonly \
  --regions "eu-west-1" \
  --services "ec2,vpc,rds,eks" \
  --output ./tf-output

What the Output Looks Like

After a run, your output directory is structured like this:

tf-output/
├── summary.txt
└── 123456789012/
    ├── us-east-1/
    │   ├── ec2/
    │   │   ├── backend.tf
    │   │   ├── imports.tf
    │   │   └── resources.tf
    │   ├── eks/
    │   │   ├── backend.tf
    │   │   ├── imports.tf
    │   │   └── resources.tf
    │   ├── lambda/
    │   │   ├── backend.tf
    │   │   ├── imports.tf
    │   │   ├── resources.tf
    │   │   └── _packages/
    │   │       ├── my-auth-function.zip
    │   │       └── my-worker-function.zip
    │   └── rds/
    │       ├── backend.tf
    │       ├── imports.tf
    │       └── resources.tf
    └── eu-west-1/
        └── ...

Each service directory is a self-contained Terraform root module. The three generated files serve distinct purposes. backend.tf contains remote state configuration and provider setup, pre-populated with the correct S3 key path for this account/region/service combination:

terraform {
  backend "s3" {
    bucket         = "my-tf-state-prod"
    key            = "123456789012/us-east-1/eks/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-lock"
  }

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

imports.tf contains native Terraform 1.5+ import blocks, one per discovered resource:

import {
  to = aws_eks_cluster.cluster_production
  id = "production"
}

import {
  to = aws_eks_node_group.ng_production_application
  id = "production:application"
}

import {
  to = aws_eks_addon.addon_production_coredns
  id = "production:coredns"
}

resources.tf contains skeleton resource blocks ready for terraform plan -generate-config-out to populate:

resource "aws_eks_cluster" "cluster_production" {
  # Auto-generated skeleton -- run:
  #   terraform plan -generate-config-out=generated.tf
  # to populate all attributes from live state.
}

resource "aws_eks_node_group" "ng_production_application" {}

resource "aws_eks_addon" "addon_production_coredns" {}

Once the import blocks are in place, run this in any service directory:

cd tf-output/123456789012/us-east-1/eks
terraform init
terraform plan -generate-config-out=generated.tf

Terraform will query every resource via the provider and write a fully populated generated.tf. Review it, clean up any computed attributes Terraform cannot track, and run terraform apply to bind the live resource to state. Then run terraform plan once more; a clean run with no changes means the import is complete.

Automating the Plan Step with run.sh

Instead of running terraform init and terraform plan manually in every service directory, run.sh walks the entire output tree and does this for every service directory in one command:

# Process every service directory under tf-output
./run.sh --output ./tf-output

# Limit to specific accounts, regions, or services
./run.sh --output ./tf-output --regions "us-east-1" --services "ec2,eks,rds"

# Only run terraform init
./run.sh --output ./tf-output --init-only

# Preview which directories would be processed
./run.sh --output ./tf-output --dry-run

# Run up to 5 terraform processes in parallel (default: 3)
./run.sh --output ./tf-output --parallel 5

Each service directory gets a .run.log file. A summary at the end shows which directories succeeded, had no changes, or failed.

Running terraform import with import.sh

After reviewing generated.tf, use import.sh to run terraform import for every resource block in the output tree. It checks terraform state list first and skips anything already managed, so it is safe to re-run at any time:

# Preview what would be imported
./import.sh --output ./tf-output --dry-run

# Import everything
./import.sh --output ./tf-output

# Import with parallel workers and auto terraform init
./import.sh --output ./tf-output --parallel 4 --init

# Limit to specific accounts, regions, or services
./import.sh --output ./tf-output \
  --accounts "123456789012" \
  --regions  "us-east-1" \
  --services "ec2,eks,rds"

Each service directory gets a .import.log file. A summary at the end shows resources imported, skipped (already in state), and failed.

Does It Catch Everything?

No tool can guarantee full coverage of a large AWS estate, and this one is honest about that. After running the export, use the included reconcile.sh script to diff what was exported against what AWS Resource Explorer sees as the full account inventory:

./reconcile.sh --output ./tf-output --index-region us-east-1

This queries Resource Explorer’s aggregator index, compares every discovered resource against the generated import blocks, and produces a report grouped by region and service:

Summary
-------
Total resources (Resource Explorer):  847
Matched to exported import blocks:    801
Potentially missed:                    46
Coverage:                              94%

If you do not have Resource Explorer enabled, the --local flag bypasses it entirely and prints a per-service import block count directly from the output directory:

./reconcile.sh --output ./tf-output --local

Each entry in the missed list is a decision point: add it to the Terraform estate, mark it intentionally unmanaged, or open a PR to add the service exporter. The report tells you exactly which --services flag to use and points to CONTRIBUTING.md for adding new exporters.

A Note on Lambda Packages

Lambdas deserve special attention and the script handles this automatically. For every discovered function, it downloads the deployment package into _packages/:

lambda/
├── _packages/
│   ├── auth-service.zip
│   ├── payment-processor.zip
│   └── notification-worker.zip
├── backend.tf
├── imports.tf
└── resources.tf

In many inherited estates, the deployment package in AWS is the only surviving copy of the code. The script retrieves it before you start making changes. You will thank yourself later.

The Parts That Will Not Go Smoothly

No guide should pretend this is clean. Here is what to expect.

IAM will be the hardest part. The relationship between roles, policies, and attachments in a lived-in account is rarely clean. The script imports what it can enumerate, but the dependencies between resources will likely need manual untangling. Budget time for it.

terraform plan will show drift. After importing, running terraform plan almost always reveals differences between what Terraform infers from the API and what is in your generated configuration. This is normal. Work through each difference; some will be computed attributes you can remove, some will be genuine configuration drift that needs a decision.

Some resources will be missed. The script covers 65+ services found in production estates. The reconciliation report will surface the gaps. If you have unusual services, add an exporter following the pattern in CONTRIBUTING.md.

Provider version mismatches. Always pin your provider version before running any of this. The backend.tf files generated by the script pin to ~> 5.0; verify this matches your environment.

The Workflow That Works

The pattern that gets this over the finish line every time is the same regardless of estate size. Run the script with --dry-run first and verify the resource counts look right. Run for real, starting with a single region and your most important services. Use run.sh to execute terraform init and terraform plan -generate-config-out=generated.tf across all service directories automatically. Review generated.tf in each directory and remove or fix anything that causes a non-empty second plan. Run import.sh --dry-run to preview, then import.sh to execute terraform import for all resources not yet in state. Run reconcile.sh to identify gaps, or use --local if Resource Explorer is not enabled. Commit everything as a baseline-import branch; messy and imperfect is fine. Refactor incrementally from there, one pull request at a time.

The failure mode on projects like this is always the same: waiting for the output to be clean before committing anything. It will never be clean enough if you set that as the bar. Commit the messy baseline. That is the whole point.

drift.sh – Detecting Changes After Day One

Once your baseline is committed, resources will be created or deleted outside Terraform over time. drift.sh re-scans AWS and diffs the results against your imports.tf files; no AWS Resource Explorer required, just the AWS CLI. Resources found in AWS but missing from imports.tf are flagged as NEW. Resources in imports.tf that no longer exist in AWS are flagged as REMOVED. With --apply, new import blocks are appended and stale ones are commented out with a timestamp. Drop it into a nightly CI job and pipe the output to Slack for continuous governance without a commercial tool. A GitHub Actions workflow (drift-check.yml) is included in the repository that does exactly this, opening a GitHub issue automatically when drift is detected.

# Report only
./drift.sh --output ./tf-output --regions "us-east-1"

# Apply changes to imports.tf in place
./drift.sh --output ./tf-output --regions "us-east-1" --apply

# Save report to file
./drift.sh --output ./tf-output --apply --report ./drift-report.txt

report.sh – Generating a Summary Report

report.sh generates a Markdown summary of any output directory, useful for sharing estate coverage with a team or for tracking progress over time. It produces an executive summary table, per-region and per-service import block counts sorted by size, and an optional drift section when a drift.sh --report file is supplied:

./report.sh --output ./tf-output --title "Production AWS Baseline"
./report.sh --output ./tf-output --drift ./drift-report.txt --out ./summary.md

When the output directory spans more than one account, the report also generates an Account Totals table and a Cross-Account Service Totals table sorted by import block count.

Testing

The tests/ directory contains a bats-core test suite covering 112 tests across 7 suites. All tests use a mock AWS CLI and mock Terraform binary; no real credentials or state are needed. The suites cover the main scanner, drift detection, the import runner, the reconcile coverage calculation, the Markdown report generator, the Terraform plan runner, and the shared helper library. Run the suite with bats tests/ after installing bats-core (brew install bats-core on macOS).

Contributing

The script is designed to be extended. Each service is a self-contained function following a consistent pattern. To add a new service, write an export_<service>() function following the conventions in the existing exporters, add one line to the SERVICE_MAP dispatch table, add the service name to the ALL_SERVICES array, add a matching scan_<service>() function in drift.sh, add bats tests in tests/cloudtorepo.bats and tests/drift.bats, and open a pull request. See CONTRIBUTING.md in the repository for the full guide.