Fixing Terraform remote-exec SSH Timeouts and Connection Errors

intermediatešŸ—ļø Terraform2026-06-14| Terraform 1.0+, AWS/Azure/GCP, Linux Instances (Ubuntu, RHEL, Amazon Linux)

Error Message

Error: timeout - last error: dial tcp 10.0.1.5:22: connect: connection refused
#terraform#aws#ssh#devops#troubleshooting

The Provisioner TrapYou’ve finally nailed the Terraform manifest. You hit terraform apply, watch the instance spin up, and then... silence. The terminal hangs for five minutes before crashing with a familiar failure:

Error: timeout - last error: dial tcp 10.0.1.5:22: connect: connection refused

This usually means your Terraform runner—whether it’s your laptop, a GitHub Actions agent, or a CI/CD pipe—cannot reach the SSH port. It’s a frustrating roadblock, but usually easy to clear. Here is how to debug it without losing your mind.

TL;DR: The 60-Second Audit- Security Groups: Is port 22 open for your specific IP? Check the ingress rules.- IP Selection: Are you trying to hit a private IP from the public internet? Switch the host to public_ip.- Boot Lag: The VM might be 'running' in the AWS console, but sshd might still be initializing.- Usernames: Are you using ubuntu for an Amazon Linux AMI? (It should be ec2-user).### 1. The Routing Gap: Public vs. Private IPsLook closely at the IP in your error: 10.0.1.5. If you are running Terraform from a local machine, you can't route traffic to that address. It’s a private, internal IP. Terraform often grabs the first IP it sees, which is usually the internal one.

The remote-exec provisioner needs a direct line of sight. Unless you are on a VPN or using a bastion host, you must use the instance's public IP.

The Solution:Force the connection block to use the public IP attribute.

resource "aws_instance" "web" {
  # ... configuration ...

  provisioner "remote-exec" {
    connection {
      type        = "ssh"
      user        = "ubuntu"
      private_key = file("~/.ssh/deploy_key")
      host        = self.public_ip # Don't leave this to chance
    }

    inline = ["sudo apt-get update"]
  }
}

2. Security Groups: The Invisible WallEven with the correct IP, a timeout suggests a firewall is silently dropping your packets. A connection refused, however, means you hit the server but were rejected. Most cloud providers default to 'deny all' for inbound traffic.

Your Next Step:Verify your Security Group or Network Security Group allows TCP port 22. For testing, you might use 0.0.0.0/0, but for production, restrict this to your specific IP range (e.g., 203.0.113.5/32).

resource "aws_security_group" "ssh_access" {
  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["your_ip_here/32"]
  }
}

3. The Race Condition: Boot Time RealitiesCloud APIs are fast; Linux kernels are slower. An AWS t3.micro might report as 'Running' within 20 seconds, but cloud-init and sshd often need another 40 to 60 seconds to fully start. If Terraform attempts to connect too early, it might exhaust its retries.

How to Fix it:Increase the connection timeout to give the OS room to breathe. A 5-to-10 minute window is usually safe for most standard images.

connection {
  type    = "ssh"
  user    = "ec2-user"
  host    = self.public_ip
  timeout = "10m" 
}

4. Local Firewalls (UFW/Firewalld)Some hardened AMIs come with internal firewalls enabled. Even if the cloud-level Security Group is wide open, ufw (on Ubuntu) or firewalld (on RHEL) might be blocking port 22.

Run a manual check to rule out Terraform-specific issues:

ssh -i ~/.ssh/key.pem user@1.2.3.4

If this manual command fails, your problem is the network or the OS, not your Terraform code.

The Pro Move: Ditch ProvisionersHashiCorp considers provisioners a 'last resort.' They aren't part of the Terraform state, which makes them brittle. If network flakiness is a recurring theme, move your setup logic into user_data.

Cloud-init runs locally on the machine. It doesn't need an SSH tunnel from your laptop, making it 10x more reliable for bootstrapping instances.

resource "aws_instance" "web" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"
  user_data     = <<-EOF
              #!/bin/bash
              yum update -y
              yum install -y httpd
              systemctl start httpd
              EOF
}

Final Checklist- Subnet: Is the instance in a private subnet? If so, you'll need a Bastion or VPN.- Key Permissions: Ensure your private key is protected (chmod 400).- Username: Double-check the AMI defaults. Using root or admin when the OS expects ubuntu will fail every time.

Related Error Notes