EC2 SSH 'Connection Timed Out': The Definitive Security Group Diagnosis Guide

TL;DR — Fix It in 60 Seconds

An SSH Connection timed out error means your TCP packets are being silently dropped before reaching the instance. The most common culprit is a missing or misconfigured inbound rule in your EC2 Security Group. Here is the exact checklist:

CheckWhat to VerifyCorrect Value
ProtocolSecurity Group Inbound RuleTCP
PortSSH daemon port22 (or custom port in sshd_config)
Source CIDRYour public IP or trusted rangeYOUR_IP/32 (not 0.0.0.0/0 in prod)
Subnet RoutePublic subnet has IGW route0.0.0.0/0 → igw-xxxxxxxx
Instance StateInstance is running & has public IPState: running, Public IPv4 assigned

Why 'Connection Timed Out' — Not 'Connection Refused'

This distinction is critical for fast diagnosis. Think of it like a postal system:

  • Connection Refused = Your letter reached the building, but the mailbox slot is sealed. The OS received the packet and actively rejected it (port closed or no listener).
  • Connection Timed Out = Your letter never arrived. It was intercepted at the city gate (the Security Group or Network ACL) and silently discarded. No response is sent back to the client — the connection just hangs until the client gives up.

A Security Group is a stateful firewall that operates at the instance boundary. When an inbound rule does not match, AWS drops the packet with no RST or ICMP response — hence the timeout, not a refusal.


Data Flow: Where the Packet Dies

Your Machine (port 22 SYN)
  |
  v
Internet Gateway (IGW)
  |
  v
Network ACL (stateless — check inbound AND outbound rules)
  |
  [PASS or DROP]
  |
  v
Security Group (stateful — check inbound rules only)
  |
  [PASS → instance NIC]  OR  [DROP → timeout]
  |
  v
EC2 Instance OS (sshd listening on port 22)
  |
  v
SSH Session Established

In the vast majority of new-instance timeout cases, the packet dies at the Security Group layer because the default Security Group has no inbound rules permitting port 22.


Step-by-Step Diagnosis & Fix (CLI-First)

Step 1 — Identify Your Security Group

aws ec2 describe-instances \
  --instance-ids i-0123456789abcdef0 \
  --query "Reservations[].Instances[].SecurityGroups" \
  --output table

Step 2 — Inspect Current Inbound Rules

aws ec2 describe-security-groups \
  --group-ids sg-0123456789abcdef0 \
  --query "SecurityGroups[].IpPermissions" \
  --output table

Step 3 — Get Your Current Public IP

curl -s https://checkip.amazonaws.com

Step 4 — Add the Correct Inbound Rule (Least Privilege)

Replace 203.0.113.45 with your actual IP from Step 3:

aws ec2 authorize-security-group-ingress \
  --group-id sg-0123456789abcdef0 \
  --protocol tcp \
  --port 22 \
  --cidr 203.0.113.45/32

Never use 0.0.0.0/0 as the source in production. Exposing port 22 to the entire internet invites automated brute-force attacks within minutes of instance launch.

Step 5 — Verify the Route Table (Public Subnet Check)

aws ec2 describe-route-tables \
  --filters "Name=association.subnet-id,Values=subnet-0123456789abcdef0" \
  --query "RouteTables[].Routes" \
  --output table

You must see a route with DestinationCidrBlock: 0.0.0.0/0 and a GatewayId starting with igw-. If this route is missing, your subnet is private and SSH from the internet is architecturally impossible without a Bastion Host or VPN.


Infrastructure-as-Code: Terraform Snippet

For repeatable, auditable infrastructure, define the rule in Terraform instead of clicking through the console:

resource "aws_security_group" "ec2_sg" {
  name        = "ec2-ssh-access"
  description = "Allow SSH from trusted IP only"
  vpc_id      = var.vpc_id

  ingress {
    description = "SSH from admin IP"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["203.0.113.45/32"] # Replace with your IP
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "ec2-ssh-sg"
  }
}

IAM: Minimum Required Permissions

To diagnose and fix Security Group rules, the operator's IAM identity needs only these actions — nothing more:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeInstances",
        "ec2:DescribeSecurityGroups",
        "ec2:AuthorizeSecurityGroupIngress",
        "ec2:DescribeRouteTables"
      ],
      "Resource": "*"
    }
  ]
}

Scope AuthorizeSecurityGroupIngress to a specific Security Group ARN in production using a Resource condition for tighter control.


Cost Impact

Security Groups themselves are free — AWS does not charge for Security Group rules or evaluations. However, the architectural choices around SSH access do carry indirect costs:

  • Bastion Host / Jump Server: Running a dedicated EC2 instance for SSH access incurs standard EC2 compute costs (a t4g.nano costs ~$1.50/month). Use EC2 Instance Connect or AWS Systems Manager Session Manager as a zero-cost, zero-open-port alternative for SSH-equivalent access.
  • AWS Systems Manager Session Manager: No additional cost beyond standard SSM API calls (free tier covers most usage). Eliminates the need to open port 22 entirely.

Secondary Root Causes (If Security Group Is Already Correct)

If your Security Group rule is verified correct and you still timeout, work down this checklist:

  • Network ACL (NACL): NACLs are stateless. You need an explicit inbound rule for port 22 AND an outbound rule for ephemeral ports (1024–65535) to allow the response traffic back.
  • No Public IP: Run aws ec2 describe-instances --instance-ids i-xxx --query "Reservations[].Instances[].PublicIpAddress". If null, the instance has no public IP. Allocate and associate an Elastic IP.
  • Private Subnet: No IGW route means the instance is not internet-reachable. Use SSM Session Manager or a Bastion in a public subnet.
  • Instance Not Running: Verify state is running, not pending or stopped.
  • Host-Based Firewall: If the instance OS has iptables or firewalld rules blocking port 22, the packet reaches the OS but is dropped there. Check via the EC2 Serial Console.

Glossary

TermDefinition
Security GroupA stateful, instance-level virtual firewall in AWS that evaluates inbound and outbound rules; return traffic for allowed connections is automatically permitted.
Network ACL (NACL)A stateless, subnet-level firewall in a VPC that requires explicit rules for both request and response traffic directions.
Internet Gateway (IGW)A horizontally-scaled VPC component that enables communication between instances in a public subnet and the internet.
Ephemeral PortsTemporary, high-numbered ports (1024–65535) dynamically assigned by the OS for the return leg of an outbound or inbound TCP connection.
Elastic IP (EIP)A static, public IPv4 address allocated to your AWS account that can be associated with an EC2 instance to provide a persistent internet-routable address.

Comments

Popular posts from this blog

EC2 No Internet Access in a Custom VPC: Attaching an IGW and Fixing Route Tables

Breaking the Loop: How to Prevent Recursive Lambda Triggers on S3