EC2 SSH 'Connection Timed Out': The Definitive Security Group Diagnosis Guide
TL;DR — Fix It in 60 Seconds
An SSH Connection timed out error means your TCP packets are being silently dropped before reaching the instance. The most common culprit is a missing or misconfigured inbound rule in your EC2 Security Group. Here is the exact checklist:
| Check | What to Verify | Correct Value |
|---|---|---|
| Protocol | Security Group Inbound Rule | TCP |
| Port | SSH daemon port | 22 (or custom port in sshd_config) |
| Source CIDR | Your public IP or trusted range | YOUR_IP/32 (not 0.0.0.0/0 in prod) |
| Subnet Route | Public subnet has IGW route | 0.0.0.0/0 → igw-xxxxxxxx |
| Instance State | Instance is running & has public IP | State: running, Public IPv4 assigned |
Why 'Connection Timed Out' — Not 'Connection Refused'
This distinction is critical for fast diagnosis. Think of it like a postal system:
- Connection Refused = Your letter reached the building, but the mailbox slot is sealed. The OS received the packet and actively rejected it (port closed or no listener).
- Connection Timed Out = Your letter never arrived. It was intercepted at the city gate (the Security Group or Network ACL) and silently discarded. No response is sent back to the client — the connection just hangs until the client gives up.
A Security Group is a stateful firewall that operates at the instance boundary. When an inbound rule does not match, AWS drops the packet with no RST or ICMP response — hence the timeout, not a refusal.
Data Flow: Where the Packet Dies
Your Machine (port 22 SYN) | v Internet Gateway (IGW) | v Network ACL (stateless — check inbound AND outbound rules) | [PASS or DROP] | v Security Group (stateful — check inbound rules only) | [PASS → instance NIC] OR [DROP → timeout] | v EC2 Instance OS (sshd listening on port 22) | v SSH Session Established
In the vast majority of new-instance timeout cases, the packet dies at the Security Group layer because the default Security Group has no inbound rules permitting port 22.
Step-by-Step Diagnosis & Fix (CLI-First)
Step 1 — Identify Your Security Group
aws ec2 describe-instances \ --instance-ids i-0123456789abcdef0 \ --query "Reservations[].Instances[].SecurityGroups" \ --output table
Step 2 — Inspect Current Inbound Rules
aws ec2 describe-security-groups \ --group-ids sg-0123456789abcdef0 \ --query "SecurityGroups[].IpPermissions" \ --output table
Step 3 — Get Your Current Public IP
curl -s https://checkip.amazonaws.com
Step 4 — Add the Correct Inbound Rule (Least Privilege)
Replace 203.0.113.45 with your actual IP from Step 3:
aws ec2 authorize-security-group-ingress \ --group-id sg-0123456789abcdef0 \ --protocol tcp \ --port 22 \ --cidr 203.0.113.45/32
Never use 0.0.0.0/0 as the source in production. Exposing port 22 to the entire internet invites automated brute-force attacks within minutes of instance launch.
Step 5 — Verify the Route Table (Public Subnet Check)
aws ec2 describe-route-tables \ --filters "Name=association.subnet-id,Values=subnet-0123456789abcdef0" \ --query "RouteTables[].Routes" \ --output table
You must see a route with DestinationCidrBlock: 0.0.0.0/0 and a GatewayId starting with igw-. If this route is missing, your subnet is private and SSH from the internet is architecturally impossible without a Bastion Host or VPN.
Infrastructure-as-Code: Terraform Snippet
For repeatable, auditable infrastructure, define the rule in Terraform instead of clicking through the console:
resource "aws_security_group" "ec2_sg" {
name = "ec2-ssh-access"
description = "Allow SSH from trusted IP only"
vpc_id = var.vpc_id
ingress {
description = "SSH from admin IP"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["203.0.113.45/32"] # Replace with your IP
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "ec2-ssh-sg"
}
}IAM: Minimum Required Permissions
To diagnose and fix Security Group rules, the operator's IAM identity needs only these actions — nothing more:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:DescribeSecurityGroups",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:DescribeRouteTables"
],
"Resource": "*"
}
]
}Scope AuthorizeSecurityGroupIngress to a specific Security Group ARN in production using a Resource condition for tighter control.
Cost Impact
Security Groups themselves are free — AWS does not charge for Security Group rules or evaluations. However, the architectural choices around SSH access do carry indirect costs:
- Bastion Host / Jump Server: Running a dedicated EC2 instance for SSH access incurs standard EC2 compute costs (a
t4g.nanocosts ~$1.50/month). Use EC2 Instance Connect or AWS Systems Manager Session Manager as a zero-cost, zero-open-port alternative for SSH-equivalent access. - AWS Systems Manager Session Manager: No additional cost beyond standard SSM API calls (free tier covers most usage). Eliminates the need to open port 22 entirely.
Secondary Root Causes (If Security Group Is Already Correct)
If your Security Group rule is verified correct and you still timeout, work down this checklist:
- Network ACL (NACL): NACLs are stateless. You need an explicit inbound rule for port 22 AND an outbound rule for ephemeral ports (1024–65535) to allow the response traffic back.
- No Public IP: Run
aws ec2 describe-instances --instance-ids i-xxx --query "Reservations[].Instances[].PublicIpAddress". If null, the instance has no public IP. Allocate and associate an Elastic IP. - Private Subnet: No IGW route means the instance is not internet-reachable. Use SSM Session Manager or a Bastion in a public subnet.
- Instance Not Running: Verify state is
running, notpendingorstopped. - Host-Based Firewall: If the instance OS has
iptablesorfirewalldrules blocking port 22, the packet reaches the OS but is dropped there. Check via the EC2 Serial Console.
Glossary
| Term | Definition |
|---|---|
| Security Group | A stateful, instance-level virtual firewall in AWS that evaluates inbound and outbound rules; return traffic for allowed connections is automatically permitted. |
| Network ACL (NACL) | A stateless, subnet-level firewall in a VPC that requires explicit rules for both request and response traffic directions. |
| Internet Gateway (IGW) | A horizontally-scaled VPC component that enables communication between instances in a public subnet and the internet. |
| Ephemeral Ports | Temporary, high-numbered ports (1024–65535) dynamically assigned by the OS for the return leg of an outbound or inbound TCP connection. |
| Elastic IP (EIP) | A static, public IPv4 address allocated to your AWS account that can be associated with an EC2 instance to provide a persistent internet-routable address. |
Comments
Post a Comment