Security Group vs. Network ACL: Stateful vs. Stateless Traffic Filtering in AWS VPC

When designing network security in AWS VPC, engineers routinely confuse Security Groups and Network ACLs — both filter traffic, but they operate at fundamentally different layers, with different state-tracking behaviors that can cause subtle, hard-to-debug connectivity failures if misunderstood.

TL;DR — Quick Comparison

Attribute Security Group (SG) Network ACL (NACL)
Applies To ENI (instance/resource level) Subnet level
State ✅ Stateful ❌ Stateless
Rule Direction Inbound & Outbound (return traffic auto-allowed) Inbound & Outbound rules evaluated independently
Rule Type Allow only (implicit deny) Allow & Deny (explicit)
Rule Evaluation All rules evaluated together Rules evaluated in numbered order (lowest first)
Default Behavior Deny all inbound, allow all outbound Default NACL: allow all; Custom NACL: deny all
Scope Must be explicitly associated with each resource Automatically applies to all resources in the subnet

The Core Distinction: Where Does Filtering Happen?

Think of your VPC network as a corporate office building. The Network ACL is the security checkpoint at the building entrance — every person (packet) entering or leaving the floor (subnet) is checked, regardless of which desk (instance) they're heading to. The Security Group is the lock on each individual office door — it controls who can enter that specific room, and once you're inside, it remembers you're a trusted visitor so you can leave freely.

Analogy: A NACL is a stateless border customs officer — they check your passport both when you enter and when you leave the country, independently. A Security Group is a stateful hotel key card — once the system grants you access, it remembers your session and lets you back out without re-checking credentials.

Architecture: Traffic Flow Through Both Layers

The diagram below illustrates how inbound and outbound traffic traverses both NACL and Security Group layers for an EC2 instance inside a subnet.

graph TD IGW["🌐 Internet Gateway"] NACL_IN["NACL
Inbound Rules
(Stateless - Ordered)"] SG_IN["Security Group
Inbound Rules
(Stateful)"] EC2["🖥️ EC2 Instance
(ENI)"] SG_OUT["Security Group
Outbound
(Auto-Allowed - Stateful)"] NACL_OUT["NACL
Outbound Rules
(Stateless - Must Explicitly Allow
Ephemeral Ports)"] IGW -->|"Inbound Packet"| NACL_IN NACL_IN -->|"Rule Match: ALLOW"| SG_IN NACL_IN -->|"No Match: DENY"| DROP1["🚫 Dropped"] SG_IN -->|"Rule Match: ALLOW"| EC2 SG_IN -->|"No Match: DENY"| DROP2["🚫 Dropped"] EC2 -->|"Response Packet"| SG_OUT SG_OUT -->|"Connection Tracked: Auto ALLOW"| NACL_OUT NACL_OUT -->|"Outbound Rule Match: ALLOW"| IGW NACL_OUT -->|"No Outbound Rule: DENY"| DROP3["🚫 Dropped"] style IGW fill:#FF9900,color:#fff style NACL_IN fill:#1A73E8,color:#fff style NACL_OUT fill:#1A73E8,color:#fff style SG_IN fill:#34A853,color:#fff style SG_OUT fill:#34A853,color:#fff style EC2 fill:#6c3483,color:#fff style DROP1 fill:#c0392b,color:#fff style DROP2 fill:#c0392b,color:#fff style DROP3 fill:#c0392b,color:#fff
  1. Internet Gateway: Entry point for traffic originating from the public internet.
  2. NACL Inbound Check: The first filter — rules are evaluated in ascending numeric order. If no rule matches, the default deny applies (for custom NACLs).
  3. Security Group Inbound Check: Applied at the ENI of the EC2 instance. All allow rules are evaluated; if none match, traffic is dropped.
  4. EC2 Instance: Receives and processes the request, then generates a response.
  5. Security Group Outbound (Auto-allowed): Because SGs are stateful, the return traffic for an established inbound connection is automatically permitted — no explicit outbound rule needed for the response.
  6. NACL Outbound Check: Because NACLs are stateless, the return traffic is evaluated against outbound rules independently. You MUST have an outbound rule permitting the ephemeral port range used by the response.
  7. Internet Gateway: Return traffic exits back to the client.

Deep Dive: Stateful vs. Stateless — The Critical Difference

Security Groups — Stateful

A Security Group tracks connection state using a flow table. When an inbound connection is permitted, the SG automatically allows the corresponding outbound return traffic, and vice versa. You never need to write a mirror rule for the response.

Example: You allow inbound TCP port 443. A client connects. The EC2 response on an ephemeral port (e.g., 49152–65535) is automatically allowed outbound — no explicit outbound rule required for that return flow.

Network ACLs — Stateless

NACLs have no memory of connections. Every packet is evaluated independently against both inbound and outbound rule sets. This means you must explicitly configure rules in both directions for any communication to succeed.

The Ephemeral Port Trap: This is the most common NACL misconfiguration. When a client initiates a TCP connection to your server on port 443, the server's response flows back to the client on a randomly chosen ephemeral port (typically 1024–65535, OS-dependent). Your NACL outbound rules must permit this port range, or responses will be silently dropped.

sequenceDiagram participant Client as "🌐 Client
(Port 54321)" participant NACL as "NACL
(Stateless)" participant SG as "Security Group
(Stateful)" participant EC2 as "🖥️ EC2
(Port 443)" Note over Client,EC2: Inbound Request Flow Client->>NACL: "TCP SYN: src=54321, dst=443" NACL->>NACL: "Evaluate Inbound Rules (ordered)" NACL->>SG: "Rule 100: ALLOW TCP dst=443 ✅" SG->>SG: "Evaluate all Inbound Rules" SG->>EC2: "ALLOW TCP dst=443 ✅" EC2->>EC2: "Process Request" Note over Client,EC2: Outbound Response Flow EC2->>SG: "TCP Response: src=443, dst=54321" SG->>NACL: "Stateful: Connection tracked → Auto ALLOW ✅" NACL->>NACL: "Stateless: Evaluate Outbound Rules independently" alt "Outbound Rule for ephemeral ports EXISTS" NACL->>Client: "Rule 200: ALLOW TCP dst=1024-65535 ✅" else "Outbound Rule MISSING" NACL->>NACL: "Default DENY * → Response DROPPED 🚫" end
  1. Client Request (Inbound): Client on port 54321 (ephemeral) → Server port 443. NACL inbound rule must allow TCP destination port 443.
  2. Server Response (Outbound): Server port 443 → Client port 54321 (ephemeral). NACL outbound rule must allow TCP destination port range 1024–65535 (or the specific ephemeral range). Without this, the response is dropped at the subnet boundary.
  3. Security Group: Handles the same flow statelessly — only the inbound allow rule for port 443 is needed; return traffic is tracked automatically.

Rule Evaluation Logic

NACL: Ordered, First-Match Wins

NACL rules are numbered (e.g., 100, 200, 300) and evaluated from lowest to highest. The first matching rule is applied and evaluation stops. A common pattern is to use increments of 100 to leave room for future rules.

🔽 Example NACL Rule Table (Inbound)
Rule # Type Protocol Port Range Source Action
100HTTPSTCP4430.0.0.0/0ALLOW
200Custom TCPTCP1024–655350.0.0.0/0ALLOW
300SSHTCP2210.0.0.0/8ALLOW
*All trafficAllAll0.0.0.0/0DENY

Security Group: All Rules Evaluated, Most Permissive Wins

Security Groups do not use numbered rules. All rules are evaluated simultaneously, and if any rule permits the traffic, it is allowed. There is no explicit deny — only an implicit deny if no rule matches. This means you cannot use a Security Group to block a specific IP that would otherwise be allowed by another rule.

🔽 Example Security Group Inbound Rules (JSON)
{
  "GroupId": "sg-0abc123def456",
  "IpPermissions": [
    {
      "IpProtocol": "tcp",
      "FromPort": 443,
      "ToPort": 443,
      "IpRanges": [
        { "CidrIp": "0.0.0.0/0", "Description": "Allow HTTPS from internet" }
      ]
    },
    {
      "IpProtocol": "tcp",
      "FromPort": 22,
      "ToPort": 22,
      "IpRanges": [
        { "CidrIp": "10.0.0.0/8", "Description": "Allow SSH from internal network only" }
      ]
    }
  ]
}

When to Use Each — Practical Decision Framework

graph TD START(["Need to filter network traffic?"]) Q1{"Block a specific IP/CIDR
across entire subnet?"} Q2{"Apply rule to all resources
in a subnet automatically?"} Q3{"Need explicit DENY rule?"} Q4{"Fine-grained per-resource
access control?"} Q5{"Service-to-service auth
within VPC?"} NACL_REC["✅ Use Network ACL
(Subnet-level, Stateless)"] SG_REC["✅ Use Security Group
(Resource-level, Stateful)"] BOTH_REC["✅ Use BOTH
(Defense in Depth)"] START --> Q1 Q1 -->|"Yes"| NACL_REC Q1 -->|"No"| Q2 Q2 -->|"Yes"| NACL_REC Q2 -->|"No"| Q3 Q3 -->|"Yes"| NACL_REC Q3 -->|"No"| Q4 Q4 -->|"Yes"| Q5 Q5 -->|"Yes - SG Referencing"| SG_REC Q5 -->|"No"| SG_REC Q4 -->|"Need both layers"| BOTH_REC style NACL_REC fill:#1A73E8,color:#fff style SG_REC fill:#34A853,color:#fff style BOTH_REC fill:#FF9900,color:#fff
  • Use NACLs for subnet-wide, coarse-grained controls: Blocking a known malicious IP range across an entire subnet, enforcing compliance boundaries between subnet tiers (public vs. private), or adding an explicit deny that Security Groups cannot provide.
  • Use Security Groups for fine-grained, resource-level controls: Allowing only your application server's SG to talk to your database SG (SG referencing), controlling per-instance access, and leveraging stateful tracking to simplify rule management.
  • Use both in combination (Defense in Depth): NACLs as the outer perimeter guard, Security Groups as the inner door lock. This is the AWS Well-Architected Framework recommendation for layered security.

SG Referencing — A Security Group Superpower

Security Groups support referencing another Security Group as a source/destination, instead of a CIDR block. This is a powerful pattern for service-to-service communication within a VPC — it automatically adapts as instances scale in/out without requiring IP-based rule updates.

🔽 AWS CLI: Allow App Tier SG to Access DB Tier SG on Port 5432
# Allow inbound PostgreSQL from the application security group
aws ec2 authorize-security-group-ingress \
  --group-id sg-0db123456789abcde \
  --protocol tcp \
  --port 5432 \
  --source-group sg-0app987654321fedcb \
  --region us-east-1

Common Pitfalls & Gotchas

PitfallRoot CauseFix
Response traffic dropped at subnet boundary NACL outbound rule missing ephemeral port range Add outbound ALLOW for TCP 1024–65535
Cannot block a specific IP with SG SGs have no explicit deny capability Use NACL deny rule for that IP/CIDR
Custom NACL blocks all traffic unexpectedly Custom NACLs start with a default deny-all; default NACL allows all Explicitly add required allow rules after creating a custom NACL
SG rule allows traffic but instance unreachable NACL deny rule (lower number) is blocking before SG is reached Check NACL rule order; a lower-numbered deny overrides higher-numbered allows

Glossary

TermDefinition
Stateful FirewallTracks the state of active connections; return traffic is automatically permitted without an explicit rule.
Stateless FirewallEvaluates each packet independently with no connection memory; both directions require explicit rules.
ENI (Elastic Network Interface)A virtual network card attached to an EC2 instance; Security Groups are applied at the ENI level.
Ephemeral PortsShort-lived, dynamically assigned ports used by clients for the return leg of a TCP/UDP connection (typically 1024–65535, OS-dependent).
Defense in DepthA layered security strategy where multiple independent controls (e.g., NACL + SG) protect the same resource, so a misconfiguration in one layer doesn't fully expose the system.

Next Steps

  • 📖 AWS Docs: Security Groups for your VPC
  • 📖 AWS Docs: Network ACLs
  • 🏗️ Apply the Defense in Depth pattern: use NACLs to block known-bad CIDRs at the subnet boundary, and Security Groups to enforce least-privilege access at the resource level.
  • 🔍 Audit existing NACLs for missing ephemeral port outbound rules — this is the #1 silent connectivity killer in VPC configurations.

Comments