Security Group vs Network ACL: Stateful vs Stateless Traffic Filtering in AWS VPC

When traffic enters your VPC, two distinct filtering layers stand between the packet and your workload — Security Groups and Network ACLs. Engineers routinely misconfigure one while relying on the other, and the resulting behavior is often silent: traffic drops with no obvious error, or rules that look correct on paper fail in production because the stateless layer never saw the return path.

TL;DR: Security Group vs Network ACL

DimensionSecurity GroupNetwork ACL
Attachment levelENI (instance/resource)Subnet
State trackingStatefulStateless
Rule evaluationAll rules evaluated, most permissive winsRules evaluated in order; first match wins
Allow/DenyAllow onlyAllow and Deny
Return trafficAutomatically permittedRequires explicit outbound rule
Default behaviorDeny all inbound, allow all outboundAllow all inbound and outbound (default NACL)

How VPC Traffic Filtering Works

Every packet entering or leaving a subnet crosses the Network ACL boundary first. If the NACL permits it, the packet then reaches the target ENI where the Security Group is evaluated. These two layers are not redundant — they operate at different scopes and with fundamentally different state models. A Security Group tracks connection state: once an inbound connection is allowed, the response traffic is automatically permitted regardless of outbound rules. A Network ACL has no memory of prior packets. Each packet — including TCP ACKs and response payloads — is evaluated independently against the rule list.

graph TD IGW["Internet Gateway / VPN"] RT["Route Table"] NACL["Network ACL (Subnet boundary — Stateless)"] SG["Security Group (ENI boundary — Stateful)"] EC2["EC2 / Resource"] IGW --> RT RT --> NACL NACL -->|"Inbound: rule order match Outbound: rule order match"| SG SG -->|"Allow rules evaluated Return traffic auto-permitted"| EC2 style NACL fill:#f0a500,color:#000 style SG fill:#1a73e8,color:#fff
  1. Internet Gateway / VPN — packet enters the VPC.
  2. Route Table — determines which subnet the packet targets.
  3. Network ACL (subnet boundary) — stateless evaluation; rules checked in ascending numeric order, first match wins. Both inbound and outbound directions require explicit rules.
  4. Security Group (ENI boundary) — stateful evaluation; all allow rules are checked, no explicit deny rules exist. Return traffic for tracked connections is automatically allowed.
  5. EC2 / Resource — packet delivered to the workload.

Security Group Deep Dive: Stateful Filtering at the ENI

A Security Group is attached to an Elastic Network Interface, not to an instance directly. One ENI can have up to five Security Groups associated simultaneously, and all rules across all attached groups are evaluated together — there is no ordering between groups. The effective policy is the union of all allow rules. Because Security Groups are stateful, the VPC connection tracking layer records outbound-initiated flows and permits the inbound response automatically, and vice versa for inbound-initiated flows.

Think of a Security Group as a stateful firewall built into the NIC itself. The OS never sees a packet the SG drops, and the SG never forgets an established connection mid-flight.

Security Groups support only allow rules. There is no mechanism to explicitly deny traffic from a specific IP at the Security Group layer — that capability belongs to the NACL. This is a hard architectural boundary, not a configuration gap.

Minimal Security Group for an HTTPS web server

# Create the security group
aws ec2 create-security-group \
  --group-name web-sg \
  --description "HTTPS inbound for web tier" \
  --vpc-id vpc-0abcd1234efgh5678

# Allow inbound HTTPS
aws ec2 authorize-security-group-ingress \
  --group-id sg-0123456789abcdef0 \
  --protocol tcp \
  --port 443 \
  --cidr 0.0.0.0/0

No explicit outbound rule is needed for the HTTPS response — connection tracking handles it. The default outbound rule (allow all) is present unless you explicitly remove it.

Network ACL Deep Dive: Stateless Filtering at the Subnet

A Network ACL is associated with a subnet, and every resource in that subnet is subject to it — there is no per-instance opt-out. Rules are numbered and evaluated in ascending order; the first matching rule terminates evaluation. A trailing implicit deny (*) drops all traffic not matched by a numbered rule.

Because NACLs are stateless, you must account for both directions of every flow. For TCP traffic initiated from outside, you need an inbound allow rule for the destination port and an outbound allow rule covering the ephemeral port range the client uses for its source port. Forgetting the ephemeral port range on the outbound side is the single most common NACL misconfiguration in production.

The ephemeral port range varies by OS. Linux kernels typically use 32768–60999; Windows uses 49152–65535. AWS recommends allowing 1024–65535 on the outbound side of a NACL serving public clients to cover all common ranges.

NACL rules for a public subnet serving HTTPS

# Allow inbound HTTPS (rule 100)
aws ec2 create-network-acl-entry \
  --network-acl-id acl-0123456789abcdef0 \
  --rule-number 100 \
  --protocol tcp \
  --rule-action allow \
  --ingress \
  --cidr-block 0.0.0.0/0 \
  --port-range From=443,To=443

# Allow outbound ephemeral ports for return traffic (rule 100)
aws ec2 create-network-acl-entry \
  --network-acl-id acl-0123456789abcdef0 \
  --rule-number 100 \
  --protocol tcp \
  --rule-action allow \
  --egress \
  --cidr-block 0.0.0.0/0 \
  --port-range From=1024,To=65535
sequenceDiagram participant Client participant NACL as Network ACL participant SG as Security Group participant RDS as Server Client->>NACL: SYN (dst:443) NACL->>SG: Rule 100 matched — ALLOW SG->>RDS: Port 443 rule matched — ALLOW + track RDS->>SG: SYN-ACK (src:443, dst:ephemeral) SG->>NACL: Tracked connection — auto ALLOW alt Ephemeral outbound rule EXISTS NACL->>Client: Rule matched — ALLOW else Ephemeral outbound rule MISSING NACL-->>RDS: Implicit DENY — packet dropped silently end
  1. Client SYN arrives at the NACL inbound — matched by rule 100 (port 443), allowed.
  2. Packet reaches the Security Group — inbound port 443 rule matches, allowed. Connection tracked.
  3. Server SYN-ACK leaves the ENI — Security Group allows it automatically (tracked connection).
  4. Packet hits the NACL outbound — must match an explicit rule. Rule 100 covers ephemeral ports, allowed.
  5. If the outbound ephemeral rule is missing, the SYN-ACK is silently dropped at the NACL. The Security Group never caused the problem — the NACL did.

The Misdiagnosis That Costs Hours: A Production Pattern

The alert fires: an internal service in a private subnet can no longer reach an RDS instance. The on-call engineer checks the RDS Security Group — port 5432 is open to the application subnet CIDR. The application Security Group allows all outbound. Everything looks correct. The engineer restarts the application, checks VPC Flow Logs, and sees ACCEPT on the inbound side of the RDS ENI but no response reaching the application.

The misdiagnosis: the Security Group must be wrong. The actual cause: a NACL rule was added to the private subnet two days earlier to block a specific IP range. The rule number was set to 90 — lower than the existing allow rules — and the CIDR range was broader than intended, covering the application subnet. The NACL was accepting the inbound SYN (a separate rule covered port 5432 inbound), but the outbound ephemeral port rule had never been added for the private subnet's NACL. The RDS response was silently dropped at the subnet boundary.

VPC Flow Logs record the NACL decision separately from the Security Group decision. Filtering for action = REJECT on the subnet's outbound traffic immediately surfaces the dropped response packets — something no Security Group log would ever show.

Query Flow Logs for NACL-level rejects (CloudWatch Logs Insights)

fields @timestamp, srcAddr, dstAddr, srcPort, dstPort, action, logStatus
| filter action = "REJECT"
| filter dstAddr like "10.0.2."
| sort @timestamp desc
| limit 50

Choosing the Right Layer: Decision Guide

graph TD Start(["Need to filter VPC traffic"]) Q1{"Explicit DENY required?"} Q2{"Subnet-wide policy or per-resource?"} Q3{"Stateless return path acceptable?"} UseSG["Use Security Group (stateful, ENI-level)"] UseNACL["Use Network ACL (stateless, subnet-level)"] UseBoth["Use Both: NACL for subnet deny SG for resource allow"] Start --> Q1 Q1 -->|No| Q2 Q1 -->|Yes| UseNACL Q2 -->|Per-resource| UseSG Q2 -->|Subnet-wide| Q3 Q3 -->|Yes| UseNACL Q3 -->|No| UseBoth style UseSG fill:#1a73e8,color:#fff style UseNACL fill:#f0a500,color:#000 style UseBoth fill:#34a853,color:#fff

The practical rule: use Security Groups as your primary access control mechanism for all resource-to-resource and client-to-resource flows. Add NACL rules only when you need subnet-wide explicit deny — blocking a known malicious CIDR, enforcing a hard network boundary between subnets, or satisfying a compliance requirement that mandates a stateless perimeter control.

IAM Permissions Required to Manage These Resources

Managing Security Groups and NACLs requires EC2 API permissions scoped to the VPC. Follow least privilege — separate the read path from the write path, and restrict modify actions to specific VPC resources where possible.

🔽 Click to expand — IAM policy for Security Group and NACL management
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DescribeNetworkResources",
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeSecurityGroups",
        "ec2:DescribeSecurityGroupRules",
        "ec2:DescribeNetworkAcls",
        "ec2:DescribeVpcs",
        "ec2:DescribeSubnets"
      ],
      "Resource": "*"
    },
    {
      "Sid": "ModifySecurityGroups",
      "Effect": "Allow",
      "Action": [
        "ec2:AuthorizeSecurityGroupIngress",
        "ec2:AuthorizeSecurityGroupEgress",
        "ec2:RevokeSecurityGroupIngress",
        "ec2:RevokeSecurityGroupEgress",
        "ec2:UpdateSecurityGroupRuleDescriptionsIngress",
        "ec2:UpdateSecurityGroupRuleDescriptionsEgress"
      ],
      "Resource": "arn:aws:ec2:us-east-1:123456789012:security-group/*"
    },
    {
      "Sid": "ModifyNetworkAcls",
      "Effect": "Allow",
      "Action": [
        "ec2:CreateNetworkAclEntry",
        "ec2:DeleteNetworkAclEntry",
        "ec2:ReplaceNetworkAclEntry",
        "ec2:ReplaceNetworkAclAssociation"
      ],
      "Resource": "arn:aws:ec2:us-east-1:123456789012:network-acl/*"
    }
  ]
}

Note: Describe actions for EC2 network resources require "Resource": "*" — resource-level restrictions are not supported for those actions. Verify current support in the AWS Service Authorization Reference.

Wrap-Up: Security Group vs Network ACL in Production

The stateful vs. stateless distinction is not academic — it directly determines whether you need to write return-path rules and whether an explicit deny is even possible. Security Groups give you stateful, ENI-scoped allow rules that cover the vast majority of access control needs. Network ACLs give you subnet-wide, stateless rules with explicit deny capability, at the cost of requiring bidirectional rule authoring for every flow.

When traffic mysteriously drops and the Security Group looks clean, check the NACL outbound rules and ephemeral port coverage before anything else. That single check resolves the majority of silent drop incidents in VPC environments.

For further reading, see the official AWS documentation: Security Groups for your VPC and Network ACLs.

Glossary

TermDefinition
Stateful filteringThe firewall tracks connection state; return traffic for an established flow is automatically permitted without an explicit rule.
Stateless filteringEach packet is evaluated independently with no memory of prior packets; both directions of a flow require explicit rules.
ENI (Elastic Network Interface)A virtual network card attached to an EC2 instance or other resource; Security Groups attach at this level.
Ephemeral portsShort-lived source ports (typically 1024–65535) used by clients for the return path of a TCP/UDP connection.
VPC Flow LogsA VPC feature that captures metadata about IP traffic flowing through network interfaces, including ACCEPT/REJECT decisions.

Related Posts

Comments

Popular posts from this blog

EC2 No Internet Access in Custom VPC: Fix Internet Gateway and Route Table

EC2 SSH Connection Timeout: Which Security Group Rules to Check

Difference Between IAM User and IAM Role: Which One Should Your EC2 Use?