Breaking the Loop: How to Prevent Recursive Lambda Triggers on S3

TL;DR

Writing a processed file back to the same S3 bucket that triggers your Lambda creates an infinite recursive loop. Every write fires a new invocation, which writes again, which fires again — until your account hits concurrency limits or your bill explodes.

StrategyComplexityCost ImpactBest For
Write to a separate output bucketLowMinimal (extra bucket, no extra compute)Most use cases — cleanest solution
Prefix/suffix filtering on S3 triggerLowNoneWhen a single bucket is a hard requirement
Object metadata/tag check in codeMediumSlight (extra S3 API calls)Defense-in-depth layer

Core Fix: Use a dedicated output bucket, or apply S3 event filter rules so the trigger only fires on input/ prefix objects, never on processed output.

Why This Happens: The Recursive Trigger Loop

S3 event notifications are bucket-scoped and prefix/suffix filtered. When you configure a trigger with no filter (or a filter that matches both input and output paths), every s3:ObjectCreated:* event — regardless of who wrote the object — fires the notification.

The data flow looks like this:

User uploads file
  -> S3 bucket receives object (e.g., report.csv)
  -> S3 fires ObjectCreated event
  -> Lambda invoked (Invocation #1)
  -> Lambda processes file
  -> Lambda writes processed-report.csv to SAME bucket
  -> S3 fires ObjectCreated event again
  -> Lambda invoked (Invocation #2)
  -> Lambda writes again...
  -> [LOOP CONTINUES]

Lambda does not natively detect that it caused the triggering event. From S3's perspective, a new object appeared — it has no concept of "who wrote it." This is the root cause.

Analogy: Think of it like a fax machine that automatically prints every incoming fax and then faxes the printout back to the sender. The sender's machine receives it, prints it, and faxes it back again. Neither machine knows to stop — they're just doing their job correctly.

Solution 1: Separate Input and Output Buckets (Recommended)

This is the architecturally cleanest solution. The Lambda trigger is bound to the input bucket only. The output bucket has no Lambda trigger attached.

input-bucket  (trigger: s3:ObjectCreated:*)
  -> Lambda reads from input-bucket
  -> Lambda writes to output-bucket  (NO trigger)
  -> Loop is structurally impossible

AWS CLI: Create Output Bucket and Update Lambda Permissions

# Create the output bucket
aws s3api create-bucket \
  --bucket my-app-output-bucket \
  --region us-east-1

# Block public access on output bucket
aws s3api put-public-access-block \
  --bucket my-app-output-bucket \
  --public-access-block-configuration \
    "BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true"

IAM: Least Privilege Policy for Lambda Execution Role

Grant read access on the input bucket and write-only access on the output bucket. Never grant s3:* on both buckets.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ReadFromInputBucket",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject"
      ],
      "Resource": "arn:aws:s3:::my-app-input-bucket/*"
    },
    {
      "Sid": "WriteToOutputBucket",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::my-app-output-bucket/*"
    }
  ]
}

Apply this policy via CLI:

aws iam put-role-policy \
  --role-name my-lambda-execution-role \
  --policy-name S3LeastPrivilegePolicy \
  --policy-document file://lambda-s3-policy.json

Solution 2: S3 Event Filter by Prefix (Single-Bucket Constraint)

If you must use one bucket, enforce a strict folder structure and filter the S3 trigger to only fire on the input/ prefix. Write all processed output under output/. The trigger will never match output/ objects.

Bucket structure:
  my-bucket/
    input/     <- S3 trigger watches ONLY this prefix
    output/    <- Lambda writes here, trigger ignores it

CloudFormation: Lambda with Prefix-Filtered S3 Trigger

Resources:
  MyLambdaFunction:
    Type: AWS::Lambda::Function
    Properties:
      FunctionName: file-processor
      Runtime: python3.12
      Handler: index.handler
      Role: !GetAtt LambdaExecutionRole.Arn
      Code:
        ZipFile: |
          def handler(event, context):
              pass

  LambdaInvokePermission:
    Type: AWS::Lambda::Permission
    Properties:
      FunctionName: !GetAtt MyLambdaFunction.Arn
      Action: lambda:InvokeFunction
      Principal: s3.amazonaws.com
      SourceArn: !Sub arn:aws:s3:::my-single-bucket

  MyS3Bucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketName: my-single-bucket
      NotificationConfiguration:
        LambdaConfigurations:
          - Event: s3:ObjectCreated:*
            Filter:
              S3Key:
                Rules:
                  - Name: prefix
                    Value: input/
            Function: !GetAtt MyLambdaFunction.Arn

Solution 3: Metadata Guard in Lambda Code (Defense-in-Depth)

Never rely on this as your only safeguard, but it is a valid secondary layer. Before processing, check if the object already has a custom metadata tag indicating it was processed. If yes, exit immediately.

import boto3

s3 = boto3.client('s3')

def handler(event, context):
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = event['Records'][0]['s3']['object']['key']

    # Guard: check for processed metadata tag
    response = s3.head_object(Bucket=bucket, Key=key)
    metadata = response.get('Metadata', {})

    if metadata.get('processed') == 'true':
        print(f"Skipping already-processed object: {key}")
        return {'statusCode': 200, 'body': 'Already processed'}

    # --- Your processing logic here ---

    # Write output with processed metadata tag
    s3.put_object(
        Bucket=bucket,
        Key=f"output/{key}",
        Body=b"processed content",
        Metadata={'processed': 'true'}
    )

Warning: This approach still incurs Lambda invocation costs for every recursive trigger before the guard exits. It reduces damage but does not eliminate the loop structurally.

Cost Impact Analysis

  • Unmitigated loop: A single upload can generate thousands of Lambda invocations within seconds. At scale, this hits the default regional concurrency limit (1,000 concurrent executions) and generates significant Lambda + S3 PUT request charges. AWS Support has documented cases of four-figure bills from a single runaway loop.
  • Separate bucket solution: Adds one S3 bucket (negligible cost: ~$0.023/GB storage + request pricing). Zero additional Lambda invocations.
  • Prefix filter solution: Zero additional cost. Pure configuration change.
  • Metadata guard: Adds one s3:HeadObject API call per invocation ($0.0004 per 1,000 requests). Minimal, but the loop still runs until the guard fires.

Recommendation: Set a Lambda concurrency limit as a circuit breaker while you implement the structural fix.

# Set reserved concurrency to cap runaway invocations during debugging
aws lambda put-function-concurrency \
  --function-name file-processor \
  --reserved-concurrent-executions 5

Glossary

TermDefinition
S3 Event NotificationA configuration that instructs S3 to send a message to a target (Lambda, SQS, SNS) when a specified event (e.g., ObjectCreated) occurs on a bucket.
Recursive TriggerA loop where a function's output re-triggers the same function, causing unbounded invocations without an explicit termination condition.
S3 Key Filter RuleA prefix or suffix condition applied to an S3 event notification to restrict which object keys will fire the notification.
Reserved ConcurrencyA Lambda setting that caps the maximum number of simultaneous executions for a specific function, acting as a circuit breaker against runaway loops.
Least PrivilegeAn IAM security principle where a role or user is granted only the minimum permissions required to perform its specific task, nothing more.

Comments

Popular posts from this blog

EC2 No Internet Access in a Custom VPC: Attaching an IGW and Fixing Route Tables

EC2 SSH 'Connection Timed Out': The Definitive Security Group Diagnosis Guide