Breaking the Loop: How to Prevent Recursive Lambda Triggers on S3
TL;DR
Writing a processed file back to the same S3 bucket that triggers your Lambda creates an infinite recursive loop. Every write fires a new invocation, which writes again, which fires again — until your account hits concurrency limits or your bill explodes.
| Strategy | Complexity | Cost Impact | Best For |
|---|---|---|---|
| Write to a separate output bucket | Low | Minimal (extra bucket, no extra compute) | Most use cases — cleanest solution |
| Prefix/suffix filtering on S3 trigger | Low | None | When a single bucket is a hard requirement |
| Object metadata/tag check in code | Medium | Slight (extra S3 API calls) | Defense-in-depth layer |
Core Fix: Use a dedicated output bucket, or apply S3 event filter rules so the trigger only fires on input/ prefix objects, never on processed output.
Why This Happens: The Recursive Trigger Loop
S3 event notifications are bucket-scoped and prefix/suffix filtered. When you configure a trigger with no filter (or a filter that matches both input and output paths), every s3:ObjectCreated:* event — regardless of who wrote the object — fires the notification.
The data flow looks like this:
User uploads file -> S3 bucket receives object (e.g., report.csv) -> S3 fires ObjectCreated event -> Lambda invoked (Invocation #1) -> Lambda processes file -> Lambda writes processed-report.csv to SAME bucket -> S3 fires ObjectCreated event again -> Lambda invoked (Invocation #2) -> Lambda writes again... -> [LOOP CONTINUES]
Lambda does not natively detect that it caused the triggering event. From S3's perspective, a new object appeared — it has no concept of "who wrote it." This is the root cause.
Analogy: Think of it like a fax machine that automatically prints every incoming fax and then faxes the printout back to the sender. The sender's machine receives it, prints it, and faxes it back again. Neither machine knows to stop — they're just doing their job correctly.
Solution 1: Separate Input and Output Buckets (Recommended)
This is the architecturally cleanest solution. The Lambda trigger is bound to the input bucket only. The output bucket has no Lambda trigger attached.
input-bucket (trigger: s3:ObjectCreated:*) -> Lambda reads from input-bucket -> Lambda writes to output-bucket (NO trigger) -> Loop is structurally impossible
AWS CLI: Create Output Bucket and Update Lambda Permissions
# Create the output bucket
aws s3api create-bucket \
--bucket my-app-output-bucket \
--region us-east-1
# Block public access on output bucket
aws s3api put-public-access-block \
--bucket my-app-output-bucket \
--public-access-block-configuration \
"BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true"
IAM: Least Privilege Policy for Lambda Execution Role
Grant read access on the input bucket and write-only access on the output bucket. Never grant s3:* on both buckets.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ReadFromInputBucket",
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": "arn:aws:s3:::my-app-input-bucket/*"
},
{
"Sid": "WriteToOutputBucket",
"Effect": "Allow",
"Action": [
"s3:PutObject"
],
"Resource": "arn:aws:s3:::my-app-output-bucket/*"
}
]
}
Apply this policy via CLI:
aws iam put-role-policy \ --role-name my-lambda-execution-role \ --policy-name S3LeastPrivilegePolicy \ --policy-document file://lambda-s3-policy.json
Solution 2: S3 Event Filter by Prefix (Single-Bucket Constraint)
If you must use one bucket, enforce a strict folder structure and filter the S3 trigger to only fire on the input/ prefix. Write all processed output under output/. The trigger will never match output/ objects.
Bucket structure:
my-bucket/
input/ <- S3 trigger watches ONLY this prefix
output/ <- Lambda writes here, trigger ignores it
CloudFormation: Lambda with Prefix-Filtered S3 Trigger
Resources:
MyLambdaFunction:
Type: AWS::Lambda::Function
Properties:
FunctionName: file-processor
Runtime: python3.12
Handler: index.handler
Role: !GetAtt LambdaExecutionRole.Arn
Code:
ZipFile: |
def handler(event, context):
pass
LambdaInvokePermission:
Type: AWS::Lambda::Permission
Properties:
FunctionName: !GetAtt MyLambdaFunction.Arn
Action: lambda:InvokeFunction
Principal: s3.amazonaws.com
SourceArn: !Sub arn:aws:s3:::my-single-bucket
MyS3Bucket:
Type: AWS::S3::Bucket
Properties:
BucketName: my-single-bucket
NotificationConfiguration:
LambdaConfigurations:
- Event: s3:ObjectCreated:*
Filter:
S3Key:
Rules:
- Name: prefix
Value: input/
Function: !GetAtt MyLambdaFunction.Arn
Solution 3: Metadata Guard in Lambda Code (Defense-in-Depth)
Never rely on this as your only safeguard, but it is a valid secondary layer. Before processing, check if the object already has a custom metadata tag indicating it was processed. If yes, exit immediately.
import boto3
s3 = boto3.client('s3')
def handler(event, context):
bucket = event['Records'][0]['s3']['bucket']['name']
key = event['Records'][0]['s3']['object']['key']
# Guard: check for processed metadata tag
response = s3.head_object(Bucket=bucket, Key=key)
metadata = response.get('Metadata', {})
if metadata.get('processed') == 'true':
print(f"Skipping already-processed object: {key}")
return {'statusCode': 200, 'body': 'Already processed'}
# --- Your processing logic here ---
# Write output with processed metadata tag
s3.put_object(
Bucket=bucket,
Key=f"output/{key}",
Body=b"processed content",
Metadata={'processed': 'true'}
)
Warning: This approach still incurs Lambda invocation costs for every recursive trigger before the guard exits. It reduces damage but does not eliminate the loop structurally.
Cost Impact Analysis
- Unmitigated loop: A single upload can generate thousands of Lambda invocations within seconds. At scale, this hits the default regional concurrency limit (1,000 concurrent executions) and generates significant Lambda + S3 PUT request charges. AWS Support has documented cases of four-figure bills from a single runaway loop.
- Separate bucket solution: Adds one S3 bucket (negligible cost: ~$0.023/GB storage + request pricing). Zero additional Lambda invocations.
- Prefix filter solution: Zero additional cost. Pure configuration change.
- Metadata guard: Adds one
s3:HeadObjectAPI call per invocation ($0.0004 per 1,000 requests). Minimal, but the loop still runs until the guard fires.
Recommendation: Set a Lambda concurrency limit as a circuit breaker while you implement the structural fix.
# Set reserved concurrency to cap runaway invocations during debugging aws lambda put-function-concurrency \ --function-name file-processor \ --reserved-concurrent-executions 5
Glossary
| Term | Definition |
|---|---|
| S3 Event Notification | A configuration that instructs S3 to send a message to a target (Lambda, SQS, SNS) when a specified event (e.g., ObjectCreated) occurs on a bucket. |
| Recursive Trigger | A loop where a function's output re-triggers the same function, causing unbounded invocations without an explicit termination condition. |
| S3 Key Filter Rule | A prefix or suffix condition applied to an S3 event notification to restrict which object keys will fire the notification. |
| Reserved Concurrency | A Lambda setting that caps the maximum number of simultaneous executions for a specific function, acting as a circuit breaker against runaway loops. |
| Least Privilege | An IAM security principle where a role or user is granted only the minimum permissions required to perform its specific task, nothing more. |
Comments
Post a Comment