Eliminating Lambda Cold Starts: A Deep Dive into Provisioned Concurrency & SnapStart
Your Lambda function responds in milliseconds on every subsequent call — but that first invocation after a period of inactivity takes 2–3 seconds, silently breaking SLAs and degrading user experience. This is the cold start problem, and understanding its root cause is the prerequisite to fixing it correctly.
TL;DR
| Concept | What It Is | Best For | Key Trade-off |
|---|---|---|---|
| Cold Start | Latency from bootstrapping a new execution environment | Understanding the problem | Unavoidable without mitigation |
| Provisioned Concurrency | Pre-initializes N execution environments, keeping them warm | All runtimes; latency-sensitive APIs | Billed even when idle |
| SnapStart | Snapshots the initialized environment; restores from snapshot | Java (Corretto 11, 17, 21) runtimes only | Restore latency; uniqueness considerations |
What Exactly Is a Cold Start?
AWS Lambda runs your code inside a Micro VM (MicroVM) managed by Firecracker. When no warm execution environment exists for your function, Lambda must perform a full bootstrap sequence before your handler even begins executing. This sequence has two distinct phases:
- Platform Init: AWS provisions the MicroVM, downloads your deployment package or container image, and starts the runtime process (JVM, Node.js, Python interpreter, etc.).
- Function Init: Your initialization code outside the handler runs — importing libraries, establishing DB connections, loading ML models, etc.
Only after both phases complete does your handler receive the event. The combined duration is the cold start latency you observe.
- Invoke Request: An event triggers the Lambda function (API Gateway, EventBridge, etc.).
- Environment Check: Lambda's control plane checks for an available warm execution environment.
- Cold Path (red): If none exists, Platform Init + Function Init must complete before the handler runs — this is the cold start penalty.
- Warm Path (green): A reused environment skips both init phases and goes directly to handler execution.
- Response: Handler result is returned to the caller.
Why Java Is the Worst Offender
Cold start duration is heavily influenced by runtime startup time and initialization code complexity. The JVM's class-loading and JIT compilation make Java functions notorious for 2–5 second cold starts, while Python and Node.js typically see 100–500ms. This is precisely why AWS built SnapStart specifically for Java runtimes.
| Runtime | Typical Cold Start Range | Primary Driver |
|---|---|---|
| Java (Corretto) | 1,000ms – 5,000ms+ | JVM startup + class loading |
| Python | 100ms – 700ms | Interpreter + package imports |
| Node.js | 100ms – 500ms | V8 engine + module loading |
| Go (provided.al2) | 50ms – 200ms | Binary startup (compiled) |
Note: These are representative ranges. Actual values depend on deployment package size, VPC configuration, and initialization code complexity. Always measure your specific function.
Solution 1: Provisioned Concurrency
Provisioned Concurrency instructs Lambda to pre-initialize and keep a specified number of execution environments in a ready state. These environments have already completed both Platform Init and Function Init. When an invocation arrives, it is dispatched to a pre-warmed environment with zero init overhead.
- Configuration: You set Provisioned Concurrency on a specific function version or alias (not
$LATEST). - Pre-warming: Lambda proactively initializes the specified number of environments, running your init code.
- Invocation Routing: Incoming requests are routed to pre-warmed environments first.
- Overflow: If requests exceed provisioned capacity, Lambda spins up additional on-demand environments — these will cold start.
- Billing: You are billed for provisioned concurrency hours regardless of invocation volume.
Configuring Provisioned Concurrency (AWS CLI)
🔽 [Click to expand] — CLI: Publish version & set Provisioned Concurrency
# Step 1: Publish an immutable version (required — cannot use $LATEST)
aws lambda publish-version \
--function-name my-api-function \
--description "v1 - production release"
# Step 2: Set Provisioned Concurrency on the published version
# Replace '1' with your published version number from Step 1 output
aws lambda put-provisioned-concurrency-config \
--function-name my-api-function \
--qualifier 1 \
--provisioned-concurrent-executions 10
# Step 3: Poll until status is READY (not IN_PROGRESS)
aws lambda get-provisioned-concurrency-config \
--function-name my-api-function \
--qualifier 1
Auto-Scaling Provisioned Concurrency
Keeping a fixed number of warm environments wastes money during off-peak hours. Use Application Auto Scaling to scale provisioned concurrency based on a schedule or utilization metric.
🔽 [Click to expand] — CLI: Register scalable target & attach scheduled scaling
# Register the Lambda function version as a scalable target
aws application-autoscaling register-scalable-target \
--service-namespace lambda \
--resource-id function:my-api-function:1 \
--scalable-dimension lambda:function:ProvisionedConcurrency \
--min-capacity 2 \
--max-capacity 50
# Scale UP at 08:00 UTC (business hours start)
aws application-autoscaling put-scheduled-action \
--service-namespace lambda \
--resource-id function:my-api-function:1 \
--scalable-dimension lambda:function:ProvisionedConcurrency \
--scheduled-action-name scale-up-morning \
--schedule "cron(0 8 * * ? *)" \
--scalable-target-action MinCapacity=10,MaxCapacity=50
# Scale DOWN at 20:00 UTC (off-peak)
aws application-autoscaling put-scheduled-action \
--service-namespace lambda \
--resource-id function:my-api-function:1 \
--scalable-dimension lambda:function:ProvisionedConcurrency \
--scheduled-action-name scale-down-evening \
--schedule "cron(0 20 * * ? *)" \
--scalable-target-action MinCapacity=2,MaxCapacity=10
Solution 2: Lambda SnapStart (Java Only)
SnapStart takes a fundamentally different approach. Instead of keeping environments perpetually warm, it snapshots the initialized state of the execution environment after Function Init completes, then restores from that snapshot on subsequent cold starts. The expensive JVM startup and class-loading happens once at deployment time, not at invocation time.
- Publish Version: When you publish a new function version with SnapStart enabled, Lambda runs Function Init once.
- Snapshot: Lambda takes a memory and disk snapshot of the fully initialized Firecracker MicroVM.
- Cache: The snapshot is encrypted and cached in a tiered storage layer managed by AWS.
- Restore: On a cold start, Lambda restores from the snapshot instead of re-running init — dramatically reducing latency.
- Hook Execution:
beforeCheckpointandafterRestorelifecycle hooks allow you to handle state that must be refreshed (e.g., re-establishing DB connections, re-seeding random number generators).
Enabling SnapStart (AWS CLI)
🔽 [Click to expand] — CLI: Enable SnapStart on a Java function
# Update function configuration to enable SnapStart
# Supported runtimes: java11, java17, java21
aws lambda update-function-configuration \
--function-name my-java-api-function \
--snap-start ApplyOn=PublishedVersions
# Publish a new version — snapshot is taken at this point
aws lambda publish-version \
--function-name my-java-api-function \
--description "SnapStart enabled - v2"
Implementing Lifecycle Hooks in Java
SnapStart snapshots state — which means any state that must be unique per environment (random seeds, timestamps, open network connections) must be handled in the afterRestore hook.
🔽 [Click to expand] — Java: SnapStart lifecycle hook implementation
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
import org.crac.Core;
import org.crac.Resource;
public class MyHandler implements RequestHandler<MyEvent, MyResponse>, Resource {
private DatabaseConnection dbConnection;
public MyHandler() {
// This runs during Function Init (before snapshot)
// Safe: load configs, initialize static data, warm up classes
Core.getGlobalContext().register(this);
System.out.println("Init: Loading static configuration...");
}
@Override
public void beforeCheckpoint(org.crac.Context<? extends Resource> context) {
// Called BEFORE the snapshot is taken
// Close any connections that should NOT be snapshotted
if (dbConnection != null) {
dbConnection.close();
dbConnection = null;
}
System.out.println("beforeCheckpoint: Closed DB connection before snapshot.");
}
@Override
public void afterRestore(org.crac.Context<? extends Resource> context) {
// Called AFTER restore from snapshot, BEFORE handler invocation
// Re-establish connections, re-seed randomness, refresh tokens
this.dbConnection = DatabaseConnection.create();
System.out.println("afterRestore: Re-established DB connection after restore.");
}
@Override
public MyResponse handleRequest(MyEvent event, Context context) {
// Handler runs with a fully restored, connection-ready environment
return dbConnection.query(event.getId());
}
}
Choosing the Right Solution
Analogy: Think of Provisioned Concurrency as keeping N taxis idling at the airport rank 24/7 — always ready, but burning fuel constantly. SnapStart is like a taxi that can be flash-frozen mid-shift and instantly thawed when a passenger appears — you pay for the freeze once, not for continuous idling.
| Criteria | Provisioned Concurrency | SnapStart |
|---|---|---|
| Runtime Support | All Lambda runtimes | Java (Corretto 11, 17, 21) only |
| Cold Start Elimination | Complete (for provisioned capacity) | Significant reduction (not always zero) |
| Cost Model | Billed per provisioned concurrency-hour | No additional charge beyond standard Lambda pricing |
| Scales to Zero | No (provisioned environments always running) | Yes |
| State Complexity | None (standard init) | Requires lifecycle hook management |
| Deployment Trigger | Manual or auto-scaling configuration | Automatic on publish-version |
Additional Optimizations (Runtime-Agnostic)
Provisioned Concurrency and SnapStart address the platform layer, but your Function Init code is equally important:
- Minimize deployment package size: Smaller packages download faster. Use Lambda Layers for shared dependencies. Avoid bundling unused libraries.
- Lazy initialization: Defer expensive object creation to the first handler invocation if the resource is not always needed.
- Avoid VPC unless necessary: Lambda functions inside a VPC historically had higher cold start latency due to ENI attachment. AWS has significantly improved this with Hyperplane ENIs, but VPC still adds overhead. Only attach to a VPC if your function genuinely requires private resource access.
- Use ARM64 (Graviton2): Graviton2-based Lambda functions can offer better price-performance and in some cases lower cold start times compared to x86_64 for the same workload.
- Increase memory allocation: Lambda allocates CPU proportionally to memory. More memory means faster initialization code execution, which reduces Function Init duration.
Measuring Cold Starts with CloudWatch
Before optimizing, measure. Lambda reports initialization duration in CloudWatch Logs Insights. Use the following query to identify cold start frequency and duration:
🔽 [Click to expand] — CloudWatch Logs Insights: Cold start analysis query
-- Run this in CloudWatch Logs Insights against your Lambda log group
-- e.g., /aws/lambda/my-api-function
filter @type = "REPORT"
| parse @message "Init Duration: * ms" as initDuration
| filter ispresent(initDuration)
| stats
count() as coldStartCount,
avg(initDuration) as avgInitMs,
max(initDuration) as maxInitMs,
pct(initDuration, 95) as p95InitMs,
pct(initDuration, 99) as p99InitMs
by bin(1h)
The Init Duration field only appears in REPORT log lines for cold start invocations. A high coldStartCount relative to total invocations indicates your function is not retaining warm environments — a signal to consider Provisioned Concurrency or traffic pattern analysis.
Glossary
| Term | Definition |
|---|---|
| Execution Environment | The isolated Firecracker MicroVM that hosts a single concurrent Lambda invocation. Reused across warm invocations. |
| Provisioned Concurrency | A Lambda feature that pre-initializes a set number of execution environments, eliminating cold starts for that capacity. |
| SnapStart | A Lambda feature for Java runtimes that snapshots the post-init execution environment and restores from it on cold starts. |
| Init Duration | The time Lambda spent on Function Init (your initialization code) during a cold start, reported in CloudWatch REPORT logs. |
| CRaC (Coordinated Restore at Checkpoint) | The OpenJDK project API used by SnapStart lifecycle hooks (beforeCheckpoint, afterRestore) to manage stateful resources across snapshots. |
Next Steps
- 📖 Official Docs: Lambda Provisioned Concurrency | Lambda SnapStart
- 🔬 Measure first: Run the CloudWatch Logs Insights query above to quantify your cold start rate before choosing a solution.
- 💰 Cost model: Use the AWS Lambda Pricing page to model Provisioned Concurrency costs against your traffic patterns before committing.
- 🏗️ IaC: Manage Provisioned Concurrency and SnapStart via AWS SAM (
ProvisionedConcurrencyConfigandSnapStartproperties) or Terraform (aws_lambda_provisioned_concurrency_configresource) for repeatable deployments.
Comments
Post a Comment