S3 Glacier Storage Classes: Choosing the Right Tier for Long-Term Archival

If you're storing compliance records, audit logs, or historical datasets that you access once a year or less, paying for S3 Standard is like renting a downtown office for a filing cabinet you open annually. AWS S3 Glacier storage classes are purpose-built for exactly this scenario — delivering durable, long-term archival at a fraction of the cost, with retrieval options tuned to your urgency requirements.

TL;DR — Glacier Storage Class Comparison

Storage Class Min. Storage Duration Retrieval Time Best For
S3 Glacier Instant Retrieval 90 days Milliseconds Quarterly access, instant response needed
S3 Glacier Flexible Retrieval 90 days 1–5 min (Expedited), 3–5 hrs (Standard), 5–12 hrs (Bulk) Annual audits, flexible retrieval window acceptable
S3 Glacier Deep Archive 180 days Up to 12 hrs (Standard), up to 48 hrs (Bulk) 7–10 year compliance retention, rarely accessed
Analogy: Think of Glacier Instant Retrieval as a climate-controlled off-site vault with a courier on standby. Glacier Flexible Retrieval is the same vault but you call ahead and wait for the truck. Glacier Deep Archive is a deep underground archive — the cheapest to store in, but you plan your retrieval days in advance.

Architecture & Data Flow

Understanding how data moves into and out of Glacier is critical for designing a cost-effective archival pipeline. The diagram below shows a lifecycle-driven archival flow from ingestion to retrieval.

graph LR A["Application / Service"] -->|"Upload"| B["S3 Standard"] B -->|"Lifecycle Rule (Day 30)"| C["S3 Glacier Flexible Retrieval"] C -->|"Lifecycle Rule (Day 365)"| D["S3 Glacier Deep Archive"] E["Auditor / Admin"] -->|"RestoreObject API (Tier: Standard)"| C C -->|"3-5 hrs later Temporary copy accessible (billed at S3 Standard rates)"| F["Restored Object (Accessible State)"] E -->|"Download within expiry window"| F style A fill:#4A90D9,color:#fff style B fill:#F5A623,color:#fff style C fill:#7B68EE,color:#fff style D fill:#2C3E50,color:#fff style E fill:#27AE60,color:#fff style F fill:#1ABC9C,color:#fff
  1. Ingestion: Objects are uploaded to S3 Standard initially, where they are immediately accessible.
  2. Lifecycle Transition: An S3 Lifecycle Policy automatically transitions objects to the target Glacier class after a defined number of days (e.g., 30 days to Glacier Flexible Retrieval, or 90 days to Deep Archive).
  3. Archival State: Objects reside in the Glacier tier. They are not directly accessible — a restore operation must be initiated first.
  4. Restore Request: You issue a RestoreObject API call, specifying the retrieval tier (Expedited, Standard, or Bulk) and the number of days the restored copy should remain accessible.
  5. Accessible State: Once restored, a temporary copy is made accessible, with the temporary copy billed at S3 Standard rates for the duration specified. The original archived object remains in Glacier.
  6. Audit / Access: Your application or auditor downloads the restored object within the availability window.

Choosing the Right Glacier Class for Annual Audits

For data accessed once a year, the decision narrows to two candidates:

Option A: S3 Glacier Flexible Retrieval

  • Retrieval in 3–5 hours (Standard tier) is perfectly acceptable for a planned annual audit.
  • 90-day minimum storage duration — ensure objects are not deleted before this threshold to avoid early deletion charges.
  • Lower storage cost than Glacier Instant Retrieval; higher than Deep Archive.
  • Verdict: Best fit if your audit window is flexible and you can plan 3–5 hours ahead.

Option B: S3 Glacier Deep Archive

  • Lowest storage cost of all S3 classes — designed for data retained for 7–10 years.
  • 180-day minimum storage duration.
  • Standard retrieval up to 12 hours; Bulk retrieval up to 48 hours.
  • Verdict: Best fit if your retention period is multi-year and you can initiate retrieval 12–48 hours before the audit begins.

Lifecycle Policy Implementation

The most operationally sound approach is to automate transitions using S3 Lifecycle Policies — no manual intervention, no forgotten objects sitting in expensive tiers.

🔽 Click to expand — S3 Lifecycle Policy JSON (Flexible Retrieval + Deep Archive)
{
  "Rules": [
    {
      "ID": "ArchiveAuditLogs",
      "Status": "Enabled",
      "Filter": {
        "Prefix": "audit-logs/"
      },
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "GLACIER"
        },
        {
          "Days": 365,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "NoncurrentVersionTransitions": [
        {
          "NoncurrentDays": 30,
          "StorageClass": "GLACIER"
        }
      ]
    }
  ]
}

Key parameters:

  • Prefix: Scope the rule to a specific folder/prefix (e.g., audit-logs/) to avoid transitioning unintended objects.
  • GLACIER: Maps to S3 Glacier Flexible Retrieval in the API/CLI.
  • DEEP_ARCHIVE: Maps to S3 Glacier Deep Archive.
  • Transition from Standard → Glacier Flexible at 30 days, then to Deep Archive at 365 days for a tiered cost-reduction strategy.

Initiating a Restore — CLI Example

🔽 Click to expand — RestoreObject CLI Command
# Restore an object from Glacier Flexible Retrieval
# Standard retrieval: 3-5 hours, available for 7 days
aws s3api restore-object \
  --bucket my-audit-archive-bucket \
  --key audit-logs/2023/annual-report.zip \
  --restore-request '{
    "Days": 7,
    "GlacierJobParameters": {
      "Tier": "Standard"
    }
  }'

# Check restore status
aws s3api head-object \
  --bucket my-audit-archive-bucket \
  --key audit-logs/2023/annual-report.zip \
  --query 'Restore'

The head-object response will include ongoing-request="false" and an expiry-date once the restore is complete and the object is accessible.

Cost Optimization Tips

  • Aggregate small objects before archiving: Glacier charges per-object (per-request and monitoring fees apply). Bundling thousands of small files into a single archive (e.g., a .tar.gz) significantly reduces per-object overhead.
  • Respect minimum storage durations: Deleting or transitioning objects before the minimum duration (90 days for Flexible Retrieval, 180 days for Deep Archive) incurs a prorated early deletion charge for the remaining days.
  • Use Bulk retrieval for non-urgent restores: Bulk retrieval is the lowest-cost retrieval option for Glacier Flexible Retrieval and is ideal for non-urgent restores.
  • Use S3 Storage Lens or S3 Analytics: Identify objects in Standard that haven't been accessed in 90+ days — these are prime candidates for lifecycle transitions.
  • Tag-based lifecycle rules: Apply lifecycle policies using object tags (e.g., retention=7yr) for fine-grained control without relying solely on prefixes.

IAM Permissions — Least Privilege

Grant only the permissions required for archival operations. Avoid broad s3:* policies.

🔽 Click to expand — Minimal IAM Policy for Glacier Restore
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowGlacierRestore",
      "Effect": "Allow",
      "Action": [
        "s3:RestoreObject",
        "s3:GetObject",
        "s3:HeadObject"
      ],
      "Resource": "arn:aws:s3:::my-audit-archive-bucket/audit-logs/*"
    },
    {
      "Sid": "AllowLifecycleManagement",
      "Effect": "Allow",
      "Action": [
        "s3:PutLifecycleConfiguration",
        "s3:GetLifecycleConfiguration"
      ],
      "Resource": "arn:aws:s3:::my-audit-archive-bucket"
    }
  ]
}

Decision Flow — Which Glacier Class?

graph TD Start(["How often do you access this data?"]) --> Q1{"More than once per quarter?"} Q1 -->|"Yes"| GIR["S3 Glacier Instant Retrieval (millisecond access)"] Q1 -->|"No - Annual or less"| Q2{"Can you wait 3-5 hours?"} Q2 -->|"Yes"| Q3{"Retention period 7+ years?"} Q2 -->|"No - Need minutes"| GIR Q3 -->|"No"| GFR["S3 Glacier Flexible Retrieval ✅ Best for Annual Audits"] Q3 -->|"Yes - Can wait 12-48 hrs"| GDA["S3 Glacier Deep Archive ✅ Lowest TCO"] style GIR fill:#F5A623,color:#fff style GFR fill:#27AE60,color:#fff style GDA fill:#2C3E50,color:#fff style Start fill:#4A90D9,color:#fff
  1. Start by evaluating your access frequency — more than once per quarter points away from Deep Archive.
  2. If access is annual or less, assess your retrieval urgency tolerance.
  3. If you can wait 3–5 hours, Glacier Flexible Retrieval is the pragmatic choice for annual audits.
  4. If your retention spans 7+ years and you can plan 12–48 hours ahead, Deep Archive delivers the lowest total cost of ownership.

Glossary

TermDefinition
S3 Lifecycle PolicyA set of rules that automate transitioning objects between S3 storage classes or expiring them after a defined period.
RestoreObjectThe S3 API operation that initiates a temporary retrieval of an archived Glacier object into an accessible state.
Minimum Storage DurationThe minimum number of days an object must remain in a storage class before deletion or transition to avoid early deletion charges.
Retrieval TierThe speed/cost option selected when restoring a Glacier object: Expedited, Standard, or Bulk — each with different latency and pricing.
S3 Glacier Deep ArchiveThe lowest-cost S3 storage class, designed for data that is rarely accessed and retained for 7–10 years, with retrieval times up to 48 hours.

Next Steps

Comments

Popular posts from this blog

EC2 No Internet Access in Custom VPC: Attaching an Internet Gateway and Fixing Route Tables

IAM User vs. IAM Role: Why Your EC2 Instance Should Never Use a User

EC2 SSH Connection Timeout: The Exact Security Group Rules You Need to Fix It