Back to posts
May 9, 2026
8 min read

S3 Lifecycle Rules: Automate Storage Cost Optimization

Manually moving objects between storage classes is like sorting mail by hand every day — it works at first, but it doesn’t scale.

In practice, you rarely pick just one storage class and leave it forever. Data tends to be hot when new and cold over time — logs are queried heavily in the first week, user uploads are viewed frequently in the first month, then gradually forgotten.

S3 Lifecycle Rules let you define automatic transitions between storage classes based on object age. You configure rules at the bucket level, and S3 handles the rest — no cron jobs, no scripts, no manual intervention.

If you’re not familiar with the different S3 storage classes and their trade-offs, check out S3 Storage Classes: Choosing the Right One for Your Data first.


How Lifecycle Rules Work

A lifecycle rule consists of:

Key constraints:

If you need objects to move both directions automatically (down when idle, back up when accessed), use Intelligent-Tiering instead of lifecycle rules. Lifecycle rules are best when your access pattern is predictable and decays over time.


Example: Shopify Analytics Pipeline

Problem: You’re designing the storage layer for an analytics pipeline. To keep things simple, you’re focusing on a single metric — product clicks.

Your system is an analytics platform for Shopify store owners that lets them analyze product click metrics such as conversion rate, click-through rate, and more.

Each store’s storefront streams product click events through Kinesis Data Firehose into an S3 bucket as raw data. AWS Glue runs ETL jobs to clean, deduplicate, and aggregate this raw data into structured tables on S3.

Store owners access a dashboard powered by AWS Athena that queries the processed data. The dashboard has these constraints:

This produces 3 data types on S3, each with a different lifecycle:

{ "Rules": [ { "ID": "RawStreamData", "Status": "Enabled", "Filter": { "Prefix": "raw/" }, "Transitions": [ { "Days": 7, "StorageClass": "STANDARD_IA" }, { "Days": 90, "StorageClass": "GLACIER_FLEXIBLE_RETRIEVAL" }, { "Days": 365, "StorageClass": "DEEP_ARCHIVE" } ], "Expiration": { "Days": 1095 } }, { "ID": "ProcessedData", "Status": "Enabled", "Filter": { "Prefix": "processed/" }, "Transitions": [ { "Days": 180, "StorageClass": "STANDARD_IA" }, { "Days": 365, "StorageClass": "GLACIER_IR" } ], "Expiration": { "Days": 1095 } }, { "ID": "AthenaQueryResults", "Status": "Enabled", "Filter": { "Prefix": "athena-results/" }, "Expiration": { "Days": 7 } } ] }

Raw stream data (raw/)

Kinesis delivers raw JSON/Parquet files here. Glue reads them within the first few days to run ETL.

Processed data (processed/)

This is what Athena queries for the store owner’s dashboard — storage class directly affects dashboard performance.

Do not move processed data to Glacier Flexible Retrieval or below — Athena cannot query objects in those classes without a manual restore first, which would break the dashboard experience.

Athena query results (athena-results/)

Athena saves every query result to S3. These are purely temporary — any query can be re-run. Delete after 7 days — no reason to transition to a cheaper class, just expire them.


Cost Estimation

Let’s assume the platform handles ~500 active stores, generating a combined 50 GB/day of raw event data, and Glue produces 10 GB/day of processed data. Here’s the cost at steady state after 2 years:

S3 storage pricing (us-east-1):

Storage ClassPrice per GB/month
S3 Standard$0.023
S3 Standard-IA$0.0125
Glacier Instant Retrieval$0.004
Glacier Flexible Retrieval$0.0036
Deep Archive$0.00099

Storage costs

Raw data (50 GB/day):

PeriodClassVolumeMonthly Cost
Day 0–7Standard350 GB$8.05
Day 7–90Standard-IA4,150 GB$51.88
Day 90–365Glacier Flexible13,750 GB$49.50
Day 365–1095Deep Archive36,500 GB$36.14
Total54,750 GB$145.57

Without lifecycle (all Standard): 54,750 GB × $0.023 = $1,259.25/month — savings of 88%

Processed data (10 GB/day):

PeriodClassVolumeMonthly Cost
Day 0–180Standard1,800 GB$41.40
Day 180–365Standard-IA1,850 GB$23.13
Day 365–730Glacier IR3,650 GB$14.60
Total7,300 GB$79.13

Without lifecycle (all Standard): 7,300 GB × $0.023 = $167.90/month — savings of 53%

Retrieval costs

Standard-IA and Glacier classes charge a retrieval fee per GB when Athena scans data:

Storage ClassRetrieval Fee per GB
S3 StandardFree
S3 Standard-IA$0.01
Glacier Instant Retrieval$0.03
Glacier Flexible Retrieval$0.01 (Standard), $0.03 (Expedited)

Assumptions based on dashboard usage patterns:

Data TypeClassRetrievedCost
RawStandard-IA100 GB$1.00
RawGlacier Flexible50 GB$0.50
ProcessedStandard-IA300 GB$3.00
ProcessedGlacier IR100 GB$3.00
Total$7.50

Total

With LifecycleAll StandardSaved
Storage$224.86$1,427.31$1,202.45
Retrieval$7.50$0.00-$7.50
Total$232.36$1,427.31$1,194.95 (84%)

That’s roughly $14,340 saved per year — with no impact on the dashboard experience for store owners. The 6-month hot window stays on Standard with zero retrieval fees, while older data gradually moves to cheaper classes that still support instant Athena queries.


Combining Lifecycle Rules with Object Tagging

Lifecycle rules can be filtered not just by prefix, but also by S3 Object Tags. This opens up powerful patterns — like offering different storage tiers based on a customer’s subscription plan.

Use case: Premium plan upgrade

Continuing the Shopify analytics example — suppose you want to upsell a premium plan where store owners get the fastest possible dashboard performance across all their historical data (no retrieval fees, no latency increase for older data).

The approach:

  1. Tag all objects with plan=basic by default
  2. Configure lifecycle rules to only transition objects tagged plan=basic:
{ "Rules": [ { "ID": "BasicPlanProcessed", "Status": "Enabled", "Filter": { "And": { "Prefix": "processed/", "Tags": [{ "Key": "plan", "Value": "basic" }] } }, "Transitions": [ { "Days": 180, "StorageClass": "STANDARD_IA" }, { "Days": 365, "StorageClass": "GLACIER_IR" } ] } ] }
  1. When a store upgrades to premium → tag their objects as plan=premium → objects no longer match the rule → stay in Standard forever
  2. For objects already transitioned to cheaper classes → copy them back to Standard

Implementation

When a store owner upgrades, you need to tag all their objects and copy any already-transitioned objects back to Standard:

import { S3Client, ListObjectsV2Command, PutObjectTaggingCommand, CopyObjectCommand, } from '@aws-sdk/client-s3' import pLimit from 'p-limit' const s3 = new S3Client({ region: 'us-east-1' }) const BUCKET = 'your-analytics-bucket' const CONCURRENCY = 50 interface UpgradeResult { tagged: number copied: number errors: string[] } async function upgradeStorePlan(storeId: string): Promise<UpgradeResult> { const prefix = `processed/store_id=${storeId}/` const limit = pLimit(CONCURRENCY) const result: UpgradeResult = { tagged: 0, copied: 0, errors: [] } let continuationToken: string | undefined do { const listResponse = await s3.send( new ListObjectsV2Command({ Bucket: BUCKET, Prefix: prefix, ContinuationToken: continuationToken, }) ) const objects = listResponse.Contents ?? [] await Promise.all( objects.map((obj) => limit(async () => { const key = obj.Key! try { await s3.send( new PutObjectTaggingCommand({ Bucket: BUCKET, Key: key, Tagging: { TagSet: [{ Key: 'plan', Value: 'premium' }] }, }) ) result.tagged++ if (obj.StorageClass && obj.StorageClass !== 'STANDARD') { await s3.send( new CopyObjectCommand({ Bucket: BUCKET, CopySource: `${BUCKET}/${key}`, Key: key, StorageClass: 'STANDARD', MetadataDirective: 'COPY', TaggingDirective: 'COPY', }) ) result.copied++ } } catch (err) { result.errors.push(`${key}: ${(err as Error).message}`) } }) ) ) continuationToken = listResponse.NextContinuationToken } while (continuationToken) return result }

Key details:

Cost of upgrading one store

Assuming 1 store with 2 years of data at 20 MB/day (10 GB/day ÷ 500 stores):

PeriodCurrent ClassVolumeRetrieval Fee
Day 0–180Standard3.6 GB— (already Standard)
Day 180–365Standard-IA3.7 GB$0.037
Day 365–730Glacier IR7.3 GB$0.219
14.6 GB$0.256

S3 request costs (ListObjects + CopyObject + PutObjectTagging) for ~730 objects: < $0.02

Total cost to upgrade 1 store: ~$0.28 — a one-time cost that eliminates ongoing retrieval fees for that store’s dashboard.

For comparison, if you don’t copy and let the premium store query data on Standard-IA and Glacier IR, retrieval fees accumulate to ~$0.11/month. After just 3 months, cumulative retrieval fees exceed the one-time copy cost. Copying upfront is always cheaper.

Related