LSI vs GSI in DynamoDB: Choosing the Right Secondary Index
You've modeled your DynamoDB table around one access pattern, and now a new query requirement has emerged — one that doesn't fit your primary key. Secondary indexes are the solution, but choosing between a Local Secondary Index (LSI) and a Global Secondary Index (GSI) has architectural consequences that are difficult to reverse.
TL;DR — Quick Comparison
| Dimension | LSI (Local Secondary Index) | GSI (Global Secondary Index) |
|---|---|---|
| Partition Key | Same as base table | Any attribute (can differ from base table) |
| Sort Key | Different attribute (required) | Any attribute (optional) |
| Creation Time | At table creation only | Any time (added/deleted dynamically) |
| Consistency | Strongly consistent reads supported | Eventually consistent reads only |
| Storage Scope | Shares partition storage with base table | Separate, independent storage |
| Item Collection Limit | 10 GB per partition key value | No item collection size limit |
| Throughput | Shares base table's provisioned capacity | Has its own provisioned or on-demand capacity |
| Max per Table | 5 | 20 (default; can be increased) |
| Use Case | Alternate sort order within a partition | Entirely different access pattern / partition key |
The Mental Model: What Problem Does Each Solve?
DynamoDB's primary key determines where data lives physically. A query can only target items within a single partition. Secondary indexes let you project a subset of your data into a new key structure so DynamoDB can route queries differently.
Analogy: Think of your DynamoDB table as a physical filing cabinet. The primary key is the drawer label — everything in a drawer is co-located. An LSI is like adding a color-coded tab system inside the same drawer to sort files differently. A GSI is like building an entirely separate filing cabinet that reorganizes all files from the original cabinet under a completely different labeling scheme.
Deep Dive: Local Secondary Index (LSI)
An LSI shares the same partition key as the base table but allows a different sort key. Because it is co-located within the same partition, it can offer strongly consistent reads — the same guarantee as the base table itself.
- Must be defined at table creation. You cannot add an LSI to an existing table. This is the single most important operational constraint.
- Item collection limit: All items sharing the same partition key value — across the base table and all its LSIs — must fit within 10 GB. Exceeding this causes write rejections.
- Capacity sharing: LSI reads and writes consume the base table's read/write capacity units (RCUs/WCUs).
LSI Data Flow
(Partition Key: CustomerId)"] BaseTable -->|"Synchronous propagation"| LSI["LSI Storage
(Same Partition,
Sort Key: OrderDate)"] QueryLSI["Query on LSI"] -->|"Routed to same partition"| LSI LSI -->|"Strongly consistent read available"| Result["Query Result"] BaseTable -->|"10 GB combined limit"| Limit["Item Collection Size Limit"]
- A write to the base table automatically propagates to all LSIs within the same partition — synchronously.
- A query targeting the LSI is routed to the same physical partition as the base table query.
- Strongly consistent reads are available because the LSI data is co-located and updated in the same write operation.
- The 10 GB item collection limit applies to the combined size of the base table items and all LSI projections for a given partition key value.
Deep Dive: Global Secondary Index (GSI)
A GSI is essentially a separate DynamoDB table maintained automatically by the service. It can have a completely different partition key and sort key from the base table, enabling access patterns that cross partition boundaries.
- Can be added or deleted at any time. This flexibility makes GSIs the default choice when requirements evolve post-launch.
- Eventually consistent only. Writes to the base table are asynchronously propagated to GSIs. There is a replication lag, meaning a GSI query may not reflect the most recent write immediately.
- Independent throughput (Provisioned mode): In provisioned capacity mode, a GSI has its own RCU/WCU settings. Under-provisioning a GSI can cause write throttling on the base table — a critical operational pitfall.
- Sparse indexes: If the GSI key attribute is absent from an item, that item is simply not projected into the GSI. This is a powerful pattern for filtering large datasets efficiently.
GSI Data Flow
(Partition Key: CustomerId)"] BasePartition -->|"Asynchronous replication"| GSIStorage["GSI Storage
(Partition Key: OrderStatus)"] QueryGSI["Query on GSI"] --> GSIStorage GSIStorage -->|"Eventually consistent read only"| Result["Query Result"] GSIStorage -->|"Item missing GSI key?"| Sparse["Item excluded
(Sparse Index)"] GSIThrottle["GSI Write Throttle"] -->|"Backpressure"| BasePartition
- A write lands on the base table partition determined by the base table's partition key.
- DynamoDB asynchronously replicates the write to the GSI's own storage, partitioned by the GSI's partition key.
- A query against the GSI is routed to the GSI's independent storage — it may not yet reflect the latest write (eventual consistency).
- If the GSI write capacity is throttled, it creates backpressure that can throttle writes on the base table itself.
- Items missing the GSI key attribute are not included in the GSI (sparse index behavior).
Architecture Decision Flow
partition key?"} Q1 -->|"Yes"| GSI1["Use GSI"] Q1 -->|"No — same partition key,
different sort key"| Q2{"Table already
created?"} Q2 -->|"Yes"| GSI2["Use GSI
(same PK as base table)"] Q2 -->|"No"| Q3{"Need strongly
consistent reads?"} Q3 -->|"Yes"| Q4{"Item collection
under 10 GB?"} Q4 -->|"Yes"| LSI["Use LSI"] Q4 -->|"No / Uncertain"| GSI3["Use GSI"] Q3 -->|"No"| GSI4["Use GSI"]
- Start by determining whether your new query needs a different partition key. If yes, only a GSI can satisfy this requirement.
- If the partition key is the same and you only need a different sort key, an LSI is viable — but only if the table hasn't been created yet.
- If the table already exists and you need a different sort key within the same partition, you must use a GSI (with the same partition key as the base table).
- If you require strongly consistent reads on the index, an LSI is the only option.
- Validate that your item collection size will remain under 10 GB before committing to an LSI.
Projection Types
Both LSIs and GSIs require you to specify which attributes are projected into the index. This directly impacts storage cost and read efficiency.
| Projection Type | What's Included | Best For |
|---|---|---|
KEYS_ONLY |
Index keys + base table primary key only | Existence checks; fetch full item from base table separately |
INCLUDE |
Keys + a specified subset of attributes | Queries that need a few specific attributes without fetching the full item |
ALL |
All attributes from the base table item | Queries that need the full item; avoids a second fetch but increases storage cost |
Practical Implementation
Creating a Table with an LSI (AWS CLI)
LSIs must be declared at table creation. The following example creates an Orders table with a GSI-compatible design and an LSI to query orders by OrderDate within a customer's partition.
🔽 [Click to expand] — Create Table with LSI
aws dynamodb create-table \
--table-name Orders \
--attribute-definitions \
AttributeName=CustomerId,AttributeType=S \
AttributeName=OrderId,AttributeType=S \
AttributeName=OrderDate,AttributeType=S \
--key-schema \
AttributeName=CustomerId,KeyType=HASH \
AttributeName=OrderId,KeyType=RANGE \
--local-secondary-indexes \
'[{
"IndexName": "OrderDateIndex",
"KeySchema": [
{"AttributeName": "CustomerId", "KeyType": "HASH"},
{"AttributeName": "OrderDate", "KeyType": "RANGE"}
],
"Projection": {
"ProjectionType": "INCLUDE",
"NonKeyAttributes": ["OrderStatus", "TotalAmount"]
}
}]' \
--billing-mode PAY_PER_REQUEST \
--region us-east-1
Adding a GSI to an Existing Table (AWS CLI)
The following command adds a GSI to the existing Orders table, enabling queries by OrderStatus across all customers. The GSI inherits the PAY_PER_REQUEST billing mode from the base table — no separate throughput block is required.
🔽 [Click to expand] — Add GSI to Existing Table
aws dynamodb update-table \
--table-name Orders \
--attribute-definitions \
AttributeName=OrderStatus,AttributeType=S \
AttributeName=OrderDate,AttributeType=S \
--global-secondary-index-updates \
'[{
"Create": {
"IndexName": "OrderStatusDateIndex",
"KeySchema": [
{"AttributeName": "OrderStatus", "KeyType": "HASH"},
{"AttributeName": "OrderDate", "KeyType": "RANGE"}
],
"Projection": {
"ProjectionType": "ALL"
}
}
}]' \
--region us-east-1
Querying an Index (AWS CLI)
# Query the LSI — strongly consistent read within a customer's partition
aws dynamodb query \
--table-name Orders \
--index-name OrderDateIndex \
--key-condition-expression "CustomerId = :cid AND OrderDate BETWEEN :start AND :end" \
--expression-attribute-values '{
":cid": {"S": "CUST-001"},
":start": {"S": "2024-01-01"},
":end": {"S": "2024-06-30"}
}' \
--consistent-read \
--region us-east-1
# Query the GSI — eventually consistent, cross-partition
aws dynamodb query \
--table-name Orders \
--index-name OrderStatusDateIndex \
--key-condition-expression "OrderStatus = :status" \
--expression-attribute-values '{":status": {"S": "PENDING"}}' \
--region us-east-1
IAM Least Privilege for Index Access
Index queries require explicit IAM permissions scoped to the index ARN. Grant only the actions your application needs.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "QueryOrdersIndexes",
"Effect": "Allow",
"Action": "dynamodb:Query",
"Resource": [
"arn:aws:dynamodb:us-east-1:123456789012:table/Orders/index/OrderDateIndex",
"arn:aws:dynamodb:us-east-1:123456789012:table/Orders/index/OrderStatusDateIndex"
]
}
]
}
Key Operational Pitfalls
- LSI item collection limit: Monitor the
ConsumedWriteCapacityUnitsand use theReturnItemCollectionMetricsparameter on writes to track item collection sizes before hitting the 10 GB ceiling. - GSI write throttling backpressure: In provisioned mode, if a GSI's write capacity is exhausted, DynamoDB will throttle writes to the base table. Always provision GSI WCUs at least as high as the base table's WCUs for write-heavy workloads.
- GSI eventual consistency: Never use a GSI for read-your-own-writes patterns. If a user writes data and immediately queries the GSI, the result may be stale.
- Projection cost: Using
ALLprojection doubles your storage cost for that index. UseINCLUDEwith only the attributes your query actually needs.
Glossary
| Term | Definition |
|---|---|
| Item Collection | All items in a table and its LSIs that share the same partition key value. Subject to the 10 GB LSI limit. |
| Projection | The set of attributes copied from the base table into a secondary index. Determines query capability and storage cost. |
| Sparse Index | A GSI where the key attribute is absent from many items, resulting in a smaller, filtered index — useful for querying a subset of items efficiently. |
| Eventual Consistency | A read model where the returned data reflects a recent but not necessarily the most recent write. All GSI reads are eventually consistent. |
| Strong Consistency | A read model guaranteeing the response reflects all prior successful writes. Available on base table reads and LSI reads only. |
Wrap-Up & Next Steps
The decision rule is straightforward: if you need a different partition key or need to add an index post-launch, use a GSI. If you need strongly consistent reads on an alternate sort key and can define the index at table creation, an LSI is the right tool. In practice, GSIs cover the vast majority of secondary access patterns due to their flexibility.
Before finalizing your index design, model all your access patterns upfront using the AWS DynamoDB NoSQL Design Best Practices and validate item collection sizes if you're using LSIs. For the authoritative reference on index limits and behavior, consult the official DynamoDB Secondary Indexes documentation.
Comments
Post a Comment