Blog - Trusted Team Extension Partner For Europe & USA

Mastering Sorting Techniques in DynamoDB: From Chaos to Clarity - Best Software Development Team Extension Partner for Nordics

Written by Shahed Chy Suzan | 09/04/2025

Amazon DynamoDB is a highly efficient, scalable, and low-latency NoSQL database service, ideal for managing large data volumes and high traffic loads. Its robust architecture and single table design principle ensure exceptional performance and reliability which makes DynamoDB a top choice for numerous applications.

With great power comes great responsibility, hence one of the key challenges developers often experience while working with DynamoDB is implementing efficient sorting strategies, which can have a major effect on how well data is retrieved.

In this blog I will cover fundamental principles, advanced mechanism, and practical advice for turning chaotic data into orderly resources.

Sort Key in DynamoDB:

DynamoDB supports two kinds of primary keys: the partition key and the composite primary key. A partition key(aka hash key) is a single attribute, while a composite primary key includes a combination of 2 attributes: ‘hash key’ and ‘sort key’.

 

DynamoDB uses the partition key value as input to the hash function. The output from the hashing function determines which partition the item will be placed in. A sort key is also referred to as a range attribute of an item and used to arrange and organize items in a partition. It is possible that many items will have the same value for a partition key; however, they must have different values for sort keys.

 

By default the Items in DynamoDB partitions are ordered by sort key in ascending order. A well defined sort key provides the capability to query dynamoDB tables by using the conditions expression.

 

The Sort key cannot be changed after table provisioning, Therefore be mindful when creating a sort key. If your data contains hierarchical (one-to-many) relationships and you want to query at any hierarchy level, use a composite primary key.

 

For example, The sort key for a restaurant DDB table can be structured as follows.

 

 

Here ‘chain’ means company name of the restaurant.

 

Here, the sort key allows for efficient range queries for locations at various levels of aggregation, from ‘chain’ to ‘area’, and everything in between. By defining sort key as above we can query table with more granularity, like if we want to get data for a specific ‘chain’ then we can query with sortkey begins_with chain# , similarly if we want to get data for any specific ‘country’ we can query the table like sortkey begins_with chain#country.

 

Points to consider for Sort Key:

  1. The sort key attribute must be a scaler type.
  2. It should be noted that within a partition, sort key has to be unique meaning that there ought to be no other item with the same partition key as well as the same sort key.
  3. Sort key names can’t be a reserved words. You can see the reserve list in the dynamoDB documentation here.
  4. The length of the sort key attribute should not exceed 255 characters.
  5. Sort keys should follow DynamoDb’s naming conventions for attributes.
  6. Name your Sort Key Attribute something meaningful and descriptive that reflects the type of data it is holding so that when you make queries, they will make more sense and help in faster execution.

 

Diving Deeper: Sorting Mechanism in DynamoDB

The most fundamental sorting strategy in DynamoDB involves using the sort key of your table’s primary key. If you define a composite primary key with partition key + sort key, DynamoDB will store items inside every partition automatically in ascending order of the sort key.

 

Here are two main rules you will need to get into the habit of following if you need any kind of specific ordering when reading multiple items in DynamoDB.

 

  • You’re going to have to use composite primary keys first.
  • Second, ordering must be done with the sort key of a certain item collection.

 

DynamoDB uses the partition key to isolate item collections into different partitions. This then supports O(1) lookup time to find the proper node. Afterward, items inside an item collection are stored as a B-tree to allow O(log n) time complexity in search. This B-tree is arranged in lexicographical order according to the sort key and forms what is used for sorting.

 

In this blog I will delve into the following sorting mechanism:

  • Basics of sorting:
      1. Lexicographical sorting mechanism.
      2. Sorting with Timestamps, how to handle timestamps,
      3. Unique, sortable IDs (unique identifiers that still need sorted chronologically).
  • Sorting on changing attributes
  • Ascending vs. Descending sorting
  • Two relational access patterns in a single item collection
  • Zero-padding with numbers
  • Faking ascending order

 

Let’s get started, there’s a lot to discuss!

 

1. Basics of sorting:

Sorting is done in DynamoDB only based on the sort key which can be a scalar types such as: string, number or binary. This implies that we do not need to consider how a map attribute will be sorted by DynamoDB.

 

In cases where sort keys are numbers, they are directly ordered according to their numerical values. However, for sort keys of type strings or binaries, items are arranged based on the order of UTF-8 bytes. Now let’s examine this more closely.

 

1.1. Lexicographical sorting:

A simplified version of sorting on UTF-8 bytes is to say the ordering is lexicographical. Lexicographical sorting, also known as dictionary or alphabetical order, It is the default method of sorting strings in DynamoDB. Strings are compared character by character in this method based on their Unicode values.

 

To determine what character comes after which, consider the sequence written below:

  1. Symbols(e.g. # or $) and Numbers (0-9) come before uppercase letters.
  2. Uppercase letters (A-Z) come before lowercase letters.
  3. Lowercase letters (a-z) come last.

 

Notice that these letters are separated by case (first capital letters and afterwards lowercase). This means “Banana” will come before “apple”. That is crucial to be aware of if you want to avoid surprises later.

 

Here’s an example table demonstrating the lexicographical ordering of various characters based on their UTF-8 values(dec):

 

I’m refering to this page for a complete list of UTF-8 characters including special ones sorted according to their bytes order.

 

Example of a lexicographical sorting with Uppercase Rule:

A common mistake people make with lexicographical ordering is forgetting the uppercase rule. For instance, consider an item collection with names “Ellen VanDyke” (with a capital “D” in the middle of VanDyke), ‘Alice Vanbyke’ and ‘Bob Vanhoten’, In an item collection using these last names as SK. it might turn out as follows:

 

You might be surprised to see that ‘VanDyke’ came before ‘Vanbyke’! This is due to the casing uppercase before lowercase. To avoid odd behavior around this, standardize your sort keys to either all uppercase or all lowercase values:

 

Here, when all last names are in uppercase, they are sorted as expected.

 

1.2. Sorting with Timestamps:

Regarding sorting, the second fundamental issue I want to mention is how to go about timestamps. First and foremost, your choice must be such that it can be sorted. What you definitely cannot do is use a non-sortable format like “May 26, 1988”. This will not sort in DynamoDB and you will have some painful moments there.

 

Personally, I prefer to use ISO 8601 format: “YYYY-MM-DDTHH:MM:SSZ”. Because this format is human-readable and it also follow lexicographical order.

 

This ISO 8601 format is universally recognized standard used in many countries across the globe for showing dates-times (e.g., “2023-05-15T10:30:00Z”). It consists of year, month, day, hour, minutes, seconds, and then Z suffix indicating UTC timezone.

 

Eg: “2023-07-10T14:48:00Z” which will properly sort before “2023-07-11T15:00:00Z”.

 

1.2. Unique, sortable IDs:

An often encountered requirement is to have unique IDs that can be sorted. This mostly comes where you need an item with a unique identifier (preferably URL friendly) but you still want them ordered by date or time created. In this arena, there are several options but ‘KSUID’ implementation stands out for me.

 

KSUID means K Sortable Unique Identifier. Simply, it’s an identification number that begins with a timestamp and has enough randomness to minimize the possibility of overlapping values. On the whole this gives you a 27-character string, which is better than any UUIDv4 but yet achieves lexicographical sortability.

 

For more check out Rick’s blog post on the implementation of KSUIDs.

 

2. Sorting on changing attributes:

DynamoDB uses the sort key to sort items within a collection, useful for displaying recently stored items or top scores. However, sorting based on frequently changing values can be challenging. For example, consider a ticket tracker app might allow users to view tickets in order by most recently updated.

 

Let’s imagine you decide to model your table as follows:

 

This database design uses the organization name as a partition key for ‘group by’ functionality and sorting tickets with the ‘UpdatedAt’ field as a sort key for order. However, this design has limitations, as the primary key cannot be changed during item modification. In your case, you have included ‘UpdatedAt’ field into the primary key which changes each time a ticket is updated. So whenever you update a ticket item, first delete the existing ticket item, and then create another new one with updated primary key. causing a complicated process and potential data loss if not handled properly.

 

However, I’m thinking maybe we can do things differently here. For our primaryKey, let’s give two attributes that will not change anything at all. We’ll keep ‘OrganizationName’ as partition key but switch to using ‘TicketId’ as sort key instead.

 

So, here is how our table looks like now.

 

At this point, we can also add local secondary index which has a secondary sort key known as ‘UpdatedAt’. The information contained in the base table gets copied into the secondary index and appears as shown:

 

Note that this was exactly how our original table design used to be. Consequently, we can use the Query API against this secondary index to satisfy our ‘Fetch most recently updated tickets’ access pattern. Significantly, there is no need for us to worry about complex delete + create logic while trying to update an item because DynamoDB takes care of all this when replicating data into secondary index.

 

3. Ascending vs. Descending sorting:

We know DynamoDB by default reads the items in ascending order. By employing ‘ScanIndexForward=False’ you will be reading them in descending order. This is useful in many cases such as when you want the most recent timestamps or check highest score on a leaderboard.

 

But it becomes complex if you combine different entity types into a single item collection and then try to read them back together with a parent object for establishing a one-to-many relationship between parent and child entities. In that situation you need to consider the common sort order you’ll use in this access pattern to know where to place the parent item.

 

For example, let’s assume a library system where users often want to access a book with the latest 10 review items.

 

You could structure your table as shown below:

 

Here according to the table data our query retrieves the oldest Book and Review items, but if the collection is large, multiple pagination requests may be needed to fetch recent ones. To avoid this, you can prefix our Review items with ‘#‘. After doing that, our table will appear as:

 

So now if we use ‘ScanIndexForward=False‘ our Query API will retrieve recent Review Items for each Book item starting from the last item in a collection.

 

4. Two relational access patterns in a single item collection:

Let’s take this example if we can have a one-to-many relationship where the related items are before or after the parent items, can we do both in one item collection?

 

Yeah, sure! To achieve this you need to have two access patterns: one fetching related items in ascending order and another in descending order.

 

To give an illustration, consider a SaaS application where an organization has two sub-concepts: Employees and Teams, each with a one-to-many relationship with the organization.

 

You could think of each of these relationships as having a relational access pattern ‘Fetch Organization and all {Employees|Teams} for the Organization’. Additionally, you would want Employees returned sorted by their names but not care how Teams are sorted because there shouldn’t be too many of them.

 

You could model your table as follows:

 

This above table contains three types of items: Team, Org, and Employee. Each item has same PK value of ORG#, with Team items bearing SK #TEAM#, Org items bearing SK ORG#, and Employee items bearing SK EMPLOYEE#. Here, the Org item is positioned between Team and Employee items, with the Team item preceding the Org item, The structure of our item collection was designed with a ‘#’ prefix, ensuring Team item precedes Org item.

 

Now we can query for all team items as well as org:

 

 

This process searches our partition, identify items less than or equal to the sort key value for our Org item, and then scans backward to retrieve all Team items.

 

Below is an image showing how it works on the table:

Alternatively you could do the reverse to fetch both Org item and Employee items, look for items greater than or equal to our Org item sort key and read forward to retrieve all Employee items from your table.

 

This is a very advanced pattern that doesn’t necessarily mean you must use it but it will eliminate more additional secondary indexes from your table.

 

5. Zero-padding with numbers:

One should be aware of lexicographic sorting with numbers represented as strings. For better demonstration. Let’s revisite our above ‘Book Review’ example and now we are using ‘RatingNumber’ instead of timestamps for the sort key.

 

You can have a table that looks like this:

 

Here, the rating order is incorrect, 10 is being placed before 2 due to lexicographic sorting. one character at a time is compared from the left towards right. When it compares ’10’ and ‘2’, the first digit(‘1’) in 10 comes before the first digit(‘2’) in 2, so 10 came before 2.

 

To avoid this, you can zero-pad your numbers—make sure that your sort key has a fixed length such as three digits in this case. When you find that your value does not contain many digits, you append with a ‘0’. Therefore, It will turn “10” into “010”, and “2” into “002”.

 

Now, this is how our table looks like:

 

This means that RatingNumber 2 comes before RatingNumber 10 as expected. However, the main thing here involves you having the padding about to accommodate any increase. In this case, we had a fixed width of three digits thus reaching Rating number 999. Make your fixed length longer if your requirements are greater than that.

 

6. Faking ascending order:

Let’s revisite our above example of zero-padding with numbers. Now retrieve reviews in ascending order by the book name according to their rating, So here i am going to implement a technique to fake ascending order by transforming the sort key.

 

We’ll modify the SK into a FakeRating for efficient querying in ascending order. To fake ascending order, we can invert the ratings by subtracting each rating from a constant value (e.g., 100), then zero-pad the result to ensure lexicographical sorting,

 

That means, FakeRating = (100 – ActualRating) and then convert it to a zero-padded string.

 

For example, imagine we were again using a zero-padded number with a width of 3. If we had an item with an ratingID of “10”, the zeropadded number would be “010”. To find the zero-padded difference, we subtract our number (“10”) from the highest possible value (“999”). Thus, the zero-padded difference would be “989”

 

If we put this in our table and query with ‘ScanIndexForward=False’ you would get:

 

This strategy may appear unconventional and may not be necessary in practice. The most important lesson to learn from this approach is how versatile DynamoDB can be if you combine multiple strategies. Once you grasp the basics, you may put them together in unique ways to solve your problem.

 

Conclusion:

When dealing with data management in DynamoDB, sorting is one of the core functions that decides how fast information can be retrieved. By understanding these above discussed sorting mechanism will help you optimize your DynamoDB tables for many difficult use cases so that you can maintain correct order and clearness in your data.

 

From basic lexicographic ordering to advanced mechanism like zero-padding and faking ascending order, acquiring expertise in them will help you take advantage all the potentialities of DynamoDB. As your data grows and application scales, these sorting techniques will ensure your queries remain fast and efficient, keeping your data access organized and performant.