Designing Effective Pagination Schemes for RESTful APIs

The art and science that goes into getting this key design decision right.

Pagination is more than just a technical detail—it's a key design decision that impacts both user experience and system performance. As data and traffic scale, a robust pagination scheme ensures that your API can handle increasing demands while providing a seamless experience to users. In this post, we'll explore the fundamentals of pagination design, compare common pagination strategies, and discuss best practices for handling edge cases like record creation and deletion.

Fundamentals of Pagination

At its core, pagination is about presenting users with a subset of data in an ordered and digestible format, typically in the form of "pages." For pagination to be effective, the dataset must be sorted into a fixed order. This is crucial for providing a stable user experience, where the contents of each page remain consistent unless the underlying data changes in a significant way.

A good pagination key should have high cardinality (meaning few duplicates) and minimal nulls to ensure that the order is stable. When the sort key isn't unique, adding a tiebreaker, such as a unique record ID, can help maintain a consistent order.

What Makes a Good Pagination Scheme?

Designing an effective pagination scheme requires balancing several factors.

  • User Experience (UX): Pagination should support desired user interactions, such as infinite scrolling or page navigation (e.g., first, previous, next, last). It should feel responsive and provide stable results that change only when relevant user actions occur.
  • Developer Experience (DX): The pagination parameters should be simple and intuitive, with clear documentation. Ideally, the API should include control flow instructions to guide the client on how to navigate through the pages.
  • Design Considerations: A good pagination scheme should be future-proof, accommodating changes in filtering, sorting, and other modifications without breaking backward compatibility. Consistency across endpoints is also crucial for a predictable API design.
  • Implementation Efficiency: Pagination needs to perform well, scaling gracefully as both data size (data scalability) and request traffic (traffic scalability) increase.

Offset-Based Pagination

Offset-based pagination is the simplest and most widely supported method. It works by specifying a `limit` (the number of items per page) and an `offset` (the index of the first item on the page).

Example

A resource like this:

/books?limit=100&offset=20

Might map to a SQL query like this:

SELECT * FROM book ORDER BY published ASC LIMIT 100 OFFSET 20

Advantages

  • Transparency: It's immediately clear what's happening—you're skipping a number of records and then retrieving the next set.
  • Widespread Support: Most databases and data stores natively support this approach.
  • Absolute Navigation: Users can jump directly to any page by calculating the offset.
  • Simplicity: It's easy to implement, often requiring just a pass-through to the underlying data store.

Disadvantages

  • Inefficiency: Retrieving the Nth page requires scanning and materializing all the preceding pages, which can be costly with large datasets.
  • Scalability Issues: It's difficult to scale horizontally since all the data needs to be available locally for sorting.
  • Inflexibility: Making backward-compatible changes can be challenging.

Use Cases

Offset-based pagination is well-suited for applications with relatively small datasets and predictable traffic patterns. It also supports UI interactions like infinite scrolling and traditional page navigation.

Cursor-Based Pagination

Cursor-based pagination uses an opaque cursor rather than an explicit offset to navigate through the data. The cursor typically contains the key value(s) from which to start the next page, and it can be as simple or complex as needed.

Example

A resource like this:

/books?limit=100&cursor=eyJwdWJsaXNoZWRhdCI6IjIwMjQtMDgtMjZUMTQ6MDE6MjNaIn0=

Might map to a SQL query like this:

SELECT * FROM book WHERE publishedat > '2024-08-26T14:01:23Z' ORDER BY publishedat ASC LIMIT 100

In this example, the cursor is simply the base64-encoded version of the following JSON:

{ "publishedat": "2024-08-26T14:01:23Z" }

Advantages

  • Efficiency: Cursor-based pagination can be extremely efficient, especially with large datasets or distributed databases. For example, the above SQL could take advantage of a clustered index on the publishedat column.
  • Flexibility: The cursor's opaque nature allows it to evolve without changing the API interface, making it easier to maintain backward compatibility.
  • Horizontal Scalability: This approach is well-suited for sharded data stores, as the cursor can encode shard information.
  • Reversibility: While more complex, cursor-based pagination can also support going backward by including additional cursor data.

Disadvantages

  • Complexity: Implementing cursor-based pagination can be more involved, requiring careful design and thorough testing to ensure that cursors are generated and interpreted correctly.
  • Relative Navigation: Users can't jump to a specific page, as navigation is relative rather than absolute.

Use Cases

Cursor-based pagination shines in applications with large datasets or distributed systems, where efficiency and scalability are paramount. It supports UI patterns like infinite scrolling and next/previous navigation but is less suited to scenarios requiring direct page access.

Handling Edge Cases

Record Deletion

According to the HTTP spec, the DELETE method doesn’t guarantee that a resource is physically removed—only that its association with the URL is severed. This allows you to mark records as “deleted” without actually removing them from the database, preserving their position in the pagination order. While this approach prevents items from shifting unnecessarily between pages, it can result in pages with fewer items than expected. It’s also important to avoid returning an empty page followed by more data, as this can confuse users.

Record Creation

When records are ordered by creation timestamp, new records can disrupt pagination by shifting the position of existing items. To mitigate this, consider using a "point in time" parameter in the pagination request. This parameter ensures that users see the dataset as it existed at a specific moment, ignoring records created after that point.

Random Sorts

Randomized or haphazard sorts can be valuable for data sampling. A common approach is to sort by a cryptographic hash of a unique ID, which can be computed on the fly for simplicity or stored as a computed field for efficiency. This method ensures stable sampling, reducing the chance of users “random mining” for additional data while effectively minimizing bias.

Choosing the Right Pagination Approach for Your API

Choosing the right pagination scheme depends on your application’s specific needs, including data size, traffic patterns, and user experience. While offset-based pagination is simple and effective for many cases, cursor-based pagination offers greater performance and flexibility, especially for larger or distributed datasets. By carefully weighing the pros and cons of each approach and addressing edge cases like record deletion and creation, you can design a robust pagination system that scales gracefully and provides a smooth user experience.

More blog posts

see all