Wiki

Clone wiki

api-specifications / Guidelines / Pagination

Why ?

Our APIs may expose quantity of contents, possibly all records held in our databases. This often means tens of thousands of entities. Since REST is transported over HTTPS, there are limitations as per the size and time requests and responses can be exchanged. On our technical platforms, responses are typically limited to 10 MB of metadata.

Most of the times, you will specify request parameters that substantially limit the number of entities returned. However, there is no predictable rule that will guarantee that your request parameters will produce responses that fit within the technical limits.

All API providers do this. For example Facebook has limited the page size to 50 records.

The pagination feature

Therefore, all "search" endpoints use the common notion of pagination. Within the complete result set of entities matching the request parameters, only one "page" is returned per response. The page is specified with the request by: * a number of entities (the "size" , typically defaulting to 10, should not exceed about 500) * a position (the "page", starting with 0 being the first page)

How to use it

"size" and "page" are two extra request parameters.

With page=0 you will fetch the first 100 entities With page=1 you will fetch the next 100 entities, and so on.

In order to process the complete result set, you may have to iterate several requests, incrementing the "page" parameter until the response contains no entities. If a response contains less than the requested "size", then you can assume it is the last page, and stop iterating.

Examples of request parameters, for the second page of 100:

...?size=100&page=1

If you request with the parameters as above, the response will contains a "Link" header describing further requests you can invoke to fetch the complete result set (line breaks have been added for readability):

<https://api(...your original request here ...)&page=0&size=100>; rel="first",
<https://api(...your original request here ...)&page=0&size=100>; rel="prev",
<https://api(...your original request here ...)&page=2&size=100>; rel="next",
<https://api(...your original request here ...)&page=672&size=100>; rel="last"

This follows the specification of RFC 5988.

The Link header will indicate what would be the complete request parameters you should specify if you want to reach the next, the previous, the first or the last pages respectively.

In the example above notice that the last page is numbered 672. This means that the request parameters return an exceptionally large result set of 673 * 100 entities. This was on purpose.

The format of the Link header is a little cumbersome but this is a standard. For further reading see:

Limitations

If you ask for more records than the maximum page size, the server will return a HTTP 400 code and the payload will indicate the cause, as in:

{
  "message": "Request parameter 'size' must be between 1 and 500, you have specified 501"
}

Algorithms

  1. Start requesting with page=0
  2. Read the Link header and extract the number of the last page
  3. Iterate by incrementing the "page" parameter and requesting again, until you reach the last page number
  1. Start requesting with page=0
  2. If the Link header does not contain rel="last", this means you are on the last page, stop iterating
  3. Iterate by incrementing the "page" parameter and requesting again
  1. Start requesting with page=0
  2. If the response contains less than "size" entities, stop iterating
  3. If the response contains ZERO entities, REALLY stop iterating please!
  4. Iterate by incrementing the "page" parameter and requesting again

The geek's corner

Traversing the complete result set through pages will take several requests, hence there is NO GUARANTEE that data is not modified while you iterate! Especially if you look for recently modified entities, it is possible that an entity modified recently, to be returned on 10th page, is again modified, and would then fit for first page. Murphy's law helping, you may miss it. Therefore you might want to switch the sorting order with the request parameter "sortBy=urn" (see the OpenAPI spec), to guarantee a stable result set.

Thanks to Standards Australia to pointing this out.

Updated