Building Fast and Robust REST APIs using Conditional HTTP Requests
Conditional HTTP requests are one of the little-known but widely used features. An intelligent client can determine the status and content of HTTP requests without actually transmitting the body over the wire. With conditional HTTP requests, you can reduce the work done by your server by identifying whether existing data available with the client still valid, hence saving bandwidth.
In conditional HTTP requests, the result of an HTTP request can be changed by comparing the affected resources with the value of a validator.
REST uses HTTP extensively — HTTP verbs in combination with URLs have special implied meanings and status codes for errors. Conditional requests can be used to validate the content of a cache or determining whether the current document being edited is valid.
Making a conditional HTTP request
Properly configured web servers and CDNs instruct browsers to cache most of the static content to avoid re-downloading big chunks of assets, improving page loading speed.
The easiest way to cache static content is to instruct the browser to cache something with the HTTP Expires header. It directs browsers, on an intermediate proxy (CDN Edge), to assume the downloaded content to be fresh until the given time and use the same when the subsequent identical request arrives.
This approach has some drawbacks, such as difficulty in changing content once cached by browsers, etc. These can be mitigated with well-known cache-busting techniques — the benefits outweigh the problems.
Last-Modified, ETags and 304 Not Modified
One of the ways to compromise between unconditional caching and not caching is validating cached content. By supplying a timestamp with the Last-Modified header, you can instruct browsers to send an If-Modified-Since header with the timestamp of the cached response. If the server has the same content, it responds with a 304 Not modified with an empty body, and the browser proceeds to use content from it’s cache. If the content has changed, the server sends a new body with a new Last-modified value with a 200 OK response and the new body.
Sometimes you don’t want to (or cannot) use timestamps to track the state of a resource. You can use a state-identifier such as a checksum (or anything) in a header called ETag which the browser uses with the If-None-Match header. If the value sent by the browser matches the Etag at the server, then the server sends a 304 Not modified status like before.
Conditional requests in APIs
Most API calls, however, are not cached by default. This is true for calls originating from the browser and server-to-server calls. Every time you repeat a GET request, the server fetches the resource from its data store and transfers the body over the network.
Using the date headers — Expiry, Last-Modified and If-Modified-Since
A good example would be an API GET /recommendations for new article recommendations. Your client polls the endpoint for new items to be available. For every call, your API server fetches content from the database, serializes them to JSON, and returns the result. The client then compares the response with the previous and detects changed items.
The recommendations are generated daily using a batch job at midnight, which means that the article recommendations are valid till midnight. You can attach an Expiry header to indicate that the content is fresh till midnight.
In some cases, you may not be able to determine the expiry in advance, such as the API GET /articles/<id> for returning the content of an article. In this case, you can return a Last-Modified header with the last updated timestamp from your database. For subsequent requests for the same article, the client sends an If-Modified-Since with the timestamp sent by the server earlier. The server responds with a 200 OK or with a response of 304 Not Modified depending on whether the article has been modified or not.
Optimistic Locking with ETags/If-Match or If-Unmodified-Since
Let’s continue with our example. We have an update API PATCH /articles/<id> which can be called simultaneously when multiple editors are working on the same article. Without any optimistic locking, we will have something called a lost-update problem.
A and B started editing the same article at 9:00 am. Both see identical contents.
A makes some changes and decides to save them at 9:15 am.
B takes a little longer and saves at 9:30 am.
Without optimistic locking, the content of B will overwrite the changes of A, and the updates made by A will be lost.
To prevent this, you can use a version number as the ETag or serve the Last-Modified timestamp. While making the PATCH request to update the article, the client also sends If-Match or If-Unmodified-Since headers. The server now checks if there is any modification since the value specified those headers headers. The server either returns a 412 Precondition failed if there has been an update since 9:00 am. The client can decide to fetch the latest value and attempt to merge with the latest changes.
Implementing conditional requests
The client
If you are calling APIs from a browser, you do not have to set up anything special. Set the appropriate header, and the browser will do the heavy lifting for caching. If you need advanced control, you can control the cache behavior using fetch API.
For server-side clients, it a bit complex. You need to configure a store, such as “Redis” to store the response and associated metadata. Before sending an HTTP request, you need to fetch the values for a given URL and add it to the headers. Libraries such as OKHttp contain pluggable providers to easily integrate such a mechanism.
The server
The server must understand these headers and perform the appropriate action. For example, in the GET /article/<id> API described above, your server can return 304 Not modified if the “last-modified” timestamp on the database is older than the timestamp sent an If-Modified-Since header.
To prevent “lost updates”, the server can use a version number as the Etag or use the If-Unmodified-Since header and assert it to be newer than the value in the your database. If the condition is violated, send a 412 Precondition Failed status. The client can then initiate appropriate conflict-resolution action such as displaying a diff and letting the user merge the differing versions of the article.
Conclusion
Leveraging HTTP conditional requests is an often ignored aspect of REST APIs. An adequately designed API can improve performance by reducing unnecessary network traffic. By implementing proper locking semantics, you can ensure APIs perform safe operations. This translates to thousands of dollars of money saved for a moderate-scale service.