Showing posts with label etag. Show all posts
Showing posts with label etag. Show all posts

Why you should not use ETag

In the Last-modified time stamp validation entry i mentioned following problems with using Last-Modified date


  • A file's time stamp might get updated without any changes in the actual content of the file. In that case any conditional get request will result in 200 response and will send the full body of resource

  • One of the common problems in that HTTP servers clocks are out of synch. Even if your environment has multiple HTTP servers they might not have same time. So if you copy same file to different server at the same time, it might end up getting different last-modified time. SO if your request goes to different HTTP server, the last modified time wont match and it will return 200 for file that is not changed

  • If-modified-since values cannot be used for objects that may be updated more frequently than once per second, because value of Last-Modified is specified in seconds



So if your thinking that you could use ETag to solve this problem but problem is that the Http Server generates Last-Modified date by default and the browser always adds If-Modified-Since header in the subsequent request. As per HTTP 1.1 specification client must use an entity tag validator if a server sends back an entity tag. If the server sends only a Last-Modified value, the client can use If-Modified-Since validator. If both an entity tag and last-modified date are available, the client should use both re-validation schemes. If an HTTP 1.1 cache or server receives a request with both If-Modified-Since and entity tag conditional headers, it must not return a 304 Not Modified response unless doing so is consistent with all of the conditional header fields in the request.

That means for the response to be valid and returning 304(which is desirable behavior) both Last-Modified and ETag must be same as that of the copy on the server. That means if the Last-Modified date does not match for any reason but ETag matches server will still return 200 status code with full response. But if your using default format of ETag INode MTime Size then there is chance that the server might generate different ETag for same file, which will result in full refresh even if the client has latest version.

The Configure ETag entry on YSlow blog has details on why you should disable ETag

How to enable ETag in Apache Http Server

If your using Apache HTTP Server then you can use it to generate and return ETag for static resources that are served from disk. The Apache Http Server also takes care of comparing the value of If-None-Match header with ETag of the resource and returning either 304 Not modified or 200 OK.

Support for ETag is part of the Apache Core and it is enabled by default. Ex. These are the response headers that i get when i try to access cachesample.gif, which is a static image file on disk from Apache Server



The value of ETag header in this case is combination of three things INode MTime Size

You can configure the behavior of the ETag using the FileETag directive which configures the file attributes that are used to create the ETag (entity tag) response header field when the document is based on a file. (The ETag value is used in cache management to save network bandwidth.) In Apache 1.3.22 and earlier, the ETag value was always formed from the file's inode, size, and last-modified time (mtime). The FileETag directive allows you to choose which of these -- if any -- should be used. The recognized keywords are:


  1. INode : The file's i-node number will be included in the calculation. i-node is the number generated by OS to keep track of the file, it includes things like access level, creation time,... You can configure apache to use only INode by adding this line to httpd.conf

    FileETag INode

    This is screen of how the INode only ETag looks like


  2. MTime: The date and time the file was last modified will be included. You can configure apache to use only Last-Modified date of the file to generate ETag by adding this line to httpd.conf

    FileETag MTime

    This is how my cachecontrol.gif response headers look like when i use MTime for ETag generation


  3. Size: The number of bytes in the file will be included. You can configure Apache to use only size of the file for generating ETag by adding this line to httpd.conf

    FileETag Size

    This screen shot of headers when Apache is configured to use only file Size for calculating ETag


  4. All: All available fields will be used. This is equivalent to: FileETag INode MTime Size

  5. None: You can disable generation of ETag by Apache Http Server by adding this line

    FileETag None

    THis is screen shot of headers after ETag is disabled


Using Entity Tag (ETag) for validation

In the Last-modified time stamp validation, i talked about how you can use Last-Modified date for making conditional request and problems with that approach.

The Http 1.1 specification provides another kind of validator known as an entity tag(ETag) . An entity tag is nothing but a string that is used to identify a specific instance of an object.

When you request a resource, server can calculate string representing the version of the resource and return it to the client using ETag header like this


Etag "9c334-9933-74b9cec0"


After that whenever browser wants to check if it has the correct version of the resource it will add following header to the conditional request


If-None-Match "9c334-9933-74b9cec0"


Server will check the version in If-None-Match the version of resource that it has and will return either 304 (Not modified) if the version is same or 200 with full response body if the resource is changed.

Important Note: As per HTTP 1.1 specification client must use an entity tag validator if a server sends back an entity tag. If the server sends only a Last-Modified value, the client can use If-Modified-Since validator. If both an entity tag and last-modified date are available, the client should use both re-validation schemes. If an HTTP 1.1 cache or server receives a request with both If-Modified-Since and entity tag conditional headers, it must not return a 304 Not Modified response unless doing so is consistent with all of the conditional header fields in the request.

Last-modified time stamp validation

In the What is conditional/ validation request i explained concept of conditional request, which means the server will execute the request only if particular condition is met.

When a browser requests any resource, sometime the server response includes Last-modified header that specifies the time when the resource was last changed on the origin server like this


Last-Modified Thu, 22 Jul 2010 23:41:23 GMT


This header tells the client that the requested resource was last modified on 22nd of July 2010. If requested resource is static file served by the HTTP Server, then this value would be equal to the file system modification time. The Last-modified time stamp is given in the Greenwich Mean Time (GMT) with one second resolution.

When the browser/cache wants to validate if the resource that it has is changed, it will take the value of Last-modified header from the response that it already has and make a conditional get request by adding If-Modified-Since header to the request like this


If-Modified-Since Thu, 22 Jul 2010 23:41:23 GMT


These are the disadvantages of using If-Modified-Since header for making conditional request

  • A file's time stamp might get updated without any changes in the actual content of the file. In that case any conditional get request will result in 200 response and will send the full body of resource

  • One of the common problems in that HTTP servers clocks are out of synch. Even if your environment has multiple HTTP servers they might not have same time. So if you copy same file to different server at the same time, it might end up getting different last-modified time. SO if your request goes to different HTTP server, the last modified time wont match and it will return 200 for file that is not changed

  • If-modified-since values cannot be used for objects that may be updated more frequently than once per second, because value of Last-Modified is specified in seconds

What is conditional/ validation request

The HTTP Specification has concept of conditional/ validation request, which client makes to check if the cached copy that it has is still valid. Client will make the conditional get request in two cases

  • Resource is not cacheable: If it made a request for resource and got the response, but request does not have either Expires or cache-control header or you explicitly set the resource as non-cachable by setting Expires equal to 0 or in the past date, or set Cache-control: no-cache.

  • Cached resource is expired: A resource is cachable and lets say it was cacheable till 1 PM on 1st of August 2010, then if you query for the resource at 1.30 pm, in that case browser/ cache will make a request



When a browser makes conditional request either of two things will happen

  • Client copy is fresh: That means the copy that browser has is same as that of the copy of the server or the resource is not modified. In that case the server sends HTTP status code 304 (Not modified) with only headers without body. The server can send new expires date for the resource so that the resource can get cached in the browser

  • Client copy is stale: Means the server has new copy of the resource, or the resource has changed since the last time user requested for it. In that case the server will return HTTP status code 200 OK, with full resource in the body.



Conditional requests are implemented by conditional headers that start with If. The conditional header allows method to execute only if particular conditional is met

GET /perf/images/cachesample.gif HTTP/1.0
If-Modified-Since Thu, 22 Jul 2010 23:41:23 GMT


Means return /perf/images/cachesample.gif, only if it was modified since 22nd of July

There are two different attributes that you can use for testing creating conditional request

  • Last Modified date: Means check if the document has changed since the last modified date

  • ETag: Used to check if the entity tag(ETag) of the document has changed