What is cached?
The mozilla cache holds all documents downloaded via HTTP by the user. At first this may seem odd; however, this is done to make visited documents available for back/forward, saving, viewing-as-source, etc. without requiring an additional trip to the server. It likewise improves offline browsing of cached content.
What about documents sent with a Cache-control: no-cache header?
Yup, we even store "no-cache" documents in our cache for the reasons enumerated above.
But don't you end up serving stale content?
Nope, each document stored in the mozilla cache is given an expiration time. If mozilla tries to load the document for normal viewing, this expiration time is honored. A cached document will be validated with the server if necessary before being shown to the user.
How are expiration times calculated (since not every response includes an Expires header)?
Mozilla strives to implement RFC 2616 (see in particular section 13). The following response headers generate an expiration time that is always in the past (given the value "00:00:00 UTC, January 1, 1970"):
Cache-control: no-cache Cache-control: no-store Pragma: no-cache Expires: 0
It is interesting to note that "
Expires: 0" and "
Pragma: no-cache" are technically invalid response headers. If none of these headers are present, then the expiration calculation is essentially the algorithm described in RFC 2616 section 13.2. We estimate the current age of the response and the freshness lifetime based on available information.
The current age is usually close to zero, but is influenced by the presence of an
Age header, which proxy caches may add to indicate the length of time a document has been sitting in its cache. The precise algorithm, which attempts avoid error resulting from clock skew, is described in RFC 2616 section 13.2.3.
The freshness lifetime is calculated based on several headers. If a "
Cache-control: max-age=N" header is specified, then the freshness lifetime is equal to N. If this header is not present, which is very often the case, then we look for an"
Expires" header. If an "
Expires" header exists, then its value minus the value of the "
Date" header determines the freshness lifetime. Finally, if neither header is present, then we look for a "
Last-Modified" header. If this header is present, then the cache's freshness lifetime is equal to the value of the "
Date" header minus the value of the "
Last-modified" header divided by 10. This is the simplified heuristic algorithm suggested in RFC 2616 section 13.2.4.
The expiration time is computed as follows:
expirationTime = responseTime + freshnessLifetime - currentAge
responseTime is the time at which the response was received according to the browser.
What other factors influence revalidation?
Revalidation is triggered when the user presses the reload button. It is also triggered under normal browsing if the cached response includes the "
Cache-control: must-revalidate" header. Another factor is the cache validation preferences in the
Advanced->Cache preferences panel. There is an option to force a validation each time a document is loaded.
How does cache validation work?
When a cached documents expiration time has been reached, it is either validated or refetched. Validation can only occur if the server provided either a strong validator or a weak validator. Cache validators are described in RFC 2616 section 13.3.2.
ETag" response header is an opaque-to-the-useragent value that can be used as a strong validator. If the "
ETag" header is present in a response, then the client can issue an "
If-None-Match" request header to validate the cached document.
Last-Modified" response header can be used as a weak validator. It is considered weak because it only has 1-second resolution. If the "
Last-Modified" header is present in a response, then the client can issue an "
If-Modified-Since" request header to validate the cached document.
When a validation request is made, the server can either ignore the validation request and response with a normal
200 OK, or it can return
304 Not Modified to instruct the browser to use its cached copy. The latter response can also include headers that update the expiration time of the cached document.
I plan to extend this document in the future. Feel free to email me your questions or comments.