Intermittent Caching problem
Created | Updated Apr 20, 2007
Subject: Intermittent caching problem
Posted 6 Hours Ago by studioj
This is a reply to this Posting.
Post: 91
Hi Jim, I hope you are well - and also, of course, that you're still reading this thread smiley
I'm a little worried that unless I word this post carefully then I might:
a: look like I'm trying to make a big thing of pointing out a mistake you /might/ have made
b: look like I'm teaching my grandmother to suck eggs
c: make a complete arse of myself
Of course, I want none of these things. I make this post purely with the intent of trying to help solve the problem.
Having seen increasing numbers of people complaining about this problem - and a few of these from people in the UK - I decided to do a little investigation.
I thoroughly re-read this thread and also compared the response headers from a DNA request with those from a request to news.bbc.co.uk (which people don't seem to be having any problems with).
Response headers from news.bbc.co.uk:
Date: Mon, 16 Apr 2007 19:24:35 GMT
Server: Apache/2.0.54 (Unix)
Vary: Cookie
Accept-Ranges: bytes
Cache-Control: max-age=0
Expires: Mon, 16 Apr 2007 19:24:35 GMT
Pragma: nocache
Keep-Alive: timeout=5, max=300
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html
Response headers from a www.bbc.co.uk/dna/ messageboard:
Date: Mon, 16 Apr 2007 19:23:40 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Cache-Control: Private
Content-Type: text/html
Connection: close
A fair few differences.. but which are the crucial ones?
I had a dig around the web and have come to the conclusion that the clue to the problem probably lies in this sentence from your post 21:
"...it still has the Cache-control: Private header which tells all web caches not to cache the page"
I'm no expert in http caching methods or their control but my interpretation of the header: "Cache-control: Private" is that it doesn't mean "do not cache" but rather "you may cache this - but only in a private (non-shared) cache"
Some googling suggests that there seems to be conflicting information regarding this setting, for instance the following (.asp / IIS related) page: http://www.asp-dev.com/main.asp?page=94 starts off by saying:
"Cache Control Header: Private - A cache mechanism may cache this page in a Private cache and resend it only to a single client. This is the default value. Most proxy servers will not cache pages with this setting."
then later goes on to say:
"Between your Web server and a user requesting your page, there may be proxy servers configured to cache Web pages for faster response times. Usually ASP pages are developed to be unique for each user, or may contain secure information.
For this reason, IIS sets this property to "Private" so that proxy servers or other cache mechanisms will not cache pages."
I did find other resources that inferred/assumed "Cache-control: Private" == "Do not cache" but I am reasonably convinced that this is a misinterpretation (and one that maybe just gets propagated 'cos it seemed to work in the past).
A brief read of the relevant sections here: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9 would seem to confirm that the "Private" setting is strictly for specifying that a page MAY be cached but if so that cache must not be shared with other clients.. it does not mean that a page must not be cached.
More caching info here: http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.4 (btw: I don't pretend that I've read all this stuff.. just enough to come to the conclusions that I have).
Some other googling /seemed/ to suggest that not all caching implementations necessarily take notice of all cache related headers - so it /may/ not be as simple as changing "Cache-Control: Private" to "Cache-Control: no-cache" or "Cache-Control: max-age=0", maybe you need to go belt and braces and also add the "Expires" and "Pragma" headers too (as per bbc news)
Ok, I seem to have blathered a lot.. and I do hope it came across the way I meant it (and it is understandable).
Also hope it is of some help.
jont {;¬· >···{
ps: I also learnt this week that apparently ISPs are increasingly resorting to transparent (proxy) caching in order to reduce their bandwidth usage - which may explain why we are seeing an increase in the incidences of this problem.