nanog mailing list archives

Re: decreased caching efficiency?


From: Scott Gifford <sgifford () tir com>
Date: 19 Oct 2000 15:02:09 -0400


All of these problems are solvable, using common and well-known
techniques:

Daniel Senie <dts () senie com> writes:

It might be worth thinking about the problem from the other end. From a
web site owner's perspective, caching is a major annoyance. Here are the
arguments you may encounter from a web site owner or web developer:

1. It interferes with content in many cases (web site visitors may see
cached pages instead of current content). I know cache products claim
this doesn't happen, but it has, and often.

Most (all?) reasonable caching products will honor whatever expiration
information you put in the page, such as the Cache-Control header and
Expires header.  Where I've made careful use of these, I've never had
problems with stale content, even from browser caches.

2. The website owner loses information on how many visitors are coming
to the site.

A common technique to just count Web page hits is to <img src> a small
image on the page, and then use that to count page visits, or to have
the page itself not be cacheable, but the images (which are most of
the load time) cachable.  Having the page itself be dynamic and
uncachable, while the images can always be cached, can be a big win
all around; dynamic images are fairly rare (except from MRTG. :) )

3. The website owner loses the demographics on where visitors are coming
from, and especially the number of unique visitors. (It's not helpful to
know that one cache engine visited, if that cache engine equated to
10,000 visits in an hour).

You can use the X-Forwarded-For header that many caches provide to
gather this same information.  In the future, you may be able to use
the protocol described in RFC 2227 to get more detailed information.

4. Banner advertising may or may not display properly when caching is
involved, thereby costing the website money.

I've never experienced this; I've been viewing the Web through a cache
or a hierarchy of caches for 2 years now, and I've never noticed
anything weird with banner ads.  At least nothing an "Expires: 0"
wouldn't solve.

5. There's NOTHING in it for the website owner, other than the
possibility that SOME pages might display faster for SOME users.

If folks running networks really think website designers and owners
should care about caching, then there needs to be some sort of benefit
(perhaps paid in dollars) to those affected. Otherwise, there's little
reason for them to care.

I don't understand this; having Web pages which are effectively cached
around the world reduces the load on your servers significantly
(especially as more and more ISPs start to cache), and saves you
significant bandwidth.  This lets you buy fewer servers for your farm,
and buy less upstream bandwidth.

Right now, having a site which is cache friendly can save you money in
a big way, at the same time savin ISPs money, making your page display
how you want it (since the ISPs are already deploying caching, whether
your pages are friendly to it or not), and having the page load faster
for quite a few users.

How is that not a benefit?  How is that not paid in dollars?

In the future, if Webserver operators would take effective cache
performance while maintaining correct display into account when
configuring their servers, and make sure that page designers do the
same, that would allow caches to become more ubiquitous, and push
people to set up large-scale cache hierarchies.  It could get to the
point where all of the non-dynamic content from an infinitely large
Website could be served by an old desktop computer over a 28.8 modem,
since it would just have to send its content once to the UUNet cache,
once to the MCI/Worldcom cache, once to the Sprint cache, etc.  Of
course, that's still a ways off.  :)


Just my 2 cents,

------ScottG.



Current thread: