In this section we present reference studies of the 25-day DEC trace to motivate the collective architectures explored in this paper. All our studies except those using real web proxies use a basic LRU replacement scheme; the operation of real proxies may differ somewhat when objects are invalidated due to age, HTTP expiry indicators (not indicated in the trace), etc.
Figure 1 shows that almost 89% of the the references in this trace are to static documents that can be cached. About 70% of all references are potential hits; the purpose of any Web caching architecture is to serve as many of these accesses from the cache as possible at low cost.
The trace data provides further evidence to support a central premise of our work, that the highest hit ratios are achieved by shared caches serving large user communities:
Although the 25-day trace shows a clear benefit to large, shared caches, experience has shown that the scalability of individual proxy servers is limited. While we can expand proxy server capacity by scaling the hardware and streamlining the software, a simpler and ultimately lower-cost solution is to install additional proxy servers for portions of the reference stream. The following subsections consider three basic choices for structuring distributed Web caches.
It is easy to build a scalable Web cache by distributing users across a collection of isolated caching servers. However, this structure sacrifices most of the benefits of a shared cache. Figure 4 shows the best hit ratios achievable by such an ``uncooperative'' collective cache as a function of the number of independent caching servers, with balanced round-robin client assignment. As in Figure 2, these shared proxies supplement ideal per-user caches.
Adding isolated proxies to serve a given client population causes hit ratios to decline because it decreases the probability that a client accessing a shared object for the first time can benefit from another client's previous reference to the object. A collective cache of 32 isolated proxies (540 clients per proxy) can yield hits for less than half of the shared references; the absolute achievable hit ratio for the trace drops to 21%, down from 53% for a monolithic shared cache.
Pure PPC suffers from several weaknesses that motivate the more sophisticated schemes in this paper. First, it relies on a hash function that produces balanced distributions of objects and references among the proxies. This is difficult to achieve in practice, and it may be necessary to adjust the proxy sizes to match observed reference patterns, or to run the cache below capacity to minimize queuing delays at the most heavily loaded caching servers. Second, all participants must be made aware of a new hash function if proxies join or leave a PPC cache, and cache performance will degrade temporarily as the cache accommodates the new caching function through flushing, reorganizing, or rebuilding the contents of the cache.
Most importantly, PPC may impose high latency and network overhead for both hits and misses, since the proxy handling an access is predetermined by the referenced document, independent of the user issuing the reference or that user's location in the network. This is a problem primarily if the individual proxies are dispersed in the network, as we envision for caches serving large user communities.
The third option is to partition the clients into groups, each bound to an autonomous caching server as in option 1, but also to supplement the collective with a remote querying mechanism. The purpose of the query mechanism is to bridge the gap in hit ratios between monolithic shared caches (Figure 2) and uncooperative collective caches (Figure 4), balancing the benefits of a shared cache with the scalability of isolated proxy servers.
On a local cache miss, a participating server probes the collective cache for the missing document before contacting a Web server to service the miss. This structure presents an opportunity to reduce latency and network traffic by (1) placing each proxy along the network path between the users it serves and the wide-area network link, and (2) placing the objects most likely to yield hits into caching servers closest to the users most likely to reference them. Moreover, the auto-configuration mechanism can be used to implement a dynamic binding of clients to proxies in order to balance load or mask proxy failures.
All of the collective cache architectures considered in the remainder of this paper use this option. At issue is the question of which query mechanism delivers the best hit ratios with the lowest cost and network overhead. The next section compares two fundamental implementation choices for queries in collective caches.