Replicated Partial Directory seeks to further reduce the cost of replicating the cache directory, but without compromising hit ratios significantly. The idea is to identify a subset of the cache contents that is most likely to yield productive hits, and replicate only those entries (called the submap) at each mapping server. One approach we have investigated is to restrict the submap to only ``shared documents'', those that have already been accessed by more than one client. On typical Web traces from DEC WRL this configuration reduced the size of the map replicas by 70% while sacrificing potential cache hits on less than 6% of all references. This is because only about 30% of documents are shared, but many shared documents are widely shared.
Note that partial directory replicas can be used to supplement rather than replace central directories. One approach is to check the submap replica first to speed up the majority of hit cases, but fall back to querying a central mapping server for objects that miss in the submap. This approach delivers hit ratios and miss latencies comparable to the synchronous directory configurations, but with lower average hit ratios.
Recently we have begun to evaluate a variant of Replicated Partial Directory called neighbor cache in which each submap contains entries for only those objects that are closer than some proximity threshhold. Neighbor cache does not yield hits for objects resident in caching servers that are ``too far away'' from the requesting server. However, it is the most scalable of all configurations, and widely shared objects eventually propagate through the entire cache. This is our preferred approach for constructing collective caches that are highly dispersed geographically.