Our premise is that there are three key needs common to a large set of DCAs, independent of the task being performed by the application:
Figure 1.1: Abstract diagram of a shared data space
The simple shared data space implemented as part of this thesis work is aimed at addressing the specific needs of caching for Web-based DCAs. We believe that shared data spaces for the Web can be made efficient and scalable through cache policies and protocols tailored for their environment. Cache issues such as size, prefetch policy, replacement policy, name space, and consistency protocol must be addressed. Choices in each area will affect the performance and scalability of the cache. Also, Java and the Web let us download caching policies and behavior onto the client side when the server is first accessed.
Web-based DCAs are characterized by human interaction. In this environment, data accesses are driven entirely by human actions. They are often unordered and unpredictable. This creates a problem with synchronizing access to shared state. In many distributed systems applications, data accesses to shared state are synchronized with locks, guaranteeing that only one user accesses shared data at a time. However, the following problems arise with the use of locks in a Web-based DCA:
As discussed in [3], locking schemes are not always necessary for access to shared state in distributed applications. Applications that require them could implement locks on top of the system we propose, but applications that don't require them allow the shared data space to be more scalable and efficient. Without locks, we are left with unrestricted, unsynchronized access to cache data.
Unsynchronized access introduces the possibility of retrieving stale data from the cache. Without synchronization, it is possible for one user to read the value of some data while another user is changing that value. In other words, one cache's data might not reflect an update made to another cache. This leads us to the central question: how do we maintain cache consistency in a shared data space for DCAs?
We believe that short-lived inconsistencies are acceptable for a large class of DCAs. Applications that deal with noncritical data can work with stale data, as long as the client caches eventually converge to the same state. We hypothesize that a weakly ordered variant of eventual consistency [15] will be a strong enough policy to support useful collaborative applications.