next up previous contents
Next: Weakly Ordered Eventual Consistency Up: Introduction Previous: Caching Shared DCA State

Shared Data Spaces

Our premise is that there are three key needs common to a large set of DCAs, independent of the task being performed by the application:

  1. maintain shared state;
  2. process client-side updates;
  3. provide a consistent view of the shared state to all clients.
Conceptually, this trio of responsibilities can be grouped together and handled by a single data substrate called a shared data space. A shared data space combines data and operations to provide a common facility for these basic DCA needs. It is made up of the client caches, an API for updating cache data, a cache consistency protocol, and one or more servers for reliable storage of updates. Figure 1.1 depicts an abstract view of the components in a shared data space.

  figure21
Figure 1.1: Abstract diagram of a shared data space

The simple shared data space implemented as part of this thesis work is aimed at addressing the specific needs of caching for Web-based DCAs. We believe that shared data spaces for the Web can be made efficient and scalable through cache policies and protocols tailored for their environment. Cache issues such as size, prefetch policy, replacement policy, name space, and consistency protocol must be addressed. Choices in each area will affect the performance and scalability of the cache. Also, Java and the Web let us download caching policies and behavior onto the client side when the server is first accessed.

Web-based DCAs are characterized by human interaction. In this environment, data accesses are driven entirely by human actions. They are often unordered and unpredictable. This creates a problem with synchronizing access to shared state. In many distributed systems applications, data accesses to shared state are synchronized with locks, guaranteeing that only one user accesses shared data at a time. However, the following problems arise with the use of locks in a Web-based DCA:

As discussed in [3], locking schemes are not always necessary for access to shared state in distributed applications. Applications that require them could implement locks on top of the system we propose, but applications that don't require them allow the shared data space to be more scalable and efficient. Without locks, we are left with unrestricted, unsynchronized access to cache data.

Unsynchronized access introduces the possibility of retrieving stale data from the cache. Without synchronization, it is possible for one user to read the value of some data while another user is changing that value. In other words, one cache's data might not reflect an update made to another cache. This leads us to the central question: how do we maintain cache consistency in a shared data space for DCAs?

We believe that short-lived inconsistencies are acceptable for a large class of DCAs. Applications that deal with noncritical data can work with stale data, as long as the client caches eventually converge to the same state. We hypothesize that a weakly ordered variant of eventual consistency [15] will be a strong enough policy to support useful collaborative applications.


next up previous contents
Next: Weakly Ordered Eventual Consistency Up: Introduction Previous: Caching Shared DCA State

Carmine F. Greco
Wed Mar 26 23:44:38 EST 1997