Internet Systems and Storage Group
Software architectures for scalable wide-area systems
Duke Computer Science
Traditionally, naming in computer systems involves mapping human-understandable strings to machine-understandable identifiers. Such mappings are typically static and one-to-one. For example, when accessing a remote Internet machine, the host's name (e.g., www.cs.duke.edu) must first be translated into an IP address (e.g., 22.214.171.124). Once the address is retrieved, programs typically use it to request a service from the remote host (such as FTP, HTTP, or ssh).
Today, we see mobility and replication changing the naming paradigm for distributed applications. Clients use names to access resources and service providers. These services take the form of stock quotes, Internet search, online commerce, news services, audio, and video. Furthermore, a name for such a resource no longer maps to a single identifier (i.e., hostname). Rather, a named resource can be retrieved from one of many replicated service providers whose exact membership is quickly changing or from a competing set of service providers, each able to deliver a resource at varying quality and cost. Similarly, mobility and variable network performance means that a client may implicitly require different versions of a named resource based on its current conditions. For example, a client with a PDA utilizing a wireless connection may use the same name for a resource (e.g., cnn.com) as a client on workstation utilizing a T3 link. However, the two clients implicitly require different versions of the same named resource.
Our work on Active Names is motivated by the observation that end users and programs are no longer interested in an address to contact for a resource, but the resource itself. An Active Name is a location independent program responsible for both locating and retrieving a named resource on behalf of a client. Consider the simple example of a user naming a web service through a URL. Traditionally, the client's web browser would transmit the URL to an organizational proxy cache. The proxy would retrieve the hostname from the URL and contact DNS for the IP address to contact for the resource. The proxy then connects with the server, retrieves the named resource, and transmits the resource back to the client. In this scheme, it is difficult to account for replication, fault tolerance, or load balancing. With Active Names, the client requests that a program be run in a local name resolver to both locate and retrieve the resource. This program can be customized by both the client and the service. Client customizations can be utilized to, for example, "clip" a web page into a format more appropriate for a PDA. Service customizations can be similarly utilized to perform resource discovery, cache coherence, fail over, and load balancing among service replicas.
Active Name programs are routed, at the application level as opposed to the packet level, from name resolver to name resolver until the target resource is located. In the limit, the final destination for an Active Name program is one of a canonical set of servers responsible for maintaining a given resource. Once the resource is located, it is transmitted directly to the client, short circuiting the program's previous path through the network and minimizing client-perceived latency. The following architectural features support the goals of our system design: