More synchronization examples

Here are some examples of concurrency problems that are relevant to real operating systems and their abstractions.

Process manager

This problem deals with a Process class to serve as the core of a process manager for a multiprogrammed kernel. The Process class maintains parent/child/sibling relationships among processes, and coordinates the Exec (or fork), Exit, and Join (or wait) system calls. The problem is to implement three key methods of process that encapsulate the synchronization aspects of implementing process trees:

Each Process object P has an associated thread. The thread bound to P calls Birth and Death on P as part of the implementation of the Exec and Exit system calls respectively. Join on P may be called only by the thread bound to the parent of P. You do not need to create these threads or enforce these restrictions; they are intended to simplify the problem.

Death on P waits until two conditions are satisfied: (1) the children of P have exited, and (2) the parent of P has exited or has executed a Join on P. The intent is that the Process object for P may be safely deleted after Death returns.

Your solution should represent the process tree with the following state for each Process object P: (1) a list of the children of P, (2) a pointer to the parent of P, and (3) P’s exit status if P has exited. Be sure that your solution is free of deadlock and dangling references.


EventPair and LPC

Microsoft Windows has kernel-supported threads, and it provides an EventPair object to help support fast request/response message passing between client and server processes (this is sometimes called cross-domain procedure call or local/remote procedure call, LPC or RPC). EventPair synchronizes a pair of threads (a server and a client) as follows. The server thread waits for a request by calling Event-Pair::Wait(). The client issues a request by placing a request message on a queue in shared memory, then calling EventPair::Handoff(). Handoff wakes up the server thread and simultaneously blocks the client to wait for the reply. The server thread eventually responds by placing the reply message in another shared queue and calling Handoff again; this wakes up the client to accept the response, and simultaneously blocks the server to wait for the next request.

a) Show how to implement EventPair using a mutex and condition variable.

b) Show how to implement EventPair using semaphores.


Events and Signaling

Event. This problem asks you to implement an Event class similar to the fundamental coordination primitives used throughout Windows. Initially, an Event object is in an unsignaled state. Event::Wait() blocks the calling thread if the event object is in the unsignaled state. Event::Signal() transitions the event to the signaled state, waking up all threads waiting on the event. If Wait is called on an event in the signaled state, it returns without blocking the caller. Once an event is signaled, it remains in the signaled state until it is destroyed.

(a) Implement Event using a single semaphore with no additional synchronization.

(b) Implement Event using a mutex and condition variable.

(c) Show how to use Event to implement the synchronization for a Join primitive for processes or threads. Your solution should show how Event could be used by the code for thread/process Exit as well as by Join. For this problem you need not be concerned with deleting the objects involved, or with passing a status result from the Join.


WorkCrew

Most implementations of Sun’s Network File Service (NFS) use the following Work Crew scheme on the server side. The server node’s incoming network packet handler places incoming requests on a shared work queue serviced by a pool of server threads. When a server thread is idle (ready to handle a new request), it examines the shared queue. If the queue is empty, the server thread goes to sleep. The incoming packet handler is responsible for waking up the sleeping server threads as needed when new requests are available.

a) Show how to implement the NFS server-side synchronization using a mutex and condition variable. For this problem you may assume that the incoming packet handler is a thread rather than an interrupt handler.

c) What will happen to the queue if the server machine is not powerful enough to process requests at the arrival rate on the network? What should the incoming packet handler do in this case? How is this different from the standard producer-consumer BoundedBuffer problem?

d) Early NFS server implementations used Broadcast as a wakeup primitive for the server threads, because no Signal primitive was available in Unix kernels at that time (around 1985). This was a common performance problem for early NFS servers. Why is Signal more efficient than Broadcast here?

e) This and other examples led Carl Hauser and co-authors from Xerox PARC to claim (in a case study of thread usage in SOSP93) that “Signal is just a performance hint”. What they meant was that Signal is not necessary to implement correct programs using condition variables, and that (more generally) any use of Signal can be replaced by Broadcast without affecting the program’s correctness, although the Signal might be more efficient. Do you believe this statement? What assumptions does it make about condition variables?


Spinlocks with Load-Locked and Store-Conditional

The Alpha and MIPS 4000 processor architectures have no atomic read-modify-write instructions, i.e., no test-and-set-lock instruction (TS). Atomic update is supported by pairs of load_locked (LDL) and store-conditional (STC) instructions.

The semantics of the Alpha architecture’s LDL and STC instructions are as follows. Executing an LDL Rx, y instruction loads the memory at the specified address (y) into the specified general register (Rx), and holds y in a special per-processor lock register. STC Rx, y stores the contents of the specified general register (Rx) to memory at the specified address (y), but only if y matches the address in the CPU’s lock register. If STC succeeds, it places a one in Rx; if it fails, it places a zero in Rx. Several kinds of events can cause the machine to clear the CPU lock register, including traps and interrupts. Moreover, if any CPU in a multiprocessor system successfully completes a STC to address y, then every other processor’s lock register is atomically cleared if it contains the value y.

Show how to use LDL and STC to implement safe busy-waiting, i.e., spinlock Acquire and Release primitives. Explain why your solution is correct.


Sequencers and Eventcounts

Cold coffee. The espresso franchise in the strip mall near my house serves customers FIFO in the following way. Each customer entering the shop takes a single “ticket” with a number from a “sequencer” on the counter. The ticket numbers dispensed by the sequencer are guaranteed to be unique and sequentially increasing. When a barrista is ready to serve the next customer, it calls out the “eventcount”, the next highest unserved number previously dispensed by the sequencer. Each customer waits until the eventcount reaches the number on its ticket. Each barrista waits until the customer with the ticket it called places an order.

Show how to implement sequencers, tickets, and eventcounts using mutexes and condition variables. Your solution should also include code for the barrista and customer threads.


Bounded Buffer

A producer/consumer Bounded Buffer is the basis for pipes. BoundedBuffer has two primitives: Read(n) and Write(n). (Ordinarily these calls would transfer n bytes of data out of or into the buffer, but ignore that for purposes of this problem. The point here is to focus on the synchronization.) The Write procedure produces n bytes at the tail of the buffer; Read consumes n bytes from the front of the buffer. The BoundedBuffer has a maximum size of N bytes. Read blocks if the buffer is empty, and Write blocks if the buffer is full. A Read or Write with n > N is legal, but it will always block the caller at least once.

Implement BoundedBuffer using mutexes and condition variables. Your solution should guarantee that a Read never consumes interleaved bytes from two or more Writes.