next up previous contents
Next: Mechanics of Thread Switching Up: A Road Map Through Previous: Disk Device

Nachos Threads

In Nachos (and many systems) a process consists of:

  1. An address space. The address space includes all the memory the process is allowed to reference. In some systems, two or more processes may share part of an address space, but in traditional systems the contents of an address space is private to that process. The address space is further broken down into 1) Executable code (e.g., the program's instructions), 2) Stack space for local variables and 3) Heap space for global variables and dynamically allocated memory (e.g., such as obtained by the Unix malloc or C++ new operator). In Unix, heap space is further broken down into BSS (contains variables initialized to 0) and DATA sections (initialized variables and other constants).
  2. A single thread of control, e.g., the CPU executes instructions sequentially within the process.
  3. Other objects, such as open file descriptors.

That is, a process consists of a program, its data and all the state information (memory, registers, program counter, open files, etc.) associated with it.

It is sometimes useful to allow multiple threads of control to execute concurrently within a single process. These individual threads of control are called threads. By default, processes have only a single thread associated with them, though it may be useful to have several. All the threads of a particular process share the same address space. In contrast, one generally thinks of processes as not sharing any of their address space with other processes. Specifically, threads (like processes) have code, memory and other resources associated with them. Although threads share many objects with other threads of that process, threads have their own private local stackgif.

One big difference between threads and processes is that global variables are shared among all threads. Because threads execute concurrently with other threads, they must worry about synchronization and mutual exclusion when accessing shared memory.

Nachos provides threads. Nachos threads execute and share the same code (the Nachos source code) and share the same global variables.

The Nachos scheduler maintains a data structure called a ready list, which keeps track of the threads that are ready to execute. Threads on the ready list are ready to execute and can be selected for executing by the scheduler at any time. Each thread has an associated state describing what the thread is currently doing. Nachos' threads are in one of four states:

READY:
The thread is eligible to use the CPU (e.g, it's on the ready list), but another thread happens to be running. When the scheduler selects a thread for execution, it removes it from the ready list and changes its state from READY to RUNNING. Only threads in the READY state should be found on the ready list.

RUNNING:
The thread is currently running. Only one thread can be in the RUNNING state at a time. In Nachos, the global variable currentThread always points to the currently running thread.

BLOCKED:
The thread is blocked waiting for some external event; it cannot execute until that event takes place. Specifically, the thread has put itself to sleep via Thread::Sleep(). It may be waiting on a condition variable, semaphore, etc. By definition, a blocked thread does not reside on the ready list.

JUST_CREATED:
The thread exists, but has no stack yet. This state is a temporary state used during thread creation. The Thread constructor creates a thread, whereas Thread::Fork() actually turns the thread into one that the CPU can execute (e.g., by placing it on the ready list).

In non-object oriented systems, operating systems maintain a data structure called a process table. Process (thread) table entries contain all the information associated with a process (e.g., saved register contents, current state, etc.). The process table information is frequently called a context block.

In contrast to other systems, Nachos does not maintain an explicit process table. Instead, information associated with thread is maintained as (usually) private data of a Thread object instance. Thus, where a conventional operating system keeps thread information centralized in a single table, Nachos scatters its ``thread table entries'' all around memory; to get at a specific thread's information, a pointer to the thread instance is needed.

The Nachos Thread object supports the following operations:

Thread *Thread(char *debugName)
The Thread constructor does only minimal initialization. The thread's status is set to JUST_CREATED, its stack is initialized to NULL, its given the name debugName, etc.

Fork(VoidFunctionPtr func, int arg)
does the interesting work of thread creation, turning a thread into one that the CPU can schedule and execute.

Argument func is the address of a procedure where execution is to begin when the thread starts executing. Argument arg is a an argument that should be passed to the new thread. (Of course, procedure func must expect a single argument to be passed to it if it is to access the supplied argument.)

Fork allocates stack space for the new thread, initializes the registers (by saving the initial value's in the thread's context block), etc.

One important detail must be considered. What should happen when procedure func returns? Since func was not called as a regular procedure call, there is no place for it to return to. Indeed, rather than returning, the thread running func should terminate. Fork takes care of this detail by building an initial activation record that makes this happen (described in detail below).

void StackAllocate(VoidFunctionPtr func, int arg)
This routine does the dirty work of allocating the stack and creating an initial activation record that causes execution to appear to begin in func. The details are a somewhat complicated. Specifically, StackAllocate does the following:

  1. Allocate memory for the stack. The default stack size is StackSize (4096) 4-byte integers.
  2. Place a sentinel value at the top of the allocated stack. Whenever it switches to a new thread, the scheduler verifies that the sentinel value of the thread being switched out has not changed, as might happen if a thread overflows its stack during execution.
  3. Initialize the program counter PC to point to the routine ThreadRoot. Instead of beginning execution at the user-supplied routine, execution actually begins in routine ThreadRoot. ThreadRoot does the following:
    1. Calls an initialization routine that simply enables interrupts.
    2. Calls the user-supplied function, passing it the supplied argument.
    3. Calls thread::Finish(), to terminate the thread.
    Having thread execution begin in ThreadRoot rather than in the user-supplied routine makes it straightforward to terminate the thread when it finishes. The code for ThreadRoot is written in assembly language and is found in switch.s Note: ThreadRoot isn't run by the thread that calls Fork(). The newly created thread executes the instructions in ThreadRoot when it is scheduled and starts execution.

void Yield()
Suspend the calling thread and select a new one for execution (by calling Scheduler::FindNextToRun()). If no other threads are ready to execute, continue running the current thread.

void Sleep()
Suspend the current thread, change its state to BLOCKED, and remove it from the ready list. If the ready list is empty, invoke interrupt->Idle() to wait for the next interrupt. Sleep is called when the current thread needs to be blocked until some future event takes place. It is the responsibility of this ``future event'' to wake up the blocked thread and move it back on to the ready list.

void Finish()
Terminate the currently running thread. In particular, return such data structures as the stack to the system, etc. Note that it is not physically possible for a currently running thread to terminate itself properly. As the current thread executes, it makes use of its stack. How can we free the stack while the thread is still using it? The solution is to have a different thread actually deallocate the data structures of a terminated thread. Thus, Finish sets the global variable threadToBeDestroyed to point to the current thread, but does not actually terminate it. Finish then calls Sleep, which effectively terminates the thread (e.g., it will never run again). Later, when the scheduler starts running another thread, the newly scheduled thread examines the threadToBeDestroyed variable and finishes the job.  




next up previous contents
Next: Mechanics of Thread Switching Up: A Road Map Through Previous: Disk Device

Thomas Narten
Mon Feb 3 15:00:27 EST 1997