Systems and Complex Systems
The course introduction touches on the classical
operating systems
view of isolated processes and a protected kernel that virtualizes
system resources. We also discuss how the core concepts
of operating systems reflect in a broader set of contexts (clusters, virtual computing systems, clouds, large-scale services, browsers, mobiles and MIDs).
This semester we will start with some examples of the breadth of operating systems and systems design and research. This discussion goes well with the material in the PCSD text up through Section 2.3.1. That material focuses on drivers of system complexity and how to control it. Along the way we discuss a couple of research papers:
Naming in File Systems
PCSD has a fairly comprehensive treatment of elemental abstractions and naming, using file systems as an example to illustrate. This unit explores the file system example in more depth, based on an early paper on NetApp's WAFL technology. We focus on how its naming structures created the foundation for a powerful file server appliance that has been the core of NetApp's product set for 15 years. Along the way we touch on some other topics of interest: cloning/snapshots, volume virtualization, storage atomicity and shadowing, server performance (e.g., SPECSFS benchmark), and RAID. (1/22-29)
The Classical OS View and the Unix Programming Environment
This week is a quick tour through classical operating systems, kernel structure, and internals, focusing on the history and philosophy of the Unix programming environment and how that changed through time. (2/3-5)
Programs, Trust, and Integrity
How do we know if we can trust the programs we run? This week covers how programs are built in a system like Unix, and vulnerabilities to trojan horses and other forms of attacks. This is also an opportunity to introduce basic cryptosystems for integrity (hashes and digital signatures). We will talk about how these ideas apply in a rather different kind of "operating system": modern cloud systems like EC2 and Eucalyptus.
- [Thompson84] Reflections on Trusting Trust. ACM Turing Award lecture. Communications of the ACM, Vol. 27, No. 8, August 1984.
- Cryptosystems: PCSD chapter 11 through 11.3. This material is available in the online Part 2 of the textbook (this PDF). Like much of PCSD, some parts of this can be skimmed, but you should understand digital signatures.
- Optional short paper on virtual clouds by Werner Vogels of Amazon [PDF]: no paper note required on this one.
- Optional short paper on virtual appliances from rPath, a local company [PDF]. This paper is from the Workshop on Hot Topics in Cloud Computing (HotCloud), June 2009. No paper note required on this one.
- Slides on this topic [PPT]
Enforced Modularity, RPC, and Services
This topic introduces a view of kernels and servers as protected modules with controlled interactions through inter-process communication (IPC) and protected procedure call. (PPC or RPC). We discuss mechanisms needed for cross-module (cross-domain) calls: transformation/copying of arguments/results, language integration with stubs, differences from ordinary procedure call, transports for network Remote Procedure Call, and object naming for IPC (handles, object IDs, capabilities). We illustrate some issues for RPC and distributed services in general, using NFS and DNS as examples. This unit also raises some basic issues of trust, consistency, and recovery, and introduces approaches secure network communication and secure binding to services (SSL and DNSSEC).
Server Structure and Performance
Threads and concurrency
We won't spend a lot of class time on concurrency races and synchronization, but you should be familiar with the problem and the programming tools to address it, and related topics such as deadlock. You can refer to these notes on concurrency for background readings and sample problems. The class discussion will focus on two research papers on dynamic race detection, published in the flagship SOSP conference a decade apart. (Friday 3/19 and Wednesday 3/24)
- PCSD 5.1-5.2, and 5.6
-
Eraser: a dynamic data race detector for multithreaded programs.
In the 16th Symposium on Operating Systems Principles (Saint-Malo, France), October 1997.
Stefan Savage (University of Washington), Mike Burrows (DEC SRC), Greg Nelson (DEC SRC), Patrick Sobalvarro (DEC SRC), Tom Anderson (UC-Berkeley).
- MUVI: automatically inferring multi-variable access correlations and detecting related semantic and concurrency bugs. In the 21st Symposium on Operating Systems Principles (Stevenson, WA), October 2007.
Shan Lu, Soyeon Park, Chongfeng Hu, Xiao Ma, Weihang Jiang, Zhenmin Li, Raluca A. Popa, Yuanyuan Zhou (UIUC).
- Some basic concurrency slides [ppt]
- Some slides on LockSet/Eraser
- Shan Lu's slides on MUVI [ppt]
Resource Management and Kernel Structure
We discuss thread implementation and kernel concurrency control, and use threads as an example to explore system-level structuring issues: the centrality of the kernel for resource allocation, the costs of kernel interaction, separation of policy and mechanism, the role of runtime-level support, the integration of kernel and runtime, and the minimalist view of Exokernel and microkernel systems.
- PCSD 5.3-5.5. We have mostly covered 5.3 (memory protection) in the OS overview, and we'll discuss 5.4 (virtual memory mapping) soon. Section 5.5 (virtualizing processors using threads) is most relevant to the discussion for Friday 3/26 and Wednesday 3/31.
-
Scheduler activations: effective kernel support for the user-level management of parallelism.
In the Thirteenth ACM Symposium on Operating Systems Principles (SOSP),
Pacific Grove, December 1991.
Tom Anderson, Brian Berhad, Ed Lazowska, Hank Levy (University of Washington)
- Slides on this topic [PPT]
- We also spent one class discussing discrete event simulation, virtual time, and the lab.
Virtual Memory and Virtual Machines
Virtual memory mapping mechanisms, page replacement policies, and extension of virtualized resource management to hypervisors. (Friday April 9, Wednesday April 14)
Changing Hardware
-
The Multikernel: A new OS architecture for scalable multicore systems.
In the 22nd ACM Symposium on Operating Systems Principles (SOSP), Big Sky, MT, October 2009. Andrew Baumann (ETH Zurich), Paul Barham (MSR Cambridge), Pierre-Evariste Dagand (ENS Cachan Bretagne), Tim Harris (MSR Cambridge), Rebecca Isaacs (MSR Cambridge), Simon Peter (ETH Zurich), Timothy Roscoe (ETH Zurich), Adrian Schüpbach (ETH Zurich), Akhilesh Singhania (ETH Zurich). Barrelfish project. Here are the authors' slides.
- Better I/O Through Byte-Addressable, Persistent Memory. In the 22nd ACM Symposium on Operating Systems Principles (SOSP), Big Sky, MT, October 2009. Jeremy Condit (Microsoft Research), Edmund B. Nightingale (Microsoft Research), Christopher Frost (UCLA), Engin Ipek (Microsoft Research), Doug Burger (Microsoft Research), Benjamin C. Lee (Microsoft Research), Derrick Coetzee (Microsoft Research). Here are the authors' slides.