|
CPS 196 Systems and Networks |
|||||||
|
|||||||
This assignment is not very big, but you should get started early because you need to get various stuff working like compiling and booting your own Linux kernel on Xen, and getting used to working with a version control system (we use subversion) that can take some time. Take a look here to see the survival guide we've put together for you.
The OS command interpreter is the program that people interact with in order to launch and control programs. On UNIX systems, the command interpreter is usually called the shell: it is a user-level program that gives people a command-line interface to launching, suspending, and killing other programs. sh, ksh, csh, tcsh, bash, ... are all examples of UNIX shells. (It might be useful to look at the manual pages of these shells, for example, type: "man bash")Every shell is structured as the following loop:
Although most of the commands people type on the prompt are the name of other UNIX programs (such as ls or cat), shells recognize some special commands (called internal commands) which are not program names. For example, the exit command terminates the shell, and the cd command changes the current working directory. Shells directly make system calls to execute these commands, instead of forking a child process to handle them.
- print out a prompt
- read a line of input from the user
- parse the line into the program name, and an array of parameters
- use the fork() system call to spawn a new child process
- the child process then uses the exec() system call to launch the specified program
- the parent process (the shell) uses the wait() system call to wait for the child to terminate
- once the child (i.e. the launched program) finishes, the shell repeats the loop by jumping to step 1.
This assignment consists of two parts. In the first, you will design and implement an extremely simple shell that knows how to launch new programs, and also recognizes three internal commands (exit, cd, and execcounts), which we will describe below. The first two internal commands will work by calling existing system calls (exit and chdir); the third internal command will work by calling a new system call that you will design and implement. So, in the second part of this assignment, you will design and implement the execcounts system call. This will involve making changes to the Linux kernel source code. The semantics of the execcounts system call, and some hints on how to go about implementing it are also described below.
Write a shell program in C which has the following features:
- It should recognize two internal commands, exit and cd. exit terminates the shell, i.e., the shell calls the exit() system call or returns from main. cd uses the chdir system call to change to the new directory.
- If the command line is not an exit or cd, it can be assumed to be of the form
<executable name> <arg0> <arg1> .... <argN>
Your shell should invoke the executable, passing it the command line.
You can assume that executable and file names are full path names, e.g., /bin/ls or /etc/motd.
For extra credit, include the ability to search for the executable in all directories in the shell's path. For example, typing ls should automatically execute /bin/ls, if /bin is in the path. Your shell should use the path set in its invoking environment. (Bash, for example)
CPS196Shell% /bin/date
Wed Jan 25 16:03:51 EST 2006
CPS196Shell% /bin/cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
CPS196Shell% cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
Note: The words in bold are output by the shell and the words underlined are typed in by the user.
Please take a look at the manual pages of execv, fork and wait. (e.g., "man fork", "man execv", "man wait".). You might find something useful to help you avoid duplicating already existing functionality.
To allow users to pass arguments to programs you will have to parse the input line into words separated by white space (spaces and '\t' tab characters) and place these words into an array of strings. You might try using strtok() for this (man strtok for a very good example of how to solve exactly this problem with strtok).Hint: if you have never used system calls before, you might try starting very slowly. (In general, prototyping your design a little piece at a time is an EXCELLENT software development strategy.) So, we might start by just trying to use the fork() and wait() calls. We would have a program fork a child process and then wait on it. The child might just print out a line like "Hi, Parent!" and then exit. The parent might wait until the child is done and the print out a line "Hi Kid!" and then ends. Then, we might try to pass arguments from parent to child and just have the child print out the arguments. Then, we could go to execv ...
There are four system calls in Linux related to creating new processes: fork, vfork, execve, and clone. (The man pages will describe for you the differences among them.) Instrument the kernel so that we can write a user-level program that will print counts of the number of times each of these four system calls has been invoked (by any process on the system); that is, we want to be able to write a garden-variety C program that prints out the total number of invocations of each of these four system calls.
To do this requires three things:
Modify the kernel to keep track of this information.
Design and implement a new system call that will get this data back to the user application.
Write the user application.
We'd also like to be able to reset these statistics periodically. So we need a way to clear the request information we've tracked so far. This requires either parameterizing the above system call to add a clear option, or adding another system call. This is up to you.
Warning 1: Remember that the Linux kernel should be allowed to access any memory location, while the calling application should be prevented from causing the kernel to unwittingly read/write addresses other than those in its own address space.
Warning 2 (Hint 0): Remember that it's inconceivable that this problem (warning 1) has never before been confronted in the existing kernel.
Warning 3: Remember that the kernel must never, ever trust the application to know what it's talking about when it makes a request, particularly with respect to parameters passed in from the application to the kernel.
Warning 4: Remember that you must be sure not to create security holes in the kernel with your code. This is a generalization of warning 3 above.
Warning 5: Remember that the kernel should not leak memory.
SOME HINTS
You should be using the C language whenever you alter or add to the Linux kernel. If you use C++, the compiler will likely complain, and your kernel will not build.
To actually add the system call, look at arch/i386/kernel/syscall_table.S. The following explanation might be helpful. Also take a look at this page for more detail, although it is dated.
To call your new system call from a user-level program, you probably want to do something like this:
#include <sys/syscall.h>
#define MY_SYSCALL syscall_number
#include <unistd.h>
....int ret = syscall(MY_SYSCALL, ...);
Try "man syscall." It should give you some useful information.
Recommended Procedure
We suggest you wade, rather than dive, into sthis. In particular, while you can work however you want, here's a suggested set of incremental steps:
Don't change any Linux code. Figure out how to make a plain vanilla kernel, what file to move where so that you can boot the image you just created, how to tell Xen where your image exists, and then how to boot your image. In other words, read the guide. Then read it again.
Now put a "printk()" somewhere in the code, and figure out how to find its output. (Hints: /var/log and "man dmesg").
Now implement a parameterless system call, whose body is just a printk() call. Write a user-level routine that invokes it. Check to make sure it was invoked.
Now figure out how to instrument the kernel and write the full implementation. Ask for help if you need it.
Now that you have a working shell and an implementation of your new system call, it's time to integrate them; this should be very simple. Add a new internal command to your shell, called execcounts. The execcounts command should invoke the system call that you build in Part 2, and print out:
- the number of invocations of each of the four "process creation" Linux system calls that have occurred since the last invocation of execcounts.
- the fraction of all such calls that each of the four types represents.
- A README file with your name and netid, documenting the approach you have taken, along with instructions to the TA as to where everything is and what I should do to test your assignment. Be warned that TAs tend to get very upset if they need to spend time figuring this out on their own. Especially if you did something differently from the procedure set out here.
- A subdirectory under user_space (create this directory!) that contains the source for your shell. There must be a Makefile to make the executable, or else clear instructions on what needs to be done to compile your code. You should also include a document explaining how you tested your shell. This file might include actual output from running your shell during testing. . ("man script" for a useful tool.)
- A file containing the names of all the Linux kernel files (with their paths: several files in the kernel have identical names) that you modified or added. This file should also contain an explanation of your changes and why you needed them. It should also include the interface to the new system call (i.e., a miniature man page for it).
- To attempt to achieve some sort of uniformity in results for the new system call, hand in the results obtained from the following (whose intent is to count the four system calls that occur due to a make of the kernel):
- Get the kernel source on the machine running your new kernel.
- Do a make clean to remove any traces of previous compilations.
- Invoke your shell
- cd to the kernel source directory
- Use your shell to reset the system call counts to zero
- Do a make
- Invoke execcounts and report these results
- A brief (less than a page) discussion of any important design decisions that you made while implementing your system call and/or shell. This is also a good place to let us know how you overcame any difficulties.