CPS 06, Fall 1995
Lab 8: Arrays, Vectors, Histograms


Change into your cps6 directory and create a lab8 subdirectory (use mkdir lab8 ). Then change into your lab8 subdirectory and copy files by typing what's below (don't forget the trailing dot).
      cp ~ola/cps6/lab8/*  .
If you type ls you should see the following files: Makefile, histo.h, histo.cc , rollem.cc , words.cc and letters.cc . Use the menu-bar to invoke emacs or type emacs & to get an emacs window. You want to use the ampersand & which leaves you with control of your xterm (so you can run programs from it). Be sure to leave the emacs window up, don't quit it until you log off.

Getting Started

First make a link to several text data files. To do this type
    ln -s ~ola/data data
Then you'll be able to use the directory data to access several big text files.

Histograms

The histogram class requires two things:

Counting Letters

Use the emacs commands C-x C-f to find/load the file letters.cc . Then compile this command using C-x c and type make letters in the minibuffer. You'll be prompted for the name of a file. Enter data/poe.txt and the program will print a histogram of all the letters in Edgar Allan Poe's ``The Cask of Amontillado''. How do you know if the data is correct? Run the program again and type letters.cc for the name of the file, what is being counted now? ____________________________?

The counts in parentheses are correct, but the bars are NOT scaled properly. You need to add some code to the histo.cc file for the function Display . In its current version, there is a for loop that is missing code. This loop is supposed to find the maximal (largest) entry in myCounts and set the variable max to this value. You must write code in this loop to compare all entries of myCounts to max and reset max when necessary.

When you've done this, you should run the program using as an input file data/shakespeare/romeo . How many times does the letter 'e' occur in the play ________________? How many times does the letter 'z' occur and why is the bar for the letter 'z' as long as it is _______________________?


Counting Dice Rolls

You should now load the file rollem.cc into emacs using C-x C-f . Do NOT start another emacs session. The program is designed to track how many times each combination of dice rolls occurs. There are two arrays defined: one to track the dice rolls (counts) and one for the histogram labels (labels).

The loop to roll the dice and update the appopriate array entry is missing. You are to add a loop that that ``rolls'' two ``dice'' the specified number of times and updates the appropriate entry in the array counts . Then compile and execute the program for 6-sided dice and 12-sided dice, rolling the dice 20,000 times. If you think the output is ``wrong'' make some noise.

How many times does the sum 23 occur when you roll two 12-sided dice 20,000 times ______________________?


Walk

You are to write a complete program called walk.cc that simulates a two-dimensional random walk. In this simulation a ``frog'' (or other random walker) starts at the origin (0) and goes left or right one step, at random (use a 2-sided dice). The logic for this is shown below.
   for(k=0; k < numSteps; k++)
   {
      if (die.Roll() == 1) 
      {
          position += 1;
      }
      else 
      {
          position -= 1;
      }
   }
Here the variable position keeps track of where the frog is. You are to write a program that does this, but keeps track of how many times the frog visits each location in the range -100 to 100. Don't worry about keeping counts for positions outside this range. To do this, you'll need a vector of size (at least) 201, and you'll need to think about the expression you use to index the array.

Use this program to track a random-walk of 1,000 steps and 10,000 steps. In each case, display a histogram of how many times each position in the range -30 to 30 is visited. You can use the code in rollem.cc as a model, particularly to set up the vector of labels for drawing the histogram.


Words

This is extra. You do NOT need to do this part of the lab, but if you do you can earn 3 extra lab points (half a lab credit).

The program words.cc is designed to keep track of how many times each word in an input file occurs (not letters, complete words). The function Store is NOT currently implemented. You should write this function, search the entire vector wordList for the string word . If the word is already in the vector, you don't need to do anything. If the word is NOT in the vector, it should be added as the last entry and the count of the number of words updated.

If you do this correctly, the number of unique words in ``The Cask of Amontillado'' should be 1,000. How many unique words are there in ``Macbeth'' (this will take a long time to determine).

For extra extra credit (3 more lab points), modify the program words.cc so that it keeps track of how many times each word occurs (not just the unique words, but how many times each unique word occurs). The easiest way to do this is to use a struct:

    struct WordInfo
    {
        string word;
        int count;
    };
and then define Vector<WordInfo> wordList(10000) , but you'll need to modify the code in the Store function as well.

Submission

When your modified programs works, you can submit them, along with a README describing how long it took you and what your impressions of the lab are. To submit, type the command below (Replace N in lab8secN with your section number: 1, 2, 3, 4, or 5).

      submit6 lab8secN README rollem.cc walk.cc histo.cc

The README file should include your name, how long you worked on the program, and anyone you received significant help from. You should also include any comments you have about the lab.

You should receive a message telling you that the programs were submitted correctly. If you get a message indicating that submit6 isn't found, you can always type the full path name:

      ~ola/bin/submit6 lab8secN README rollem.cc walk.cc histo.cc