Introduction
This lab will provide practice using vectors, classes, and structs. It will also provide a good lead-in to part 2 of the second assignment. Be sure that you have created a "cps100e" subdirectory. If you haven't and don't know how, be sure to ask a TA for help. It's a good idea to make the directory readable by the professors teaching the course and by your lab TA. To do this you can type what's shown below (and substitute any login-id for "ola").
fs setacl cps100e ola read
Change into your "cps100e" directory and create a "lab2" subdirectory (use "mkdir lab2"). Then change into your lab2 subdirectory and copy files by typing what's below (don't forget the trailing dot).
cp ~ola/cps100e/lab2/* .
From the "lab2" subdirectory, make a link to several text data files. To do this type
ln -s ~ola/data data
Then you'll be able to use the directory "data" to access several big text files. You can type "ls data" to see what's in it.
If you type "ls" you should see the following files: Makefile, words.cc, logsearch.cc, logins.h, logins.cc (and data which is a link).
Use the menu-bar to invoke emacs or type "emacs &" to get an emacs window. You want to use the ampersand & which leaves you with control of your xterm (so you can run programs from it). Be sure to leave the emacs window up, don't quit it until you log off.
Counting Words
Use the emacs commands "C-x C-f" to find/load the file words.cc". Then compile this command using "C-x c" and type "make words" in the minibuffer. If this doesn't invoke the compile command you can always type "M-x compile" (that's ESCAPE-x compile).
When the program runs, you'll be prompted for the name of a file. Enter "data/poe.txt" and the program will keep track of all the unique words in Edgar Allan Poe's ``The Cask of Amontillado''. The program stores each word in a vector, and checks to see if a word is already present in the array before adding it again. The program prints how long reading takes, and how long sorting the words read in takes.
How long does it take to sort the words in Poe's ``cask'' __________________? How long to read them in ___________________? You should show these answers to a TA.
In its current version, the program prints all the words to the screen. You are to change this so that the words are printed to a file whose name you enter.
Replace the call "list.Print(cout)" with the code sequence below.
ofstream output;
cout << "output file: ";
cin >> filename;
output.open(filename);
list.Print(output);
Now run the program again. Find the alphabetically fifth word in Poe's ``cask'', what is it ______________________ ?
Word Counting
You are to modify the word.cc program to print the number of times each unique word appears in the text.
The struct "WordList" contains a vector of strings. You're to change it so that it uses a vector of type "Info" defined below. (Do NOT forget the semi-colon after the } !!).
Define this struct before the class "WordList" and change the vector
definition to "Vector
Try to compile the program. You'll get errors due to the change in the
definition. For example, in "Search" you'll need to change
the line:
to
so that the "word" field is compared to key. In the
"Update" function, if the word is already in
"myList" then the count of how many times it occurs should be
incremented. Note that "Search" is called in
"Update". This is an example of one member function calling
another member function.
If "key" is NOT in "myList", the new word should be
added to the end of "myList" as is done in the original
version of "Update".
You'll need to make some changes in "Sort" and
Print" as well, but these changes are syntactic and based on
the change to using "Info" vectors rather than
"string" vectors.
How do you know if your program works correctly? You should create a small
data file with about ten words, and test your program on it.
After your program works, change the "Sort" function so that it
compares "count" fields of each item in the list rather than using
an alphabetic comparison. Doing this requires changing ONLY the line
that compares "list[k].word" and "list[min].word".
Then rerun the program with Poe's cask and determine what
the top 5 most frequently occurring words are. Put these
words and their counts, suitably labelled, in your README.
You should run the program on "data/melville.txt" as well, this
should take longer. Put the top 5 most frequently occurring words
and their counts from this file (Bartelby the Scrivener) into your README.
Login Search
Now load the file "logsearch.cc", compile it, and run it. When
prompted for the name of an input file, enter "data/acpub.logs", a file
that contains all names and logins of users on the acpub system.
Search (using the 'l' command) for yourself by entering your login.
Then search using the 'n' command for anyone with your name. Search
again using the 'n' command for anyone with 'owen' as part of their
name. How many such users are there ________________________ ?
You should study the programs "logsearch.cc" and "logins.cc"
carefully since they can help with the CD data base
assignment. You should probably print these files.
Re-reading
In the current program, to read another file of logins
(e.g., one from computer science machines) you quit the
program and run it again. Add an option for the user
specified by the character 'r' that will read
logins from a file specified by the user. To do this,
move the code that prompts for a file name and calls
"ReadLogs" into a new function called "DoRead". You
should do this using the mouse to cut-and-paste (see the
Edit menu). Drag over some text with the mouse while
pressing
the left button to highlight it. Then choose cut (or copy)
followed by paste from the Edit menu. If you don't have an
Edit menu you can cut a chunk of text by putting the cursor
at the beginning of the text and marking the start by typing
"C-' '", that's control-spacebar. Then move the cursor
to the end of the block of text you want to copy. You can
cut this block of text by typing "C-w" (control-W) and
you can copy it by typing "M-w" (escape-w). To paste it
move the cursor and type "C-y". You can undo without a
menu by typing "C-x u" (control-x followed by u).
Before opening a new file, you'll need to close the current stream.
To close the stream called open, you would use:
Printing the program
Practice printing. To print the file logins.cc by typing
If this doesn't
work ask a TA. (You can substitute another printer for
teerlp1).
Submission
To turn in programs: When your modified programs works, you can
submit them, along with a README describing how long it took you and what
your impressions of the lab are. If you edit the Makefile, you can type
"make submit" to submit the appropriate
files, or you can type the command below (Replace N in lab2secN with
your section number: 1, 2, or 3).
submit100e lab2secN README words.cc logsearch.cc
The README file should include your name, how long you worked on the
program, and anyone you received significant help from. You should also
include any comments you have about the lab.
You should receive a message telling you that the programs were submitted
correctly. If you get a message indicating that
"submit100e" isn't found, you can always type the full path
name:
~ola/bin/submit100e lab2secN README words.cc logsearch.cc
struct Info
{
string word;
int count;
};
if (myList[k] == key)
if (myList[k].word == key)
input.close();
print -Pteerlp1 logins.cc