Due Date: Early bonus: Friday, February 14
Final Due Date: Monday, February 17
For this assignment you'll read words from text files and keep track
of how many times each word in the file occurs. As an example, the
program allword7.cc (also shown as word7.cc) in
the "Tapestry" text can be used as an example. It is available in
~ola/cps100/linkcount on the acpub system or linked from here.
You will use three different methods for organizing the linked lists in
this assignment. You must time each method on three different files and
include these comparisons in your README file.
Time this program on the three input files discussed at the end of
the assignment.
Time this program using the three input files below.
In your README you must report on the time taken for each part. You
should offer reasons for why the timings are different. You should try
to do each run on the same machine, and perhaps do a run more than once
and average the times.
Files
Simple Counting
The method shown in allword7.cc stores each word of a file in
a node of a linked list together with a count of how many time the word
occurs. The declaration for the struct is reproduced below.
Add To Back
For Part 2, you must modify the function Update so that new
words are added to the end of the linked list. In the current version
of Update, words are kept in alphabetical order. You must
change this so that a pointer to the last node of the list is maintained
(e.g., myLastNode). Adding a new node to the linked list
should be an O(1) operation, in the current program it is
an O(n) operation for an n-node list.
When your program works you should time it on the three input files
discussed
below.
Move To Front
For Part 3, you must modify the function Search so that
whenever a word is found, it is moved one place closer to the front.
You should use the same Update from Part 2 so that new words
are added to the end of the linked list. Each time a word is found it
moves one position closer to the front of the list. You must continue
to use a singly linked list for this part of the assignment.
Optional: Jump to Front
This part is optional, you do not need to do it.
Instead of moving a found node one position closer to the front, move
all found nodes to the front. This means that when a node is found, it
is unlinked and made the first node.
Input Files
You must time the program on three input files from ~ola/data
on the acpub system: romeo.txt, macbeth.txt, and
caesar.txt. These are three plays by Shakespeare.
All three files should
be run at once!
This means that when you time the final version of each
part, you will have all the words from all three places stored in
memory. This may take a while, so be sure you've tested each part before
you time the part.
Submit
To sumbit, use
submit100 timelink README allword7.cc (other files)
The final version of allword7.cc that you submit should be the
one that uses move-to-front.
Owen L. Astrachan
Last modified: Fri Feb 7 13:43:52 EST