Introduction
Today you'll explore reading and writing from streams, use of the
apstring and apvector member functions, pointers,
structs, and classes. You'll also use a class that helps measure
execution times of different code segments.
Classes and libraries used include:
- SysTimer a class designed to measure execution
times of code segments, declared in systimer.h and defined in systimer.cpp.
- ClockTime a class with overloaded operators
for manipulating times expressed as hours:minutes:seconds,
declared in clockt.h and defined
in clockt.cpp
- Functions for formatting output, some declared in
<iostream.h> and others in <iomanip.h>
Streams, strings, timing
We'll discuss two programs as a springboard to the syntax and semantics
of C++. These two programs are:
- words.cpp, a program
for tracking how many times each word
in a text file occurs.
- cdsum.cpp a program for
determining the total time of all the tracks on a CD.
Sample runs of each program are shown below, user-entered input is in
italics.
Only the last part of the run of
words.cpp is shown below, the
output is lengthy (all the words in a data file).
prompt> enter name of file: data\poe.txt
...
1 rampart
1 half
1 century
1 mortal
1 disturbed
1 pace
1 requiescat
# of words = 809
total time for update = 1.3
Here is a run of cdsum.cpp
on a file of Van Morrison's The Best of Van Morrison. The
first three lines of the file vanmor.dat are
3:46 Bright Side Of The Road
2:36 Gloria
4:31 Moondance
A run of the program is shown below:
enter name of data file: data\vanmor.dat
0:03:46 Bright Side Of The Road
0:02:36 Gloria
0:04:31 Moondance
0:02:40 Baby Please Don't Go
0:04:19 Have I Told You Lately
0:03:04 Brown Eyed Girl
0:04:21 Sweet Thing
0:03:22 Warm Love
0:03:57 Wonderful Remark
0:02:57 Jackie Wilson Said (I'm In Heaven When You Smile)
0:03:14 Full Force gail
0:04:28 And It Stoned Me
0:02:46 Here Comes The Night
0:03:04 Domino
0:04:05 Did Ye Get Healed
0:03:32 Wild Night
0:04:40 Cleaning Windows
0:04:54 Whenever God Shines His Light
0:04:54 Queen Of The Slipstream
0:04:44 Dweller On The Threshold
------------------------------
total = 1:15:54
Group Activity
- Create two constructors for the WordStat struct
in words.cpp:
a default constructor, and a constructor that takes
one string as an argument, initializes the info field
with the string, and sets the count field to 1 ---
you can use default parameters for the count field if
you want to, but it should be possible to define:
WordStat ws("hello");
WordStat empty;
Indicate what changes this engenders in the program
- What changes are needed to use binary search instead of
sequential search in words.cpp
- show-me Design a struct named
CDinfo to hold both the time of a CD-track, the number
of the track (1, 2, ...), and the title of the track. Then read
information from a file (as in cdsum.cpp), but store it in a
array/apvector of structs, and print the elements sorted by
track time, shortest track first. Write a prototype for a
function to sort a vector of CDinfo structs, but don't write the
function itself, just call it. Do, however, show the definition
of the struct, an array of structs, and the modified loop from
cdsum.cpp to read
information into the array.
- Write a member function Less with prototype below,
and use it to implement operator < for
ClockTime objects.
bool ClockTime::Less(const ClockTime & rhs)
// postcondition: returns true if *this < rhs, otherwise returns false
- Implement operator == for ClockTime
objects by creating a member function Equal similar
to the function Less in the previous exercise.
Lab Activity
The goal in the morning lab is to get more practice with
structs, vectors, and using classes. There are several parts of the lab,
you probably won't finish all of them in the morning, but the afternoon
session continues with activities related to this one.
- Modify the program cdsum.cpp by adding
some or all of the features below.
- Read information into an array of CDinfo
structs (see the group exercise). Print the information in
the array after reading.
- Sort the CDs read by length of track. Do this by
overloading operator < for CDinfo
objects (implement member function Less as
discussed in the group exercises).
- Implement a Shuffle function that randomly
permutes the CDinfo tracks stored in an array. Print the
contents before and after shuffling.
- Modify words.cpp
to use binary search and shift elements as needed to add words
in sorted order rather than at the end of the vector. Time how
long this method of word-search takes and compare it to the
sequential search version. Move the calls of StripPunc
and ToLower into the Update function.
Lunch
Group Activity
The goal of the afternoon session is to practice with class design,
operator overloading, and using classes and streams. The program wordify.cpp will be used as the
basis around which a new class will be built and an existing class
modified --- it's a class-version of the program
words.cpp used in the morning
session.
Struct to Class
The first activity will be to turn the struct WordStat into a
class. At first, add member functions to the struct, but leave the data
public. Eventually you'll move the struct to a full class with public
functions, and private data.
The WordStat struct is used in the functions
WordList::Update and WordList::Search. In both
functions different fields of the struct are accessed directly, either
to compare info to a string or to set/increment the
count field.
- Create two constructors for the WordStat struct.
(see the morning exercise --- this is a repeat).
- Create three WordStat member functions:
- Print takes on ostream parameter and writes
the WordStat object to the stream.
- Equal takes a WordStat object as a
(const reference) parameter named rhs and returns
true iff rhs and *this are equal, where
only info fields are checked to determine
equality.
- show-me: Less is the analog of
of equal but returns true iff *this is less than
parameter rhs.
- Now overload all comparison operators: <, >, ==, <=
>=, and != using the member functions
Equal and Less. You should implement the
four operators below as show-me operators,
and think about the others.
-
bool operator == (const WordStat & lhs, const WordStat & rhs)
-
bool operator < (const WordStat & lhs, const WordStat & rhs)
-
bool operator > (const WordStat & lhs, const WordStat & rhs)
You should also implement (or think about) an overloaded insertion
operator <<.
- Write a parameterless member function Increment
that bumps the value of count by 1.
Explain how to modify the code in WordList::Search to take
advantage of the overloaded operators so as not to
access any info fields directly --- you should think about
overloading operator == and operator < (and
others if you see fit).
Similarly, what modifications are needed in WordList::Update
and WordList::Print to use overloaded member functions
and not make any direct access to fields info and count.
Lab Activity
In the lab you'll make modifications to the program wordify.cpp based on the group
exercises. There may not be time to do all the modification.
The first exercise is to implement the changes discussed in the group
activities so that WordStat is a class with member functions
instead of a struct. You'll need to implement two constructors,
overloaded comparison operators (using member functions Less
and Equal), printing functions/operators, and an
Increment function.
Make sure that data is private, this will ensure that you can't access
it in any of the WordList member functions --- you'll be forced
to use WordStat member functions to access/alter the values
stored in WordList::myList.
When you've made these changes, time the program on
data/poe.txt and (if machines have enough memory) on
data/hamlet.txt. Write the times down since you'll be making
changes to the program and re-timing.
Make a copy of the working program so that you can go back to it if you
need to, then begin converting wordify.cpp to use pointers as
described below.
Pointers to Structs
Much of the time in the WordList::Update function is spent
shifting elements. Each shift requires a string and an integer to be
copied from one array cell to another. You'll make shifting faster by
using pointers to WordStat objects rather than the objects
themselves. Change the declaration of myList in the class
WordList to
apvector myList;
You should initialize all values in myList to NULL. You can do
this with a
a loop in the body of the constructor or (at least on some compilers)
by using the definition myList(size,NULL) in the initializer list.
Now you'll need to search for uses of myList[...] and change
them as appropriate. For example, in the original version of
wordify.cpp you'll find (in WordList::Search)
if (myList[mid].info == key)
You've probably changed this to what's shown below to take advantage of
overloaded operator ==.
if (myList[mid] == key)
Now you'll need to write
if (*myList[mid] == key)
to dereference the pointer in myList[mid] and get to the object
it points to. You'll make similar changes elsewhere, but do
not change the assignment in the loop of
WordList::Update
myList[loc+1] = myList[loc];
this stays the same to copy pointers rather than objects. When your
program works time it again on the same data files you used originally.
Tracking Line Numbers
Instead of tracking just the number of occurrences, you'll modify the
program (either the original wordify.cpp or the
pointer-modified version) to track the line numbers on which each word
occurs. This will require modifying the WordStat class and
using string streams to read. Note: string streams are NOT part
of the APCS C++ subset. However, it's useful to know how to
read line oriented data.
To read lines at a time, you'll replace the reading loop test
while (input >> word) // reading word succeeded
with the test
while (getline(input,word)) // reading word succeeded
When the getline function succeeds, one line of text is read
into the string word. To read strings from the word, you could
use apstring::find to search for spaces, but it's easier to use
a string stream.
You access the class istrstream (input string stream) by
including the header file <strstream.h>. Then you make
the modifications shown below to the reading loop from
wordify.cpp.
while (getline(input,word))
{
istrstream sstream(word.c_str());
string s;
while (sstream >> s)
{
// process s, i.e., tolower, strippunc, call UpDate
}
}
You must define the variable sstream inside the loop
so that it's constructed each time the outer loop iterates.
You'll also need to modify the WordStat class to track line
numbers. Add an array/apvector of integers (and a count of the number
of values stored in it) and update the Print function to print
the array and the Increment function to take a line number as a
parameter. Don't store the same line number more than once --- if the
word "the" occurs twice on line 82, then 82 should appear only once in
the vector of line numbers associated with "the".
Linked Lists
This part of the lab is aimed at the AB course. If you're not
comfortable with linked lists you should work on another problem. One
is suggested at the end of the lab.
Instead of using an array of pointers, you'll make modifications so that
a linked list of WordStat objects is used. You can do this in
two ways: make the WordStat class back into a struct (public
data) and add a new field WordStat * next. However, it would
be better to create a new struct Node as shown below.
struct Node
{
WordStat info;
Node * next;
Node (const WordStat & ws, Node * follow = NULL)
: info(ws),
next(follow)
{}
};
This struct can be declared within the private section of the
WordList class, or outside the class. Then you should use
Node * myFirst; // points to first node
Node * myLast; // points to last node
in place of the private WordList::myList variable. Create
a dummy/header node in the WordList constructor.
You'll need to make many changes in the code to deal with linked lists.
Searching for values requires changing WordList::Search. In
WordList::Update you'll need to add new nodes (at the end of
the list) using statements similar to:
myLast->next = new Node(WordStat(word));
myLast = myLast->next;
When you've got this working, try the experiments below.
- Each time a word is found (e.g., in WordList::Update),
move the node to the front of the list. The idea here is that
frequently accessed words will be moved to the front of the list.
You'll need to think about the best way to unlink a node i.e., you could
alter WordList::Search to return two values: a pointer to a
node and a pointer to the node before, or you could find the node before
with a loop if the search is successful.
Time this program --- if it's faster why? if not, why not?
- Instead of moving a node all the way to the front, move it one
place closer to the front. Time again.
- Instead of adding new nodes at the end of the list, add new nodes
at the front of the list. Time again.
Extra Activities
Key Word In Context
This problem is (extensively) described
in a document about key-word-in-context (KWIC).
A large portion of the design of this problem is provided for you. The
idea is to get used to multi-class programs so you get a feel for object
oriented (class-based) design. You won't finish this program in one
day, we'll discuss the program and the design on the last day of the
workshop.
Owen L. Astrachan
Last modified: Thu Jun 12 09:51:18 EDT