This assignment will provide practice with pointers, classes, sorting,
overloaded operators, and reasoning about alternative implementations.
(A Makefile and sample input files are accessible in
~ola/cps100/anagram on the acpub system. Be sure to
create a subdirectory anagram for this problem and to
set the permissions for access by prof/uta/ta by typing
fs setacl anagram ola:cps100 read.)
For users outside the acpub system.
- Makefile (site specific)
- anaword.h
- anaword.cc (skeleton)
- anafind.cc (skeleton)
- words5 (4,176 five letter
words from Linux /usr/dict/words)
- words (45,402 words from Linux /usr/dict/words)
Two words are anagrams if they are composed of the same letters. For
example, "bagel" and "gable" are anagrams as are "drainage" and
"gardenia". In this assignments you'll write a program that reads a
dictionary (a sorted list of words) and finds all the words that are
anagrams of each other.
Your program should read a file of words separated by whitespace. You
can assume the words are unique (one occurrence in the file) and
sorted. Two example files are provided, you may want to create smaller
examples to test your program.
The output should be a sequence of lines, each line contains words that
are anagrams. For example:
begin being binge
caret carte cater crate trace
argon groan organ
You must use the class Anaword whose declaration is shown
below and in the file anaword.h; you will need
to write the implementation of this class in the file
anaword.cc although this has been started for you.
#ifndef _ANAWORD_H
#define _ANAWORD_H
#include
#include "CPstring.h"
// class designed to facilitate finding Anagrams
// written for CPS 100, 1/16/1997
//
// an Anaword object prints as a regular string, but
// compares as a sorted string
//
// Example: the Anaword version of the string "bagel"
// prints as bagle, but will be compared with
// other Anawords as the sring "abegl", the sorted
// version of "bagel". This means that the Anaword
// version of "gable" is equal to the Anaword version
// of "bagel"
//
// operations:
//
// Anaword(const string & word) -- construct from a string
//
// bool Equal(const Anaword & rhs) -- oompare rhs for equality
// bool operator == (lhs, rhs) -- compare Anawords lhs == rhs
//
// bool Less(const Anaword & rhs) -- compare rhs for inequality <
// bool operator < (lhs,rhs) -- compare Anawords lhs < rhs
//
// void Print(const Anaword & lhs) -- print anaword (unsorted)
// ostream & << operator(ostream, -- print using <<
// Anaword)
class Anaword
{
public:
Anaword(const string & word); // construct from string
bool Equal(const Anaword & rhs) const; // compare for ==
bool Less(const Anaword & rhs) const; // compare for <
void Print(ostream & out) const; // print (sorted form)
private:
void Normalize(); // helper function, sorts
string myWord; // regular string: "bagel"
string mySortedWord; // sorted form: "abegl"
};
bool operator == (const Anaword & lhs, const Anaword & rhs);
bool operator < (const Anaword & lhs, const Anaword & rhs);
ostream & operator << (ostream & out, const Anaword & a);
#endif
An Anaword object is constructed from a string, and prints as
the string, but is compared using a normalized or
canonical form created by sorting the string. For example, the
code fragment below prints the two lines of output shown.
Anaword a("bagel");
Anaword b("gable");
cout << a << " " << b << endl;
if (a == b) cout << "they're ananagrams!" << endl;
Output as shown:
bagel gable
they're anagrams!
The objects a and b are equal because the operator ==
is overloaded for Anaword objects and uses a sorted form of a
word for comparison. The normalized form of "bagel" and "gable" is
"abegl", the sorted version of each word.
You must implement the member functions described in anaword.h
so that the real word (e.g., "bagel") is used for printing, but the
normalized or sorted word is used for comparison using == and <.
Algorithm
You should read all the words in a file whose name is entered by the
user. Each word should be used to construct an Anaword object
using new, you'll create a vector of pointers to
Anaword objects to use in your program. You must use a vector
of pointers because the Anaword class doesn't have a default
constructor, so it's not possible to create
Vector<Anaword> a(100), for example.
The three lines below define a vector of pointere and store an
Anaword object representing the string "bagel" in the first
vector entry. You'll need to do something similar for every word in the
file (the vector you use should grow as necessary).
Vector list(100);
string s = "bagel";
list[0] = new Anaword(s);
After reading all the words and creating a vector you should sort the
vector. You'll need to compare Anawords using < and ==; this
will require dereferencing pointers to get at the Anawords,
e.g., if (*(list[0]) == *(list[1])).
After sorting, all anagrams will be adjacent to each other, but there
will be lots of singleton words that aren't anagrams of anything. You
should remove all singleton words leaving only anagrams. If the
original sorted vector has N elements, your code should remove
all singletons, leaving only anagrams, in O(N) or linear time.
Once the vector has only anagrams, you can print the anagrams, one set
per line.
Expectations are that you will implement the Anaword class and
write code that prints all anagrams as described above. In addition,
your program, anafind.cc, should use functions so that the body
of main is small. To sort strings and Anaword objects
you should use selection or insertion sort.
To exceed expectations you can do several things, two outlined below (be
creative.)
This assignment is worth 20 points, it is a minor assignment. You will
receive 16/20 for a program that meets all expectations reasonably well.
Style of code will count for 5/20 of the points.
You should create a README file for this and all assignments.
All README files should include your name as well as the name(s)
of anyone with whom you collaborated on the assignment and the amount of
time you spent. In addition, you should write any comments you have
about the assignment, what you liked and disliked. For this assignment
you should also include your favorite anagrams in the README file.
To submit your assignment, type:
submit100 anagram README *.cc *.h Makefile
Be sure to submit all source files as shown and your Makefile.
For extra credit you should implement Anaword using another
method described here. You should time both implementations and write
up your findings in your README file. You should take care to
have enough data from running the code to backup claims you make about
the two methods.
In the class Anaword, instead of normalizing
by sorting each word, you should create a histogram of the
number of occurrences of each letter in the word. You'll need a vector
of 26 ints. Initialize each element to 0, then keep track of how many
a's, b's, ... z's there are in a word. For example, the word "cabbage"
has two a's, two b's, and one c, g, e. If the histograms if two words
are equal (requires 26 comparisons to determine) the words are anagrams.
To submit your assignment, type:
submit100 anagram.xtra README *.cc *.h Makefile
Owen L. Astrachan
Last modified: Fri Jan 17 11:21:36 EST