Word Ladder, CPS 100, Spring 1996

Due Date: Early bonus: February 22, 8:00 am. Final due date February 28, 8:00 am.

Introduction

The files for this assignment can be found in ~ola/cps100/ladder. These files are itemized below and a listing of some are attached as an appendix to this document.

You should create a directory cps100 in which all work for this class will be done (see the initial Unix/Emacs/C++ writeup). Each assignment should be done in its own subdirectory, for this assignment create a directory ladder. All work for your assignment will then be done in the directory ~/cps100/ladder. You should change the acl (access control list) on your cps100directory to give the professor (and TA/UTA) permission to read your files. The command below will give Owen Astrachan (login id 'ola') read permission on the directory cps100.

   fs setacl cps100 ola read
Change acl permissions before creating the ladder subdirectory or you'll need to change the acl on the subdirectory too.

For this assignment, you'll be using a database of English five-letter words from the Stanford GraphBase (a list compiled by Don Knuth). This list has about 6,000 words in it. There is a smaller file of data to work with also. In your ladder directory, create a link to these data files by typing:

ln -s ~ola/data/knuth.dat knuth.dat ln -s ~ola/data/words5.dat words5.dat Then you can access these files without specifying a long path-name for the files.

Word Ladder: Turning Stone into Money

The input to the program is the name of a word file. The user should then be prompted for two words of the same length (5 characters). The output is a sequence of words in which consecutive words share all but one letter, starting with the first word and ending with the last. One letter can be changed to another letter only if the resulting symbols form a valid word. For example, to turn stone into money, one possible ladder is (replace 't' by 'h', replace 'o' by 'i', etc.):

stone shone shine chine chins coins corns cores cones coney money

All of these words can be found in the Knuth file and in a dictionary. The user will continue to enter words, and word ladders searched for, until either of the words entered is NOT 5 letters in length.


Assignment: Part I

You are to write a program ladderq.cc that uses a file of 5-letter words to find the shortest ladder from one word to another using a process outlined below. You must develop a class to do this, the class Ladder has been started for you, but you will need to add more member functions (both public and private).

Your program should:

A sample run: > ladderq Enter two 5-letter words (length != 5 to end): smart brain smart start stark stack slack black blank bland brand braid brain Enter two 5-letter words (length != 5 to end): angel devil There is no path from angel to devil Enter two 5-letter words (length != 5 to end): no more

The file knuth.dat has extraneous information in it. Ignore lines that begin with *, and only process the first 5 characters on other lines. Knuth asks that the file not be altered, hence these restrictions. Code to read this file is included as the member function LoadWords, already written for you to use.


Algorithm

To find the shortest ladder, you should use the templated Queue class provided (from the Weiss book) First, store all of the words from the file in a vector of type Wnode * (this is done in LoadWords).

struct Wnode { string word; Wnode * prev; }; (this assumes the use of the class string from CPstring.h. You can use some other kind of string if you want).

A ladder is found by putting the starting word (or a pointer to it) on the queue, then putting all words 1 letter away from this word on the queue, then putting all words 2 letters away on the queue, then all words 3 letters away, etc. As each word is taken off the queue, if the last (target) word is found the process can stop (there may be other words on the queue, but they'll be ignored).

A Word w isn't actually stored on the queue, a pointer to a struct containing w is stored. The other field of the struct is a pointer to the word that is one letter away from w and that caused w to be put on the queue (the word's predecessor). For example, if w is house, then pointers to structs containing mouse, louse, douse, horse (and so on) are enqueued with each struct pointing to house since this word preceeded the others and caused them to be enqueued. The first word doesn't have a predecessor. It's field cannot be 0/NULL since this is used for another purpose. An easy fix is to make the pointer self-referential, it points to the struct itself (and this will need to be checked when printing ladders).

More Details

The first word (entered by the user) is looked up in the list of words, and a pointer to the struct containing the word is enqueued. For extra credit your program should be able to handle a first word even if the word is NOT in the list of words (all other words in the ladder, except perhaps for the last, mut be in the list of words).

Put a pointer to the struct containing the word onto the queue (it's a queue of Wnode pointers). Then repeat the dequeue/enqueue process below.

Dequeue an element (it's a pointer). Find all words one letter apart from the dequeued word. If one of these is the target word, you're done (or if one of the words is one apart from the target word you're done, you can stop early). Otherwise enqueue each of the words found if it hasn't been queued up before (you can use the prev pointer fields in a Wnode to determine if a word has been enqueued before --- initially all prev fields should be set to 0/NULL, this helps determine if a word has been enqueued before). This means each word is enqueued at most once.

When the target word is derived, you'll need to print out the ladder from the first word to the target word. The prev pointer in the Wnode stores information that will allow the ladder to be recreated, you may need to use recursion or a vector since the ladder will be backwards (but should be printed properly). Alternatively you can store the words in an array/vector and print them out in reverse without using recursion, but using a loop over the vector.


Ladder Member Functions

You must implement the functions described below. You'll find it useful to implement other member functions. Sometimes the functions should be private. This is the case when a member function is a helper function for other member functions, but shouldn't be called by the user. Making a helper function private ensures that only other member functions can access the helper function, but client programs cannot.

You'll probably find it useful to write a function IsOneApart() that is used to determine if two strings are one letter apart. To do this, count the letters that are equal. If this is one less than the total number of letters in the words, the words are one apart. This function does NOT need to be a member function, it has two strings as parameters (const reference) and returns true if the strings are one letter apart. You can just define this function in ladder.cc and use it there.

You'll probably want debugging code/member functions to verify what's going on. If you build helping/debugging member functions into your class you'll save time in the long run since the member functions can be used to help debug code.

You may want to write a separate function to find a word in the vector of words (pointers to words) read in. You can write this code inline (rather than making a function out of it), but the function can be useful in debugging and developing.

You may find it useful to write a function that gives back all the words that are one letter apart from a given word. This list of words could be stored in a vector or in a templated sequence (random access isn't needed). A templated version of the sequence class is provided in case you want to use a sequence somewhere in your code. It's not at all necessary to do this, and there is no grade improvement, but it may be helpful.

Using Templates

To use the templated Queue class you'll need to do use a file called template.cc. Template code needs to be seen by the compiler. To this end, all .h and .cc files are #included in a separate file template.cc. This file is illustrated below. #include "sequence.h" #include "sequence.cc" #include "seqiterator.h" #include "seqiterator.cc" #include "QueueAr.h" #include "QueueAr.cc" #include "ladder.h" template class Queue<Wnode *>; template class Sequence<Wnode *>; template class SequenceIterator<Wnode *>;

(you may not need the sequence code, you can remove it if you don't want it, remove references to it from the Makefile too).

If you want several kinds of queue, just put another definition in the template.cc file. Once the template.cc file is compiled to template.o, you only need to relink, not recompile, every time you make a change in ladder.cc. This will make your recompiles much faster.


Submitting

You should create a README file for this and all assignments. All README files should include your name as well as the name(s) of anyone with whom you collaborated on the assignment and the amount of time you spent.

To submit your assignment, type:

submit100 ladder README ladderq.cc ladderq.h template.cc ..... You can submit by typing make submit if the correct README file is in the directory from which you submit. You can always edit the Makefile to add or change filenames.

Extra Credit

Write a new version of ladderq.cc called ladxtra.cc. This program should process the words so that only "good" matches are tried when ladders are found. The preprocessing step will take a long time, but word ladders will be found very quickly.

The idea is that for each word, all words one-letter away are determined (and stored somehow) when the words are loaded. When looking for candidate words to enqueue, only words that are one-letter away (these are already known) are checked for previous use. This saves searching through the entire list of words and checking whether each is one letter away.

Submitting Extra Credit

To submit the extra credit assignment, type: submit100 ladderXtra README ladxtra.cc