Word Ladder, CPS 100, Spring 1996
Due Date: Early bonus: February 22, 8:00 am. Final due date February 28, 8:00 am.
You should create a directory cps100 in which all work for this class will be done (see the initial Unix/Emacs/C++ writeup). Each assignment should be done in its own subdirectory, for this assignment create a directory ladder. All work for your assignment will then be done in the directory ~/cps100/ladder. You should change the acl (access control list) on your cps100directory to give the professor (and TA/UTA) permission to read your files. The command below will give Owen Astrachan (login id 'ola') read permission on the directory cps100.
fs setacl cps100 ola readChange acl permissions before creating the ladder subdirectory or you'll need to change the acl on the subdirectory too.
For this assignment, you'll be using a database of English five-letter words from the Stanford GraphBase (a list compiled by Don Knuth). This list has about 6,000 words in it. There is a smaller file of data to work with also. In your ladder directory, create a link to these data files by typing:
All of these words can be found in the Knuth file and in a dictionary. The user will continue to enter words, and word ladders searched for, until either of the words entered is NOT 5 letters in length.
Your program should:
The file knuth.dat has extraneous information in it. Ignore lines that begin with *, and only process the first 5 characters on other lines. Knuth asks that the file not be altered, hence these restrictions. Code to read this file is included as the member function LoadWords, already written for you to use.
A ladder is found by putting the starting word (or a pointer to it) on the queue, then putting all words 1 letter away from this word on the queue, then putting all words 2 letters away on the queue, then all words 3 letters away, etc. As each word is taken off the queue, if the last (target) word is found the process can stop (there may be other words on the queue, but they'll be ignored).
A Word w isn't actually stored on the queue, a pointer to a struct containing w is stored. The other field of the struct is a pointer to the word that is one letter away from w and that caused w to be put on the queue (the word's predecessor). For example, if w is house, then pointers to structs containing mouse, louse, douse, horse (and so on) are enqueued with each struct pointing to house since this word preceeded the others and caused them to be enqueued. The first word doesn't have a predecessor. It's field cannot be 0/NULL since this is used for another purpose. An easy fix is to make the pointer self-referential, it points to the struct itself (and this will need to be checked when printing ladders).
More Details
The first word (entered by the user) is looked up in the list of words, and a pointer to the struct containing the word is enqueued. For extra credit your program should be able to handle a first word even if the word is NOT in the list of words (all other words in the ladder, except perhaps for the last, mut be in the list of words).
Put a pointer to the struct containing the word onto the queue (it's a queue of Wnode pointers). Then repeat the dequeue/enqueue process below.
Dequeue an element (it's a pointer). Find all words one letter apart from the dequeued word. If one of these is the target word, you're done (or if one of the words is one apart from the target word you're done, you can stop early). Otherwise enqueue each of the words found if it hasn't been queued up before (you can use the prev pointer fields in a Wnode to determine if a word has been enqueued before --- initially all prev fields should be set to 0/NULL, this helps determine if a word has been enqueued before). This means each word is enqueued at most once.
When the target word is derived, you'll need to print out the ladder from the first word to the target word. The prev pointer in the Wnode stores information that will allow the ladder to be recreated, you may need to use recursion or a vector since the ladder will be backwards (but should be printed properly). Alternatively you can store the words in an array/vector and print them out in reverse without using recursion, but using a loop over the vector.
You'll probably find it useful to write a function IsOneApart() that is used to determine if two strings are one letter apart. To do this, count the letters that are equal. If this is one less than the total number of letters in the words, the words are one apart. This function does NOT need to be a member function, it has two strings as parameters (const reference) and returns true if the strings are one letter apart. You can just define this function in ladder.cc and use it there.
You'll probably want debugging code/member functions to verify what's going on. If you build helping/debugging member functions into your class you'll save time in the long run since the member functions can be used to help debug code.
You may want to write a separate function to find a word in the vector of words (pointers to words) read in. You can write this code inline (rather than making a function out of it), but the function can be useful in debugging and developing.
You may find it useful to write a function that gives back all the words that are one letter apart from a given word. This list of words could be stored in a vector or in a templated sequence (random access isn't needed). A templated version of the sequence class is provided in case you want to use a sequence somewhere in your code. It's not at all necessary to do this, and there is no grade improvement, but it may be helpful.
(you may not need the sequence code, you can remove it if you don't want it, remove references to it from the Makefile too).
If you want several kinds of queue, just put another definition in the template.cc file. Once the template.cc file is compiled to template.o, you only need to relink, not recompile, every time you make a change in ladder.cc. This will make your recompiles much faster.
To submit your assignment, type:
The idea is that for each word, all words one-letter away are determined (and stored somehow) when the words are loaded. When looking for candidate words to enqueue, only words that are one-letter away (these are already known) are checked for previous use. This saves searching through the entire list of words and checking whether each is one letter away.