Boggle® Assignment
You should snarf assignment boggle from
http://www.cs.duke.edu/courses/spring09/cps100e/snarf or browse the
code directory for code files provided. See the Boggle® howto file for help and more instructions.
After you snarf the assignment, move the .txt files (bogwords.txt and
ospd3.txt) into the bin directory -- otherwise the code won't find
them when it loads the lexicons. To do this use the
Window>Showview>Navigator
menu sequence to get to the Navigator view. Then grab the .txt files
with your mouse to move them into the bin directory.
Genesis of Assignment
Prof. Astrachan fondly
recalls neighborhood Boggle®
games from his now long-ago childhood
and was re-introduced to it on Sun Unix workstations as a grad student
in the mid 80's.
We used Boggle® for a
Compsci 108 assignment at Duke in 1996.
Boggle is in the Nifty
Assignments Archive and the current instantiation of the assignment
is a combination of efforts from Compsci educators at Duke, Stanford,
and UCSD. This version emphasizes empirical analyses of tradeoffs in
different implementations.
| Boggle
| Wordgames and Background
|
|
The Wikipedia Boggle
entry explains the game in detail including popular culture references
to Boggle. Reviewing the rules on the Wikipedia site will help in
understanding the play-of-the-game, but you'll be looking at
implementation trade-offs that are independent of game-playing.
You can play games based on words that are similar to Boggle online,
e.g.,
Facebook's Scramble and
Yahoo! Word Racer
both of which is definitely related to Boggle.
Theoretical and practical approaches to word games have formed the basis
for The
World's Fastest Scrabble Program (Appel and Jacobson) for
recognizing Genomic
Signatures in DeBruijn Chains (Heath and Pati) and for An
efficient interface between continuous speech recognition and language
understanding (Oerder and Ney).
Analyzing alternative implementations and algorithms will help inform a
better understanding of both work and play, what a great way to build a
foundation in understanding how to turn data into information.
In particular you'll be examining alternate implementations of a
lexicon or word source. You'll work on a Trie implementation and a
compressed Trie that's similar to a Radix Trie/Patricia
Trie, but simpler to implement.
Tries are the data structure behind predictive text used in
cell-phone texting, e.g., in T9 Text
Input and have formed the basis for IP-routing, e.g., see Keith Sklower's
Tree-Based Packet Routing Table for Berkeley Unix although
changes to IPv6 may change the utility of tries as described here: Scalable
high speed IP routing lookups (Waldvogel, Varghese, Turner,
Plattner).
You'll use different word-sources or lexicons. Apparently there's
a
controversy as to what the "best"
lexicon is for playing word games
like Scrabble® and Boggle®
|
What You Will Do
You'll code a class to find a specific word on a Boggle® board and
you'll implement two hierarchies of classes. Each hierarchy will be a
group of classes that implement a common interface and allow you to
reason empirically about trade-offs in implementation
techniques. You'll run code/experiments to reason about the
classes you write.
- Find Words On Board
You'll design and
code one class that implements the IWordOnBoardFinder
interface to find a word on a Boggle board. You'll use this class in
implementing one of the auto-player classes and in determining where a
word appears on a board when you or another (human) player plays the
game.
- Implement and benchmark lexicons
You'll write classes that implement
the ILexicon interface. Each lexicon implementation
facilitates looking up a word or a possible
prefix of a word. The classes differ in how efficiently they support
these operations and in how much memory they use.
- Implement two AutoPlayers that find all words quickly
You'll write classes that implement the IAutoPlayer
interface. Each class facilitates playing a game of Boggle by finding
every possible word on a given Boggle board. These classes use the
lexicons from the previous hierarchy.
- Empirical Analyses
You'll make conclusions based on empirical, runtime analyses of the
classes you write to determine which of several methods is
"best", and to reason in general about the trade-offs in
different implementations.
Find Words on the Board
You're given a class BadWordOnBoardFinder
that implements the IWordOnBoardFinder
interface. The bad word finder returns an empty list of BoardCell objects for any
query so no words will be found.
You'll implement the method cellsForWord in a class
GoodWordOnBoardFinder using
a standard, backtracking search to find a word on a Boggle board.
Details and more specifics for this part of the assignment are provided
in the howto pages in the section
on the IWordOnBoardFinder interface. Those pages
also include hints on how to write the class described here.
Implementing Lexicons
Details about the classes you write for this part of the assignment and
help in writing them are provided in the howto pages for implementing lexicons.
You'll design and code three classes for this part of the assignment and you'll
analyze their performance empirically. You're given a JUnit testing class, TestLexicon, to test your
implementations --- see the howto pages for
help/reminders on using JUnit. In the load methods you write you
can assume no duplicate words will be inserted via either the Scanner or
ArrayList parameters to the load methods. You're also given a class LexiconBenchmark you can use
to analyze the empirical performance of your implementations.
In Boggle®, legal words are those found in the
lexicon associated with a game. A lexicon is simply
a list of words, it's not a dictionary because it doesn't have
associated definitions for each word.
The ILexicon interface specifies the
methods exported by the interface which implementations must
provide. You're given an implementation SimpleLexicon
with an O(n) implementation of the method
wordStatus since the implementation simply does a linear
search over the list of words it stores. Read the
documentation for details of the interface.
You must write two new, more efficient implementations and modify
SimpleLexicon so that it's wordStatus method
executes more efficiently.
Implementing AutoPlayers
Details about the classes you write for this part of the assignment and
help in writing them are provided in the
howto pages for implementing AutoPlayers.
You'll implement two classes that find all the words on a Boggle board.
Each class uses a different technique and you'll analyze the runtime
tradeoffs in these techniques. You'll also reason empirically about the
performance of these two classes when they're configured with different
implementations of ILexicon lexicons.
One class, BoardFirstAutoPlayer looks at the board and
tries to form words by trying all the paths on the board. The code will
be very similar to the backtracking code you wrote for the
GoodWordOnBoardFinder class, but you'll prune searches
early based on prefixes as described in the howto pages. Another class
LexiconFirstAutoPlayer will be relatively simple to
implement since you'll have working lexicons and a working
WordOnBoardFinder from earlier in this assignment. In
implementing the LexiconFirstAutoPlayer you'll write code
to look up every word in the lexicon to see if it's on the board. This
method is surprisingly fast enough for a game of Boggle , but it's
probably not fast enough to run 10,000 times without waiting for a
while.
Statistical and Empirical Analyses of Boggle
You need to report in your README file information about which Lexicon
implementation is fastest. You should compare all the lexicons and
report on their relative times --- you shouldn't simply say "this
one is fastest". You should have at least three lexicons to test
and four if you do the extra credit.
You should explain the methods you used
to determine this and report on experiments you run, giving
times.
You must write code to play lots of auto-games, not games in which
humans play, but games in which all words are found on thousands of
boards --- see BoggleStats for a starting
point.
You must provide a board that scores the highest of
all the 4x4 and 5x5 boards you test in running at least 10,000
auto-games.
and preferably 50,000 games. Report on how many seconds it takes your
code
to run for 1,000 games; for 10,000 games (or predict that if
you can't run that many); and predict/justify on
how much time it would take your computer and implementation to
simulate both 100,000 games and one million games. When doing the
experiments
be sure to set the random-number generator's seed, currently done
already
in BoggleStats and described in
the howto pages. If you can't run 10,000 auto-games,
indicate the maximum number you can run.
Grading
This assignment is worth 55 points with 10 extra points available; the
breakdown is as follows:
| functionality | breakdown |
| GoodWordOnBoardFinder | 20% |
| SimpleLexicon | 10% |
| BinarySearchLexicon | 10% |
| CompressedTrieLexicon | 20% (extra) |
| LexiconFirstAuto | 10% |
| BoardFirstAuto | 20% |
| README/Stats 10000 game | 15% |
| README/Lexicon analysis | 15% |
Your README file should list your testing results, all the people with
whom you collaborated, and the TAs/UTAs you consulted with. You should
include an estimate of how long you spent on the program and what your
thoughts are about the assignment.
Submit your README and all of your source code using Eclipse with
assignment name boggle.
|