name: ___________________________________ name: ___________________________________ name: ___________________________________
IMapper provides a
basic interface to a Map. We have given you an implementation UtilMapHash based on the
java.util.HashMap. Create
a class ArrayListHash where there is an ArrayList
of buckets and each bucket is an ArrayList<Combo> where
Combo will be an inner class that stores and integer value and
a string key. How does the performance of ArrayListHash
compare to UtilMapHash? Test using HashMain
For example, some occurrences of love and hate from Shakespeare's Romeo and Juliet are shown below, the formatting isn't pretty but you can see several occurrences of each of the [love,hate] in the context in which the word occurs in the play.
talk of peace? I hate the word As I the word As I hate hell, all Montagues, and better ended by their hate Than death prorogued, wanting lives, By doing damned hate upon thyself? Why railest But thankful even for hate that is meant love.
not. She whom I love now Doth grace for grace for grace and love for love allow. The grace and love for love allow. The other did she knew well Thy love did read by rote,
You'll be looking at code in a new class ContextModel that finds key words in context as follows.
Every word from a file is stored in an array in the order in which
the word appears. Because of the miracle of the
String.split method it's simple to get this array of
every word.
We'll call the array myWords, it's state in the
ContextModel class.
Keep track of every unique word (uword) and where uword occurs in
myWords by storing the index of each occurrence of uword in
a list of indexes associated with the uword. This information is stored
in a map in which each uword is a key and a list of integer indices are
a key's associated value: the locations of the uword in the array
myWords
Given a word, find all its indices, then use those indices to get the contexts, show these contexts.
How the View communicate with the Model
When the user loads a file, all the words are stored. The model's
initialize method takes care of this.
When the user enters a word and presses go, the word is sent to the
model's process method as a String. In the code you're
given, the map of key words to their indexes is constructed as
needed. Then the map is used to generate contexts. Currently the
map-filling code is missing (in what you get before class).
You'll complete method fillMap and the code to build the
context strings.
process method. Why is the map's size used to
determine if the map should be filled? When does the map need to
be filled and when can the "old" map be used?
process here is removed what will
happen when the user types a word that doesn't appear in the file (or has too
few letters)?
continue statement in
building the context string? When will it be executed?
fillMap (shown below as well)
ContextModel.process. What line is added and why would
using a TreeMap instead of a HashMap be better (changing the
constructor of ContextModel)?
doFreqsA
and doFreqsB from that code.
When run using Melville's Bartleby The Scrivener with 4,256
unique
words. The output is
as shown below (it's not side by side, but vertical when
actually run-- below)
Here's the output from A Scarlet Letter with 14,124 unique words:
If the running time for
What is the purpose of the code guarded by doFreqsA on the left and
doFreqsB on the right.
total # words read: 14353
1 &c., 1 &c.,
1 'Tis 1 'Tis
1 'em-can't 1 'em-can't
1 ( 1 (
1 (Nippers's) 1 (Nippers's)
1 (The 1 (The
1 (and 1 (and
1 (as 1 (as
1 (except 1 (except
2 (for 2 (for
1 (he 1 (he
1 (one 1 (one
2 (the 2 (the
1 6 1 6
6 A 6 A
1 According 1 According
1 Accordingly 1 Accordingly
1 Accordingly, 1 Accordingly,
1 Acting 1 Acting
1 Adam 1 Adam
time to complete: 1.859000 time to completed 0.020000
Here's the output from Hamlet with 7,806 unique words
total # words read: 31955
1 &c.' 1 &c.'
1 ''Tis 1 ''Tis
5 'A 5 'A
1 'A-down 1 'A-down
1 'Adieu, 1 'Adieu,
1 'And 1 'And
1 'Anon 1 'Anon
1 'As 1 'As
1 'Before 1 'Before
1 'But 1 'But
1 'By-and-by' 1 'By-and-by'
1 'Choose 1 'Choose
1 'Doubt 1 'Doubt
1 'For 1 'For
1 'Forgive 1 'Forgive
1 'Gainst 1 'Gainst
2 'Good 2 'Good
1 'HAMLET.' 1 'HAMLET.'
1 'He 1 'He
1 'Horatio, 1 'Horatio,
time to complete: 7.534000 time to completed 0.037000
total # words read: 85754
1 1
98 " 98 "
9 "A 9 "A
1 "Adorn 1 "Adorn
1 "Advise 1 "Advise
1 "Ah! 1 "Ah!
2 "Ah, 2 "Ah,
1 "Ah," 1 "Ah,"
1 "Aha! 1 "Aha!
1 "Alas! 1 "Alas!
1 "All 1 "All
1 "Alone, 1 "Alone,
1 "Am 1 "Am
1 "An 1 "An
21 "And 21 "And
2 "And, 2 "And,
1 "Annals") 1 "Annals")
2 "Art 2 "Art
2 "Arthur 2 "Arthur
1 "As 1 "As
time to complete: 36.026000 time to completed 0.097000
doFreqsA works, e.g., how it results in a list
of each different word read and the number of times each
word occurs.
Collections.frequency(list,s) in the code?
break statement in the loop
that prints words and frequencies?
doFreqsA changes depending on
both N the total number of words
read, and U the number of unique/different words
read. Suppose three files contain
the same value for U, e.g., each contains
5,000 different values; but each file has a different number of total
words N -- suppose this number is 20,000;
30,000; and 40,000 for the three files.
doFreqsA on the 20,000 word file is
12 seconds what do you expect the running time to be for
the other two files and why? Explain your reasoning.
doFreqsA in
terms
of both
N and U and a justification
for your expression.
doFreqsB is to use a map which is
a collection of keys and values. In this case the key is a
string or word and the value associated with each key is the
number of times the key occurs. You can put values into a
map and get values out of a map, typically you do this by
reference to a key.
if !
map.containsKey(s))... as shown? Explain in
words.
doFreqsA and
doFreqsB display words in the same order?
TreeSet or TreeMap
is proportional to the (base 2) logarithm of the number of
elements in the set or map. Explain why the running time of
doFreqsB is so much faster than the running
time of the other method. Make reference to
N and U from above
if possible.
WordCount object as in an earlier classwork. Consider the FrequenciesSorted
class that
uses a WordCount object to store a word and its
frequency.
ArrayList of WordCount
objects from the map.
doFreqsA by
frequency of occurrence. Why do
you need to use Collection.sort instead of
Arrays.sort?