Name: _____________________ Netid: _________________ Name: _____________________ Netid: _________________ Name: _____________________ Netid: _________________
doFreqsA
and doFreqsB
from that code.
When run using Melville's Bartleby The Scrivener with 4,256
unique
words. The output is
as shown below (it's not side by side, but vertical when
actually run-- below)
Here's the output from A Scarlet Letter with 14,124 unique words:
If the running time for
What is the purpose of the code guarded by doFreqsA
on the left and
doFreqsB
on the right.
total # words read: 14353
1 &c., 1 &c.,
1 'Tis 1 'Tis
1 'em-can't 1 'em-can't
1 ( 1 (
1 (Nippers's) 1 (Nippers's)
1 (The 1 (The
1 (and 1 (and
1 (as 1 (as
1 (except 1 (except
2 (for 2 (for
1 (he 1 (he
1 (one 1 (one
2 (the 2 (the
1 6 1 6
6 A 6 A
1 According 1 According
1 Accordingly 1 Accordingly
1 Accordingly, 1 Accordingly,
1 Acting 1 Acting
1 Adam 1 Adam
time to complete: 1.859000 time to completed 0.020000
Here's the output from Hamlet with 7,806 unique words
total # words read: 31955
1 &c.' 1 &c.'
1 ''Tis 1 ''Tis
5 'A 5 'A
1 'A-down 1 'A-down
1 'Adieu, 1 'Adieu,
1 'And 1 'And
1 'Anon 1 'Anon
1 'As 1 'As
1 'Before 1 'Before
1 'But 1 'But
1 'By-and-by' 1 'By-and-by'
1 'Choose 1 'Choose
1 'Doubt 1 'Doubt
1 'For 1 'For
1 'Forgive 1 'Forgive
1 'Gainst 1 'Gainst
2 'Good 2 'Good
1 'HAMLET.' 1 'HAMLET.'
1 'He 1 'He
1 'Horatio, 1 'Horatio,
time to complete: 7.534000 time to completed 0.037000
total # words read: 85754
1 1
98 " 98 "
9 "A 9 "A
1 "Adorn 1 "Adorn
1 "Advise 1 "Advise
1 "Ah! 1 "Ah!
2 "Ah, 2 "Ah,
1 "Ah," 1 "Ah,"
1 "Aha! 1 "Aha!
1 "Alas! 1 "Alas!
1 "All 1 "All
1 "Alone, 1 "Alone,
1 "Am 1 "Am
1 "An 1 "An
21 "And 21 "And
2 "And, 2 "And,
1 "Annals") 1 "Annals")
2 "Art 2 "Art
2 "Arthur 2 "Arthur
1 "As 1 "As
time to complete: 36.026000 time to completed 0.097000
doFreqsA
works, e.g., how it results in a list
of each different word read and the number of times each
word occurs.
Collections.frequency(list,s)
in the code?
break
statement in the loop
that prints words and frequencies?
doFreqsA
changes depending on
both N the total number of words
read, and U the number of unique/different words
read. Suppose three files contain
the same value for U, e.g., each contains
5,000 different values; but each file has a different number of total
words N -- suppose this number is 20,000;
30,000; and 40,000 for the three files.
doFreqsA
on the 20,000 word file is
12 seconds what do you expect the running time to be for
the other two files and why? Explain your reasoning.
doFreqsA
in
terms
of both
N and U and a justification
for your expression.
doFreqsB
is to use a map which is
a collection of keys and values. In this case the key is a
string or word and the value associated with each key is the
number of times the key occurs. You can put values into a
map and get values out of a map, typically you do this by
reference to a key.
if !
map.containsKey(s))...
as shown? Explain in
words.
doFreqsA
and
doFreqsB
display words in the same order?
TreeSet
or TreeMap
is proportional to the (base 2) logarithm of the number of
elements in the map. Explain why the running time of
doFreqsB
is so much faster than the running
time of the other method. Make reference to
N and U from above
if possible.