A Simple Guide to tvector
In C++ a vector is a homogeneous collection supporting
random access. This means:
- Homogeneous: all elements in the vector
are the same type: string, int, double, Date, etc.
- The time it takes to access the 500th element is
the same as the time to access the first element.
You can't store int values and string values in the same vector, that
wouldn't be a homogeneous collection. A vector is more like a DVD than
a video cassette: you can access the 20th track of a DVD without
fast-forwarding past all the tracks 1-19, but you can't do this with a
cassette. Accessing tracks or values takes time that is independent of
which track/value is being accessed.
First Example: Rolling Dice
You're given the task of simulating rolling two dice 50,000 times and
recording how many times each sum 2-12 appears. You should expect seven
to occur more often than two, for example since there's only one way to
roll a two (a pair of ones), but six ways to roll a seven (1-6, 2-5,
3-4, 4-3, 5-2, 6-1).
Here's the code from diceroll.cpp to
roll the dice, count how many times each sum occurs, and print the
results.
#include
using namespace std;
#include "dice.h"
#include "tvector.h"
tvector rollDice(int rollCount)
// pre: rolls > 0
// post: returns vector of counts with v[k] = # times k rolled
// when 2 six-sided dice are rolled rollCount times
{
tvector counts(13,0); // make room for rolls 2-12, all zero
Dice d(6);
for(int k=0; k < rollCount; k++){
int roll = d.Roll() + d.Roll();
counts[roll] += 1;
}
return counts;
}
int main()
{
int rollCount = 50000;
tvector counts = rollDice(rollCount);
for(int k=2; k <= 12; k++){
cout << k << "\t" << counts[k] << endl;
}
return 0;
}
Here's a run of this program, an explanation of the code follows
the run.
2 1392
3 2810
4 4201
5 5391
6 7070
7 8372
8 6910
9 5648
10 4079
11 2728
12 1399
Anatomy of a vector variable
Standard C++ supports a class named vector. We'll use a class
that's compatible, named tvector (tapestry vector). This class
can be used just like the standard vector class, but is more forgiving
when mistakes are made than the standard class. Since students
learning to program sometimes make mistakes, we'll use
the class tvector.
When you define a tvector object you must specify what type of element
it will hold, e.g., string, int, double, Date, etc., In the code from diceroll.cpp the vector holds int values
--- basically 11 counters for the rolls 2-12. The type, included
between angle-brackets, is called the tvector template
parameter and tvector is referred to as a templated class
or a generic container since it can store any type of element,
where the type of element is specified when the tvector is constructed.
In our first uses of vector, we'll also specify how many elements
a vector can hold and what the initial values of these elements
are. If you don't specify the initial value, the default
constructor is called for classes, but for built in types
the values aren't known if no initial value is provided
(note: in the standard vector class the initial value will
be zero if one isn't provided, but this isn't the case in tvector).
tvector counts(100,0); // 100 ints, all zero
tvector words(50,"hello"); // 50 strings, all "hello"
tvector diffs(25); // 25 doubles, values unknown
tvector days(31); // 31 Dates, all today's date
tvector guesses(5); // 5 strings, all "" or empty
Accessing a tvector
To access an element, sometimes called a slot or
cell of a vector, brackets are used. The code
below creates the string vector diagrammed. The first
element of a vector has index 0/zero.
|
"elephant"
| "kangaroo"
| "monkey"
| "lion"
| "crocodile"
|
|
0
|
1
|
2
|
3
|
4
|
tvector animals(5); // room for 5 strings
animals[0] = "elephant";
animals[1] = "kangaroo";
animals[2] = "monkey";
animals[3] = "lion";
animals[4] = "crocodile";
The bracket operator [..] used with an int value accesses
an element of a vector. In the code
from diceroll.cpp above,
the vector is indexed in rollDice to update
the counter and again in main to print the values
after they've been calculated.
Dice Statistics: A Closer Look
The vector created and
returned in the function rollDice contains
13 elements, indexed from 0 to 12. The first two elements,
those with indexes zero and one, are not used in the program
since the program uses the sum of two rolls to index the vector
and increment the value stored in the indexed location.
These eleven values (indexed from 2 to 12) are then printed
in main.
Dice Stats Revisited
Just looking at numbers isn't as useful as looking
at a bar-graph of the rolls. The code
in dicerollhisto.cpp
is very similar, but prints a histogram showing the rolls.
In looking at the code, look carefully at the
function maxValue which returns the largest
value stored in the vector parameter to the function.
#include
using namespace std;
#include "dice.h"
#include "tvector.h"
tvector rollDice(int rollCount)
// pre: rolls > 0
// post: returns vector of counts with v[k] = # times k rolled
// when 2 six-sided dice are rolled rollCount times
{
tvector counts(13,0); // make room for rolls 2-12, all zero
Dice d(6);
for(int k=0; k < rollCount; k++){
int roll = d.Roll() + d.Roll();
counts[roll] += 1;
}
return counts;
}
int maxValue(tvector counts)
// pre: 12 < counts.size(), so counts[12] is legal index/entry
// post: returns largest value in counts[2] .. counts[12]
{
int max = counts[2];
for(int k=3; k <= 12; k++){
if (counts[k] > max){
max = counts[k];
}
}
return max;
}
void makeStars(int count)
// post: prints count '*' asterisks on current line
{
for(int k=0; k < count; k++){
cout << "*";
}
}
int main()
{
int rollCount = 50000;
tvector counts = rollDice(rollCount);
int max = maxValue(counts);
for(int k=2; k <= 12; k++){
cout << k << "\t" << counts[k] << "\t";
makeStars(40*counts[k]/max);
cout << endl;
}
return 0;
}
Here's a run of the program.
2 1406 ******
3 2842 *************
4 4147 ********************
5 5598 ***************************
6 7089 **********************************
7 8173 ****************************************
8 6976 **********************************
9 5422 **************************
10 4185 ********************
11 2806 *************
12 1356 ******
The difference between this version and the previous is
the asterisks printed according to the value of each vector
element. The maximum number of asterisks printed on a line
is 40, and the number printed is scaled by the largest
vector element, computed when the function
maxValue is called.
Finding Maximal Values
The code to find the maximal value in a vector requires looping
over all elements in the vector. The largest one found is "remembered"
as the vector elements are examined. The largest value seen so far
is stored in the variable max which is updated when a new,
larger value is found.
If the loop in maxValue started at 2, rather than 3,
initialization of max to zero would work in this function,
since all vector elements are known to be no smaller than zero (why?)
Adding Values to a vector
In many situations the number of slots/elements needed in a vector
can't be determined before the program runs. For example, if every word
from a file is stored in a vector, how many slots (space) should be
allocated when the vector is constructed?
The vector method push_back adds an element to the end
a vector (a new last element) and makes sure there's always room to add
more. Given this, there must be a method to determine how many values
are actually stored in a vector --- this method is named
size. The code below shows the methods.
tvector words;
words.push_back("cherry");
words.push_back("grape");
words.push_back("melon");
words.push_back("orange");
cout << "# elements: " << words.size() << endl;
for(int k=words.size()-1; k >= 0; k--){
cout << k << "\t" << words[k] << endl;
}
If executed the output of this code is:
# elements: 4
3 orange
2 melon
1 grape
0 cherry
Calling push_back is efficient. If more space is needed in
the vector to add a new element, more space is created in an efficient
manner (see Chapter 8 in Tapestry for details).
Owen L. Astrachan
Last modified: Wed Sep 24 01:24:11 EDT 2003