A Simple Guide to tvector

In C++ a vector is a homogeneous collection supporting random access. This means:

  1. Homogeneous: all elements in the vector are the same type: string, int, double, Date, etc.

  2. The time it takes to access the 500th element is the same as the time to access the first element.

You can't store int values and string values in the same vector, that wouldn't be a homogeneous collection. A vector is more like a DVD than a video cassette: you can access the 20th track of a DVD without fast-forwarding past all the tracks 1-19, but you can't do this with a cassette. Accessing tracks or values takes time that is independent of which track/value is being accessed.

First Example: Rolling Dice

You're given the task of simulating rolling two dice 50,000 times and recording how many times each sum 2-12 appears. You should expect seven to occur more often than two, for example since there's only one way to roll a two (a pair of ones), but six ways to roll a seven (1-6, 2-5, 3-4, 4-3, 5-2, 6-1).

Here's the code from diceroll.cpp to roll the dice, count how many times each sum occurs, and print the results.

#include <iostream> using namespace std; #include "dice.h" #include "tvector.h" tvector<int> rollDice(int rollCount) // pre: rolls > 0 // post: returns vector of counts with v[k] = # times k rolled // when 2 six-sided dice are rolled rollCount times { tvector<int> counts(13,0); // make room for rolls 2-12, all zero Dice d(6); for(int k=0; k < rollCount; k++){ int roll = d.Roll() + d.Roll(); counts[roll] += 1; } return counts; } int main() { int rollCount = 50000; tvector<int> counts = rollDice(rollCount); for(int k=2; k <= 12; k++){ cout << k << "\t" << counts[k] << endl; } return 0; }

Here's a run of this program, an explanation of the code follows the run.


  2       1392
  3       2810
  4       4201
  5       5391
  6       7070
  7       8372
  8       6910
  9       5648
  10      4079
  11      2728
  12      1399

Anatomy of a vector variable

Standard C++ supports a class named vector. We'll use a class that's compatible, named tvector (tapestry vector). This class can be used just like the standard vector class, but is more forgiving when mistakes are made than the standard class. Since students learning to program sometimes make mistakes, we'll use the class tvector.

When you define a tvector object you must specify what type of element it will hold, e.g., string, int, double, Date, etc., In the code from diceroll.cpp the vector holds int values --- basically 11 counters for the rolls 2-12. The type, included between angle-brackets, is called the tvector template parameter and tvector is referred to as a templated class or a generic container since it can store any type of element, where the type of element is specified when the tvector is constructed.

In our first uses of vector, we'll also specify how many elements a vector can hold and what the initial values of these elements are. If you don't specify the initial value, the default constructor is called for classes, but for built in types the values aren't known if no initial value is provided (note: in the standard vector class the initial value will be zero if one isn't provided, but this isn't the case in tvector).

tvector<int> counts(100,0); // 100 ints, all zero tvector<string> words(50,"hello"); // 50 strings, all "hello" tvector<double> diffs(25); // 25 doubles, values unknown tvector<Date> days(31); // 31 Dates, all today's date tvector<string> guesses(5); // 5 strings, all "" or empty

Accessing a tvector

To access an element, sometimes called a slot or cell of a vector, brackets are used. The code below creates the string vector diagrammed. The first element of a vector has index 0/zero.

"elephant" "kangaroo" "monkey" "lion" "crocodile"
0 1 2 3 4

tvector<string> animals(5); // room for 5 strings animals[0] = "elephant"; animals[1] = "kangaroo"; animals[2] = "monkey"; animals[3] = "lion"; animals[4] = "crocodile";

The bracket operator [..] used with an int value accesses an element of a vector. In the code from diceroll.cpp above, the vector is indexed in rollDice to update the counter and again in main to print the values after they've been calculated.

Dice Statistics: A Closer Look

The vector created and returned in the function rollDice contains 13 elements, indexed from 0 to 12. The first two elements, those with indexes zero and one, are not used in the program since the program uses the sum of two rolls to index the vector and increment the value stored in the indexed location.

These eleven values (indexed from 2 to 12) are then printed in main.

Dice Stats Revisited

Just looking at numbers isn't as useful as looking at a bar-graph of the rolls. The code in dicerollhisto.cpp is very similar, but prints a histogram showing the rolls.

In looking at the code, look carefully at the function maxValue which returns the largest value stored in the vector parameter to the function.

#include <iostream> using namespace std; #include "dice.h" #include "tvector.h" tvector<int> rollDice(int rollCount) // pre: rolls > 0 // post: returns vector of counts with v[k] = # times k rolled // when 2 six-sided dice are rolled rollCount times { tvector<int> counts(13,0); // make room for rolls 2-12, all zero Dice d(6); for(int k=0; k < rollCount; k++){ int roll = d.Roll() + d.Roll(); counts[roll] += 1; } return counts; } int maxValue(tvector<int> counts) // pre: 12 < counts.size(), so counts[12] is legal index/entry // post: returns largest value in counts[2] .. counts[12] { int max = counts[2]; for(int k=3; k <= 12; k++){ if (counts[k] > max){ max = counts[k]; } } return max; } void makeStars(int count) // post: prints count '*' asterisks on current line { for(int k=0; k < count; k++){ cout << "*"; } } int main() { int rollCount = 50000; tvector<int> counts = rollDice(rollCount); int max = maxValue(counts); for(int k=2; k <= 12; k++){ cout << k << "\t" << counts[k] << "\t"; makeStars(40*counts[k]/max); cout << endl; } return 0; }

Here's a run of the program.


  2       1406    ******
  3       2842    *************
  4       4147    ********************
  5       5598    ***************************
  6       7089    **********************************
  7       8173    ****************************************
  8       6976    **********************************
  9       5422    **************************
  10      4185    ********************
  11      2806    *************
  12      1356    ******

The difference between this version and the previous is the asterisks printed according to the value of each vector element. The maximum number of asterisks printed on a line is 40, and the number printed is scaled by the largest vector element, computed when the function maxValue is called.

Finding Maximal Values

The code to find the maximal value in a vector requires looping over all elements in the vector. The largest one found is "remembered" as the vector elements are examined. The largest value seen so far is stored in the variable max which is updated when a new, larger value is found.

If the loop in maxValue started at 2, rather than 3, initialization of max to zero would work in this function, since all vector elements are known to be no smaller than zero (why?)


Adding Values to a vector

In many situations the number of slots/elements needed in a vector can't be determined before the program runs. For example, if every word from a file is stored in a vector, how many slots (space) should be allocated when the vector is constructed?

The vector method push_back adds an element to the end a vector (a new last element) and makes sure there's always room to add more. Given this, there must be a method to determine how many values are actually stored in a vector --- this method is named size. The code below shows the methods.

tvector<string> words; words.push_back("cherry"); words.push_back("grape"); words.push_back("melon"); words.push_back("orange"); cout << "# elements: " << words.size() << endl; for(int k=words.size()-1; k >= 0; k--){ cout << k << "\t" << words[k] << endl; } If executed the output of this code is:
  # elements: 4
  3	orange
  2	melon
  1	grape
  0	cherry

Calling push_back is efficient. If more space is needed in the vector to add a new element, more space is created in an efficient manner (see Chapter 8 in Tapestry for details).
Owen L. Astrachan
Last modified: Wed Sep 24 01:24:11 EDT 2003