L2:  Strings, Arrays, Memory, and Pointers

 

C variables are (usually) stored in the computer’s “memory”, which is a very large sequence of “pigeon-holes”, into each of which a small integer can be placed.  That integer (actually, a pattern of 8 bits called a “byte”) will remain there until changed, or until power goes off.  Each pigeon-hole has a numeric “address”, also called a “location”.  The computer uses the numeric address to locate the specific pigeon-hole, to load information from it, or store information into it.

 

Single bytes are not very useful – a byte can hold only one of 256 (2**8) different values.  C (and computers) allow sequences of bytes with sequential addresses to be treated together.  A sequence of 4 bytes is common, and is treated as a 32-bit “word”.  Such a word is used to store C integers.   Single bytes are useful for holding printable characters – one printable character fits into one byte.

 

C requires each variable you use to be “declared”.  A declaration announces to C how many bytes must be reserved for that variable, as well as defining the variable’s “type”.  The type of the variable defines how operators applied to that variable behave.

 

Declarations:

               int x;  char c;

declares x as a variable which can hold a C integer, and c a variable which can hold a printable character.

X will require 4 bytes of memory, and c will require 1.

 

Arrays

C allows “arrays” of variables, each of the same type, to be declared.  An array allows related variables to be logically grouped together, and the language allows any one of them to be selected for use, by using a computed index (an integer).

 

A C array is declared as <type> <variable>[<size>];

An example:  int QQQ[100];

Such a declaration allows the program to use the notation QQQ[i], where i can be any expression, so long as its value is between 0 and 99 inclusive.  That is, the array has <size> elements, each of which is automatically given a “name” between 0 and <size>-1.

 

Pointers

Any object (constant, variable, subroutine) that is stored in memory must be stored at some location or address in memory.  The object’s address is, by convention, the address of the first (lowest numbered) byte it occupies.  C provides a way of obtaining and operating on the address of the object.  Such an address is called a pointer.  To obtain the address of the variable x, write “&x”.  Variables which hold pointers can be declared:

To declare y to be a pointer to an object of type T, write: “T *y;”  For instance, int *y; declares y to point to an integer.  Such a declaration does NOT specify which integer y points to.  An assignment to y must be used to tell the program what location in memory y points to.

 

If y points to some object, then *y is a reference to that object, and can be used either as an assignment target, or in an expression which needs the object’s value.

Example:

int x, *y;

x = 50;

y = &x;

*y += 3;

printf(“%d\n”, x);

/* should print 53 */

 

Pointers have many uses in C.  One common reason for their existence is to allow C functions to return more than one value.  If a C function needs to compute and return more than one value, a pointer argument is passed to the function, and the function “stores indirectly” through this argument to return the second value.

 

Arrays and Pointers

C pointers and arrays are somewhat interchangeable.  If y is a pointer to int, declared int *y;, then

y[10] and *(y+10)

refer to the SAME element of array y.  (Note that no space is reserved for the y array by these declarations –

y is actually an int-sized variable, and the array it points to must be constructed and space reserved for it in some other statement.)  The notation (y+10), where y is a pointer, needs some explanation.  In C, an integer which is added to a pointer is first “scaled” (multiplied by the size, in bytes, of the type the pointer points to) before being added to the pointer.  The reason:  Each element of an array occupies several bytes.  To get the location of the i-th element of the array, the computer must use L+(sizeof(element))*i, where L is the address of the array’s first element, element 0.  sizeof(x) is actually a C function that returns the number of bytes x occupies in memory.

 

This allows a C function to accept as an argument a parameter which points to an array of unknown length.

The function can read or alter any element of the passed-in argument, using either pointer or array notation.

Example:

int compute(int *result) {

int i, s;

 

s = 0;

for (i=0; i<100; i++) {

result[i] = …;

s += result[i];

}

return s;

}

This ability is used heavily in C.  For instance, the scanf() function uses it to return the values of the items it reads from stdin.

 

C strings:

A C string is a sequence of printable (and special) characters, surrounded by double-quotes:

“This is a string of characters”

This notation creates a constant somewhere in memory, consisting of an array of “char” values.  The i-th element of the array is initialized to the i-th character of the string constant, and an extra char beyond the last character of the constant is reserved, and initialized to the special char value ‘\0’ – a byte whose value is 0.

C strings are thus sequences of bytes in memory, ended with a byte whose contents is 0.

The C string constant has, as value a “pointer” to the actual constant.  This pointer is the address (a number) of the 0-th element of the char array.  Usually, C string constants should not be changed by the program, although on many machines this can actually be done.