L8 The Preprocessor;  enum

enum  Programs often need to indicate that element “i” has some particular “status”, or “state”.  This is usually done by placing and integer code in the i-th entry of some array, Code.  Code[i]==1 might mean that the character with code i is a letter;  Code[i]==0, that i represents a digit, and other Code[i] values might indicate arbitrary other information about what i represents.

Using integers in this way has advantages:  Checking them is quick, whether you do it by using a switch, or using a “if-test”, because the machine can compare two integers in one or two instructions.  For other types of data, several extra instructions might be needed (for example, if strings are used, each character of the string may have to be tested individually).  However, the method has a serious drawback – when you look at the value of the code, you have no idea of WHAT that integer means.  The entire program has to be examined, to see what all parts of the code do when that value is observed in that particular array.  Of course, the meaning of codes changes completely, if they appear in a different array.

The scheme usually adopted to solve this “documentation of codes” problem is to NAME each code value.  The language should allow the programmer to pick the name, and the programmer should pick names that help people remember what the code’s meaning is.  C provides two mechanisms for associating names with integers.  One is the #define statement of the pre-processor.  The other, the “enum”.

An enum consists of an enum tag, followed by a list of names, representing possible values that this enum can take on.  The simple case associates the first name in the list with the constant 0, the second with constant 1, and so on.  The declaration:

enum class {Letter, Digit, White, Other} V;

defines “class” to be the “tag” for this enum (so that you can later write

enum class W;

to declare variable W to hold enum “class”).  The first line also declares variable V to hold values of this enum.

C will complain if you try to assign a value to V or to W which is NOT in this enum class.  C will allow you to use the enum value names just as if they were the integer constants each represents, in other statements.  These enum values can be assigned to other integer variables, and compared with variables, freely.  GDB can look at a variable declared to be of a specific enum type, and print its value symbolically.  Use of enum allows the compiler to generate successive values for your codes automatically.  (You can take control of these values by following an enumeration-name in the list with “=(value)” to set that name to the specified value.  Subsequent names in the list continue getting successive values, starting from this value.)

enum tags should be different within each naming context, unless you are trying to refer to a previously-defined enum;  The enumeration constants (names in enum lists) must be distinct, across all enums in a given naming context.  They must also differ from variable and function names.

The C Preprocessor

In several situations (array bounds being the most common) C requires “constant expressions” – expressions which involve no variables.  For array bounds in particular, the arry’s size doesn’t appear just once in the program source – it appears in the array’s declaration, AND as the “upper bound” in any loops which scan the array from element 0 to the last element.  If the expression changes in one place, it must change in all, for the program to work correctly.  However, the expression in the array declaration must not involve any variables.

This is difficult to achieve without the C preprocessor.  The preprocessor is a program which runs BEFORE actual compilation starts.  It transforms the source text, by performing operations like “expanding” #define’d names into source text.  The expanded text is fed to the C compiler proper.

The preprocessor provides several operations for program transformation.  The primary ones are:

Form

Meaning

Examples

#define name expansion

Whenever name appears in the source text, replace it with expansion.  Expansion ends with the first newline.  You can “hide” a newline with the “\” character, if you want multi-line expansions.

#define N 5000

int A[N];

for (i=0; i<N; i++) …

#define name(p1, p2) E

Whenever name(E1,E2) appears, replace it with E but with every occurrence of p1 in E replaced with E1, and similarly for all other parameters.

#define abs(x) \

   ((x)>0)?(x) : (-x)

#include <filename>

Find filename in the system include library, and replace this #include with the entire file.

#include <stdio.h>

#include “filename”

Find filename in the current directory, and replace this #include with the entire file.

#include “stack.h”

#if constant expression

text1

#elif constant expression

text2

#else

text3

#endif

Any number of #elif’s (else if) lines may appear, and the #else is optional.  This evaluates each constant expression until the first TRUE (non-zero) value is computed.  Then the entire statement is relaced with the expansion of the text which follows the TRUE #if or #elif.  If none is found, thetext which follows the #else is used.  #defined names may be used in the constant expressions, but no other names.  (The #’s must appear at the beginning of lines.)

#if __ASCII__

int f(i); int i; {

#else

int f(int i) {

#endif

#undef name

Undefine name.

 

#ifdef name

Evaluates TRUE if name is currently preprocessor-defined

 

#defines can appear in the text inserted by #if’s, but #if’s cannot appear in the expansions of #defines.

#includes are used to include complicated header files, which can contain #defines, and which declare enums, structs, unions, functions and variables for later use.

#defines can introduce abbreviations for simple functions, and for symbolic constants.

#if’s are used to tailor C programs for different operating systems, and for different compilers, so one source text holds the code for all the versions.  Production C programs intended as source for more than 2 compilers use such complex #if’s that their source code is likely to be hard to read.