L7 Multi-File C Programs

The procedures and global variables of a C program can be split across multiple files.  Each of the files can be compiled separately, into a *.o file.  Later, all the *.o files can be linked together (still using gcc) into a running program, an executable object program.  Each file can have access to a set of names that are private to that file, and to other names that are shared across all the files of the complete program.  This allows the program to be built and modified in pieces, often called “modules”, which can be tested separately, and later combined into the final product.

The programmer determines which names are local to each file, and which are shared, and indicates his/her decisions by constructing declarations appropriately.  The storage class specifiers extern and static play a role in this, as does the position of each declaration, and whether a declaration is in fact a “definition”.

A declaration announces to the compiler that a certain name is “known”, and usable in some area of the code.  The declaration also gives the name’s “type”.  A “definition” is a declaration accompanied by an initialization (for variables), or a function body (for functions).

A declaration can appear inside a function or other set of “{}”.  Such a declaration declares a name usable only within the innermost enclosing “{}”.  Such a declaration is termed :local”.  If a local declaration begins with the keyword “static”, space is reserved for the name in “global memory”, and preserved across calls to the function – the next time this function is called, the name’s previous value will still be preserved.  Note that this name is still usable only within this function – if another function declares the same name the two function each get their own version of the variable.  If “static” is omitted from a local declaration, space for the name is reserved on the function’s Activation Record (on the stack).  Each time the function is called, a new area of memory (completely uninitialized) is reserved for this variable.

Declarations appearing outside functions are termed “external”.  The names they declare are known within the current file, in any program text which FOLLOWS the declaration.  In addition, the name will be shared across files UNLESS the declaration includes the word “static”.  A static, external declaration defines names which are local to one file.

The word “extern” can appear in an external declaration, but seems to be ignored.  (In the original C, an “extern” external declaration allocated no storage, but linked the declared name to the storage allocated for the same name in another file.)

A general rule states that all declarations for a given name must have matching types.  However, it is legal for some of these declarations to be “incomplete” – the most specific declaration is used.  Further, at most one DEFINITION for a given name can occur.  These rules apply to each “name scope”, that is: within a file, for names local to the file, within a function for names local to the function, and within the whole program for global names.

A special rule applies to global variables which have NO definitions.  In this case, a single copy of the variable is made, and it is initialized to 0.  Note that the “single definition” rule applies to both functions and variables, so a function can have many prototypes (headers), if they all agree, but the function body can only appear once.

Chapter 4 of Kernighan and Ritchie discusses these issues.  However, their discussion seems to be based on the ORIGINAL C.  Modern C has somewhat more relaxed rules.  In particular, K+R state that if one file wants to reference a variable declared in another, the first file MUST place the word “extern” on the external declaration for the shared variable.  That does NOT seem to be what the “reference manual” (Appendix A of K+R) says.  I’ve tried the following example in gcc, and it seems to work:

Example

File set.h:
int x;

void set(int i);

File  set.c:

#include "set.h"

 

void set(int i) {

        x=i;

        }

 

File main.c:

#include <stdio.h>

#include "set.h"

 

int main() {

        set(1);

        printf("x=%d\n", x);

        set(5);

        printf("x=%d\n", x);

        }

The command:

gcc set.c main.c

compiles the *.c files, and links them together into file a.out, which then can be run.  (This program is intended as illustration only.  Avoid use of global variables unless using them simplifies the program significantly.  See the example in Chap 4 of a global “stack” for evaluating arithmetic expressions in Polish Postfix notation.)

Using the rules

Large C programs are usually built by grouping the important functions into files, with “related” functions placed into the same file.  The functions of one file might serve to define a “package” of some kind – say, one for manipulating a specific kind of list.  In addition, a header file is built for each program file.  Traditionally, the header file for program file P.c is called P.h.  The header includes definitions of structures, macros and enums that are important for the file, along with prototypes for each global function in the file, and declarations for each global variable in the file.  The header file is #include’ed in the program file, and in each file that needs to use functions from that program file.

The header file technique allows the program file’s functions to call any function in that file, even functions that are defined late in the file (without the header, that wouldn’t work).  It allows any file that includes the header to access the global variables declared in the header, and to call functions defined in the program file.  The program file can contain additional declarations for some of the global variables which initialize those variables.  (These declarations become definitions.)

It is common for header files to include other header files.  If this is done extensively, it could lead to a situation in which the same header declarations are included multiple times.  This is probably OK for function prototypes, and for variable declarations, but it produces illegal C programs if enums, structs, or unions are repeated, since the member names and enum constants defined by these declarations become multiply defined.  The C preprocessor is usually used to avoid this problem.