L7 Multi-File C Programs
The
procedures and global variables of a C program can be split across multiple
files. Each of the files can be
compiled separately, into a *.o file.
Later, all the *.o files can be linked together (still using gcc) into a
running program, an executable object program.
Each file can have access to a set of names that are private to that file,
and to other names that are shared across all the files of the complete
program. This allows the program to be
built and modified in pieces, often called “modules”, which can be tested
separately, and later combined into the final product.
The
programmer determines which names are local to each file, and which are shared,
and indicates his/her decisions by constructing declarations
appropriately. The storage class
specifiers extern
and static play a
role in this, as does the position of each declaration, and whether a
declaration is in fact a “definition”.
A
declaration announces to the compiler that a certain name is “known”, and
usable in some area of the code. The
declaration also gives the name’s “type”.
A “definition” is a declaration accompanied by an initialization (for
variables), or a function body (for functions).
A
declaration can appear inside a function or other set of “{}”. Such a declaration declares a name usable
only within the innermost enclosing “{}”.
Such a declaration is termed :local”.
If a local declaration begins with the keyword “static”, space is
reserved for the name in “global memory”, and preserved across calls to the
function – the next time this function is called, the name’s previous value
will still be preserved. Note that this
name is still usable only within this function – if another function declares
the same name the two function each get their own version of the variable. If “static” is omitted from a local
declaration, space for the name is reserved on the function’s Activation Record
(on the stack). Each time the function
is called, a new area of memory (completely uninitialized) is reserved for this
variable.
Declarations
appearing outside functions are termed “external”. The names they declare are known within the current file, in any
program text which FOLLOWS the declaration.
In addition, the name will be shared across files UNLESS the declaration
includes the word “static”. A static,
external declaration defines names which are local to one file.
The
word “extern” can appear in an external declaration, but seems to be
ignored. (In the original C, an
“extern” external declaration allocated no storage, but linked the declared
name to the storage allocated for the same name in another file.)
A
general rule states that all declarations for a given name must have matching
types. However, it is legal for some of
these declarations to be “incomplete” – the most specific declaration is used. Further, at most one DEFINITION for a given
name can occur. These rules apply to
each “name scope”, that is: within a file, for names local to the file, within
a function for names local to the function, and within the whole program for
global names.
A
special rule applies to global variables which have NO definitions. In this case, a single copy of the variable
is made, and it is initialized to 0.
Note that the “single definition” rule applies to both functions and
variables, so a function can have many prototypes (headers), if they all agree,
but the function body can only appear once.
Chapter
4 of Kernighan and Ritchie discusses these issues. However, their discussion seems to be based on the ORIGINAL
C. Modern C has somewhat more relaxed
rules. In particular, K+R state that if
one file wants to reference a variable declared in another, the first file MUST
place the word “extern” on the external declaration for the shared
variable. That does NOT seem to be what
the “reference manual” (Appendix A of K+R) says. I’ve tried the following example in gcc, and it seems to work:
Example
File
set.h:
int x;
void
set(int i);
File set.c:
#include
"set.h"
void
set(int i) {
x=i;
}
File main.c:
#include
<stdio.h>
#include
"set.h"
int
main() {
set(1);
printf("x=%d\n", x);
set(5);
printf("x=%d\n", x);
}
The command:
gcc
set.c main.c
compiles the *.c files,
and links them together into file a.out, which then can be run. (This program is intended as illustration
only. Avoid use of global variables
unless using them simplifies the program significantly. See the example in Chap 4 of a global
“stack” for evaluating arithmetic expressions in Polish Postfix notation.)
Using the rules
Large
C programs are usually built by grouping the important functions into files,
with “related” functions placed into the same file. The functions of one file might serve to define a “package” of
some kind – say, one for manipulating a specific kind of list. In addition, a header file is built for each
program file. Traditionally, the header
file for program file P.c is called P.h.
The header includes definitions of structures, macros and enums that are
important for the file, along with prototypes for each global function in the
file, and declarations for each global variable in the file. The header file is #include’ed in the program file, and in each file that needs
to use functions from that program file.
The
header file technique allows the program file’s functions to call any function
in that file, even functions that are defined late in the file (without the
header, that wouldn’t work). It allows
any file that includes the header to access the global variables declared in
the header, and to call functions defined in the program file. The program file can contain additional
declarations for some of the global variables which initialize those
variables. (These declarations become
definitions.)
It
is common for header files to include other header files. If this is done extensively, it could lead
to a situation in which the same header declarations are included multiple
times. This is probably OK for function
prototypes, and for variable declarations, but it produces illegal C programs
if enums, structs, or unions are repeated, since the member names and enum
constants defined by these declarations become multiply defined. The C preprocessor is usually used to avoid
this problem.