Wolfgang Tichy


About compiling programs

Compiler basics

Most computer programs are spread out over many files. The compiler has to thus translate every file into machine language and later link them together into a single executable program that the user can launch. In the case of a C-compiler like gcc, translation is effected by a command like
gcc -c file1.c
that has to be run for each C-file. The actual translation happens in 2 stages

After this has been done for every file, all the .o-files have to be linked together into a complete program, using e.g.:

gcc -o program_name file1.o file2.o file3.o ...
The latter creates a new executable program called program_name from all listed object files and the standard C-libraries. The standard C-libraries are collections of object files, that contain standard C-functions such as printf.

Libraries

As there are always many object files, they are often grouped together and stored in library files. This is usually done with the object file archiver ar. E.g. if we want to put file1.o , file2.o and file3.o in a library file named libf123.a , we can use:
ar rcs libf123.a file1.o file2.o file3.o
Once this is done we can create the above program also with:
gcc -o program_name -L. -lf123 ...
The -L argument tells to compiler where to look for libraries. The . in -L. means look in the current directory. The argument -lf123 means link in library f123. The compiler is smart enough to expect the f123 library to be in file libf123.a .

These kinds of libraries (which conventionally end in .a) are called static libraries. The intent is that they are added to the final program (here program_name). If the library is big, the program will be big as well. If many programs use the same big library they all will contain the same big library, thus taking up space multiple times.

Besides static libraries there are also so called shared libraries. Shared libraries are not included in the program, but instead loaded into memory at program start. This loader is so sophisticated that it will load each library only once into memory, even if several programs need it. For this reason most widely used libraries, such as e.g. the GSL, exist in the form of shared libraries. Their file names end in .so , e.g. the GSL library is usually in the file libgsl.so .

As far as linking the final program is involved, shared libraries can be used in just the same way as static libraries. E.g. if the program also needs libgsl.so we would add -lgsl, as in:

gcc -o program_name -L. -lf123 ... -lgsl
The only caveat is that the compiler will look for libgsl.so only in the directories specified by -L and also in some operating system dependent places. So if libgsl.so is not in a standard directory, you will likely have to add another -L argument to tell it where to look.

Also, some libraries in turn need other libraries to function. In such cases these libraries need to be added as well. E.g. libgsl.so also needs libgslcblas.so , so that we really need:

gcc -o program_name -L. -lf123 ... -lgsl -lgslcblas

Include files from libraries

In order to use most libraries we also need to add certain include files, e.g.
#include <gsl/gsl_odeiv2.h>
to our C-files. Again the compiler will look only in certain standard places on its own. Hence often you have to add a -I argument to the translation stage (where include files are processed) to tell the compiler the directory in which it should look for include files. If e.g. file1.c includes gsl/gsl_odeiv2.h and gsl/gsl_odeiv2.h is in directory /usr/include, we need:
gcc -I/usr/include -c file1.c

Compiling Sgrid, Nmesh or BAM

Both Sgrid and Nmesh and also BAM use GNU Make to issue the necessary compiler commands. Thus in principle it suffices to just use make to compile these programs, e.g. we can simply type:
cd sgrid
make
However, since Sgrid needs certain libraries, the compiler often needs to be given various arguments to find these libraries and their include files, i.e. the -L and -I arguments. And of course it also needs the various -l arguments to tell it which libraries we want to link in. To keep this reasonably simple, all 3 programs contain a MyConfig file that sets a couple of variables that are passed on to the compiler. The main variables to use external libraries are SPECIALINCS and SPECIALLIBS. E.g.
SPECIALINCS += -I/usr/include
SPECIALLIBS += -L/usr/lib/x86_64-linux-gnu -lgsl -lgslcblas
would tell the compiler to look for include files in /usr/include and to link in the libraries gsl and gslcblas and to look for them in /usr/lib/x86_64-linux-gnu .

In Sgrid the above 2 lines are usually accompanied by

DFLAGS += -DGSL
This line adds -DGSL to the variable DFLAGS, which is also passed on to the compiler. It causes the compiler to add a "#define GSL 1" to the C-files. This is useful as some parts in Sgrid check (with "#ifdef GSL") whether GSL function calls should actually be used.

See also the Library Notes to find out more about compiling on supercomputers such as FAU's Athene.