Finding C Libraries

One of the most irksome problems in C or C++ programming is in finding and linking with the right libraries.

Really, it’s a family of problems, and there is no simple solution. The following is a collection of techniques, and a basic description of the ideas and tools which may be be brought to bear.

Some of the following only makes sense in a Unix or Linux environment. To save yourself a great deal of pain, I recommend that you learn C development in one of those environments… C grew up with Unix.

typical error messages

can’t find a library header file when compiling C files

means the header file for the library isn’t where the build system expects it. Either the file needs to be installed, or the build system needs to be told how to look for it.

On modern systems, this often means a dev package needs to be installed.

For reasons of efficiency, often only the code library is installed, by default, with the system distribution's packages. The library provides the functionality, but to build a project that uses the library, header files and other tools are needed, may be separated into a dev package. This package usually has a name similar to the package of the main library.

If the library was not installed as part of a distribution package, it could be located anywhere. You will have to search for it, and then determine how to inform the build system of its whereabouts.

library function declaration doesn’t match use in source code

this usually means the version of the library expected by the source code is different from that of the installed library

symbol not found at link-time

usually adjust link options on linker line

library not found at run time

usually need to adjust LD_LIBRARY_PATH

symbol found but not correct

library version problem

crashes or other wrong behavior at run time

sometimes from mismatch of headers and libraries

often resolved by fixing paths

preprocessing, compiling and linking

Building a C program or library involves three stages: preprocessing, compiling, then linking.

  1. Preprocessing consists of textual changes to each source file—in particular, header files are included textually by the preprocessor.
  2. Compiling is the process of translating the C source code files into machine code, and results in an object file (ending with .o for each .c source file).
  3. Linking is the process of combining these object files together with system libraries to form an executable (or library).

In C and C++, libraries are often associated with header files (.h or .hh), that describe the public data structures and functions in the library. The library files contain the machine code that implements the library’s functionality.

To build a program using a library, one must both include the proper header files during compilation, and link with the proper library files during linking (and later perhaps at run time).

dev packages

Many modern Unix-based OS distributions use some kind of package management system to install system software. In many of these systems, many libraries are installed without the header files required to build software using the libraries. Typically, a separate, corresponding package, the dev package, needs to be installed in order to build such software.

These days, the most common cause of a missing header file at link time is the absence of a dev package, and the solution is very simple.

static vs. shared libraries

There are two main sorts of libraries, static and shared. They differ mainly in how the code they contain is associated with programs that use them.

When an program is linked, the code needed from a static library is copied into the program executable file. With a shared library however, only a reference to the library file is copied into the executable. The code from a shared library is found at run time (by a program called the dynamic linker) using this reference. (And it is in this sense “shared” by all the programs that use it.)

Therefore, problems with linking static libraries appear during the build of a program after compilation has finished, during linking, whereas it often happens that problems linking a shared library appear at run time.

In Unix-like systems, static library files have names ending with .a (for “archive”), and shared library files have names ending with .so (for “shared object”). In DOS/Windows, shared library files have names ending with .dll (for “dynamic linked library”).

where to find header files

In Unix-like systems, the traditional place for the C headers of system libraries is /usr/include/, and sub-directories thereof. Locally built software traditionally puts headers in /usr/local/include/.

But the header files could be anywhere. Your friends are the utilities

where to find libraries

In Unix-like systems, the traditional place for the basic system libraries is /lib/, with many additional libraries are found in /usr/lib/, sometimes in sub-directories.

Furthermore, locally built libraries are traditionally placed in /usr/local/lib/. Specific application libraries are nowadays often stored in various sub-directories of /opt/.

If you don’t find your library in these places, you can try using locate and find.

If the library simply isn’t present in the system you’ll need to install it. First try to determine if a package from your OS distribution contains the library. Failing that, you will usually need to build the software from source. Unfortunately, this source will often require yet further libraries!

preprocessor search path for headers

C distinguishes between two different #include statements. Those with the file path in angle-brackets (#include <header.h>), and those in double-quotes (#include "header.h").

By default, the preprocessor searches for ones in angle brackets starting in the system libraries, while it searches for those in double quotes in the directory containing the current file.

The system header behavior is altered on the compiler command line using the -I switch to add other directories to be searched by the compiler.

linker search path for static libraries

The static linker is usually included as a functionality of the compiler program, and the compiler accepts various switches that control how it links and where to search for libraries. The most important are

-L directory-name
after a switch, the compiler searches in that directory for library files
-lname
This is a shorthand for libname.a or libname.so link line.

Your compiler will search for libraries in /lib/ and /usr/lib/ without being instructed to.

Often, build systems will also add /usr/local/lib/. They usually also search for various libraries, and test that the correct version is present. This process often goes wrong however.

Most compilers automatically link in the most basic libraries by default. In C, this is libc.a; in C++, lib_stdc++.so.

static linking order

The order in which files are linked makes a difference.

Many libraries require functions found in other libraries. But the linker only includes code that it knows it needs. So if library A needs functions found in library B, library A must be specified before library B on the link line.

The order for library inclusion is: from most specific libraries to most general. That is, a library for some high-level code, such as for calculations for a specific product, should come first, and if it depends on the standard math library, the flag -lm should come last.

dynamic (run-time) linking

Shared libraries are linked by a completely different mechanism, at a different time, than static libraries. The problems that arise with them re therefore different, and occur at different times, than those of static libraries. It is helpful to understand the process to some degree.

The linking of shared libraries happens when a program is executed. This is done by a system program called a dynamic linker. It is usually pre-configured to look in some places for shared library files, usually /lib/ and /usr/lib/.

You can add directories to the dynamic linker’s search path by setting the shell variable LD_LIBRARY_PATH, e.g. in the Bash shell:

export LD_LIBRARY_PATH=/my/path1:/my/path2/:$LD_LIBRARY_PATH

Putting this line in your ~/.bash_profile file, will result in the variable being set each time you log in.

Finally the system administrator can change globally which directories are searched by the dynamic linker by editing /etc/ld.so.conf and the files it includes, then running the program ldconfig.

build systems

Gnu configure
often has switches that allow you to specify a library. See
./configure --help

inspecting libraries

Archive files (static .a libraries) can be inspected using the

command (probably originally an abbreviation of “names”). This lists all symbols referenced in the library (both those with code in the library, and those with code elsewhere). Symbols listed as type A have their code in the library; those listed as type U are defined elsewhere. String variables listed as type T also live in the library.

The symbols within shared libraries can be viewed with

For shared libraries, it is important to know what library is being referenced. The main tool is

(for list dynamic dependencies). Run on an executable or shared library, it lists the references to shared libraries and where the system thinks it should look for them. (Note this depends on the current environment variable LD_LIBRARY_PATH.)

To list all the strings found among the code in a library or binary executable, run

on it. Often names of libraries and other useful hints will appear.

find a symbol among many libraries

Nowadays, the first resource is Web search engines, but that is a pretty inexact way to find a symbol on your specific system.

Some combination of Unix commands will often turn up a symbol you can’t otherwise find. For example, to search all the .a libraries for the symbol strchr, you could try:

find /usr/lib -name "*.a" -print | xargs nm -o --defined-only 2 > /dev/null | grep strchr

build your own library

If the option of installing the required library packages isn't open to you, it is always possible to make your own copy from the library sources. The usual procedure is: download the library source "tarball" (.tar.gz or .tgz archive), unpack—build—install.

The main ideas are described here. Often a README or INSTALL file in the library source directory will provide further information.

The commands often are just these:


tar -xf thelibrary.tgz
cd thelibrary
./configure
make
make install

but there are many variants.

installation directories

It is strongly discouraged to install non-distribution libraries in the system directories: that is asking for trouble. You are likely to break your system. It is likely that changes you make will disappear in an update. It makes it hard for anybody to sort out which files are distributions and which are custom-built.

If you have permissions to write to /usr/local/, the first option is to install the library there; this is the default for responsibly-written Unix-like software. It is the preferred option if the computer is your own, or if multiple users all need to access the same library. etc.

home directory

If you can't get access to /usr/local/, you can always binaries in your home directory, in ~/bin, ~/lib. Often this is just a matter of configuring it to do so: ./configure --prefix=your-home-dir. Otherwise, you can always copy files there by hand. You may also need to configure PATH and LD_LIBRARY_PATH.

C++ issues

templates

In C++, much of the functionality that in C is encapsulated in the ANSI libraries is to be found in the (far more elaborate) standard template libraries (STL). But this is not a binary library; it lives completely in the STL header files, as templates.

name mangling

Symbols in binaries compiled from C++ may bear little resemblance to the names in the source code, due to a process called name mangling. The process is deliberately compiler-dependent, so the symbols produced by two different compiler will be different.

The effects are: the symbols may be harder to find in libraries, and libraries compiled by one C++ compiler will typically not link with binaries compiled by another C++ compiler.

The GNU version of the nm command has an option --demangle to make mangled names easier to read. Likewise, the objdump command has the option -C.

FORTRAN issues

Linking C applications to FORTRAN libraries (and vice-versa) is often done, but it is a bit tricky. Whereas there is a strict convention for the symbols produced by C, there isn’t one for FORTRAN. Each compiler has its own conventions for translating function names to compiled symbols, and switches that control that behavior. It is usually some combination of handling capitalization and putting underscore characters before or after function names.

Furthermore, whereas C strings are very simple structures with a rigid definition, FORTRAN strings are complex structures that are compiler dependent. When passing strings between C and FORTRAN, one must consult the compiler manuals.

Finally, some libraries make use of global data. Once again, while the convention for how global data is represented in libraries produced by C is well-defined, it is not in FORTRAN—you have to look in the compiler manuals.