Library Design
Through this section, the term client code refers to applications and other libraries using the library.
State Management
Global State
Global state should be avoided.
If this is impossible, the global state must be protected with
a lock. For C/C++, you can use the
pthread_mutex_lock
and pthread_mutex_unlock
functions without linking against -lpthread
because the system provides stubs for non-threaded processes.
For compatibility with fork
, these locks
should be acquired and released in helpers registered with
pthread_atfork
. This function is not
available without -lpthread
, so you need to
use dlsym
or a weak symbol to obtain its
address.
If you need fork
protection for other
reasons, you should store the process ID and compare it to the
value returned by getpid
each time you
access the global state. (getpid
is not
implemented as a system call and is fast.) If the value
changes, you know that you have to re-create the state object.
(This needs to be combined with locking, of course.)
Handles
Library state should be kept behind a curtain. Client code
should receive only a handle. In C, the handle can be a
pointer to an incomplete struct
. In C++,
the handle can be a pointer to an abstract base class, or it
can be hidden using the pointer-to-implementation idiom.
The library should provide functions for creating and destroying handles. (In C++, it is possible to use virtual destructors for the latter.) Consistency between creation and destruction of handles is strongly recommended: If the client code created a handle, it is the responsibility of the client code to destroy it. (This is not always possible or convenient, so sometimes, a transfer of ownership has to happen.)
Using handles ensures that it is possible to change the way the library represents state in a way that is transparent to client code. This is important to facilitate security updates and many other code changes.
It is not always necessary to protect state behind a handle with a lock. This depends on the level of thread safety the library provides.
Object Orientation
Classes should be either designed as base classes, or it should
be impossible to use them as base classes (like
final
classes in Java). Classes which are
not designed for inheritance and are used as base classes
nevertheless create potential maintenance hazards because it is
difficult to predict how client code will react when calls to
virtual methods are added, reordered or removed.
Virtual member functions can be used as callbacks. See Callbacks for some of the challenges involved.
Callbacks
Higher-order code is difficult to analyze for humans and computers alike, so it should be avoided. Often, an iterator-based interface (a library function which is called repeatedly by client code and returns a stream of events) leads to a better design which is easier to document and use.
If callbacks are unavoidable, some guidelines for them follow.
In modern C++ code, std::function
objects
should be used for callbacks.
In older C++ code and in C code, all callbacks must have an
additional closure parameter of type void *
,
the value of which can be specified by client code. If
possible, the value of the closure parameter should be provided
by client code at the same time a specific callback is
registered (or specified as a function argument). If a single
closure parameter is shared by multiple callbacks, flexibility
is greatly reduced, and conflicts between different pieces of
client code using the same library object could be unresolvable.
In some cases, it makes sense to provide a de-registration
callback which can be used to destroy the closure parameter when
the callback is no longer used.
Callbacks can throw exceptions or call
longjmp
. If possible, all library objects
should remain in a valid state. (All further operations on them
can fail, but it should be possible to deallocate them without
causing resource leaks.)
The presence of callbacks raises the question if functions provided by the library are reentrant. Unless a library was designed for such use, bad things will happen if a callback function uses functions in the same library (particularly if they are invoked on the same objects and manipulate the same state). When the callback is invoked, the library can be in an inconsistent state. Reentrant functions are more difficult to write than thread-safe functions (by definition, simple locking would immediately lead to deadlocks). It is also difficult to decide what to do when destruction of an object which is currently processing a callback is requested.
Process Attributes
Several attributes are global and affect all code in the process, not just the library that manipulates them.
-
environment variables (see Accessing Environment Variables)
-
umask
-
user IDs, group IDs and capabilities
-
current working directory
-
signal handlers, signal masks and signal delivery
-
file locks (especially
fcntl
locks behave in surprising ways, not just in a multi-threaded environment)
Library code should avoid manipulating these global process attributes. It should not rely on environment variables, umask, the current working directory and signal masks because these attributes can be inherited from an untrusted source.
In addition, there are obvious process-wide aspects such as the virtual memory layout, the set of open files and dynamic shared objects, but with the exception of shared objects, these can be manipulated in a relatively isolated way.