Wiki

Clone wiki

Dido / Home

General about shared libraries in D

What is shared library

Please, skip this chapter if you are already familiar with shared/dynamic libraries.

During the developing applications programmers often face the same challenges. They may need I/O operations, math functions, containers' implementation, regular expressions, GUI toolkit, access to devices and more. Since developers don't like to repeat themselves they make their solutions for a given task once or use ready-made solutions written by other developers. Such code bases are often combined into so called modules or libraries. The good library should provide properly implemented required functionality and convenient interface to make it suitable for use in many projects. Sometimes a library is written during the development of application. The good example is GTK+ that was originally created for GNU Image Manipulation Program. Now many applications use GTK+ as GUI toolkit. There are some kinds of libraries:

  • Header library is just a bunch of source files that you can include or add to your project and use just as if you wrote them by yourself. The advantage of such kind of libraries is that you don't need to do anything specific to use their code - just include header files and you will get all functionality of the library. The disadvantage on other hand is increase of compilation time (at least when you compile your project first time) and resultant size of binary. Also it's impossible to hide the source code from programmer. Usually header libraries are template libraries (in sense of C++ and D templates) since templates can't be compiled to object files. The most of Boost libraries are header libraries due to extensive usage of templates.
  • Static library is just archive of object files (i.e. precompiled code). To include these objects to your executable you need to link your program against static library or just pass static library file along with other object files to compiler. The special utility linker is used for that. If you develop your project in IDE or deal only with functions provided by language standard library you probably never used linker explicitly, but building of application takes two steps: compiling (gets object files from sources) and linking (gets executable from objects). Actually many compilers call linker internally, so you may not notice it. In case of static linking (i.e. linking against static libraries) linker resolves needed symbols and then it copies required parts of binary code to resultant executable image making it easier to distribute because such image contains all needed stuff itself and has no external dependencies. Also programmer doesn't need the whole source code of the static library, she needs only interface files (which contain signatures of functions, class interfaces and definition of public global data), so developer can make source code closed if he wants. Static libraries have .a extension on the most Unix systems and .lib on Windows.
  • Shared libraries are the most interesting ones. In contrast to static libraries binary code of shared libraries is not included to resultant executable image. Application loads all needed shared libraries at loadtime or runtime depending on decision of developer. In the first case it should be linked against shared libraries (it's called dynamic linking) just like against static ones, but this time semantics are different. Linker adds special information to executable image making it able to find and load shared libraries on which it depends at loadtime. As well as for static libraries developer does not need to have the whole source code, only interface files. The second approach requires more work to do, but it can be more flexible. Developer writes some code to load shared library and needed symbols (addresses of functions) at runtime. The good thing about this approach is that symbols can be loaded at any point of code, so they will not be loaded until application needs them. Behavior actually depends on design provided by software developer. You should understand that such approach has some constraints and can be error-prone - developer has to check all types by herself because type information about loaded symbols is not available to compiler. It's runtime already, not build time. Also developer is not allowed to use structs and classes directly because compiler knows nothing about their interfaces and definitions. However runtime loading allows to make so called plugins. Although plugin interfaces are constrained by developer implementation has no constraints, so other developers can add new functionality to application without need to rebuild it. The good thing about shared libraries (in both cases) is that many applications can use same library simultaneously (that's why it is called shared). It reduces the size of binaries. Don't worry about data - variables are not shared across applications, so one application can't harm another through data changes in shared library. The other good thing about shared libraries is that if they was updated then applications dependent on them does not require update or rebuild (of course if shared libraries still provide all needed by application functions and if these functions have the same signatures as before). That's very useful to make fixes without affecting executable image. Shared libraries have .so extension (that stands for shared object) on the most Unix systems and .dll (that stands for dynamic-link library) on Windows. Actually .dlls are not shared but the idea is the same and we call them shared for simplicity. The disadvantage of shared libraries is that newer versions of library may break compatibility (for example, old functions were removed or their semantics were changed). However many libraries have static and shared versions, so developer has a choice.

Building shared library in D

Well, shared libraries seem good. So how to make shared library using our favorite programming language (I mean D surely)? First of all we should write source file (or files).

#!d
module some_module;
export int some_function(int a, int b)
{
    return a+b;
}
export extern(C) int some_cfunction(int a, int b)
{
    return a+b;
}
As you can see we don't make main function since library does not need start point. Note that both functions has export keyword before its signature, which means we want to export this function for library clients (executable image or another library). Just for example we also include the second function which has extern(C) directive before signature. It means mangling rules are not applied for this function and it has C calling conventions (mangling will be explained later). The next steps depend on the platform.

Building shared library on Unix

For example, the name of your source file is some_module.d and you want to use dmd compiler. Then command to build shared library from it will be:

dmd -shared -fPIC some_module.d -H -oflibSomeModule.so (dmd should be in $PATH)

This line needs some comments.

  • First of all -shared means that we want to generate shared library.
  • -fPIC is for position independent code. It's modern approach to build shared libraries. See more here.
  • some_module.d is name of your source file. You can specify more files if you want to build library from many source files.
  • -H generates 'header' files for each module. D header files have .di extension (D interface).
  • The last option specifies file name of shared library. It will be libSomeModule.so. It's convention to prefix library names with 'lib' word.

If you want some code to be executed during library loading (for example, global data initializing) you may use static this module constructor. For cleanup use static ~this module destructor that will be called when library unloads.

#!d
static this()
{
    //initialization
}
static ~this()
{
    //cleanup
}

Building dynamic library on Windows

Building .dll on Windows requires more work to do. First of all you need write DllMain function somewhere. Think about DllMain like about library constructor and destructor. It's better to create separate source file and place this function there. The default implementation may look like this:

#!d
version(Windows):

import std.c.windows.windows;
import core.sys.windows.dll;

__gshared HINSTANCE g_hInst;

extern (Windows)
BOOL DllMain(HINSTANCE hInstance, ULONG ulReason, LPVOID pvReserved)
{
    switch (ulReason)
    {
    case DLL_PROCESS_ATTACH:
        g_hInst = hInstance;
        dll_process_attach(hInstance);
        //initialization
        break;
    case DLL_PROCESS_DETACH:
        dll_process_detach(hInstance);
        //cleanup
        break;
    case DLL_THREAD_ATTACH:
        dll_thread_attach();
        break;
    case DLL_THREAD_DETACH:
        dll_thread_detach();
        break;
    default:
        break;
    }
    return true;
}

Another thing you should do is to write .def (definition) file. For example:

LIBRARY "SomeModule.dll"
EXETYPE NT
SUBSYSTEM WINDOWS
CODE SHARED EXECUTE
DATA WRITE
Actually .def file is not necessary but without it dmd has a bug when some symbol in the library exported with redundant underscore at the start of its name. See more about definition files on msdn. You can write some piece of code under case DLL_PROCESS_ATTACH to implement library constructor and other piece of code under case DLL_PROCESS_DETACH to implement library destructor. DllMain does not prohibit from having static this constructor and static ~this destructor but the order in which functions are called is not obvious. Now type command:

dmd SomeModule.def some_module.d dlib.d -ofSomeModule.dll (dmd should be in %PATH%)

  • As you see .def file is passed to compiler along with d source files.
  • dlib.d contains DllMain.
  • The last option specifies file name of dynamic library. It will be SomeModule.dll.

Using shared library

So you built your shared library. What's next? Now you can link you project against this library or load it at runtime.

Linking against shared library

This section in unfinished.

Loading library at runtime

The code that programmer should write to load library at runtime is platform-dependent (for Windows read this and for Linux read this). That's where Dido library is useful. See the next chapter for samples.

Using Dido

Using Library

So we have our shared library. Let's load it at runtime. Dido provides two interfaces to work with shared libraries. The first one uses old C-like approach without classes and exceptions. The second one follows D-way, it uses classes, exceptions and templates. That's what we need.

Loading shared library

The next lines load our library to memory:

#!d
string libName = Library.cwd ~ Library.nativeName("SomeModule");
auto lib = new Library(libName);

In the first line we create string representing file name of our shared library. As you can notice we don't specify platform-dependent extension explicitly. Library.nativeName function does this work for us helping to make the code cross-platform. It also adds 'lib' prefix if needed. The library name may also include relative or absolute path. We assume that library is placed in same directory as executable, so we prepend Library.cwd to library name. If we don't specify any path the library file will be searched in system paths first and only then in current directory (not true for Linux systems and some other). In the second line we create instance of Library class. It automatically loads the library as well.

Loading functions from shared library

Now we want to load functions from our library. Every function should be specified explicitly as well as its type. There are some variants to do that but they all use resolve member function of Library class. Let's start with loading function that has C linkage.

#!d
alias int function(int, int) FunctionType;
auto cfoo1 = cast(ExternC!FunctionType) lib.resolve("some_cfunction");
auto cfoo2 = lib.resolve!(ExternC!FunctionType)("some_cfunction");
assert(cfoo1(1,2)==3);
assert(cfoo2(3,2)==5);

First of all we declare type of imported function to ease our life. Then we use two variants of resolve. Here it's just example, in actual program you may choose better one. The first one returns the plain address of function i.e. it has type void* (pointer to void), so if we want to call this function we need to cast returned value to appropriate type. The second one casts address automatically depending of template parameter. As you may notice we also apply ExternC template to function type to force it to have C calling conventions. It's not necessary on Linux systems, but it's the way to keep the code cross-platform. Then we can check how it works. If you remember our function just adds two numbers and return the result value.

Ok, what's about loading D functions? The problem is that D names are mangled. That means resultant name of D function differs from its name in the source code. Why? The main reason to mangle names is function overloading. C language has no function overloading and there is only one function behind one name. But it's not true for the most of modern languages since they have overloads. That means many functions may have same name in source code but different types. Linker should distinguish them as different functions because they actually are. That's where mangling rules are applied. Mangled names are formed from function signature. Namespaces (in C++), modules (in D) and other named scopes (for example classes and templates) are taken into account as well. Reading mangled names may be painful for human eyes because they have digits and abbreviations of arguments' types mixed with original name and namespaces names. So how can we load D functions? For sure we can get knowledge of mangled names using special utilities for retrieving symbol information from a shared library (for example nm utility), but as I mentioned above the mangled names look terrible and using them makes your source code graceless. The solution is to use resolveD function. Let's see how it works.

#!d
auto dfoo = lib.resolveD!FunctionType("some_module.some_function");
assert(dfoo(2,4)==6);

resolveD automagically mangles function name depending on its type. Note also that it requires you to specify the full module path. It's not really problem since even if the source code of library is closed it still leaves interface files open for users. Note that resolveD is actually experimental Dido feature and it was not tested properly, but it works fine in simple cases.

Unloading the shared library

If we don't need resolved functions more, we can unload the library thereby freeing resources.

#!d
lib.unload();

Note that you must not call any resolved function after library unloading. Attempt to do this will cause segmentation fault or something even worse.

Handling errors

Still we talked about how to use Library class to load library and functions from it. But what if dynamic loader can't find file or specified file is not a shared library? Or if we request library to give us unexisting symbol? Then we got errors. And errors should be handled, otherwise program enters undefined state and further execution may be dangerous. According to D language reference the best way to report errors is to use exception handling. Dido defines some exceptions classes relating to Library class. Actually every member function of Library class used in previous sections can throw exceptions. Let's rewrite our code to handle exceptions and see the full listing.

Full program listing

#!d
import dido.library;
import std.stdio;

int main(string[] args)
{
    try {
        string libName = Library.cwd ~ Library.nativeName("SomeModule");
        auto lib = new Library(libName);

        alias int function(int, int) FunctionType;
        auto cfoo1 = cast(ExternC!FunctionType) lib.resolve("some_cfunction");
        auto cfoo2 = lib.resolve!(ExternC!FunctionType)("some_cfunction");
        assert(cfoo1(1,2)==3);
        assert(cfoo2(3,2)==5);

        auto dfoo = lib.resolveD!FunctionType("some_module.some_function");
        assert(dfoo(2,4)==6);

        lib.unload();
    }
    catch(LoadException e)
    {
        writefln("Error occured while loading library %s. It says: %s.", e.fileName, e.msg);
        return 1;
    }
    catch(ResolveException e)
    {
        writefln("Error occured while resolving symbol %s from library %s. It says: %s.", e.symbol, e.fileName, e.msg);
        return 1;
    }
    catch(UnloadException e)
    {
        writefln("Error occured while unloading library %s. It says: %s.", e.fileName, e.msg);
        return 1;
    }
    catch(LibraryException e)
    {
        writefln("Error occured while library processing. It says: %s.", e.msg);
        return 1;
    }
    return 0;
}

Library class is placed in dido.library module, so we should import it. In the main function you can see familiar lines of code but they are surrounded by try block. After this block catch blocks for every exception type follow. LibraryException is the base class for all other exception classes so if you're not interested in type of exception use it to catch them all.

How to build this example

For Linux it will be:

dmd -Ipath/where/dido/places -L-ldido didotest.d -L-ldl

Dido library should be in place where linker can find it. -L-ldl is used to link against libdl which Dido depends on.

More about Library class

Library is class so you can derive user defined classes from it. This can be useful if you want to do some work after library loading - just override load function. Don't forget to call original function with super.load. You can also provide resolve functions for other languages than C and D in derived class (if you know mangling rules and this language has binary compatibility with D).

Using ClassLoader

So you learnt how to load functions using Dido. What's about classes? The good news is you can load them too, but with some limitations. Dido uses ClassLoader class for this task. It has support for loading C++ classes too, but with even more limitations. We consider both cases.

Loading D class

Using ClassLoader is not so trivial unlike using Library class. You should follow some rules to get it working. First of all you should define class that will be base for all classes exported from shared library. That's where limitations are started. Interface of class should be known from both sides i.e. from side of library and from side of application. Let's write some example. Here's definition of abstract class:

#!d
module abstract_class;
class AbstractFoo
{
protected:
    int payload;
public:
    abstract int dfoo(int a, int b);
    int getPayload() const {
        return payload;
    }
}
Note that due to dmd bug you must use abstract class or class, not interface. Now let's implement some derived classes and export them from the library.

#!d
module class_impl;
import abstract_class;
import dido.classexport;

class PlusFoo : AbstractFoo
{
public:
    override int dfoo(int a, int b) 
    {
        payload = a+b;
        return payload;
    }
}

class MultFoo : AbstractFoo
{
public:
    override int dfoo(int a, int b) 
    {
        payload = a*b;
        return payload;
    }
}

mixin(ExportClasses!(AbstractClass, PlusFoo, MultFoo));
Derived classes are not much interesting. The interesting thing here is ExportClasses template. It generates some code at compiletime making this library available to load using ClassLoader. ExportClasses template is variadic but the first parameter is always the base class for the rest. Due to some issues with D templates we have to pass the result of ExportClasses to mixin.

Updated