Wiki

Clone wiki

opensource / PASE

General Learning

explainshell.com - Explains the different parts of a command. Great for everyone.

In Depth Topics

Loading Libs

This response originates from this thread and is authored by Kevin Adler (IBMer).

At program startup the linker/loader will analyze all the dependent libraries of the program and load them in to the program's address space. Any dependencies of those libraries will also be loaded as well (I don't know if it's done depth-first or breadth-first). Then static initialization occurs and finally the program entry point (main) is called. These libraries are loaded in to your address space until program termination or you call exec(). You can dynamically load at runtime through dlopen() API set and in that case, libraries can become unloaded (though it's not recommended as many libraries do not handle this properly).

What you're asking about is completely different, though. Let's say you're at a shell prompt and run some perzl application:

$ /opt/freeware/bin/curl http://google.com
  • The shell will first fork a new child process
  • In the child, it will pass the program and its arguments to exec() (or some variant thereof).
  • exec will overwrite the process's address space with the program binary
  • the loader will load all the dependent libraries as detailed above, including /opt/freeware/lib/libcurl.a which then causes /opt/freeware/lib/libssl.a to be loaded
  • curl downloads the url and echoes it to the screen
  • the parent (shell) is waiting for the child to exit and finally gets signalled that it has
  • the parent reads the child's exit code and sets $?

Now you call node:

$ /QOpenSys/QIBM/ProdData/Node/bin/node -v
  • The shell forks a new child process
  • In the child, again passes node and the arguments to exec
  • exec overwrites the child's address space with the node binary
  • the loader will load all of node's dependent libraries, including /QOpenSys/usr/lib/libssl.a
  • node prints out its version and exits with a 0 return code
  • the parent gets signalled that the child has exited
  • the parent reads the child's exit code and sets $?

If you pay close enough attention above, you'll see that each of these programs run in a new process (job), so there is no way a cached version in the process could have affected it. When you do an exec, everything in the process is replaced by a new program so you don't have to worry about that.

Now, AIX does have the concept of a system-wide cache. Basically when a library is first loaded, it will be loaded in to this system wide cache and mapped in to memory. Then instead of having to load from disk for every process that wants to use the library, the linker just has to map those pages in to the process's address space. You can run slibclean to clear this cache of any libraries that are not in use. This cache is affected by loader domains as well as the permissions set on the library itself. If a library is set executable only by the owner, it is loaded in to a private mapping and is not cached.

The other thing to remember is that the library is really just an archive of shared objects (.so or .o) or static objects (.o).

For instance, looking at libcurl.a, we see it depends on libssl.a, but really the libssl.so.0.9.8 shared object inside the libssl.a archive: /opt/freeware/lib/libcurl.a[libcurl.so.4]:

                        ***Loader Section***
                      Loader Header Information
VERSION#         #SYMtableENT     #RELOCent        LENidSTR
0x00000001       0x000002b9       0x00000aef       0x00000102

#IMPfilID        OFFidSTR         LENstrTBL        OFFstrTBL
0x00000009       0x0000c4ac       0x00002f31       0x0000c5ae


                        ***Import File Strings***
INDEX  PATH                          BASE                MEMBER
0      /opt/freeware/lib:/opt/freeware/lib:/usr/vac/lib:/usr/lib:/lib

1                                    libc.a              shr.o
2                                    libidn.a            libidn.so.11
3                                    libcrypto.a libcrypto.so.0.9.8
4                                    libssl.a            libssl.so.0.9.8
5                                    libz.a              libz.so.1
6                                    libldap.a           libldap-2.4.so.2

7                                    liblber.a           liblber-2.4.so.2

8                                    libssh2.a           libssh2.so.1

Also looking at index 0 is where you find the compiled in LIBPATH for this library. This is where the linker will look for these dependent libraries when loading libcurl.a.

When looking at node, however, we see that it is dependent on libssl.so.1 shared object in libssl.a:

kadler@wernstrom:~>dump -H  /QOpenSys/QIBM/ProdData/OPS/Node4/bin/node

/QOpenSys/QIBM/ProdData/OPS/Node4/bin/node:

                        ***Loader Section***
                      Loader Header Information
VERSION#         #SYMtableENT     #RELOCent        LENidSTR
0x00000001       0x00005904       0x00013a47       0x000001da

#IMPfilID        OFFidSTR         LENstrTBL        OFFstrTBL
0x00000006       0x001713d4       0x00173f11       0x001715ae


                        ***Import File Strings***
INDEX  PATH                          BASE                MEMBER
0
.:/QOpenSys/opt/freeware/bin/../lib/gcc/powerpc-ibm-aix6.1.0.0/4.8.4/pthread:/QOpenSys/opt/freeware/bin/../lib/gcc/powerpc-ibm-aix6.1.0.0/4.8.4/../../../pthread:/QOpenSys/opt/freeware/bin/../lib/gcc/powerpc-ibm-aix6.1.0.0/4.8.4:/QOpenSys/opt/freeware/bin/../lib/gcc:/QOpenSys/opt/freeware/bin/../lib/gcc/powerpc-ibm-aix6.1.0.0/4.8.4/../../..:/usr/lib:/lib

1                                    libc.a              shr.o
2                                    libpthreads.a       shr_xpg5.o
3                                    libcrypto.a         libcrypto.so.1
4                                    libpthreads.a       shr_comm.o
5                                    libssl.a            libssl.so.1

It has quite a complicated LIBPATH set, since by default the linker will add any path you specify with -L to this LIBPATH and GCC adds a bunch of its own paths automatically. So the only time you would run in to a problem is if you had a libssl.a that was in one of these common paths between the two binaries that did not contain both dependent objects (libssl.so.0.9.8 and libssl.so.1).

Anyway, quite a rambling knowledge dump. I should probably polish it a bit and turn it in to an article or something...

Updated