(This repo contains detailed bugreport to dynatrace agent) Summary ============== Preloading liboneagentproc.so breaks DT_RPATH/DT_RUNPATH processing during dlopen calls. Background ============== DT_RPATH/DT_RUNPATH allows one to modify dynamic library search path used by specific executable or shared library, with link option. For example: c++ -o some.exe -Wl,--rpath,/my/custom/libs some.o -llibmylib … and some.exe, when run, will look for libmylib.so in /my/custom/libs in addition to LD_LIBRARY_PATH or ld.so.conf settings (see `man dlopen` for details). This works also for shared libraries: c++ --shared -o libparent.so -Wl,--rpath,/my/custom/libs some.o … and libparent.so will look into /my/custom/libs for needed libraries (both those linked-in, and those dlopen-ed dynamically) One particularly useful idiom is that of using $ORIGIN. For example c++ -o some.exe -Wl,--rpath,\$ORIGIN …otherflags… and some.exe will look for needed libraries in the very directory it is installed. Things like c++ -o some.exe -Wl,--rpath,\$ORIGIN/../lib …otherflags… work too. The same resolution works for libraries which are dynamically loaded during execution via ``dlopen`` (like plugins) instead of being linked-in. The problem ================= Dynatrace agent installation creates /etc/ld.so.preload file, what causes preloading liboneagentproc.so library into any process run on the machine. This library provides it's replacement for various crucial libc functions (connect, execve, dlopen, dlsym, fopen, fclose, open64, open, close, fopen64), and once it is preloaded, those substitute the originals (I didn't try debugging, but I guess it grabs info like „which process opens what” then calls original, or does sth similar): $ objdump -TC /lib/x86_64-linux-gnu/liboneagentproc.so | grep DF 000000000000d948 g DF .text 000000000000031f connect 000000000000ad30 g DF .text 000000000000022a execve 0000000000004010 g DF .init 0000000000000000 _init 000000000000cf70 g DF .text 0000000000000812 dlopen 000000000000cdc0 g DF .text 00000000000001a8 dlsym 000000000000f480 g DF .text 00000000000002b0 fopen 000000000000f9e0 g DF .text 00000000000001fd fclose 000000000000ff90 g DF .text 00000000000003a1 open64 0000000000069fd8 g DF .fini 0000000000000000 _fini 0000000000004c70 g DF .text 000000000000007d ruxitBareMetalLog 000000000000fbe0 g DF .text 00000000000003a1 open 0000000000010340 g DF .text 00000000000001f9 close 000000000000f730 g DF .text 00000000000002b0 fopen64 In particular this means, that in case of dlopen calls, instead of standard sequence where (my program or library) ---[calls]---> glibc.dlopen now I have (my program or library) ---[calls]---> liboneagentproc.dlopen ---[calls]---> glibc.dlopen Now: glibc.dlopen inspects RPATH of *the calling executable*. In normal case, the calling executable is my program or library, which has RPATH/RUNPATH set, so dlopen considers this setting, and finds my library With liboneagentproc preloaded, the calling executable is liboneagentproc.so, which of course doesn't have my runpath set. So dlopen doesn't consider any rpath and doesnt find my library. Example program ===================== This repo contains simple example program (exe doing dlopen). To run the example, just execute ./compile_and_run.sh On system without dynatrace it prints: ===> Compiling lib ===> Compiling prog_linked ===> Compiling prog_dlopening ===> Running prog_linked Hello from program Hello from MyLib ===> Running prog_dlopening Hello from program Hello from MyLib Finished OK (the same happens if I remove /etc/ld.so.preload from system where dynatrace agent is installed) On system with dynatrace agent installed and it's preload active, it prints: ===> Compiling lib ===> Compiling prog_linked ===> Compiling prog_dlopening ===> Running prog_linked Hello from program Hello from MyLib ===> Running prog_dlopening Hello from program ERROR, dlopen failed libmylib.so: cannot open shared object file: No such file or directory Note: this is simplest case, to test the problem in detail one should also consider the case where one shared library dlopens another shared library. For some details, try running with LD_DEBUG=all set: On system without dynatrace: $ LD_DEBUG=all ./OUTPUT/myprog_dlopening.exe … 4836: file=libmylib.so ; dynamically loaded by OUTPUT/myprog_dlopening.exe  4836: find library=libmylib.so ; searching 4836: search path=/oracle/app/oracle/product/11.2.0/client_1/lib:/oracle/tuxedo188.8.131.52/lib (LD_LIBRARY_PATH) 4836: trying file=/oracle/app/oracle/product/11.2.0/client_1/lib/libmylib.so 4836: trying file=/oracle/tuxedo184.108.40.206/lib/libmylib.so 4836: search path=/home/marcink/DEV_hg/bugs/dynatrace_rpath/OUTPUT (RUNPATH from file OUTPUT/myprog_dlopening.exe) 4836: trying file=/home/marcink/DEV_hg/bugs/dynatrace_rpath/OUTPUT/libmylib.so 4836: 4836: file=libmylib.so ; generating link map 4836: dynamic: 0x00007f8d03cd6d48 base: 0x00007f8d03ad5000 size: 0x0000000000202018 4836: entry: 0x00007f8d03ad5b70 phdr: 0x00007f8d03ad5040 phnum: 7 … With dynatrace installed: $ LD_DEBUG=all ./OUTPUT/myprog_dlopening.exe … 5137: file=libmylib.so ; dynamically loaded by /lib/x86_64-linux-gnu/liboneagentproc.so  5137: find library=libmylib.so ; searching 5137: search path=/oracle/app/oracle/product/11.2.0/client_1/lib:/oracle/tuxedo220.127.116.11/lib (LD_LIBRARY_PATH) 5137: trying file=/oracle/app/oracle/product/11.2.0/client_1/lib/libmylib.so 5137: trying file=/oracle/tuxedo18.104.22.168/lib/libmylib.so 5137: search cache=/etc/ld.so.cache 5137: search path=…various-system-paths…/usr/lib/x86_64:/usr/lib (system search path) 5137: … 5137: trying file=/lib/x86_64-linux-gnu/libmylib.so 5137: … 5137: trying file=/lib/libmylib.so 5137: … 5137: trying file=/usr/lib/x86_64/libmylib.so 5137: trying file=/usr/lib/libmylib.so 5137: 5137: symbol=_dl_exception_create; lookup in file=OUTPUT/myprog_dlopening.exe  5137: symbol=_dl_exception_create; lookup in file=/lib/x86_64-linux-gnu/liboneagentproc.so  Different caller, so no rpath. Realistic use cases ======================= RUNPATH has various realistic use cases, for example AFAIK Qt uses it to smoothly handle multiple versions installed in parallel on the same system. My true case is related to python scripting, RUNPATH makes it easier to have pyextension.so find it's dependant libraries, especially when there are various versions of pyextension.so scattered around different virtualenvs on the system (each in need of libs from different location). Attempts to use LD_LIBRARY_PATH in such a case are really painful. Even more important case is that of setuid/setgid, when LD_LIBRARY_PATH doesn't work at all and RUNPATH is the only smooth solution in case libraries aren't installed systemwide.