Error while loading shared libraries

Dear experts,

I was running a code overnight (it was running for a couple of days), and this morning I found that it crashed in the middle of the night because it can not load the shared libraries, more exactly the error is:

tck2connectome: error while loading shared libraries: libmrtrix.so: cannot open shared object file: No such file or directory

I already checked the path of the library and the file is there. As a workaround I copied the file to /lib/x86_64-linux-gnu/ but I would like to understand what possibly could have happened. Any clue? Thanks in advance.

Best regards,

Manuel

I’ve never heard of anything like this… Assuming everything was working before, the only explanations I can think of involve some form of OS / hardware failure or filesystem corruption. I assumed you tried simply re-starting the job or at least invoking some MRtrix3 command without moving libmrtrix.so, and that didn’t work? Did your system logs provide any insights? Maybe you have MRtrix3 installed in a network filesystem (e.g. NFS) and the network went down at some point (in which case you might see hints of stale NFS file handles in the logs)? There’s lots of things that could have gone wrong here, but none of the ones I can think of are necessarily related with MRtrix3 as such…

If you really want to get to the bottom of this (or if it happens again), one thing you can look into is the output of ldd $(which mrconvert) (or whatever the failing executable actually was) – this lists all the shared libraries required by the executable at runtime, and where the system has located them. On my system, this gives:

$ ldd $(which mrconvert)
	linux-vdso.so.1 (0x00007ffd30b64000)
	libmrtrix.so => /home/jdt13/mrtrix3/bin/../lib/libmrtrix.so (0x00007ff333546000)
	libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007ff33336e000)
	libm.so.6 => /usr/lib/libm.so.6 (0x00007ff333228000)
	libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007ff33320e000)
	libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007ff3331ed000)
	libc.so.6 => /usr/lib/libc.so.6 (0x00007ff333028000)
	/lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007ff3338d9000)
	libz.so.1 => /usr/lib/libz.so.1 (0x00007ff332e0f000)
	libtiff.so.5 => /usr/lib/libtiff.so.5 (0x00007ff332d84000)
	libzstd.so.1 => /usr/lib/libzstd.so.1 (0x00007ff332ce4000)
	liblzma.so.5 => /usr/lib/liblzma.so.5 (0x00007ff332abe000)
	libjpeg.so.8 => /usr/lib/libjpeg.so.8 (0x00007ff332a29000)

Note that MRtrix3 executables are set to look for the libmrtrix.so file in a location relative to the executable itself (in the ../lib folder). We do this by setting the rpath. You can check that too using the readelf -d $(which mrconvert) command – on my system, this gives:

$ readelf -d $(which mrconvert) 

Dynamic section at offset 0xd1c38 contains 34 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libmrtrix.so]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [ld-linux-x86-64.so.2]
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN/../lib]
...

The RPATH line shows the runtime search path baked into the executable instructing the OS to look in this folder when trying to locate the required shared libraries listed above. If that’s not set as above, that would indeed cause problems – but then the command shouldn’t ever have worked in the first place…

Hi @jdtournier,

Thanks for your reply. I think I found the explanation, I forgot to report it back.

What happens is that during that night the backup process of the server started to have some problems, and the administrator re-stamped the partition (note this problem also affected some of the FSL libraries) and maybe the change in permissions may have upset al the processes.

Now everything is working fine.

Best regards,

Manuel

1 Like