MRtrix Singularity container: eddy_cuda fails to run

idychou · July 6, 2024, 7:12am

Dear experts,

I am using MRtrix3 Singularity container to run dwifslpreproc and eddy_cuda was unsuccessful. I’ve seen many threads relating to this issue but couldn’t find a solution for the problem. Your help would be much appreciated!!

Singularity container image: docker://mrtrix3/mrtrix3:latest
Nvidia GPU: Available, through job submission (Slurm) to a central processing cluster
Cuda versions supported: From 9.0 to 12.3
Eddy command called during dwifslpreproc: eddy_cuda10.2 --imain=eddy_in.nii --mask=eddy_mask.nii --acqp=eddy_config.txt --index=eddy_in$
Result: dwifslpreproc: CUDA version of ‘eddy’ was not successful; attempting OpenMP version

Remarks:

Running other MRtrix3 functions (e.g., mrconvert, mrcat, etc.) are completely fine.
Running “eddy_cuda” with FSL Singularity container is also fine (but it calls eddy_cuda9.1 instead of eddy_cuda10.2 as in the case of dwifslpreproc)
I don’t have FSL installed locally. Is this relevant?

I’m a beginner of using Singularity. Sorry if any of these sounds stupid!! Most grateful if you could find out the cause of the problem. It’s okay if it is unsolvable - just want to have an idea of what’s wrong.

Thank you so much!!!

lconcha · July 8, 2024, 7:07pm

Hi. Not sure about docker, but in singularity I believe the option --nv is necessary for the container to use the GPU. You could convert your docker container to singularity/apptainer and see if that works… Good luck!

j-tseng · July 23, 2024, 6:15pm

Hey - some troubleshooting thoughts:

As the other user remarked, --nv is necessary in your Singularity call to have GPU passthrough assuming your job has landed onto a GPU compute node
You should check with your cluster team that the CUDA version called by dwifslpreproc (i.e., CUDA 10.2) is actually available and configured properly. It sounds like a mismatch issue if you can use the eddy_cuda from the FSL Singularity container. You can probably check this by using an interactive session on the GPU node and calling nvidia-smi to check.
You don’t need FSL to be installed locally! The MRtrix3 container wraps into it the relevant FSL functions.

Hope that helps!

Daniel_Drucker · December 31, 2024, 7:19pm

Ah, it looks like Neurodesk’s container for mrtrix3 works!

Daniel_Drucker · December 31, 2024, 7:19pm

Is the problem that we only have cuda 11 and 12 available on the host - not 9? Is there a version of mrtrix’s container that supports cuda 11 or 12?

Daniel_Drucker · December 31, 2024, 7:19pm

I’m having the same issue:

$ apptainer run --nv /data/singularity-images/mrtrix3-3.0.4.simg eddy_cuda9.1
INFO:    underlay of /usr/bin/nvidia-smi required more than 50 (234) bind mounts
/opt/fsl/bin/eddy_cuda9.1: error while loading shared libraries: libcublas.so.9.1: cannot open shared object file: No such file or directory

This is from apptainer build mrtrix3-3.0.4.simg docker://mrtrix3/mrtrix3:3.0.4

It looks like the container does not have any of the necessary cuda drivers in it:

Apptainer> ls /opt/fsl/bin/eddy*
/opt/fsl/bin/eddy_cuda9.1  /opt/fsl/bin/eddy_openmp
Apptainer> ldd /opt/fsl/bin/eddy_cuda9.1
        linux-vdso.so.1 (0x0000800000136000)
        libopenblas.so.0 => /opt/fsl/lib/libopenblas.so.0 (0x00001555533bd000)
        libcublas.so.9.1 => not found
        libcudart.so.9.1 => not found
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00001555531eb000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00001555530a7000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x000015555308b000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x0000155552eb6000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x0000155552e94000)
        libgfortran.so.3 => /opt/fsl/lib/libgfortran.so.3 (0x0000003c50a00000)
        /lib64/ld-linux-x86-64.so.2 (0x0000155555529000)

How can we proceed?