I’ve been facing an issue with eddy_openmp while processing a concatenated (mrcat) diffusion MRI dataset. Not only does eddy_cuda fail for processing, but eddy_openmp gives me an empty error. Has anyone encountered the same problem?
The command I’ve been trying to use is:
dwipreproc DTIcat_denoised.mif DTIcat_preproc.mif -rpe_none -pe_dir AP -eddy_options "--slm=linear " -info
Here’s the dataset:
************************************************
Image: "DTIcat_denoised.mif"
************************************************
Dimensions: 112 x 112 x 69 x 258
Voxel size: 1.96429 x 1.96429 x 2.05 x ?
Data strides: [ -2 -3 4 1 ]
Format: MRtrix
Data type: 32 bit float (little endian)
Intensity scaling: offset = 0, multiplier = 1
Transform: 0.9955 0.08941 0.03142 -116.7
-0.09138 0.9935 0.06804 -96.77
-0.02513 -0.07061 0.9972 -33.34
EchoTime: 0.085
FlipAngle: 85
RepetitionTime: 4.5
command_history: mrcat "/mnt/DTI_p1_b1200_APA_optim - 501/IM-0024-8901-0001.dcm" "/mnt/DTI_p2_b2500_APA_optim - 701/IM-0026-8901-0001.dcm" "DTIcat.mif" (version=3.0_RC3-135-g2b8e7d0c)
dwidenoise "DTIcat.mif" "DTIcat_denoised.mif" (version=3.0_RC3-135-g2b8e7d0c)
comments: HV00 (HV_00) [MR] DTI_p1_b1200_APA_optim
study: RAD smr neuro 60 [ ORIGINAL PRIMARY M_SE M SE ]
DOB: 17/12/2010
DOS: 11/02/2019 10:00:50
dw_scheme: 0.03142273054,0.06804225594,0.9971874952,1200
[258 entries] -0.7987939119,-0.3998337686,0.4495001137,1200
...
-0.9516700506,0.289690882,0.1019967869,0.5
0,0,0,0
mrtrix_version: 3.0_RC3-135-g2b8e7d0c
Here’s the error message:
Command: eddy_cuda --imain=eddy_in.nii --mask=eddy_mask.nii --acqp=eddy_config.txt --index=eddy_indices.txt --bvecs=bvecs --bvals=bvals --slm=linear --out=dwi_post_eddy
eddy_cuda: error while loading shared libraries: libcudart.so.7.5: cannot open shared object file: No such file or directory
dwipreproc: [WARNING] Command failed: eddy_cuda --imain=eddy_in.nii --mask=eddy_mask.nii --acqp=eddy_config.txt --index=eddy_indices.txt --bvecs=bvecs --bvals=bvals --slm=linear --out=dwi_post_eddy
dwipreproc: [WARNING] CUDA version of eddy appears to have failed; trying OpenMP version
Command: eddy_openmp --imain=eddy_in.nii --mask=eddy_mask.nii --acqp=eddy_config.txt --index=eddy_indices.txt --bvecs=bvecs --bvals=bvals --slm=linear --out=dwi_post_eddy
dwipreproc:
dwipreproc: [ERROR] Command failed: eddy_openmp --imain=eddy_in.nii --mask=eddy_mask.nii --acqp=eddy_config.txt --index=eddy_indices.txt --bvecs=bvecs --bvals=bvals --slm=linear --out=dwi_post_eddy (dwipreproc:853)
dwipreproc: Output of failed command:
dwipreproc:
Traceback (most recent call last):
File "/opt/mrtrix3/bin/dwipreproc", line 853, in <module>
(eddy_stdout, eddy_stderr) = run.command(eddy_openmp_cmd + ' ' + eddy_all_options)
File "/opt/mrtrix3/lib/mrtrix3/run.py", line 224, in command
with open(os.path.join(app.tempDir, 'error.txt'), 'w') as outfile:
IOError: [Errno 5] Input/output error: '/mnt/processed/dwipreproc-tmp-V1BW3X/error.txt'
Thank you for any tips in advance.
Regards,
Daniel
While I’ve certainly heard of eddy failing without providing any information on stdout or stderr, this particular error is different to anything I’ve seen before. The MRtrix3 library is trying to write the output from eddy into a text file inside the script temporary running directory, both to store for later reference and also so that it can be printed to the terminal for you at a later stage of “script termination”; but the system is returning an “Input/Output error” when attempting to open that file for writing. But if the Python libraries can’t write files to that temporary directory, how has the script proceeded as far as it has? Can you reproduce this fault (ie. it’s not a sporadic unmounting of shared storage or the like?)
Regardless, given the line at which the exception occurs, it does indeed seem as though eddy has not provided any information about why it failed (indeed in the next tag update I’ve ensured that MRtrix3 will state more explicitly if the failing command has produced no output, partly to remove ambiguities such as this).
You could try providing here the outputs of both mrinfo and mrstats on eddy_in.nii and eddy_mask.nii, and the contents of files eddy_config.txt, eddy_indices.txt, bvecs and bvals. If there’s anything obviously wrong that should be enough to tell; but if not, assuming the data are well-formed, and running eddy without the MRtrix3dwipreproc wrapper gives the same result, then fundamentally it’s an FSL issue, and you might have a better chance on their mailing list.
You’ve at least not run into the same weird input / output error as the original post, and eddy has at least produced some terminal output; it’s just annoying that it’s not produced any output that can be used for debugging purposes. I myself have had it crash at that point after the first “calculating parameter updates” step, but there’s ususally been an exception message of some form afterwards.
It would be worthwhile to start by navigating into the script scratch directory, and manually run the eddy_openmp FSL command using the command string shown. I’m interested to know if there is any additional terminal information produced that was for whatever reason not captured when it was run from inside of the dwifslpreproc script.
Beyond that, I see nothing objectionable with what you’ve shown of your data. You could also experiment with search terms on the forum: I’m sure there been multiple conversations now where eddy has not immediately provided error messages, but topics don’t always have the right title in order to find such.
Thanks for your response. I tried running this command “eddy_openmp --imain=eddy_in.nii --mask=eddy_mask.nii --acqp=eddy_config.txt --index=eddy_indices.txt --bvecs=bvecs --bvals=bvals --slm=lin
ear --out=dwi_post_eddy --verbose” from the temp folder that was generated. However, I got the same error as before:
I want to mention that I am trying to run this on our supercomputer. The cluster that I am operating on only has newer versions of CUDA (from v 8.0.44 to 10.2.89). I originally thought the error was related to the unavailability of this version: “eddy_cuda: error while loading shared libraries: libcudart.so.7.5: cannot open shared object file: No such file or directory on our cluster.” However, when running dwifslpreproc locally, even though I got this same error, the script still ran.
Also note, from our cluster’s documentation, it looks as though OpenMP is always supported (without a need to load it).
Any other directions would be greatly appreciated! Thanks Rob.
I’m unfortunately kind of limited in my ability to provide advice once you get to the point where a command from another software package is run manually, with valid input data, and does not work, entirely independent of MRtrix3. But I’ll suggest a couple of things anyway:
If you were to upgrade to FSL6, eddy_cuda is provided pre-compiled against CUDA 8.0 and 9.1. So doing so may allow you to utilise the CUDA version, which would come with its own benefits.
The fact that you only have access to eddy_cuda7.5, thereby implying that you are using FSL5, also means that the eddy_openmp you are using must be somewhat out of date, and there’s therefore some chance that the version of eddy_openmp in FSL6 may have already resolved some internal bug that you are encountering here.
dwifslpreproc will try to run the CUDA version of eddy if it finds it in PATH, but will then attempt the OpenMP version if the former fails, precisely because of this kind of use case. Just to explain why the dwifslpreproc script is still running anyway, in case it wasn’t clear.
The fact that “Killed” is printed suggests that there may be some form of job monitor running on the compute cluster that is seeing something it doesn’t like and sending a SIGKILL system signal to that process. You could try contacting the system admin, providing the e.g. SLURM job ID, and asking them if they have any private logs that may provide insight into why the job was killed.
I would also note that supercomputer admins really don’t like people running compute-heavy jobs (especially multi-threaded ones) interactively at the terminal on nodes that are only intended for SSH login and job submission. If this happens to be what you were doing, then it actually makes sense for them to be killing your job (though it would be nice for them to contact you and explain why!).
I tried working with our supercomputer tech team but we are unable to change the version of eddy_cuda on our specific cluster. I ended up running the dwifslprepoc command locally and then transferred the output to our supercomputer using sftp for the remaining preprocessing steps.
hi Heena,
Thanks for your reply. Both my local computer and the lab server encountered the same error, so I used TOPUP and EDDY in FSL instead of dwifslpreproc, which worked well for me.
Kitty