Dwipreproc: Empty error from eddy_openmp

Hi everyone,

I’ve been facing an issue with eddy_openmp while processing a concatenated (mrcat) diffusion MRI dataset. Not only does eddy_cuda fail for processing, but eddy_openmp gives me an empty error. Has anyone encountered the same problem?

The command I’ve been trying to use is:

dwipreproc DTIcat_denoised.mif DTIcat_preproc.mif -rpe_none -pe_dir AP -eddy_options "--slm=linear " -info

Here’s the dataset:


************************************************
Image:               "DTIcat_denoised.mif"
************************************************
  Dimensions:        112 x 112 x 69 x 258
  Voxel size:        1.96429 x 1.96429 x 2.05 x ?
  Data strides:      [ -2 -3 4 1 ]
  Format:            MRtrix
  Data type:         32 bit float (little endian)
  Intensity scaling: offset = 0, multiplier = 1
  Transform:               0.9955     0.08941     0.03142      -116.7
                         -0.09138      0.9935     0.06804      -96.77
                         -0.02513    -0.07061      0.9972      -33.34
  EchoTime:          0.085
  FlipAngle:         85
  RepetitionTime:    4.5
  command_history:   mrcat "/mnt/DTI_p1_b1200_APA_optim - 501/IM-0024-8901-0001.dcm" "/mnt/DTI_p2_b2500_APA_optim - 701/IM-0026-8901-0001.dcm" "DTIcat.mif"  (version=3.0_RC3-135-g2b8e7d0c)
                     dwidenoise "DTIcat.mif" "DTIcat_denoised.mif"  (version=3.0_RC3-135-g2b8e7d0c)
  comments:          HV00 (HV_00) [MR] DTI_p1_b1200_APA_optim
                     study: RAD smr neuro 60 [ ORIGINAL PRIMARY M_SE M SE ]
                     DOB: 17/12/2010
                     DOS: 11/02/2019 10:00:50
  dw_scheme:         0.03142273054,0.06804225594,0.9971874952,1200
  [258 entries]      -0.7987939119,-0.3998337686,0.4495001137,1200
                     ...
                     -0.9516700506,0.289690882,0.1019967869,0.5
                     0,0,0,0
  mrtrix_version:    3.0_RC3-135-g2b8e7d0c

Here’s the error message:

Command:  eddy_cuda --imain=eddy_in.nii --mask=eddy_mask.nii --acqp=eddy_config.txt --index=eddy_indices.txt --bvecs=bvecs --bvals=bvals --slm=linear --out=dwi_post_eddy
          eddy_cuda: error while loading shared libraries: libcudart.so.7.5: cannot open shared object file: No such file or directory
dwipreproc: [WARNING] Command failed: eddy_cuda --imain=eddy_in.nii --mask=eddy_mask.nii --acqp=eddy_config.txt --index=eddy_indices.txt --bvecs=bvecs --bvals=bvals --slm=linear --out=dwi_post_eddy
dwipreproc: [WARNING] CUDA version of eddy appears to have failed; trying OpenMP version
Command:  eddy_openmp --imain=eddy_in.nii --mask=eddy_mask.nii --acqp=eddy_config.txt --index=eddy_indices.txt --bvecs=bvecs --bvals=bvals --slm=linear --out=dwi_post_eddy
dwipreproc:
dwipreproc: [ERROR] Command failed: eddy_openmp --imain=eddy_in.nii --mask=eddy_mask.nii --acqp=eddy_config.txt --index=eddy_indices.txt --bvecs=bvecs --bvals=bvals --slm=linear --out=dwi_post_eddy (dwipreproc:853)
dwipreproc: Output of failed command:
dwipreproc:
Traceback (most recent call last):
  File "/opt/mrtrix3/bin/dwipreproc", line 853, in <module>
    (eddy_stdout, eddy_stderr) = run.command(eddy_openmp_cmd + ' ' + eddy_all_options)
  File "/opt/mrtrix3/lib/mrtrix3/run.py", line 224, in command
    with open(os.path.join(app.tempDir, 'error.txt'), 'w') as outfile:
IOError: [Errno 5] Input/output error: '/mnt/processed/dwipreproc-tmp-V1BW3X/error.txt'

Thank you for any tips in advance.
Regards,
Daniel

1 Like

Hi Daniel,

While I’ve certainly heard of eddy failing without providing any information on stdout or stderr, this particular error is different to anything I’ve seen before. The MRtrix3 library is trying to write the output from eddy into a text file inside the script temporary running directory, both to store for later reference and also so that it can be printed to the terminal for you at a later stage of “script termination”; but the system is returning an “Input/Output error” when attempting to open that file for writing. But if the Python libraries can’t write files to that temporary directory, how has the script proceeded as far as it has? Can you reproduce this fault (ie. it’s not a sporadic unmounting of shared storage or the like?)

Regardless, given the line at which the exception occurs, it does indeed seem as though eddy has not provided any information about why it failed (indeed in the next tag update I’ve ensured that MRtrix3 will state more explicitly if the failing command has produced no output, partly to remove ambiguities such as this).

You could try providing here the outputs of both mrinfo and mrstats on eddy_in.nii and eddy_mask.nii, and the contents of files eddy_config.txt, eddy_indices.txt, bvecs and bvals. If there’s anything obviously wrong that should be enough to tell; but if not, assuming the data are well-formed, and running eddy without the MRtrix3 dwipreproc wrapper gives the same result, then fundamentally it’s an FSL issue, and you might have a better chance on their mailing list.

Rob

1 Like

Hi Rob,

I’m getting a similar error as Daniel above.

Here is the mrinfo + mrstats on eddy_in.nii:

(mrtrix3_HM) [heenamanglani@owens-login04 dwifslpreproc-tmp-B9QTM8]$ mrinfo eddy_in.nii
************************************************
Image name:          "eddy_in.nii"
************************************************
  Dimensions:        128 x 112 x 74 x 65
  Voxel size:        2 x 2 x 2 x 4.57
  Data strides:      [ -1 2 3 4 ]
  Format:            NIfTI-1.1
  Data type:         signed 16 bit integer (little endian)
  Intensity scaling: offset = 0, multiplier = 1
  Transform:               0.9999     0.01571    0.003397      -126.8
                         -0.01366      0.9419     -0.3356      -74.36
                         -0.00847      0.3355       0.942      -124.5
  comments:          TE=85;Time=122522
  mrtrix_version:    3.0.1


(mrtrix3_HM) [heenamanglani@owens-login04 dwifslpreproc-tmp-B9QTM8]$ mrstats eddy_in.nii 
      volume       mean     median        std        min        max      count
       [ 0 ]    674.079        130       1314          0      26839    1060864
       [ 1 ]     204.62        119    229.416          0       4962    1060864
       [ 2 ]    202.853        119    224.101          0       3569    1060864
       [ 3 ]    203.858        119    227.695          0       4904    1060864
       [ 4 ]    204.855        119     229.32          0       4302    1060864
       [ 5 ]    211.984        119    245.562          0       4402    1060864
       [ 6 ]    206.509        119    234.257          0       5895    1060864
       [ 7 ]     206.05        119    232.737          0       5439    1060864
       [ 8 ]    201.056        119     219.88          0       3271    1060864
       [ 9 ]    204.499        119    229.545          0       4800    1060864
      [ 10 ]    201.607        119    221.902          0       4259    1060864
      [ 11 ]     200.88        119    221.007          0       3890    1060864
      [ 12 ]    203.242        119      227.1          0       5448    1060864
      [ 13 ]    210.929        119    243.462          0       4457    1060864
      [ 14 ]    202.231        119    224.028          0       5383    1060864
      [ 15 ]    206.546        119    232.665          0       4678    1060864
      [ 16 ]    204.195        119    228.347          0       5452    1060864
      [ 17 ]    206.248        119    234.551          0       5171    1060864
      [ 18 ]    204.361        119    229.566          0       4784    1060864
      [ 19 ]    201.109        119    220.761          0       3423    1060864
      [ 20 ]    205.789        119    231.511          0       5208    1060864
      [ 21 ]    200.587        119    218.526          0       3878    1060864
      [ 22 ]    206.387        119    232.608          0       5043    1060864
      [ 23 ]     203.52        119    226.843          0       4993    1060864
      [ 24 ]    201.368        119    221.733          0       4237    1060864
      [ 25 ]    206.031        119    232.391          0       4660    1060864
      [ 26 ]    204.213        119     228.41          0       3996    1060864
      [ 27 ]    203.359        119    226.354          0       4677    1060864
      [ 28 ]    204.845        119    230.342          0       4525    1060864
      [ 29 ]    209.665        119    240.723          0       4735    1060864
      [ 30 ]    202.942        119     226.13          0       5042    1060864
      [ 31 ]    208.788        119    237.717          0       4966    1060864
      [ 32 ]    207.628        119    237.105          0       5396    1060864
      [ 33 ]    206.229        119    232.694          0       5289    1060864
      [ 34 ]     208.82        119    239.389          0       4952    1060864
      [ 35 ]    203.791        119    228.286          0       5332    1060864
      [ 36 ]    203.723        119    229.167          0       5339    1060864
      [ 37 ]    202.458        119     224.01          0       3887    1060864
      [ 38 ]    202.287        119    224.262          0       4891    1060864
      [ 39 ]     204.11        119    228.828          0       5034    1060864
      [ 40 ]    208.022        119    238.082          0       5245    1060864
      [ 41 ]    208.699        119    238.429          0       5243    1060864
      [ 42 ]    211.502        119    245.368          0       4928    1060864
      [ 43 ]    204.757        119    231.443          0       5175    1060864
      [ 44 ]    201.173        119    221.812          0       4595    1060864
      [ 45 ]     208.21        119    236.386          0       5048    1060864
      [ 46 ]    209.988        119    242.001          0       4606    1060864
      [ 47 ]    204.961        119    229.846          0       4639    1060864
      [ 48 ]    202.974        119    226.841          0       5115    1060864
      [ 49 ]    210.909        119    242.512          0       4707    1060864
      [ 50 ]    210.899        119    242.301          0       4658    1060864
      [ 51 ]     203.67        119    226.103          0       4348    1060864
      [ 52 ]    211.402        120    243.739          0       5446    1060864
      [ 53 ]    202.441        119    223.633          0       4156    1060864
      [ 54 ]    204.609        119    228.781          0       4795    1060864
      [ 55 ]    205.271        119    229.824          0       4608    1060864
      [ 56 ]    207.342        120    234.206          0       4862    1060864
      [ 57 ]    209.988        119    241.869          0       5133    1060864
      [ 58 ]    208.488        120    237.914          0       4552    1060864
      [ 59 ]    208.714        119    237.329          0       4547    1060864
      [ 60 ]    208.793        120    238.184          0       4528    1060864
      [ 61 ]    206.948        119    236.058          0       5109    1060864
      [ 62 ]    209.174        119    238.417          0       5059    1060864
      [ 63 ]    202.642        119    224.095          0       5036    1060864
      [ 64 ]    203.062        120    223.988          0       3642    1060864

Similarly, here is the mrinfo + mrstats for the eddy_mask.nii:

(mrtrix3_HM) [heenamanglani@owens-login04 dwifslpreproc-tmp-B9QTM8]$ mrinfo eddy_mask.nii
************************************************
Image name:          "eddy_mask.nii"
************************************************
  Dimensions:        128 x 112 x 74
  Voxel size:        2 x 2 x 2
  Data strides:      [ -1 2 3 ]
  Format:            NIfTI-1.1
  Data type:         32 bit float (little endian)
  Intensity scaling: offset = 0, multiplier = 1
  Transform:               0.9999     0.01571    0.003397      -126.8
                         -0.01366      0.9419     -0.3356      -74.36
                         -0.00847      0.3355       0.942      -124.5
  comments:          TE=85;Time=122522
  mrtrix_version:    3.0.1
(mrtrix3_HM) [heenamanglani@owens-login04 dwifslpreproc-tmp-B9QTM8]$ mrstats eddy_mask.nii 
      volume       mean     median        std        min        max      count
       [ 0 ]   0.173349          0   0.378549          0          1    1060864

eddy_config.txt:
0 -1 0 0.1

eddy_indices.txt:
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Suggestions? Please let me know if you need anything else.

Thanks!
Heena

Welcome Heena!

You’ve at least not run into the same weird input / output error as the original post, and eddy has at least produced some terminal output; it’s just annoying that it’s not produced any output that can be used for debugging purposes. I myself have had it crash at that point after the first “calculating parameter updates” step, but there’s ususally been an exception message of some form afterwards.

It would be worthwhile to start by navigating into the script scratch directory, and manually run the eddy_openmp FSL command using the command string shown. I’m interested to know if there is any additional terminal information produced that was for whatever reason not captured when it was run from inside of the dwifslpreproc script.

Beyond that, I see nothing objectionable with what you’ve shown of your data. You could also experiment with search terms on the forum: I’m sure there been multiple conversations now where eddy has not immediately provided error messages, but topics don’t always have the right title in order to find such.

Cheers
Rob

Hi Rob,

Thanks for your response. I tried running this command “eddy_openmp --imain=eddy_in.nii --mask=eddy_mask.nii --acqp=eddy_config.txt --index=eddy_indices.txt --bvecs=bvecs --bvals=bvals --slm=lin
ear --out=dwi_post_eddy --verbose” from the temp folder that was generated. However, I got the same error as before:

I want to mention that I am trying to run this on our supercomputer. The cluster that I am operating on only has newer versions of CUDA (from v 8.0.44 to 10.2.89). I originally thought the error was related to the unavailability of this version: “eddy_cuda: error while loading shared libraries: libcudart.so.7.5: cannot open shared object file: No such file or directory on our cluster.” However, when running dwifslpreproc locally, even though I got this same error, the script still ran.

Also note, from our cluster’s documentation, it looks as though OpenMP is always supported (without a need to load it).

Any other directions would be greatly appreciated! Thanks Rob.

Best,
Heena

Hi Heena,

I’m unfortunately kind of limited in my ability to provide advice once you get to the point where a command from another software package is run manually, with valid input data, and does not work, entirely independent of MRtrix3. But I’ll suggest a couple of things anyway:

  • If you were to upgrade to FSL6, eddy_cuda is provided pre-compiled against CUDA 8.0 and 9.1. So doing so may allow you to utilise the CUDA version, which would come with its own benefits.

  • The fact that you only have access to eddy_cuda7.5, thereby implying that you are using FSL5, also means that the eddy_openmp you are using must be somewhat out of date, and there’s therefore some chance that the version of eddy_openmp in FSL6 may have already resolved some internal bug that you are encountering here.

  • dwifslpreproc will try to run the CUDA version of eddy if it finds it in PATH, but will then attempt the OpenMP version if the former fails, precisely because of this kind of use case. Just to explain why the dwifslpreproc script is still running anyway, in case it wasn’t clear.

  • The fact that “Killed” is printed suggests that there may be some form of job monitor running on the compute cluster that is seeing something it doesn’t like and sending a SIGKILL system signal to that process. You could try contacting the system admin, providing the e.g. SLURM job ID, and asking them if they have any private logs that may provide insight into why the job was killed.

  • I would also note that supercomputer admins really don’t like people running compute-heavy jobs (especially multi-threaded ones) interactively at the terminal on nodes that are only intended for SSH login and job submission. If this happens to be what you were doing, then it actually makes sense for them to be killing your job (though it would be nice for them to contact you and explain why!).

Cheers
Rob

1 Like

Hi Heena , I got the same errors, did you find some solutions?

Kitty

Hi Kitty,

I tried working with our supercomputer tech team but we are unable to change the version of eddy_cuda on our specific cluster. I ended up running the dwifslprepoc command locally and then transferred the output to our supercomputer using sftp for the remaining preprocessing steps.

Sorry this is not more helpful!

-Heena

hi Heena,
Thanks for your reply. Both my local computer and the lab server encountered the same error, so I used TOPUP and EDDY in FSL instead of dwifslpreproc, which worked well for me.
Kitty

1 Like