Problem with parallelizing MRtrix pipeline

Hi all,

I am having some problems running preprocessing such as 5tt, dwi2response and dwi2fod on a cluster.

I suspect I know what the problem is but I’m unsure of the solution. I will describe the problem below:

Since we get charged time on our cluster by core hours, I want to ensure that as many threads per core are active while computing the preprocessing.

While most of the MRtrix programs do nicely, the 5ttgen algorithm calls on first, which apparently only works single threaded.

As a solution I disabled multithreading and ran 10 participants at one go to utilise 10 threads on the cluster.

This generally works fine but is encountering some problems which I suspect stems from the fact that the 5tt algorithm changes directory to work within the temporary directory rather than running commands from the original directory with the absolute paths to the temporary directories.

I suspect that this changing of directories is the cause of my problems but I really don’t see a way around it.

Any thoughts?

Claude

Hi Claude,

Sorry (yet again…) about being so slow to get back to you.

So indeed, 5ttgen is a script that invokes external commands (typically FSL first & fast), and the level of multi-threading in these executables is obviously outside our control. dwi2response is also a script, and what it does will depend on which algorithm you use. But it should be multi-threaded most of the time, and relatively quick in any case. I’m not sure why you’d have problems with dwi2fod though, that one is fully multi-threaded…?

Regarding your problem with 5ttgen, it’s hard to say without a full dump of the command-line and its output (preferably with the -debug option). But if you suspect the temporary directory is a problem, you can specify it manually with the -tempdir option. If that doesn’t help, please show us the full output of the command so we can try to figure it out.

All the best,
Donald.

Hi Donald,

Thanks for your response, I understand that you have been super busy these days.

My issue usually pops up at the 5ttgen stage so I’m not sure about the dwi2fod. As I said, I only get the error when I try to disable multithreading and run multiple particiapents from one script in parrallel.

I’ve left that project go for a short while but I will get back to it shortly and send you a dump.

What I mean by suspecting the temporarary directory is this, I run (for example):

5ttgen fsl 1_1-T1w.mif -premasked -nthreads 0 &
5ttgen fsl 2_1-T1w.mif -premasked -nthreads 0 &
5ttgen fsl 3_1-T1w.mif -premasked -nthreads 0 &
5ttgen fsl 4_1-T1w.mif -premasked -nthreads 0 &

All of them start running and all create temporary directories but the shell can only be in one directory at one time hence there will be a confilct.

Or at least that is what I think is going on.

As I said I’ll rerun and send some error messages to be sure.

Claude

Hi Claude,

I’m pretty sure that precise usage will indeed cause problems with any of the MRtrix3 Python scripts due to manipulation of the working directory: it’s not generation of a temporary directory that’s the problem, but specifically moving the current working directory to that location.

If the cluster is using a job scheduler like SLURM, it’s usually possible to run a script that will generate a set of batch jobs, one for each subject, that are then all submitted to the job scheduler. If each job appropriately requests only one thread within the job request, then there’s a good chance the cluster will run all of them in parallel on the one node.

Otherwise, another approach you can try is the foreach script provided with MRtrix3. You could request that a number of jobs be run in parallel, but specify that each individual job gets the -nthreads 0 option.

Rob

Thanks Rob,

Yes, that is exactly what I meant by the “temp directory is causing problems”. It’s changing into it and hence constantly changing the cwd.

Re: SLURM on our cluster, as far as I am aware each job is granted exclucivity of nodes so that may be a problem (though I will double check).

I will also have a look at the foreach script.

So far I’ve ported the python script to a bash script which doesn’t change workiing directories, that may be the easiest approach.

Claude

Hi @rsmith,

I have been trying alternative approaches to using these scripts and I’m now about to try the foreach script. However, before I go down that route, will the foreach script compensate for the manipulation of the working directory? I cannot see how it will.

Claude

******* UPDATE *******
It does not seem to help

I’ve been trying to get my head around what the problem might be with the changing of the working directory – it doesn’t make sense to me. Each of these jobs should get its own process ID, so as far as I know, gets its own working directory. What makes you think they would interfere?

Also, any chance we could have a look at the exact commands you’ve tried, along with the full debug output of these commands?

Personally, I’d be more inclined to blame SGE for these issues, since it tends to get in the way of commands like this due to FSL’s tendency to implicitly invokes sub-tasks via qsub if SGE is detected – see description of problems this can cause here. We put in safeguards around this which were eventually merged into 3.0_RC3, released on 11 May 2018. So if you’re not running that version, there’s a good chance the error might relate to that…

One way to check is to see whether you get these issues on your desktop – assuming you don’t have SGE installed on that…

Hi,

Re: full output I will give you as much as I can. I did not run it with the debug option unfortunatly… I may try it soon though.

Re: cluster, I do not think this is a cluster specific issue and in any case we use SLURM. We also get a dedicated node per user.

Re exact commands, here we go, this command was one which caused the error (though I did try different iterations):

ResultFolder=./DWI_preprocessing

foreach -13 $ResultFolder/*/*-DWI.mif : \
	dwi2response msmt_5tt \
	$ResultFolder/UNI/UNI_1-DWI.mif \
	$ResultFolder/UNI/UNI_1-5TT.mif \
	$ResultFolder/UNI/UNI_1-RF_WM.txt \
	$ResultFolder/UNI/UNI_1-RF_GM.txt \
	$ResultFolder/UNI/UNI_1-RF_CSF.txt \
	-voxels $ResultFolder/UNI/UNI_1-RF_voxels.mif -nthreads 0

During the processing a couple of messaged appear on the command line. These are the ones that I suspect are affecting things (re: working directory issues):

dwi2response: Generated temporary directory: /path/dwi2response-tmp-HFTGHE/
dwi2response: Changing to temporary directory (/path/dwi2response-tmp-HFTGHE/)

As you see, in participant 101309 the python script generates a temporary directory and then moves into it. Fine, all works well till now.

Problem arrises when participant 112920 also wants to ‘change to temporary directory’.

I don’t know if this is clearer.

Claude

OK, I guess if the cluster doesn’t use SGE, then that’s not the issue.

What is the problem exactly? Can we see the error messages that happen at that point…?

And can I just check: the locations of these temporary folders are different, right?

And there is enough storage on that device to hold all of the requisite data?

I will rerun my pipeline with the debug option and then dump the output.

The problem is (I think) that it is trying to be in two directories at the same time.

Re: storage, there is enough

Thanks

The script below simply test whether the current working directory of the process is identical to the temporary directory that was created by the script for that process. If your hypothesis is correct, then some processes should be reporting their current working directory as being different to the temporary directory it created earlier (supposedly due to some other process having changed the current working directory), and hence this script should throw an error.

Test script
import os, time
from mrtrix3 import app
app.init('Robert E. Smith (robert.smith@florey.edu.au)', 'Test use of foreach parallelisation with Python scripts')
app.cmdline.add_argument('input', help='An input string to uniquely label each process')
app.parse()
app.makeTempDir()
app.gotoTempDir()
time.sleep(1)
if os.path.realpath(os.getcwd()) != os.path.realpath(app.tempDir):
  app.error('Directory mismatch detected for input ' + app.args.input + ' (' + app.tempDir + ' <-> ' + os.getcwd() + ')')
app.complete()

However it doesn’t; so I believe the issue you’re encountering has a different source.

Just to keep everyone in the loop: I had a look at the full logs, it seems the error is likely an out-of-space issue on /tmp, occurring during one of the pipelines in the script. Plenty of space for the script’s temporary folder, but since the temporary files for piping get created on /tmp, it’s not immediately obvious to check that there is enough space there too.

I’m thinking hard about alternative approaches to piping that might mitigate or minimise this problem, but I’m not convinced there’s any other way to do this in a way that will be portable across OS’s…

1 Like

Thanks for the update. Couldn’t you just do away with piping? the code will look uglier and more temp files will be generated in the temp folder (not /tmp) but those get deleted once the script finishes anyway no?

It’s an option. Not my preferred option, I’d rather piping Just Worked™, but we may have to consider this…

Definitely would not want to strip out piping from the provided scripts, given that the intention of the Python scripting library is to be useful in developing scripts outside of just those provided within MRtrix3, and forbidding piping for anyone using that library would be a decent-size regression. I’ve listed on GitHub what I think is a better solution.