HCP connectome tutorial

Hi,

I’m still trying to follow this tutorial : http://userdocs.mrtrix.org/en/latest/tutorials/hcp_connectome.html
I had first problems with 5ttgen but that has been solved (for more details see 5ttgen).
I was next able to run all subsequent “Structural image processing” steps.

Regarding, the “Diffusion image processing” steps, I encountered a problem at the first step :

$ mrconvert data.nii.gz DWI.mif -fslgrad bvecs bvals -datatype float32 -stride 0,0,0,1
mrconvert: [100%] uncompressing image "data.nii.gz"
mrconvert: [ERROR] error allocating memory to hold mmap buffer contents

Does that mean I don’t have enough memory on my computer (8GB) for a file data.nii.gz of 1.2GB. If yes, is there anyway I can split the data in order to do the conversion on smaller portions ?

Interestingly, in spite of this error, the file DWI.mif has been generated (but I don’t know if it is corrupted or not).
Anyway, I proceeded up to step 3, where I got the following error :

$ dwi2response msmt_5tt DWI.mif 5TT.mif RF_GM.txt RF_WM.txt RF_CSF.txt -voxels RF_voxels.mif
dwi2response: 
dwi2response: Note that this script makes use of commands / algorithms that have relevant articles for citation; INCLUDING FROM EXTERNAL SOFTWARE PACKAGES. Please consult the help page (-help option) for more information.
dwi2response: 
dwi2response: Generated temporary directory: /tmp/dwi2response-tmp-8YI12J/
Command: mrconvert 5TT.mif /tmp/dwi2response-tmp-8YI12J/5tt.mif  -quiet
Command: mrconvert DWI.mif /tmp/dwi2response-tmp-8YI12J/dwi.mif -stride 0,0,0,1  -quiet
dwi2response: Changing to temporary directory (/tmp/dwi2response-tmp-8YI12J/)
Command: dwi2mask dwi.mif mask.mif  -quiet
Command: dwi2tensor dwi.mif - -mask mask.mif -quiet | tensor2metric - -fa fa.mif -vector vector.mif  -quiet
Command: mrtransform 5tt.mif 5tt_regrid.mif -template fa.mif -interp linear  -quiet
Command: mrconvert 5tt_regrid.mif - -coord 3 0 -axes 0,1,2 -quiet | mrcalc - 0.95 -gt fa.mif 0.2 -lt -mult mask.mif -mult gm_mask.mif  -quiet
Command: mrconvert 5tt_regrid.mif - -coord 3 2 -axes 0,1,2 -quiet | mrcalc - 0.95 -gt mask.mif -mult wm_mask.mif  -quiet
Command: mrconvert 5tt_regrid.mif - -coord 3 3 -axes 0,1,2 -quiet | mrcalc - 0.95 -gt fa.mif 0.2 -lt -mult mask.mif -mult csf_mask.mif  -quiet
dwi2response: Calling dwi2response recursively to select WM single-fibre voxels using 'tournier' algorithm
Command: dwi2response -quiet tournier dwi.mif wm_ss_response.txt -mask wm_mask.mif -voxels wm_sf_mask.mif
fixel2voxel: [ERROR] fixel image is empty
dwi2response: [ERROR] Command failed: fixel2voxel iter0_peaks.msf split_value iter0_amps.mif  -quiet
dwi2response: [ERROR] Command failed: dwi2response -quiet tournier dwi.mif wm_ss_response.txt -mask wm_mask.mif -voxels wm_sf_mask.mif
dwi2response: Changing back to original directory (/home/cdeledal/brainware/hcp/structural_connectome)
dwi2response: Deleting temporary directory /tmp/dwi2response-tmp-8YI12J/ 

Thanks for your help,

Cheers
Charles

Yes, I’m afraid that first error does indicate that the process ran out of memory. I’m not sure what data you’re using, but it’s likely that data.nii.gz file uncompresses to something much larger, maybe 3Gb say. Any processing on this file will need to produce an output file too, and hold that in RAM. So you’re immediately talking about 6Gb of RAM needed for something simple like a straight copy (depending on how well that file was compressed). On top of that, the mrconvert call is requesting float32 data type, which might double the size of the output image. And this is for a simple bit of processing - things will probably get a lot worse for more intense algorithms. I can’t see any way around that other than more RAM, if you’re going to be processing such huge images…

If you really want to figure this out, look at the file size of the uncompressed data.nii.gz (i.e. just run gunzip data.nii.gz and see how big that file is). Chances are it’ll be close to half your RAM, which will make it very difficult to process on that system.

Just to clarify why you still get an output image: it gets created early to reserve the space on the storage device, and abort early in case of error (e.g. no space left on device, permission denied, etc.). But clearly in this case, it subsequently failed to create the buffer to hold the data in RAM, so no copy took place. That output DWI.mif file is going to be empty… This of course means subsequent processing will fail eventually.

You might be able to do something about it depending on what you want to do and what data is in that file. For instance, if that file contains multi-shell data and you’re only interested in one of the shells, you’d be able to use dwiextract to pick it out, and hopefully that file would be small enough to process. But like I said, that depends on what you want to do. In general, I’d invest in the RAM, that data file really is very large…

I once ran into a similar issue. What I did was (spatially) crop the data to a field of view that just contains the brain. You’d be surprised how that’s typically only quite a small portion of the full image space (in 3D)…

That’s a good point: a decent fraction of the compression provided by saving as e.g. .nii.gz actually arises because of the vast volumes of zero-padding outside the brain, which can be represented very efficiently in a compressed format. In my experience, cropping the FoV of such images can reduce the un-compressed memory usage / file size by as much as 50%.

@deledalle Give something like this a try. If your system can muster enough RAM to get through this first step, it should hopefully reduce the RAM requirements of subsequent steps enough for your system to tolerate.

dwi2mask data.nii.gz - -fslgrad bvecs bvals | mrcrop data.nii.gz data_crop.nii.gz -mask -

I’ve just run the conversion from the uncompressed file data.nii

$ gunzip data.nii.gz
$ mrconvert data.nii DWI.mif -fslgrad bvecs bvals -datatype float32 -stride 0,0,0,1

and surprisingly it worked ! The memory usage was at 100%, my computer swapped for few minutes, and then the DWI.mif file was generated. I believe, there must be a leak somewhere when mrconvert is run from compressed data. Next, with the non-empy DWI.mif file, steps 2 and 3 of the “Diffusion image processing” part runs without troubles. I’m running now the following steps and I let you know.

Thanks @Thijs and @rsmith for your pieces of advice, but I won’t have to crop the FoV.

@jdtournier To answer your questions:

look at the file size of the uncompressed data.nii.gz […] I’m not sure what data you’re using

I am doing exactly the tutorial, so it’s the “Human Connectome Project subject 100307” data and the uncompressed data.nii file is 4GB.

You might be able to do something about it depending on what you want to do. For instance, if that file contains multi-shell data…

Well, I want to generate a connectome of the full brain for this subject. I don’t know what multi-shell data means, I don’t know much about Brain imaging, I just want to do some analysis/processing on the connectome of this subject.

OK, great to hear. But it’s not a memory leak: when the file is compressed, MRtrix3 has no option but to uncompress it into a RAM buffer - so that’ll take up 4GB, as will the output buffer that you’re copying onto, and you’re out of RAM. When the file is not compressed, MRtrix3 can use memory-mapping to avoid an explicit load to RAM (and will typically try its hardest to do just that). This doesn’t actually require the RAM to be allocated, it’s up to the system to manage which bits of the file will be loaded to RAM - and crucially it gives it the option of swapping bits of the file in & out of RAM if needed. It’s one of the advantages of using uncompressed images (along with much faster access since there’s no need to uncompress).

The HCP data is multi-shell. I’d strongly recommend you read up on this stuff, there’s a million ways to analyse diffusion MRI data, and most of them are not great… If you have a better idea about what’s going on, you’ll be better placed to make the right decisions.

I have now completed the “Diffusion image processing” steps with success. I now get a new error at step 2 of the “Connectome generation” (SIFT).

tcksift 100M.tck WM_FODs.mif 10M_SIFT.tck -act 5TT.mif -term_number 10M
tcksift: [100%] resampling ACT 5TT image to fixel image space
tcksift: [100%] segmenting FODs
tcksift: [100%] mapping tracks to image
tcksift: [ERROR] Filtering failed; desired number of filtered streamlines is greater than or equal to the size of the input dataset

I tried without the last option and it worked, but the subsequent step seems to produce a corrupted file:

$ tcksift 100M.tck WM_FODs.mif 10M_SIFT.tck -act 5TT.mif
tcksift: [100%] resampling ACT 5TT image to fixel image space
tcksift: [100%] segmenting FODs
tcksift: [100%] mapping tracks to image
tcksift:        Iteration     Removed     Remaining     Cost fn
tcksift: [done] 
tcksift: [100%] Writing filtered tracks output file
$ tck2connectome 10M_SIFT.tck nodes_fixSGM.mif connectome.csv
tck2connectome: [100%] Constructing connectome
tck2connectome: [WARNING] The following nodes do not have any streamlines assigned:
tck2connectome: [WARNING] 2, 5, 9, 11, 12, 15, 24, 25, 31, 32, 33, 39, 42, 43, 47, 48, 49, 50, 51, 54, 58, 61, 62, 63, 64, 65, 71, 74, 80, 81, 82
tck2connectome: [WARNING] (This may indicate a poor registration)
$ mrview nodes_fixSGM.mif -connectome.init nodes_fixSGM.mif -connectome.load connectome.csv
mrview: [ERROR] Input parcellation image must have an integer datatype; try running mrconvert -datatype uint32
mrview: [ERROR] Connectome matrix contains 84 nodes; expected 0

Thanks for your help.

Thanks for the advice. To clarify, it’s a collegue of mine that will be doing the analysis and she knows this kind of stuff and knows about the HCP data. Since I have a more powerful laptop, I am in charge of the generation of the connectome.

:+1:

Looks like your 100M.tck file doesn’t contain 100 million streamlines… See what tckinfo 100M.tck reports under the count field. Just out of interest, did you specify -num 100M at the tckgen stage as suggested in the docs? I have a sneaky suspicion that the handling of the M suffix isn’t working properly - maybe @rsmith can confirm?

The M suffix is indeed not working. It is just generating 10 tracks

@deledalle, a general recommendation that will make your life with MRtrix much easier is to inspect the outcome of every command, to make sure it makes sense.

So each time you run a command that has an image as output, inspect it using mrinfo and load it into mrview.

Each time you run a command that has tracks as output, inspect it using tckinfo and load it through the tractography tool in mrview (although loading 100M of tracks through mrview might not be feasible on your computer).

This is the only way you can make sure that the connectome that you will end up with eventually was produced in the correct way.

OK, looks like the integer multiplier postfix wasn’t working for that specific command & option; so instead of generating 100 million tracks, it probably just generated 100. I’ll push the fix once it’s finished testing. If anybody manages to find another instance where this mechanism fails, please let me know.

Whenever a command fails, it’s a good idea to carefully consider the error message produced; in this case:
tcksift: [ERROR] Filtering failed; desired number of filtered streamlines is greater than or equal to the size of the input dataset
This is a good indication that there aren’t as many tracks in your input file as you think.

2 posts were split to a new topic: Slow performance when changing strides