Tckgen Space and Time Constraints

rlaoprasert · November 14, 2018, 3:30pm

Hello everyone,

I am using MRtrix to perform probabilistic and deterministic whole-brain tractography on a large, high resolution dataset. I am encountering issues with space and runtime constraints. I appreciate that the tractography should take a lot of time and space, but I want to reach out in the off-chance that there is a way to compress the track files during/after runtime or to speed up the process.

I am doing single shell diffusion imagery with the Tournier algorithm for dwi2response and CSD for the FOD generation. The goal is to produce a complete and thorough set of tracts for the entire brain.

For probabilistic tractography, I am using the following command:

tckgen fod_S66971.mif -seed_random_per_voxel mask.mif 50 -minlength 0.5 -maxlength 800 -cutoff 0.1 -angle 45 ~/out_prob_whole_brain_P1.tck -algorithm ifod2

For deterministic tractography, I am using this command:

tckgen fod_S66971.mif -seed_random_per_voxel mask.mif 50 -minlength 0.5 -maxlength 800 -cutoff 0.1 -angle 45 ~/out_det_whole_brain_P1.tck -algorithm sd_stream

I am running MRtrix on a desktop computer with the following specifications:
OS: Mac OS X 10.11.6
Processor: 3.5 GHz 6-Core Intel Xeon E5
RAM: 64 GB 1866 MHz DDR3 ECC

Some specific things that I am wondering are:

Is there a more efficient way of running tckgen, perhaps if I alter the preceding steps?
Is 50 a valid choice for “-seed_random_per_voxel” given the objective of making a thorough set of tracts? / Which parameters can I adjust to improve runtime, and how much effect would they have?
Where would the computational bottleneck likely be? (We can move the process onto a computing cluster if necessary)

I appreciate any help and insight you can provide! Thank you so much!

jdtournier · November 20, 2018, 4:30pm

Not as far as I can see. Where you might be able to save time is in the actual steps you’re doing. In particular:

Any reason you need to use -seed_random_per_voxel seeding strategy, rather than say -seed_gmwmi…?
Why do you need to perform deterministic tractography?

This will depend on what your exact research question is exactly, and why you’re thinking of using that seeding strategy in the first place – it’s not one we’d typically recommend. Take a look at papers such as the SIFT, SIFT2 or SIFT in connectomics papers to get a feel for what we’d normally use.

In general though, I would say that you won’t need anywhere near as many seed per voxel for deterministic tractography, since many of the resulting streamlines will be essentially identical – in contrast to the probabilistic variant.

It’s straight CPU usage… tckgen parallelises very well, and typically achieves full CPU usage. It’s not memory-constrained, and it’s not limited by IO either. To speed things up, you’ll want a system with a lot more CPU cores, and/or faster CPU cores.

rlaoprasert · November 20, 2018, 5:12pm

Thank you so much for your reply, it was very helpful and enlightening. I do realize that I did not give enough information for you to answer my questions easily, and for that, I apologize. Please allow me to elaborate on details relating to my situation and first question.

Not as far as I can see. Where you might be able to save time is in the actual steps you’re doing. In particular:

Any reason you need to use -seed_random_per_voxel seeding strategy, rather than say -seed_gmwmi …?

Why do you need to perform deterministic tractography?

The reason that I am attempting to use -seed_random_per_voxel is because my objective is to make a connectivity/tractography atlas of a rodent brain. I am hoping to get as complete of a representation as I can with the tractography.

I am not using seed_gmwmi because I am currently not using tissue information/ACT to process the data (at least, I believe that tissue data is necessary to use seed_gmwmi).

The reason that I am doing a deterministic connectome in the first place is to hopefully establish a baseline at what we currently do in our lab (we currently mainly perform deterministic tractography) and to obtain a more thorough set of tracts. However, if you believe that probabilistic tracts would make deterministic ones obsolete, I would really appreciate your insight into that matter.

In this situation, would you believe that “-seed_random_per_voxel” is a valid seeding strategy?

If there is anything that I have yet to communicate effectively, please let me know! And again, thank you so much for your reply!

jdtournier · November 20, 2018, 5:56pm

OK, that would indeed be difficult in this case…

Again, that’s a good reason
But…

This is where ‘thorough’ might need to be better defined. I’m not sure that the -seed_random_per_voxel necessarily gives you a more thorough set of tracts – although it would give you a more thorough set of seed points. But the main point in any case is to ensure that whatever procedure you use produces a sufficiently dense set of streamlines everywhere that it matters.

I don’t think ‘valid’ is the right way to think about it. What matters is that it’s appropriate for the problem you’re trying to solve. If we’re talking purely about performance, I would personally simply go for -seed_image – it won’t be guaranteed to have the same number of seed per voxel, but with sufficient numbers, you’ll get more or less uniform seeding regardless. But there’s lots of other options for seeding in tckgen, I recommend you experiment with them and see what works best for your purposes.

Well, I don’t think it makes them obsolete as such, I just think that there are so many sources of uncertainty in tractography (imaging noise of course, but also the inherent smoothness of the DW signal, and the inherent ambiguities in fibre arrangements) that simply ignoring them is not a reasonable thing to do…

rsmith · November 25, 2018, 7:45am

I want to reach out in the off-chance that there is a way to compress the track files during/after runtime …

Particularly for deterministic tractography, which uses a smaller step size, I would take a look at the -downsample option. This will reduce the number of vertices stored per streamline, which will reduce the storage required (but have negligible effect on the runtime). iFOD2 by default uses a downsample factor of 3 to “compensate” for the fact that internally it uses a smaller “step size” than what’s reported / provided (long story).

There’s an entire manuscript on a format for tractogram compression. But I’ve not been motivated to implement such in MRtrix3: the gains from simply down-sampling are not all that far off.

If we’re talking purely about performance, I would personally simply go for -seed_image …

-seed_image does not depend on inter-thread locking, whereas -seed_random_per_voxel does. I would not think that this would cause performance issues practically on a modern desktop system, but you never know…

rlaoprasert · November 27, 2018, 4:04pm

Thank you both so much! I have experimented with using seed_image, and it seems to be the route we will take, the point about “…main point in any case is to ensure that whatever procedure you use produces a sufficiently dense set of streamlines” was very enlightening. The downsampling tip is also a promising lead. I think I have a better understanding not only this specific problem, but also about MRtrix3 in general. I sincerely appreciate your comments, and they have been very helpful.