Seed-based tractography & connection density normalization

Dear MRTrix team!

I am fairly new to the forum and would like to get some feedback on our pipeline. I’ve read a lot on the forum and in the latest Smith 2020 preprint, but am still unsure what applies to my exact problem.

In short,

  1. I am wondering whether SIFT can be applied to non-targeted but seed-based tractography? I’ve read in several posts that it can only be applied to whole-brain tractography and not targeted tractography - but I am uncertain where a seed-based tractography without target would fall into)

  2. If SIFT cannot be applied - how to properly normalize for differences in starting seed volume?

In more detail, we’ve ran a seed-based tractography (i.e. starting seed point but no traget) with the aim to correlate the resulting streamline densities with behavioral measures. The tractography was run with a fixed number of streamlines to select for the output (100.000 - due to comparability to an older project). We’ve now been wondering how to properly normalize for the differences in initial seed volume. So far, we’ve included either cortical thickness or surface area (the initial seeds where projected from GM on the GM/WM boundary) as a covariate when doing the correlations to behavioral measures.

What I’ve got from the forum (mostly this post) is that the ideal approach would to be to run a whole-brain tractography, create a whole-brain connectome and then filter for the streamlines connecting to our ROI. Due to comparability to an older project, if possible, we’d like to stay as close as possible to that older pipeline which did a seed-based approach.

So, in case of not using SIFT and a whole-brain tractography, do you have any recommendations on how to threshold the obtained streamline density map? I see that there is probably no easy answer to that and that it highly depends on the context. I’ve found this paper comparing the required number of streamlines while thresholding at 0.001 x streamline count and at 0.01 x the maximum trackmap intensity and wondered what you think of this approach.

Thank you very much in advance!

1 Like

Hi @scchueler,

  1. SIFT (and related methods) are only applicable to whole-brain tractography.
    While the underlying reasons why this is the case are there in the linked preprint, the pending revised version makes this message far more explicit. It’s not about the presence or absence of a “target”; it’s about whether or not it even makes sense to expect correspondence between streamlines density and diffusion-model-based fibre density (and hence treat any discrepancy in such as a bias to be corrected).
    Imagine two fixels with equivalent fibre density, but one is right next to your seed point (and hence has a huge streamlines density) whereas the other is at the opposite end of the brain and is traversed by 1 streamline. Is this indicative of a bias? Well, in a way yes, because you performed your seeding in a “biased” fashion by seeding in some places but not others; but this is not something to be “corrected”. The fibre density in the distant fixel simply belongs to other WM pathways, but you don’t know that because you haven’t performed whole-brain tractography.
    I kind of wish that that publication in particular could be entirely dynamic so that I could keep revising my explanations to plug such holes in comprehension :confused:

  2. The “proper normalization” (or omission thereof) depends on the nuances of the quantity that you are deriving.
    In your case, you have generated a fixed number of streamlines per subject, irrespective of seed size / fraction of streamline seeds that produce valid streamlines / fraction of streamlines that intersect the target.
    As such, you can, for all subjects, convert the “number of streamlines intersecting the target” to a “fraction of valid streamlines seeded from the seed that hit the target”. This highlights the vital difference in interpretation of this kind of probabilistic streamlines experiment as capturing a “probability of connection” as opposed to what we do with e.g. SIFT which is about “density of connection” (I tried to slip this message into this book chapter, section 21.2.2, but it’s a common conflation that could do with being written more explicitly).
    From this understanding, we can instead ask the question: “Do you expect the fraction of streamlines emanating from the seed point that go on to hit the target to vary as a function of the size of the seed?” Depending on the pathway involved, the answer could go either way. But I pose the question in that way for a reason. Imagine if you had instead performed streamlines tractography in such a way that a fixed number of streamline seeds were generated for each voxel in the seeding mask; then, the number of streamlines intersecting the target would be reasonably expected to scale in proportion to the volume of the seed, because you’re simply generating more streamlines for subjects with larger seeds. But because you generate a fixed number of streamlines (irrespective of intersecting the target) per subject, that particular effect disappears.
    The point I made in the linked preprint in the “Intracranial / brain / white matter volume” section I here suggest applies equivalently to seed volume in your case. Since there is no algebraic form by which you can “correct” your data for this confound, I would use it as a regressor in the model and let the data remove any influence it may have had.

So, in case of not using SIFT and a whole-brain tractography, do you have any recommendations on how to threshold the obtained streamline density map?

This question is actually somewhat independent of SIFT; you can do whole-brain tractography, apply SIFT(2), extract a bundle of interest, map to a voxel grid, and threshold to produce a binary mask of that pathway. The effect of SIFT(2) within such a pipeline would likely be minimal, but it’s still theoretically preferable to include.

One thing I would suggest if doing that kind of experiment would be, instead of streamline count, use the sum of streamline intersection lengths (tckmap -precise option). This will make the resulting maps more stable against perturbations of the threshold, where otherwise the quantized nature of streamline counts leads to addition / deletion of many voxels from the mask as soon as that numerical threshold crosses an integer value (reason for the “jagged” plots as a function of streamline count shown in the linked manuscript).

I would also make the observation that if you’re devising a voxel map binarizing threshold per subject that is based on the maximal intensity present in each subject, that’s another parameter that varies between subjects that could conceivably influence your between-subject comparisons.

Finally, while it’s not immediately clear whether or not it’s relevant to you (since it’s unclear whether the conversation has shifted from an endpoint-to-endpoint scalar streamline count measure to a voxel-wise streamline count measure), I’ll at least make sure you’re aware of this method (available via tckgen -algorithm nulldist1/2), since it could theoretically be used for masking.


Dear Rob, thanks a lot for the detailed reply! I’ll dig a little more into your suggestions!

Dear @rsmith,

We’ve always used a voxel-wise streamline count measure, not a endpoint-to-endpoint one. It’s a developmental project so by correlating streamline density with behavioral measures we want to find (unknown) regions (within anatomical tracks) that are relevant for that respective behavioral measure.

I have some followup questions on the null distribution approach you suggested:

  1. first of all: what’s the difference between nulldist 1/2? Is it the nr of iterations (1000 and 20.000 as in the paper)?
  2. How would you implement comparing the created nulldist tracking and the initial tracking? From the paper I get to compare them with z statistics, thresholding the resulting p-value map and creating a mask from this for the initial density map. This may be a dumb question, but how do I create the z statistics map?

Thank you!

We’ve always used a voxel-wise streamline count measure, not a endpoint-to-endpoint one.

I’ll also link to this manuscript.

It shows that for whole-brain connectivity, trying to get voxel-wise “quantitative streamline counts” is a red herring, since the voxel-wise fibre density estimates from the diffusion model are what those quantitative tractogram properties are derived from, but you can use them in their native form with far less variance than what you introduce by bringing tractography into the equation.

But if I combine this:

… correlating streamline density with behavioral measures we want to find (unknown) regions (within anatomical tracks) …

with the fact that the conversation started by talking about targeted tracking rather than whole-brain connectivity, I also need to cover off a second prospect. Perhaps, instead of identifying regions with differences in (voxel-wise) “connection density” and discovering that those are within specific WM pathways, you instead wish to extract specific WM pathways, and then look for any differences in (voxel-wise) “connection density”. These are indeed two different things, because in the latter you are attempting to quantify the density of only that fraction of the “connection density” in that voxel that belongs to a specific WM pathway. I can’t really describe this distinction in any greater detail than what’s already in the preprint linked above, but from a pragmatic perspective, the latter involves extracting the pathway of interest (along with SIFT2 weights if applicable), mapping to the voxel grid (using the -precise option in tckmap to get a sum-of-intersection-lengths), and multiplying the result by mu. This would give you a spatial map of fibre volume ascribed to the pathway of interest, which could be compared across individuals in a number of ways.

what’s the difference between nulldist 1/2?

As per the help page, nulldist2 specifically matches the mechanism by which iFOD2 generates candidate paths, whereas nulldist1 can be parameter-tuned to match any first-order algorithm.

How would you implement comparing the created nulldist tracking and the initial tracking?

It’s important to recognise that this method as presented is based on the seed location being a single point; you’d want to be very confident in your understanding of the relevant statistics before attempting to use it in any capacity beyond that.

Otherwise, while I don’t recall having gone through the complete process myself, I expect that it is simply a voxel-wise operation, in which case it is a matter of deriving the appropriate expression using mrcalc.