SIFT (and related methods) are only applicable to whole-brain tractography.
While the underlying reasons why this is the case are there in the linked preprint, the pending revised version makes this message far more explicit. It’s not about the presence or absence of a “target”; it’s about whether or not it even makes sense to expect correspondence between streamlines density and diffusion-model-based fibre density (and hence treat any discrepancy in such as a bias to be corrected).
Imagine two fixels with equivalent fibre density, but one is right next to your seed point (and hence has a huge streamlines density) whereas the other is at the opposite end of the brain and is traversed by 1 streamline. Is this indicative of a bias? Well, in a way yes, because you performed your seeding in a “biased” fashion by seeding in some places but not others; but this is not something to be “corrected”. The fibre density in the distant fixel simply belongs to other WM pathways, but you don’t know that because you haven’t performed whole-brain tractography.
I kind of wish that that publication in particular could be entirely dynamic so that I could keep revising my explanations to plug such holes in comprehension
The “proper normalization” (or omission thereof) depends on the nuances of the quantity that you are deriving.
In your case, you have generated a fixed number of streamlines per subject, irrespective of seed size / fraction of streamline seeds that produce valid streamlines / fraction of streamlines that intersect the target.
As such, you can, for all subjects, convert the “number of streamlines intersecting the target” to a “fraction of valid streamlines seeded from the seed that hit the target”. This highlights the vital difference in interpretation of this kind of probabilistic streamlines experiment as capturing a “probability of connection” as opposed to what we do with e.g. SIFT which is about “density of connection” (I tried to slip this message into this book chapter, section 21.2.2, but it’s a common conflation that could do with being written more explicitly).
From this understanding, we can instead ask the question: “Do you expect the fraction of streamlines emanating from the seed point that go on to hit the target to vary as a function of the size of the seed?” Depending on the pathway involved, the answer could go either way. But I pose the question in that way for a reason. Imagine if you had instead performed streamlines tractography in such a way that a fixed number of streamline seeds were generated for each voxel in the seeding mask; then, the number of streamlines intersecting the target would be reasonably expected to scale in proportion to the volume of the seed, because you’re simply generating more streamlines for subjects with larger seeds. But because you generate a fixed number of streamlines (irrespective of intersecting the target) per subject, that particular effect disappears.
The point I made in the linked preprint in the “Intracranial / brain / white matter volume” section I here suggest applies equivalently to seed volume in your case. Since there is no algebraic form by which you can “correct” your data for this confound, I would use it as a regressor in the model and let the data remove any influence it may have had.
So, in case of not using SIFT and a whole-brain tractography, do you have any recommendations on how to threshold the obtained streamline density map?
This question is actually somewhat independent of SIFT; you can do whole-brain tractography, apply SIFT(2), extract a bundle of interest, map to a voxel grid, and threshold to produce a binary mask of that pathway. The effect of SIFT(2) within such a pipeline would likely be minimal, but it’s still theoretically preferable to include.
One thing I would suggest if doing that kind of experiment would be, instead of streamline count, use the sum of streamline intersection lengths (
tckmap -precise option). This will make the resulting maps more stable against perturbations of the threshold, where otherwise the quantized nature of streamline counts leads to addition / deletion of many voxels from the mask as soon as that numerical threshold crosses an integer value (reason for the “jagged” plots as a function of streamline count shown in the linked manuscript).
I would also make the observation that if you’re devising a voxel map binarizing threshold per subject that is based on the maximal intensity present in each subject, that’s another parameter that varies between subjects that could conceivably influence your between-subject comparisons.
Finally, while it’s not immediately clear whether or not it’s relevant to you (since it’s unclear whether the conversation has shifted from an endpoint-to-endpoint scalar streamline count measure to a voxel-wise streamline count measure), I’ll at least make sure you’re aware of this method (available via
tckgen -algorithm nulldist1/2), since it could theoretically be used for masking.