Structural connectivity edge weights

Hi MRtrix community,

I have a question regarding the weights for the edges of structural connectivity matrix.
If tracts are more likely to be connected to closer ROIs, why a lot of papers weigh the edge of the matrix by the inverse of tract lengths, instead of the tract lengths? shouldn’t the farther ROIs get weighed more?

Hi @xiaobird,

That’s mainly because those papers aim to correct for a different type of “bias” or effect that generally applies to most tractography algorithms; sometimes referred to as a “seeding bias”. Seeds to initiate tracking are often distributed randomly, and spatially uniformly across the brain or white matter. This means longer tracts will get more seeds, and thus more streamlines (that is, if those streamlines succeed to then also run the full length of the tract). You’re right though, on top of this type of “bias”, there is also the effect of seeds in shorter tracts more easily generating a streamline that successfully runs the full length of the tract. And it doesn’t end there in terms of how “easy” certain tracts are to be traced successfully; to provide yet another example: it’ll also be more difficult to tract tracts that have more curvature along them.
But all these latter kinds of examples relate to what I’d informally call “trackability”, given a seed. The inverse tract length correction on the other hand, aims to correct for the seeding bias.

So given all these complexities related to the streamline count directly following tractography, it’s best to not rely on them directly, but rather to use various underlying quantitative metrics from the diffusion MRI data to quantify how intact or “strong” the “connection” of the tract is, or how damaged it might otherwise be. There’s no single answer to that question that can be “proven” to be “correct” based on dMRI data, as the data has inherent limitations to its specificity, the link between structure and function hasn’t been resolved yet (and is probably far from trivial), and finally for the most commonly used tractography pipelines, there is an overwhelming number of false positive tracts if more advanced prior information isn’t used. But in any case, using underlying metrics of microstructure is definitely preferred to at least assess this, rather than streamline counts.

I hope that provides some insights!

Cheers,
Thijs

Just to add to Thijs answer:

The solutions offer by MRtrix are the SIFT and SIFT2 algorithms that aim to extract underlying quantitative properties from the diffusion MRI data (from voxels in the path of the tract). It is quite complicated, as a single voxel may contribute to several tracts, but these algorithms do their best to handle it. Moreover, the calculation is performed at the level of each streamline, so it doesn’t depend on how the tracts are defined later on.

https://mrtrix.readthedocs.io/en/latest/quantitative_structural_connectivity/sift.html#number-of-streamlines-pre-post-sift
https://mrtrix.readthedocs.io/en/latest/reference/commands/tcksift2.html

Notice however that the distance bias (tracts are more likely to be connected to closer ROIs) cannot always be corrected for by SIFT/SIFT2 because these algorithms cannot make up streamlines where such do not exist. If due to the distance between ROIs, the tractography algorithm completely misses part of the tract (there is not even a single streamline there), so SIFT/SIFT2 would not help. If you use probabilistic tractograpy, it is better then to use more streamlines rather than less streamlines, as this way you increase the chances that all parts of the tracts are captured by streamlines.

Also, for comparison between weighing the edge by the inverse of tract length versus SIFT/SIFT2, see the following paper:
Yeh, C. H., Smith, R. E., Liang, X., Calamante, F., & Connelly, A. (2016). Correction for diffusion MRI fibre tracking biases: The consequences for structural connectomic metrics. Neuroimage , 142 , 150-162.

Yep, agreed with @orencivier’s explanations. To add to this bit:

That’s of course the double-edged sword of false negatives versus false positives. No “easy” solution there: both are detrimental to any approach that optimises globally using a tractogram facing this challenge. There’s been interesting developments on the false-positives challenge front recently though, using diverse approaches. Some use prior information capturing most large bundles (e.g. TractSeg is pretty neat in that regard), others optimise the global fit with more information, e.g. anatomy and sparsity; I think this extension of the COMMIT framework is really nice, pretty striking results too: https://www.biorxiv.org/content/10.1101/608349v2.full .

@ThijsDhollander @orencivier
Thank you so much for the thorough answers!

1 Like

No worries. For future reference, here’s the Twitter thread on the topic: https://twitter.com/xiaobird/status/1216575894753632256

Essentially the same explanations, but in different words. :slightly_smiling_face: :+1:

1 Like

Expansion on the not-related-to-seeding bias:

Description of this specific bias originates from targeted tracking experiments, where if a specific seed point were genuinely biologically connected to two different ROIs, but one was very close to the seed whereas the other were very far, then the number of probabilistic streamlines reaching the former ROI would be greater than the latter, as the streamlines would have fewer opportunities to deviate from the genuine underlying trajectory.

Firstly, this is a fundamentally different interpretation of “connectivity” than what we use in MRtrix3 world. The type of experiment described above purports to provide a “probability of connectivity” between the seed location and the target region, based on the number of seeded streamlines that reach said target region. What SIFT / SIFT2 (and others) purport to provide is not a measure of probability of connectivity, but density.

These two interpretations are actually incompatible even before you get to the interpretation. The first is only possible based on targeted tracking (and to be fully precise, is only appropriate if seeding from an infintessimally small seed location, i.e. all streamline seeds are precisely the same). The second - at least for the quantification techniques referenced - are only possible based on whole-brain tractography (as one must be able to compare the relative streamlines densities in different locations in the brain with the relative fibre densities / diffusion-weighted signals in those locations).

So let’s consider the “probability of intersection” of ROI pairs in the context of whole-brain tractography. Moreover, let’s ignore streamline seeding effects, and let’s even ignore the specifics of SIFT / SIFT2. But we need more regions.

      --- B                                     --------------- E
     /                                         /
A ---                         D ---------------

          C                                                     F

A is connected to B but not C; D is connected to E but not F. But when we do a streamlines reconstruction, we’re not going to reconstruct the precise underlying trajectories perfectly: some streamlines intersecting A will intersect C instead of B; some streamlines intersecting D will intersect F instead of E.

Because the pathway from A to B is short, probably most streamlines intersecting A will successfully reach B, and few will instead go to C. Because the pathway from D to E is long, probably many (as the length increases, approaching 50%) of the streamlines intersecting D will hit F instead of E.

Now, are “tracts are more likely to be connected to closer ROIs”? No: D-E will most likely still be a more dense connection than A to C (assuming the number of streamlines intersecting A is equal to the number of streamlines intersecting D), despite A-C being shorter than D-E. It’s also not as simple as a “bias toward shorter tracts”, since D-F is estimated to be more densely connected than A-C despite being longer.

I have on a number of occasions described this as a “distance-dependent blurring of the connectome”. Long streamlines are more likely to traverse erroneous trajectories. If a region of interest X is biologically connected specifically to some regions and not others, then the further those regions are from X, the more the tractography-estimated connectivity from X that should ideally be attributed to a subset of those regions will be instead more evenly distributed across those regions.

The above is of course assuming probabilistic tractography. For deterministic, as the distance increases, it is not that the estimated connectivity becomes erroneously non-specific, but it becomes increasingly likely that the connectivity will be attributed specifically but erroneously to the wrong regions.

1 Like