Differences in re-running tractography and connectome generation

Apologies for a long slightly rambling posts with lots of different questions, Im mostly keen to see people’s opinions and any conventions and glaring oversights I am missing. Im a Maths PhD student looking at network analysis from dMRI for a particular cohort at my institution.

We are currently running tractography using tckgen with the following options. iFOD2 (as the default) with a cutoff and seed cuttoff of 0.1 using -act (supplying a 5tt image) with -backtrack and -crop_at_gmwmi to generate 10million streamlines, all other options are defaults.

We then use sift down to 1million with the tcksift command passing only the standard input output files, the number and the act image as options

Where we are using the Desikan-Killiany atlas from a freesurfer segmentation - and using the labelconvert and labelsgmfix and hence getting an 84 node network.

Currently in tck2connectome we pass the following options -symmetric -zero_diagonal -scale_invnodevol -out_assignments

I’m aware that it might not be best to use our current scaling by node volume after applying sift however I don’t believe this would explain the majority of differences I think I’m seeing. I am wondering if there are any options currently passed that could be cause for inconsistencies or potetial different options or values that might increase consistency with the pipeline.

I’m also intrigued by what you would expect the density of produced connectomes to be, as from my ‘uneducated’ prospective it seems high ranging from around 58 to 71 percent of potential edges dependant on subject.

I have rerun only tckgen tcksift and tck2connectome for a subset of subjects and have been exploring the differrent networks generated. Are their particular benchmarks of what’s considered consistent results? I believe I am seeing a substantial change in output networks, but mostly due to small weighted edges. For example for one subject the first run produced a network with 2046 edges while the second had 2054, however 203 edges was unique to the first run and 211 edges to the second - looking at strength of nodes however these unique nodes make up only 0.136% of the total strength of the network for the first run. The first one has a total strength of 76.1 compared to 76.2 for the second (I now realise my code is double counting the strength but consistently so percentages are still valid) a difference from the non-rounded values of 0.13%. This subject is one of the more consistent subjects between runs. These are obviously basic measures to look at but even from these it seems to me that thresholding and all the things that comes with it may be necessary - I however want to make sure I have a decent and as consistent as possible pipeline in place before going down that route and dealing with all the nuances that come up there.

I would also like to add that visualising the output weighted connectomes look very similar but I believe this is due to the consistency of high weight edges, and the fact I have not transformed the weights in anyway to take into account the differnet orders of magnitudes of weights. Plotting binary matrices (at different thresholds) doesn’t look as comforting.

As you might be able to tell I haven’t really got a good intuition yet for what is a decent level of consistnecy and what the networks themselves should look like.

Ideally I don’t want to discard too much weight information but with the current large number of (small weighted) edges different between rerunning tractography on the same data it seems inevitable. A lot of network metrics, although they have weighted variants, are affected a lot by the existence of these low weight edges especially as weighted versions often requrie an intepretation of weight as a “length”.

I am in my first year so my current priority is to get some consistent model working with which I can explore our data in a structurally sound way and less about getting a fully developed model right from the offset.

1 Like

Hi Marshall,

“Reliability” of structural connectome construction is something I’ve only dabbled in a little, though I’ve written various essays on this forum regarding how I think people should be addressing this question. And I have my own personal opinions about how people should be treating these data and quantifying variance; but as typically happens I could spend an inordinate amount of time writing an essay here that could be better invested in other things…

If I could maybe try to summarise your post into a couple of more discrete statements, and answer each in turn:

  1. “Reproducibility of structural connectome construction is not good enough!”

    • Preaching to the choir :stuck_out_tongue: Welcome to tractography!
    • There isn’t a button that can be pressed that magically turns tractography from an unreliable process to a reliable one; if there was, it’d default to being active.
    • In the case of intra-session reproducibility, i.e. re-generating the tractogram from the same image data, one factor that is within your control is the density of the reconstruction. Yes the streamline seeding & propagation is random, but if you generate a number of streamlines approaching infinity, one expects the variance to reduce to zero. Using SIFT2 rather than SIFT is intended to yield benefits in this regard also.
  2. “How do I quantify variance of structural connectomes?”

    Good question (if I do say so myself :upside_down_face: ).

    • If you have a particular intended per-subject analysis (e.g. some graph metric of interest), then you can analyse the variance in that endpoint.
    • In my own attempt (linked above), I tried to address this agnostically to any particular downstream analysis by looking at the variance in individual edge weights, additionally summarising across the connectome by considering that across the suite of possible analyses high-density edges are more likely to either be of a priori interest or more strongly influence analysis outcomes than low-density edges. Not saying that’s “the right answer”, it just shows how one can contemplate how to attack the question of connectome variance and do something more tailored to these data.
  3. What about connectome “density”?

    • With a probabilistic streamlines algorithm with as much dispersion as iFOD2, as you generate more and more streamlines, the “density” (where here “density” is the proportion of possible edges to which at least 1 streamline is assigned) will approach 100%. So it’s a measure of reconstruction as much (or more so) than biology, and I personally consider it a bad measure in the context of such reconstructions.
    • I personally consider the distinction between an edge with 0 streamlines and an edge with 1 streamline to be pointless at best and misleading at worst. If an analysis changes its result drastically based on such a difference in observation, it’s a bad analysis.
    • You make the suggestion of pruning in order to somehow address this inconsistency. We’ve elsewhere made the argument that in the context of such reconstructions, pruning doesn’t make sense as a “data quality improvement” step (also there we advocate use of “pruning” rather than “thresholding” since the latter carries a connotation of additional binarization). Indeed, your observation where there are a large proportion of edges that appear & disappear for re-generation of the tractogram is IMO an argument for not pruning. You will never find a threshold that cuts out these edges but that does not start excising more consequential edges. I personally believe that the interest should instead be in whether or not particular analyses are or are not strongly affected by such observations, and being highly critical of those that are.
    • In the manuscript cited above we had a mean connectome density of 90%, which is more than your “high” 58-71%. That was high-quality HCP data and more dense reconstructions. My suspicion is that there’s some circular reasoning here in the community: people “expect” a connectome density of 5-20% based on prior literature, but that’s the literature from the neuroimaging community where previously analyses have been pruned to have densities in the 5-20% range. Tract tracing suggests a biological reality of around 67% depending on the source.
  4. “What about network metrics based on length?”

    Yeah, I’m highly critical of those too. Partially due to reasons above, partially due to the fact that the transformation of length-based metrics from binary to weighted networks is fundamentally ill-posed. There are an infinite number of possible transformations between connection density and a reciprocal measure that we might call “length”. But crucially I don’t consider pruning low-density connections to be a solution to that problem. Anything that is based on setting a threshold that omits spurious connections and keeping everything else is IMO far too optimistic about the quality of tractography data.

Okay, that’ll do from me for now.
I appreciate this isn’t your native domain, but given your mathematical background I would propose that you consider the question of how to do network analysis utilising such data knowing that it has all of these limitations, rather than trying to fix those limitations.

1 Like

Hi Rob,

Thank you so much for such a comprehensive response! It has definitely put me at ease of my original concerns and given me lots of food for thought moving forward. As well as a better understanding of what might be more beneficial to focus on. Your last comment strongly resonates with me and is an angle I have been tending towards from various discussions and exploration of the networks. Apologies for another ramble - below are just my specific thoughts on the individual questions responses effectively reiterating the above.

  1. Its reassuring to know just how difficult tractography and network analysis based of tractography results can be especially coming from a nice cosy pure maths and now working on applied projects. It definitely does seem that a lot of my original concern is from this mindset adjustment - while obviously still needing to be ultra critical with how I produce, analyse and interpret these networks. (as well as the rest of the pipeline but that’s another matter). My original concern in this part was also fuelled by fear of potentially missing something particularly egregious before the network analysis, due to my lack of experience.

  2. This is the direction I seem to be going in, with regards to edge weights it has also been reassuring to find in my data that the stronger edges are the most consistent and various metrics directly based on this are the most consistent (as expected).

  3. With regard to density, I feel its fair to say I have been some what caught up in that circular reasoning, so it is good to know the density of networks from our data could be reasonable and dont immediately scream somethings wrong as I slightly feared. It seems obvious now but I had not yet internalised the fact that for iFOD2 the “density” tends to one for increasing number of streamlines. With regards to pruning, that makes a lot of sense and the paper was very interesting and helpful. The suggestion to focus on the analyses themselves that are robust to the data while being wary of ones that are not is definitely a big takeaway for me. My original concerns I feel now rested too highly on thinking in a binary way and your comment about 1 vs 0 streamlines definitely helps to illustrate this.

  4. It’s nice that see the concern shared about lengths, I will definitely give some thought in the use of such metrics and their tricky interpretability (a large understatement in my opinion)

Once again thank you very much for the response and your time!

All the best