Apologies for a long slightly rambling posts with lots of different questions, Im mostly keen to see people’s opinions and any conventions and glaring oversights I am missing. Im a Maths PhD student looking at network analysis from dMRI for a particular cohort at my institution.
We are currently running tractography using tckgen with the following options. iFOD2 (as the default) with a cutoff and seed cuttoff of 0.1 using -act (supplying a 5tt image) with -backtrack and -crop_at_gmwmi to generate 10million streamlines, all other options are defaults.
We then use sift down to 1million with the tcksift command passing only the standard input output files, the number and the act image as options
Where we are using the Desikan-Killiany atlas from a freesurfer segmentation - and using the labelconvert and labelsgmfix and hence getting an 84 node network.
Currently in tck2connectome we pass the following options -symmetric -zero_diagonal -scale_invnodevol -out_assignments
I’m aware that it might not be best to use our current scaling by node volume after applying sift however I don’t believe this would explain the majority of differences I think I’m seeing. I am wondering if there are any options currently passed that could be cause for inconsistencies or potetial different options or values that might increase consistency with the pipeline.
I’m also intrigued by what you would expect the density of produced connectomes to be, as from my ‘uneducated’ prospective it seems high ranging from around 58 to 71 percent of potential edges dependant on subject.
I have rerun only tckgen tcksift and tck2connectome for a subset of subjects and have been exploring the differrent networks generated. Are their particular benchmarks of what’s considered consistent results? I believe I am seeing a substantial change in output networks, but mostly due to small weighted edges. For example for one subject the first run produced a network with 2046 edges while the second had 2054, however 203 edges was unique to the first run and 211 edges to the second - looking at strength of nodes however these unique nodes make up only 0.136% of the total strength of the network for the first run. The first one has a total strength of 76.1 compared to 76.2 for the second (I now realise my code is double counting the strength but consistently so percentages are still valid) a difference from the non-rounded values of 0.13%. This subject is one of the more consistent subjects between runs. These are obviously basic measures to look at but even from these it seems to me that thresholding and all the things that comes with it may be necessary - I however want to make sure I have a decent and as consistent as possible pipeline in place before going down that route and dealing with all the nuances that come up there.
I would also like to add that visualising the output weighted connectomes look very similar but I believe this is due to the consistency of high weight edges, and the fact I have not transformed the weights in anyway to take into account the differnet orders of magnitudes of weights. Plotting binary matrices (at different thresholds) doesn’t look as comforting.
As you might be able to tell I haven’t really got a good intuition yet for what is a decent level of consistnecy and what the networks themselves should look like.
Ideally I don’t want to discard too much weight information but with the current large number of (small weighted) edges different between rerunning tractography on the same data it seems inevitable. A lot of network metrics, although they have weighted variants, are affected a lot by the existence of these low weight edges especially as weighted versions often requrie an intepretation of weight as a “length”.
I am in my first year so my current priority is to get some consistent model working with which I can explore our data in a structurally sound way and less about getting a fully developed model right from the offset.