It has been a while since I have run whole brain tracking and I am getting back to running some data. I remember that “back in the day” it was recommended to run 10 - 20 million streamlines per participant.
Now that SIFT2 is up and about, have the recommendations for “number of streamlines” for connectome generation changed? I am sure that this question has been answered somewhere but I could not seem to find it …
I reckon with the amount of time I’ve spent giving the philosophical answer to this question, I could have done the experiment and published it
There’s no singular answer even within the constraints of a fixed tracking algorithm & parameters. It depends very heavily on what quantitative measure(s) you intend to utilise. E.g.:
Hoping to reliably quantify surface-vertex-to-surface-vertex connection density would likely require a great deal more streamlines than connectivity within a parcellation that is just the four lobes of each hemisphere.
Quantifying something like mean FA for an edge may require fewer streamlines than quantifying connection density.
Obtaining stable global graph theory metrics may require fewer streamlines than that required for stable quantification of any individual edge.
I did see this related article in my inbox today; their results may not translate directly for investigation of structural connectome only, and the only article they cite that looked into this issue specifically is the same one from @tjroine that I think I linked to in another thread recently. The other recent one is this from @Lee_Reid, but the focus there is on tracts of interest, quantifying tract volume and sampling voxel-wise metrics within that volume rather than connection density estimation.
So sorry that I can’t just give you a number, but I think the general advice applies to any specific experimental context: quantify the variability in your experimental outcomes that occurs from tractogram regeneration, and choose a streamline count that’s large enough to make that variance “acceptable”.
P.S. The question is also fractionally more complex with considering SIFT2. E.g. If you generate 10M streamlines, but SIFT2 (or some comparable method) assigns very large weights to 10k streamlines and zero / negligible weights to the other 9,990,000, I’d expect that outcome to have comparable reproducibility to what you’d get generating 10k and using raw streamline count. So there’s a trade-off between the quality of fit of the model to the image data, and the “effective” density of the reconstruction taking those weights into account. The regularisation mechanisms within SIFT2 modulate that balance directly.