SC analysis pipeline question

Hi there! I’m working on my pipeline to get structural connectivity matrices. I think I have it mostly down but just a few brief questions if someone doesn’t mind taking a look?

Dataset: multi-shell (10 @ B0, 20 @ 1K, 40 @ 2K, 80 @ 3K), 2.5mm iso, AP/PA.

  1. Denoising
  2. Gibbs de-ringing (should I skip since I used phase-partial Fourier 6/8?)
  3. dwipreproc (topup/eddy)
  4. dwibiascorrect
  5. Calculate RFs (dwi2response dhollander)
  6. Upsample DWI to 1.3mm (mrresize) – Should one upsample?
  7. Create brainmask (dwi2mask)
  8. Compute FODs (dwi2fod msmt_csd) – Should I use individual or group averaged RFs?
  9. Perform joint bias field correction and intensity normalization (mtnormlise)
  10. Create tractogram- (tckgen) Should I use ACT since I have (relatively) distortion-free DWI?
  11. Reduce biases (use sift2 now)
  12. Create SC matrix (tck2connectome)

Thanks!
Mark

Don’t skip this step, I think it’s fine to use by default. See this post for details.

I reckon given the relatively low resolution of your input data, upsampling would probably be a good idea. But I’m not sure it’s an absolute requirement – some users report good results without…

Always use group-averaged RFs if you can – especially for quantitative group studies.

Definitely. this is critical for connectomic analysis: if you don’t use ACT, a very significant proportion of your generated streamlines won’t reach GM regions, and hence won’t be taken into account in the construction of the connectome matrix. Some discussion about this issue in this paper.

1 Like

Quick follow up question: For a longitudinal study (i.e. the same group of subjects is scanned multiple times), does it make sense to compute group averaged RFs from all subjects at just the first time point, or across all time points?

I don’t personally think this would make a great deal of difference in practice. But assuming your hypothesis assumes that the first time point is the most ‘normal’ (i.e. you’re looking at degenerative process or something like that), then there is a good rationale for using the average response from the first time point as the most representative of ‘healthy tissue’.

Just my 2 cents though… Others may well see it differently.

1 Like

Thanks, I was thinking along same lines.

If you don’t mind another question: If I’m upsampling my DWI data to 1.3mm, should I estimate the RFs prior to upsampling (as I listed in my original post), or do so after the upsampling?

My instinct tells me we’d want to estimate the RFs on the original voxels, but I couldn’t give you a good reason why.

Hi @MrRibbits,

To answer and confirm a few things:

It always depends on the exact scenario, cohort and hypothesis, and no 2 settings are exactly the same; but generally I’d agree for sure. Go with the group / data / time point / … where things are most normal, or most “intact” (closest to “healthy”).

Best prior to upsampling, as you guessed correctly. Upsampling isn’t going to provide you with new or better samples for the response functions, so it’ll only be more computationally expensive to perform response function estimation after upsampling. Other than that, the response functions may in fact be slightly less accurate, depending on what choice of interpolation was used in the upsampling; but admittedly, it shouldn’t make too much of a difference (it would be weird / suspicious if it did!).

I’ve recently come to observe / realise that 1.3 mm is probably a bit much actually (or unnecessarily much, depending on how you look at it). I’m these days having some things run e.g. with 1.5 mm; saving quite a bit on storage space, memory requirements and processing time. I’ve been experimenting a bit, but it might even be that upsampling isn’t all that useful after all; well, that is, depending on what kind of pipeline you intend to run. It’s not even a given that it’s beneficial in some scenarios. Currently I’d say: experiment a bit with it, and see what comes out at the other (far) end of the pipeline. :wink:

But in any case, in conclusion on the actual topic: I 100% recommend to run response function estimation on the non-upsampled (yet otherwise fully preprocessed) data.

Cheers,
Thijs

2 Likes

I’ve actually often wondered where the recommendation to upsample, specifically 1.3 came from? I believe it was also 1.25 in the documentation at one point.

Hi @phmag,

Hope you’re doing well!

Well, see some of my writing above; some bits have been gained further insight into recently in some peoples’ experiences, but I won’t be discussing most here at this point. The facts about the history of the documentation on this front are pretty unambiguous though. This is what happened over time:

  1. Originally, at some point, upsampling was introduced in the practice. At that point internally, this was often done by a factor 2. Most of the in-house data at that point had a spatial resolution of 2.5 mm isotropic, so the factor 2 in all dimensions brought it to that very particular number of 1.25 mm. Some of those internal data back then had a resolution of 2.3 mm isotropic, so some data has been processed, with upsampling with a factor 2, with a 1.15 mm isotropic resolution indeed.

  2. However, at some point, the HCP data “happened” (i.e. became available publicly). These have 1.25 mm original resolution. Some people copy-pasted the instructions with the factor 2, arriving at a crazy high resolution of 0.625 mm isotropic. This goes way beyond file sizes and memory usage most computers can handle, so at some point someone ran head-first into this issue. That prompted to not recommend a “factor 2”, but rather a fixed absolute number. The 1.25 mm we happened to often use in practice (see point 1) thus became that number. Coincidentally that also matches the HCP original resolution, which is neat but indeed also doesn’t matter at all. :slightly_smiling_face: But this is how the 1.25 mm showed up… that’s all there is to it.

  3. But 1.25 mm combined with a typical human brain’s size, combined with the typical number of fixels you’d get using a default cut-off for that step, combined with single-tissue CSD, automagically gives you a number of fixels somewhere on the order of magnitude of 500.000. Using versions of the stats back then, the memory consumption for this number of fixels also very coincidentally landed you in the ball park of 128 GB RAM, which is also that kind of magical number you might find on a lot of machines. So either your memory just cut it, or it didn’t. So all of these coincidences add up to a ball-park figure you’d rather avoided every so slightly (back then, the memory consumption now is lower than it was back then). So I changed the magical 1.25 to 1.3. That’s all there is to it. :wink: Reasoning: the original 1.25 wasn’t needed, and for no reason so suspiciously “specific”. Just rounding it up to 1.3 did the job. (1.25^3)/(1.3^3) = 0.89, so that’s effectively a reduction of 10%, and thus also roughly 10% less fixels given all other defaults remaining their default, and in a human brain, etc… etc… So again, this is kind of arbitrary, but that is the one and only reason the 1.25 became 1.3. There is no methodological justification for this.

  4. As mentioned above, I have come to realise there’s little use to upsampling that much. It might even get in the way of some things (!). And given particular kinds of default mechanisms in other steps of the FBA, some of which have changed over versions of the software even, playing with the upsampling resolution has a few very particular downstream effects. Again, this depends on which version of the software you use and which (mechanisms of) defaults you accept or copy-paste. Some funny interactions come into play. I now personally recommend just 1.5 mm. If upsampling is needed or justified at all, 1.5, I find, is certainly more than enough. Some things also become a bit more robust. Some things certainly become faster too, and will use less space and memory.

So well, the gist is: some of those strange numbers were arrived at by some coincidences. Some changes were driven by some pragmatic evidence and practical concerns. There’s certainly no intricate reason for that highly specific value of 1.25 :wink:. I highly recommend anyone to question and play around with some of this. Practical experience with FBA studies can certainly teach you a lot. :relaxed:

Cheers & take care,
Thijs

Thanks Thijs. A very thorough answer! Someone was asking me the reason for my particular methods and it was one I couldn’t answer… until now!
I’d love to play around with some of this and test a few parameters. It can be difficult though when other researchers are just looking for results asap. :face_with_raised_eyebrow:

1 Like

No worries at all. So indeed some reasons end up being a bit disappointing maybe, or well, rather arbitrary. :slightly_smiling_face:
Yep, you’re absolutely right in being motivated to play around with some of these parameters and processing choices. There’s a lot of “hidden” interactions in the pipeline that might surprise quite a bit if they turn out to apply to your data at hand. It’s worthwhile to check every once in a while if your studies’ results “survive” changing a few parameters that seem as if they “shouldn’t” matter to the result. :wink:

Since the increase from 1.25mm to 1.3mm was driven by exceeding RAM availability, and with 3.0.0 I reduced the RAM usage of fixel-fixel connectivity matrix generation by ~80%, I nudged the documentation back down to its original 1.25. I think that warrants inclusion in a thorough answer.

But it is somewhat arbitrary, and the pipeline documentation should probably reflect that there’s flexibility there.

It might even get in the way of some things (!)

And given particular kinds of default mechanisms in other steps of the FBA, some of which have changed over versions of the software even, playing with the upsampling resolution has a few very particular downstream effects.

Again, this depends on which version of the software you use and which (mechanisms of) defaults you accept or copy-paste.

Some funny interactions come into play.

Some things also become a bit more robust.

Practical experience with FBA studies can certainly teach you a lot.

There’s a lot of “hidden” interactions in the pipeline that might surprise quite a bit if they turn out to apply to your data at hand.

If you have insights as to what parameters turn out to be appropriate across a spectrum of data, or under what circumstances various parameters should be changed, I’m sure the research community would appreciate a Wiki post sharing those details.

Hadn’t even seen that yet. I don’t think the specific small difference between 1.3 and 1.25 really matters substantially.

Naturally insights will be disseminated whenever appropriate, as we do in science. Not sure if the entire research community is highly focused on wiki posts for that purpose, but whenever and wherever insights are publicly published and available, that is of course always a possibility and makes it often much easier (we’re all busy people).

From my part, currently the most appropriate advice that generalises for any studies out there is to actively play around with these parameters, and simply not take them for granted. If this advice is embraced and put in practice responsibly, it will absolutely lead to the best results. And as a bonus, the research community will gain the best sample of insights naturally over time.

On the topic of Wiki posts, one that would truly be useful to consider at this time is something to address confusion or worries about the possible effects of denoising on dwifslpreproc / eddy. This sits right at the start of the pipeline, and arguably motion correction is one of the most, if not the most, important preprocessing step(s). This is meant absolutely constructively, should some doubt it; I know this is a genuine worry of some. A Wiki post sharing details of relevant insights could provide clarity.