Fixel analysis on multi site, pre-post data set

So, I am working with some of my staff on a FIxel based analysis of diffusion data in a clinical sample, collected across two sites with a pre and post scan.

By way of background, I’m an fMRI researcher, less experience in diffusion, and I’m muddling my way through Fixel stuff. My RA ran a lot of the pipelines which is worse because I’m further from the data…

Firstly, I’m wondering fi anyone would like to posit a suggestion for combining multi-site data here? There are often big scanner differences in diffusion. it seems easier to model out in fMRI, and we can handle some of this post-processing. I’m a bit worried because the full sample goes into the fixel pipeline that we should try to harmonize before running MRTrixx and Fixel… but I’m not entirely clear what the best choice is. I know there is Rich harmonization but it seems to have a number of issues in actual implimentation.

Second, would really appreciate any advice on pre-post analysis. in FMRI, I would often subtract pre and post scans and run a simpler statistical set on the change maps. Not so sure this is viable in Fixel. I could set up a monster design matrix with (208) columns for participants. hoping there might be a slightly simpler approach.


Welcome Colin!

Firstly, I’m wondering fi anyone would like to posit a suggestion for combining multi-site data here?

Depends on what varies between sites. In other instances where this question has been posed, the “difference in sites” was a red herring, because there was a difference in acquisition protocol underlying that that necessitated earlier intervention. Anything that could make the aggregation of response functions across subjects between sites needs to be looked at: TE, TR, b-values, anything of that sort.

Further downstream, personally I would err toward trying to appropriately handle the complexity of the data within the statistical analysis rather than attempting direct harmonization of the data itself. Not that I’m an expert on the latter, or that it doesn’t have valid applications, I just have an instinctual hesitancy here. The simplest thing would be the addition of columns to the design matrix to regress out any constant offset in quantitative values between scanners. Then, if there is some reasonable a priori reason why the variance in data across participants may vary across scanners, then you can use variance groups to instruct the GLM to compute data variances individually per site.

Second, would really appreciate any advice on pre-post analysis.

Subtraction of pre-post scans is a fairly conventional strategy for longitudinal analysis, and can be done for fixel data just as it is for voxel data. Personally I advocate for computing a rate-of-change-over-time rather than a difference between two time points regardless of time between acquisitions. The generation of the null distribution through non-parametric shuffling requires knowledge of what can be shuffled within the data; in this case, in the absence of any change over time one expects the residuals of the model to be both independent of one another and symmetric about zero, and the model is informed of this by the command-line option -errors ise (or -errors both if applicable to your model; I don’t have adequate information here to know). The problem with the monster design matrix strategy is that you can’t have nuisance regressors that have the same value for the participant for both pre and post scans, as it can’t be separated from the subject mean.