Advice on response function handling in multi-center study

There is some precedent/advice in this post, where we see in this multi-center FBA by Steven Meisler that response functions were averaged within-site:

Response functions for white matter, gray matter, and CSF were estimated with MRtrix3’s unsupervised dhollander algorithm (Dhollander et al., 2016; Dhollander et al., 2019). For each tissue compartment, site-specific average fiber response functions were calculated across participants (Raffelt et al., 2012b), which enable valid inter-subject comparisons while controlling for scanner differences across sites (Smith et al., 2022).

(Here it is in the code)

I couldn’t find any other advice on this, but it makes sense to me to average within site/scanner. Here’s my thinking, just rambling a bit. Why do we average response functions in in the first place? It’s because we want to apply the same spherical deconvolution operator to different subject images. This way, intersubject differences observed in FODs can be attributed completely to what we actually measure: the diffusion weighted MR signal. With just one study site, we hope that the DWI signal reflects the character of the underlying diffusion process, so that intersubject differences in FODs characterize differences in the underlying diffusion process and nothing else. Now if there are multiple sites/scanners then we lose that perk no matter what: whether we use the same response function or not, intersubject differences for subjects from different sites/scanners could be attributed to site/scanner effects. So we might as well estimate separate response functions. Basically, if we expect the signal response to be different between two image acquisitions for the same underlying diffusion (e.g. different sites/scanners) then we should estimate response functions separately when applying CSD to these acquisitions. Using different response functions for different sites/scanners is a form of data harmonization, and in a multicenter study there would probably need to be further harmonization anyway in the dowstream computed measures.