I am relatively new to MRtrix3 and have been working through the protocol by Tahedl et al. (2025) to construct individual structural connectivity matrices. As I understand it, each element in these matrices represents the fibre bundle capacity (FBC) between two brain regions.
I noticed that Smith et al. (2022) emphasize the importance of inter-subject connection density normalisation when comparing such structural connectivity measures across groups. However, the Tahedl et al. protocol does not detail how this normalisation should be implemented.
Could you kindly point me toward the recommended workflow for performing this type of normalisation within the MRtrix3 environment? I’m aware this topic has been discussed in previous posts, but I would like to ensure the workflow aligns with the processing steps of the referenced protocol.
After reviewing recent literature on structural connectome (SC) mapping using MRtrix3, I have observed a general lack of consensus regarding whether and how to perform inter-subject normalization.
Based on Smith et al. (2022) and previous discussions in this forum (e.g., this post), the recommended approach for ensuring comparable SC across subjects appears to be the following:
Use a group-average response function for FOD estimation
Apply global intensity normalization with the mtnormalise command
Estimate streamline weights using SIFT2 and extract the proportionality coefficient (mu)
Generate the SC matrix via tck2connectome , where each element represents the sum of streamline weights
Multiply the SC matrix by the proportionality coefficient to obtain the final normalized SC matrix
The normalized matrix can then be used for edge-wise inter-subject analyses, with each element representing fibre bundle capacity (FBC) between two regions
If I have followed the Tahedl protocol, my understanding is that the necessary modifications would be:
Using a group-average response function instead of a subject-specific one
Multiplying the resulting SC matrix by the SIFT2 proportionality coefficient to obtain the final normalized matrix
If a common response function is not used, how substantial would the impact be? Using a common response function can be somewhat inflexible in practice. For example, if we initially estimate the response function using all subjects but later exclude some during analysis, the function must be re-estimated. If we use only a subset of subjects from the start, the choice of this subset introduces uncertainty, which may affect the reproducibility of the results. If the potential bias introduced by using subject-specific response functions is minimal and does not substantially affect the validity of FBC interpretation, I would prefer to use subject-specific function for greater practical flexibility.
As I do not have a strong technical background and find some aspects of Smith et al. (2022) challenging to fully digest, I would greatly appreciate confirmation from the MRtrix3 experts on whether my understanding is correct.
Yes, the inter-subject connection density normalisation was excluded from that manuscript. That may turn out to have been the wrong decision, but its publication had already been disproportionately delayed and we needed to stop making incremental modifications at some point.
Currently, yes, the default procedure would be to use common response functions, and then—as long as each subject has DWI data of an identical spatial resolution—just multiply the resulting matrix by mu. I have for a long time been torn on exactly how to better deal with this from a software interface perspective; I think I’ve landed on a solution in SIFT2: Changes to output streamline weights units by Lestropie · Pull Request #2922 · MRtrix3/mrtrix3 · GitHub, but I’ve not been able to catch a breath in the last four years or so. Unfortunately there’s a lot of ways for it to go wrong, and so rather than making things push-button I’ve tried to impart to the community an understanding of what to do and why; but that’s currently not working, and I’m uncertain whether I’ve over-complicated things or just am not as clear a communicator as I’ve strived to be.
RE common response function:
The requirement is technically a common response function, not an average response function. So if you were to retrospectively remove a subset of subjects, there would not strictly be a need to recompute ODFs. The response functions may no longer be precise numerical averages, but as long as they are representative enough of the remaining cohort, there’s no problem.
Part of my desire to convey the issue comprehensively is to show that use of common response functions is one way to deal with the AFD aspect of the normalisation, but it’s not actually the unique solution. The requirement to generate and utilise group average response functions interrupts the parallelization of connectome construction across participants. It’s technically possible to utilise subject-specific response functions for parallel processing, and then at the point of connectome data aggregation, suitably modulate the connectivity estimates based on the discrepancies in utilised response functions across participants. This is what I realised and implemented all the way back in 2016. But that requires an even deeper understanding of these scaling properties and the nuances of AFD intensity normalisation than the alternative of “utilise common response functions and multiply by mu and trust that that is adequate”.
A follow-up question concerns the spatial resolution of DWI data. As I understand it, if the spatial resolution is consistent across subjects, the “common response function and multiplication by mu” approach should suffice. However, if the resolution varies between subjects, an additional step of multiplying by the voxel volume is necessary.
In such cases, would you recommend resampling the DWI data to a uniform resolution across all subjects before response function estimation, or adjusting for resolution differences after the connectome has been constructed?
I would probably recommend preserving the native spatial resolution of each image acquisition throughout preprocessing and reconstruction, and correcting for that confound within the connectivity quantification. Resampling image data will always have nontrivial side-effects, and would introduce a bias in the reconstruction pipeline of different subjects over and above the difference in acquisition. There may be other justifications for rescaling images (eg. upsampling prior to model fitting can perform better than interpolation of model parameters), but given here it is literally handled with a mere scalar multiplication of the final derivative data, I don’t think doing so just to circumvent that scaling is justified.