Response function for group analysis



Dear Mrtrix experts,

I would like to create an atlas from a group of N controls. All the acquisitions have the same parameters (number of directions, b-0 value, etc…).
Do I need to estimate the response function for each subject with the dwi2response command then estimate the CSD with the N different response functions? Or the best way is to have the same average response function for everyone (and does it exist a command to average the response function)?

Thanks in advance!
Best regards


taken from the docs:

…using the same response function when estimating FOD images for all subjects enables differences in the intra-axonal volume (and therefore DW signal) across subjects to be detected as differences in the FOD amplitude (the AFD). To ensure the response function is representative of your study population, a group average response function can be computed by first estimating a response function per subject, then averaging with the script:

foreach * : dwi2response tournier IN/dwi_denoised_preproc_bias_norm.mif IN/response.txt
average_response */response.txt ../group_average_response.txt



I second that. The manual isn’t very elaborate on explaining why, and “to ensure the response function is representative of your study population” is slightly vague though. I reckon you’d be also good for instance with the average response of only your controls. Of even only your patients. Or even just one subject. But the more important point to emphasise it that you want to use just 1 response (or set of tissue responses if doing multi-tissue CSD) for all subjects when doing CSD. In a way, the response function is the unit of your FOD that results from CSD: amplitudes of the FOD are expressed in a unit called something like “times your responses function”. When doing any subsequent quantitative analysis across your subjects, it’s important that their FODs are expressed using the same units. You can’t compare apples and oranges!


Dear Thijs and Max,

Thanks for this perfect explanation on why averaging the response function. My turn to second Felix (friend and French colleague) on problems that we encountered by using the same response in a group of subjects.
In our experience (mainly focused on Diffusion gradient scheme and TWI adaptation for computing cranial and peripheral nerves atlas), using an average response text file instead of those yielded from an individual dataset led to a “denoising” aspect of the tractogramm (ie. on the visual analysis of the TWI map, whatever the contrast type -FOD amplitude, length…- or use of super resolution properties).

In other words, small distal tracts or nerves could “disappear” by averaging the response function, which could be problematic at the group level, depending of the disease model.
I assume this could be less important for brain white matter fascicles…but any help would be appreciated :slight_smile:

Best, Arnaud


A bit late to chip in with my 2 cents, but better than never, I guess…

OK, what this sounds like to me, is that the FODs might have ended up scaled differently in different subjects, so that the effective threshold on tracking varies - this would indeed ‘hide’ smaller amplitude, more minor tracts in those subjects where the FODs are smaller than they would be when using the subject-specific response. Conversely, I’d expect messier tracking in those subjects where the FODs ended up larger than expected.

In my experience, the response is remarkably stable across subjects (assuming the same acquisition protocol, particularly the b-value, is used). What does change is the data scaling: that is determined by the coil loading, the scanner’s calibration, internal FFT scaling, etc. When using the subject-specific response, these global differences in scaling are inherently accounted for, since the response is derived from the same data, and ends up scaled to the same extent. If however this response is then averaged and used to process the same subjects without any attempt at adjusting the scaling in each subject’s raw data, then this will introduce differences in the scaling of the output FODs. This is the same issue that needs to be accounted for in fixel-based analyses, and is a sufficiently important topic that it has its own page in the documentation.

If you were already performing some kind of subject-wise global intensity normalisation, then your experience is unexpected, and I’d like to figure out what the problem might be. But first we’d need to rule out the much simpler explanation above…

That sounds very interesting! I look forward to the results - will this be made available at some point…?