Mtnormalise huge variation


Dear experts,

I did mtnormalise in my data, and I observed a lot of variation:


The wmfod_norm looks really well, but the wmfod, some of them are completelly out of scale. Could be due to the bias field correction stage?

I followed the steps of the multi-shell multi-tissue fixel framework, but using only the WM and the CSF responses function.

I think there is something wrong with this, or at least something strange, but I don’t know the reason. Any clue? Thanks in advance.

Best regards,



Hi Manuel,

Nothing wrong here, based on what you report; especially if…

That actually shows mtnormalise did a splendid job! :sunglasses:

As to why those scales where so off to begin with (and mtnormalise then corrected that), that is realistically one of 2 things:

  1. Your raw data showed such huge variations; this can sometimes happen due to how the scanner decides to save the data. I’ve seen cases where a min-max or something determines this, and even a single outlier voxel hence causes this. That’s in principle not a problem, since T2-weighted and other …-weighted images don’t come with a unit.

  2. dwibiascorrect caused this. Again no worries in principle, same reason as above: the solution is determined up to a constant factor. In the end, the only “impact” of this is that certain subjects’ estimated response functions will have had a greater weight in computing the average response. I’ve thought about this long ago, and people have suggested to do some renormalisation of the responses before averaging for this reason; however, there’s no important need for this, as the responses vary very little (in shape/contrast) to begin with. There’s actually not even a reason for why you have to average them all; it’s just the most elegant solution in the absence of arguments for another one. I do have another alternative in mind, which I may at some point suggest in writing here or there; but again, to be honest: this won’t have any noticeable impact on your study.

So well, while going a bit off topic in point 2, essentially: I’m not surprised, and I’ve seen this more often. Actually, in the multi-tissue pipeline, dwibiascorrect is entirely optional (and at some point in the docs history, it was even not included in that pipeline), as mtnormalise does everything that needs to be done in terms of spatially varying as well as global intensity normalisation (and does it much better than anything that is only driven by the b=0 images). I’m currently doing a relatively in-depth revision of the FBA pipeline documentation, and adding important notes and warnings based on all FBA projects I’ve been involved in so far; one of them is exactly this fact that the dwibiascorrect is entirely optional. The only reason why it does still sit in the pipeline (or can be considered at least to be included) is for better mask estimation at the dwi2mask step, but then again only really in case of severe bias fields in the original data. But we’ve actually encountered cases where even the opposite is true, and dwibiascorrect is the culprit to be blamed for less-than-great masks (often when bias fields where quite limited in presence and “intensity” in the original data). There’s definitely several projects, where I’ve advised and we’ve actually ended up not including dwibiascorrect in the pipeline.

I hope this all makes some sense. :slightly_smiling_face: I’ll be posting a link to the updated documentation soon, as the process of getting it properly on master is always causing delays. Better to make sure people have access to the most up to date advice as soon as possible!



RE: dwibiascorrect, there are two slightly different points to disentangle, just to make sure that the reason for the observed variations in scaling parameters is understood.

When data from different subjects vary greatly in magnitude:

  1. Some subjects will contribute to the shape of the average response functions more than other subjects. This isn’t ideal, and shouldn’t be too difficult to fix, but shouldn’t actually cause tremendous issues as already stated.

  2. Deconvolution with an average response function will result in large variations in the magnitude of the resulting tissue densities. If the deconvolution algorithm behaves appropriately linearly and does not contain any non-zero constraints, the relative densities of different tissues and the WM FOD shape will not be affected by global scaling of either response function amplitudes or DWI magnitudes; this will just alter the estimated absolute tissue densities.

mtnormalise is producing highly variable scaling factors because it is appropriately correcting effect 2. This could be caused by:

  • Variations in overall signal magnitude between subjects in your original data;

  • (What I suspect is more likely) These differences are being introduced into the DWI data by dwibiascorrect. This seemingly occurs specifically because the N4 algorithm fails to constrain the global scaling of the bias field as it is estimated; so while it corrects for spatial inhomogeneities within each subject, it can also introduce global intensity differences across subjects. This is another effect that could theoretically be explicitly fixed. Given mtnormalise appropriately corrects for this effect subsequently, it’s not a top priority; but then again, fixing it might prevent users from sounding the alarm due to observing these fluctuations in mtnormalise factors in instances where dwibiascorrect has been used.

Hope that clarifies the distinction between DWI magnitude scaling, and response function scaling due to averaging where DWI magnitude scaling has occurred.



Agree with everything that’s been said here – the use of an average response together with differently scaled raw DWI data means that the relative scaling between the two will differ between data sets, and that will translate into differently scaled fODFs.

However, this statement is not quite true:

This is almost true for the hard constrained MSMT-CSD algorithm (which you can use for single-shell too, as per the docs). The regular CSD approach regularises negative values towards zero, and the strength of the regularisation is determined relative to the expected size of the fODF – i.e. assuming the response is appropriately scaled for the data. If the scaling is completely off, this will effectively result in a different amount of non-negativity regularisation being applied.

Another minor issue is that both versions of the algorithm include a tiny amount of minimum-norm regularisation on the fODF coefficients (to avoid ill-posed situations, particularly with badly distributed directions and/or super-resolution). Again, the strength of the regulariser is determined relative to the expected fODF amplitude.

So if the scaling is off by a huge amount, this could influence the results over and beyond a simple scaling. I expect it would need to be a very large difference to be noticeable, but it is something to bear in mind nonetheless – and suggest we should try to fix up dwibiascorrect to avoid this (I agree it’s by far the most likely culprit here).



Thanks @ThijsDhollander, @rsmith and @jdtournier for your help.

Regarding this:

so while it corrects for spatial inhomogeneities within each subject, it can also introduce global intensity differences across subjects

This can affect to the response function calculation? I mean if a subject has huge spatial inhomogeneities, could the the response function algorithm select a few affected voxels? If this is true, from my understanding, dwibiascorrect (away from being perfect), improves the FOD calculation. And as far as mtnormalise is applied afterwards, the benefit of this step is noticeable.

I have another question related with this, in the ACT framework based on msmt, dwibiascorrect is necessary to apply SIFT, is advisable to apply mtnormalise also in this context? As each subject is convoluted with his own response function, the effect should be minimal, is that true?

Again, thanks for your help.

Best regards,