Preprocessing for AFD estimation

Hi MRtrix experts,
I have a few basic questions on preprocessing of diffusion data for AFD estimation and comparison across populations. As it is written in original AFD paper, after correcting for motion as well as eddy current distortion artifacts, a multiplicative bias field has to be estimated and applied to entire dataset. After that, dataset should be normalized by the median intensity of wm voxels within b0 image, say medb0val . Once these steps are performed, a common response function has to be created by averaging those coming from single subjects and used to fit CSD model.

I have two simple questions:

  1. Should bias field (estimating using fsl fast command) be applied to b0 images as well?
  2. Has division by medb0val to be applied to diffusion weighted volumes only or to b0 volumes as well? In the latter case, should I expect to see differences when estimating tensor based parameters (e.g., FA) with respect to a datset on which such normalization is not performed?

Best Regards,

Hi Alessandro,

Yes, this is what we normally do anyway. It probably does not matter if you are only performing CSD with a single shell (since the b0 is not used). But if you are performing multi-tissue multi-shell CSD or tensor analysis then it is important.

We normally apply it to the entire image (i.e. b=0 and DW volumes). If everything is scaled together then there should be no effect on the tensor or derived metrics.

Just in case you have not already seen it, here are instructions for preprocessing and fixel-based analysis using MRtrix. The dwibiascorrect script will convert your images mif to nifti, then call either N4 (ANTS) or FSL to estimate the field on the b0s, then apply it to all volumes and convert it back to mif (with gradient directions still included). The dwiintensitynorm script normalises using the median in a conservative WM mask, which is estimated via generation of a FA population template, thresholding, then warping the mask back into each subject’s space. I’ve been thinking about changing this script to be more robust and faster when large populations (>30) are involved, so you may want to keep an eye on future updates.



I’m just wondering why the DWI volumes are not typically normalized by the total b=0 image instead of a median intensity value? This would also allow the comparison between subjects and even make the bias-field correction unnecessary (as long as the correction is multiplicative)? Does anybody know why this is commonly done for the tensor model but not for (any?) HARDI models? And I am also curious if this would negatively influence the AFD measure?


I’m assuming you’re already familiar with the reasons for not normalising to the b=0 voxel-wise (also discussed in some detail on this old thread. As to why we might not want to normalise to the mean b=0 intensity: this is because ideally we want to normalise to the true intensity of typical, healthy, pure white matter. The median is less sensitive to outliers, so less affected by the presence of partial volume effects, T2 variations, etc. that might otherwise influence the value you get. We further restrict this to a mask of white matter voxels to avoid undue influence from other tissue types altogether. If we were to use the mean b=0 signal within a whole brain mask, and the pathology also happened to cause brain atrophy and/or ventricular dilatation, this would bias the mean intensity quite badly, given how much larger the CSF signal is using typical acquisition protocols, just because the number of CSF voxels is larger, even if the white matter signal would otherwise be identical.

So that’s the logic, hopefully that answers your question?

Dear experts,
to follow-up this normalization issue:

In dwi2tensor, you also do not voxel-wise normalize to b=0 ?
It seems to me that dwi2tensor also does not normalize, in contrast to dtifit in FSL. As I look at e.g. axial diffusivity image derived by dwi2tensor and tensor2metric, in comparison to axial diffusivity derived by dtifit, they seem to have different contrast.
In AD images from tensor2metric, CSF regions have much greater intensity than white matter. In contrast to AD image derived by FSL’s dtiit where the difference between signal of CSF and white matter is much lower.

Therefore, if one wants to use tensor metrics derived by tensor2metric (only possible exception is FA, which is ratio by definition), the bias field correction and between-subject normalization is also necessary (in contrast to FSL’s dtifit), am I correct?

Would you suggest for old-style TBSS analysis of tensor metrics to use voxel-wise b=0 normalized or not normalized metrics?

No, DTI inherently assumes normalisation to the b=0, it would make no sense at all if this wasn’t done. dwi2tensor definitely does include the b=0 intensity in the calculation.

As to why you might find differences between dtifit and dwi2tensor, there are many possible reasons for this, since tensor fitting can be done many different ways. I’m not familiar with how dtifit works, but I have also noted some subtle differences between its output and that of dwi2tensor - but nothing as large as what you suggest.

The first thing to verify is that your version of dwi2tensor is up to date - there was a bug in the code that was fixed back in April 2016. That could potentially introduce large errors.

Otherwise, dwi2tensor relies on an iteratively reweighted least-squares fit, as described in this paper. This only accounts for the expected variation in the residuals due to the log-linear transform. Other ways of fitting the tensor model include:

  • simple log-linear least-squares
  • non-linear least-squares
  • constrained least-squares to ensure positivity of eigenvalues
  • Rician-bias corrected least-squares
  • weighted least-squares
  • iteratively reweighted least-squares with outlier rejection
  • any other combination of the above

So I’m not sure what the issue is exactly, but it will depend on the exact algorithm used in both implementations, and the details of your acquisition, particular the b-value, and whether it’s single or multi-shell. Maybe if you provide this information, and maybe some screenshots to demonstrate the problem, we could give more informed feedback on this.

Dear Donald,

at that moment I did not comprehend this implicit b=0 normalization in DTI model (since diffusion tensor by definition models signal attenuation), my fault.

My installation is from 3 January 2017, so it should be OK.

My dataset is multishell, b=0 (12 directions), b=1000 (64 directions) and b=2500 (64 directions).
I have compared output from dwi2tensor and dtifit for various scenarios:

  1. using only b=1000 shell
  2. using only b=2500 shell
  3. using both b=1000 and 2500 shells

For dtifit I used both ordinary (OLS) and weigted least squares (WLS) estimation.

My example metric is axial diffusivity.

1. using only b=1000 shell.

In this case the results are virtually identical.


dtifit weighted least squares:

dtifit ordinary least squares:

2. using only b=2500 shell.

  • The signal of CSF is higher for dwi2tensor in contrast to both methods in dtifit.
  • The variance of CSF signal seems to be higher with occasional outlying values in dwi2tensor.
  • The signal of anisotropic WM is slightly higher for dwi2tensor and dtifit WLS in comparison to dwifit OLS.


dtifit weighted least squares:

dtifit ordinary least squares:

3. using both 1000 and 2500 shells.

The difference between methods is largest here:

The dwi2tensor image appears very similar to image of b=1000, with CSF signal almost identical, only with slightly higher value of GM/WM for b=1000.
The dtifit WLS has slightly lower signal of CSF in comparison to dwi2tensor.
The intensity of CSF in dwifit WLS is somewhat between its alternative with b=1000 and b=2500.
The dtifit OLS has similar appearance to dtifit OLS at b=2500 (in contrast to dwi2tensor), with lower overall contrast. It has overall lower intensity in comparison to dtifit WLS and dwi2tensor.


dtifit weighted least squares:

dtifit ordinary least squares:

Therefore, using low b-values and single-shell scheme, the difference between methods is small, with high b-values and multishell is going to be big.

Could you please comment on?

  • Does the divergence of results with multi-shell scheme indicate that the tensor model assumptions are breaking at higher b values (non-exponencial decay)?
  • What do you recommend to estimate tensor values with this data? Is here any benefit with estimating tensor values from multi-shell data? Would it be better in this case to abandon high b data and estimate tensor values only from low b data?

I was also quickly looking at FA, where I would not expect much difference. Indeed, the contrast of images look optically quite similar for all b and all methods.
However, I noticed that for b=1000 in dwi2tensor there are voxels with values greater than 1:

Does it not indicate some persisting problem with tensor estimation in dwi2tensor?


OK, assuming we’re looking at axial diffusivity maps, everything matches with what I’d expect here. DTI has historically been performed using single-shell b=1000s/mm² acquisitions, so it’s reassuring that everything matches in that case.

The issue at high b-values is that the CSF signal will effectively be pure noise - and with magnitude reconstruction, it’ll have a non-zero value on average (the Rician bias). So the ADC values in that regime are going to be very dependent on how these are handled. Using a single-shell at b=2500s/mm², I’d expect the CSF axial diffusivity to be essentially meaningless (noise-driven). For the WM, the differences will relate to the differential weighting given to the signal based on their actual (dtiftit with WLS) or predicted (dwi2tensor) signals, to compensate for the increase in the variance of the low signals due to the log transform. So the fact that the OLS perform differently here matches with that expectation.

The handling of multi-shell data is however a more recent problem. The fact that dwi2tensor provides similar results to the single-shell b=1000s/mm² case is reassuring to some extent - although it’s not clear that there is a ‘right’ answer here. The problem is that the single tensor model breaks down when looking at multiple b-values (in fact, this is the entire premise behind diffusion kurtosis imaging), so these fits will never be perfect. With dwi2tensor and dtifit with WLS, the fit is inherently weighted towards the lower b-values, so it’s not surprising these look most like the b=1000s/mm² case. Moreover, dwi2tensor updates the weights iteratively based on the predicted signal from the current fit, which will tend to reduce the weights for the high b-values further since these will have lower predictions than their actual measurements (as expected from the Rician bias, but also because the signal decay is not actually mono-exponential), so its fit will be even closer to the b=1000s/mm² case.

So to address your specific questions:

Yes - see my previous comments.

Ahem… I generally don’t recommend to estimate any tensor values… Pretty much my whole career has been focused on highlighting the deficiencies of that model and coming up with better methods. I’m just the wrong person to ask here.

It very much depends what you want to do. In general, it only exposes the limitations of the method, as you’ve yourself witnessed. If you are going to do tensor-based analyses, it’s much more important to ensure reproducibility of the measures, and therefore enforce consistency of the acquisition. Besides the fact that the values will be dependent on the details of the acquisition and the particular reconstruction method used, the values themselves are very difficult to interpret reliably from a biological point of view (too many issues to go into here, but maybe I can promote one of my review articles here :wink:).

Maybe, if only for consistency with the literature (which is overwhelmingly single-shell b=1000s/mm²). But again, I’m probably the wrong person to ask, so I recommend you seek someone else’s opinion if you really want to do diffusion tensor imaging. More than happy to advise on higher-order analyses though.

Yes, dwi2tensor does not enforce any constraints on the positivity of the eigenvalues, while other fitting algorithms might. These FA>1 voxels will result from negative eigenvalues. And these tend to occur where the b=0 signal is lower than the highest DW signal, which in turns tends to be caused by Gibbs ringing in the b=0 image in the areas bordering on CSF - exactly the regions you’re highlighting. I’m not sure there’s necessarily any need to worry about this, but it will depend on what you plan on using these values for - it will certainly have no noticeable impact on tractography (but then hopefully you’re not using the tensor model for tractography :slight_smile:). There are methods out there for removal of Gibbs ringing artefacts (I think there are plans to include one of these in MRtrix3 in the near future - @bjeurissen may be able to provide details here), but otherwise we certainly have no plans to introduce a positive-definite constraint to dwi2tensor.

Note that, for multi-shell data (at least 3 distinct b-values, including b=0), you can also do a DKI fit with dwi2tensor, if you explicitly specify the -dkt option (and hence ask for the kurtosis tensors too). It’s maybe not obvious from the dwi2tensor documentation that this option actually toggles that too (beyond just asking for the kurtosis tensor output)…

(maybe we should have dwi2tensor work with an algorithm parameter too, just like dwi2fod)

Dear @jdtournier and @ThijsDhollander,

thank you for invaluable comments. Just to clarify primary reasons of my interest in DTI estimation: I was planning to use it for TBSS. Ultimately, I would like to run fixel-based analysis, but since my design is repeated measures, I currently cannot use fixelcfestats, so I would have problem to run proper statistical analysis. Another reason is that my collaborators are asking me to run TBSS since it is quite established method in research community (despite its limitations and pitfals) and its use was also annotated in the grant project which funded this study.

One more question, inspired by DKI fit proposal of @ThijsDhollander: Is it possible to obtain the errors in estimated DTI and DKI terms? It seems to me that this option is currently not available. If one was interested in evaluating kurtosis metric (despite the fact that the kurtosis metric is currently not implemented in tensor2metric Dwi2tensor_command), it would be beneficial to assess the relevance of estimated DKI terms.

I suppose you’re referring to the residuals of the fit? In this context, I’ve added a -predicted_signal option to dwi2tensor at some point. You could get the output of that one, and compute the difference with the original DWI dataset by using mrcalc, or even a sum of squared differences via mrcalc followed by mrmath.

Dear @ThijsDhollander,

actually I meant something more specific. I was thinking on the standard errors of the estimates of particular tensor coefficients or derived metrics. However, I am not sure if concept of standard errors ever makes sense in context of used model for DTI and DKI estimation.
Also, I was thinking if it is possible to obtain standard errors of the estimates of coefficients of response function?

No, there’s no such option. You could probably try to obtain something like this by bootstrapping the data, and looking at the distributions of those coefficients and/or metrics. The concept of standard errors would still perfectly make sense, I reckon.