FBA rank deficient matrix

SaraPonticorvo · October 7, 2020, 11:57pm

Hi all,
I’m performing an FBA analysis and I’m at the statistical analysis step with fixelcfestats. In my analysis, I have two groups to be compared and 3 covariates. I wrote the design matrix as follows (for an example of 6 subjects, 3 for each group):

1 1 -9.63 0 101632.23
1 1 3.37 1 68083.234
1 1 -8.63 1 -33305.765
1 -1 -5.63 0 54037.234
1 -1 -3.63 1 23039.234
1 -1 2.37 0 27409.234

After the intercept the 2nd column is the group, then covariates (age, sex and icv). The contrast file is
0 1 0 0 0
0 0 1 0 0
to respectively test the group effect and the age effect.
However, when I run the fixelcfestats I have the warning of rank-deficient matrix and I can not understand why.
I’m very grateful if you can help me.

Thanks
Sara

rsmith · October 9, 2020, 1:10am

Welcome Sara!

For reference, the warning arises from this block of code.

The specific matrix that you provide here is quite close to the currently specified threshold for detecting rank-deficiency. For matrices that contain floating-point data, you typically don’t want to use a zero threshold for detecting rank-deficiency, as then you could have two columns that are almost collinear and yet the matrix would still be reported as full rank. However it’s entirely possible that in the case of your data, it’s an overly conservative warning. I would also note that the reason this only appears as a warning and not an error is because statistical inference can in fact proceed despite rank-deficiency; it’s just indicative of some underlying mistake more often than not.

Changing this line to a threshold of 1e-6 results in the QR decomposition reporting the rank as 5 and hence not being rank-deficient.
Normalising the ICV column to zero mean and unit standard deviation results in the design matrix no longer being reported as rank-deficient without necessitating modifying the QR pivot threshold for rank calculation.
I would also note that this change results in the condition number of the matrix reducing from 230,000 to 22, meaning that the estimated beta coefficients will be less susceptible to noise.

In general it’s usually a good recommendation to normalise GLM continuous variables entered into the design matrix; so even if your matrix could theoretically be reported as full rank, and it’s possible that the way in which rank deficiency is currently being detected might not be the ideal choice, I would nevertheless use these results as justification to do so.
Also since ICV is the culprit here, I’ll throw in a shameless self-citation that it’s better to take the logarithm of ICV prior to normalisation.

Cheers
Rob

SaraPonticorvo · October 12, 2020, 7:11am

Hi Rob,
Thank you so much for all the useful information you gave me! I will re-run the statistical analysis normalizing the continuous covariates to zero mean and unit standard deviation.

Thanks again,

Sara