Help with design and contrast matrices for FBA

Hello experts! I have three hypotheses to test with my FBA pipeline, but am a little unsure about whether I have constructed my design and contrast matrices properly. My population is separated into three groups: controls, patients with a left-sided tumor, and patients with a right-sided tumor.

My first hypothesis is that controls will have higher fixel metrics than patients with a left-sided tumor. Age (demeaned) and sex (0 for female, 1 male) are the nuisance covariates. My design matrix (just two example rows) looks like this:
GI Group Sex Age
Patient (left) | +1 | 0 | 0 | 5.2 |
Control | +1 | 1 | 1 | -1.6 |
And my contrast vector is this: 0 1 0 0
My second hypothesis is the same as the above, except testing controls against patients with a right-sided tumor, again with age and sex as nuisance covariates. This is my design matrix:
GI Group Sex Age
Patient (right) | +1 | 0 | 0 | 9.9 |
Control | +1 | 1 | 1 | -1.6 |
And my contrast vector: 0 1 0 0
My third hypothesis is that there will be a difference in the fixel metrics between patients with a left-sided tumor and patients with a right-sided tumor. My design matrix is this:
GI Group Sex Age
Patient (left) | +1 | -1 | 0 | 5.2 |
Patient (right) | +1 | +1 | 0 | 9.9 |
My contrast vector is this: 0 1 0 0

Any insight on these matrices would be greatly appreciated! My concern is that I’m not actually testing my hypotheses of interest - please let me know if the construction seems correct :slight_smile:

Welcome @PMT!

I’ll start with the little thing, then move on to the big thing.

Minor observation is that you are using the values 0 and 1 for your group / sex dummy variables. Using -1 and 1 is more commonly accepted. It doesn’t actually matter at all for statistical inference; the only time it makes a difference is if you are going to place any interpretation on the values of the beta coefficients.

Now:

My concern is that I’m not actually testing my hypotheses of interest …

I can read this expression at two different levels.

The basic one is whether, given the expression of your hypotheses, your GLM constructions are correct. I don’t see any major issue here myself. The only potential limitation is the fact that for your third hypothesis, it will theoretically be possible for the observation of a left > right effect in one fixel to statistically enhance the observation of a right > left effect in another fixel, due to streamlines crossing the mid-sagittal plane. The way to prevent such erroneous enhancement is utilised in this manuscript.

The more complex interpretation is whether or not the actual formulation of your hypotheses of interest is correct for your scientific investment. This is something I’ve been observing across multiple projects of late. I think what’s happening, is that people have become accustomed to the ability to perform a straight patient-control group comparison with FBA using the GLM, and then when they start feeding more complex cohorts through this pipeline, they are simply moulding these experiments into what they’re comfortable with, turning them into a batch of patient-subgroup-control comparisons. If this was what you intended, I personally agree with the nagging voice in your head that’s telling you that this might not be the ideal way to do things, and I can follow up; if not, ignore my ranting :upside_down_face:

Cheers
Rob

Hi Rob,

Thanks for your reply! I had a follow-up question about interpreting the beta coefficient outputs. If I change the values in the group column of my design matrix to be -1 for patient and +1 for control (instead of 0 and +1, respectively), can I interpret the beta0.mif output file from fixelcfestats as the control group mean? Otherwise, with the way I have my design matrices set up currently, would it still be correct to use the beta0.mif file as the control group mean?

Also, if I use the beta0.mif file to calculate the percentage effect and with thresholding using the fwe_pvalue.tsf file I get a range of tract-specific significant values from around 0-40%, is it correct to interpret this as the fixel metrics being higher in the patient group than in the control group?

Thank you!

If I change the values in the group column of my design matrix to be -1 for patient and +1 for control (instead of 0 and +1, respectively), can I interpret the beta0.mif output file from fixelcfestats as the control group mean?

If you have e.g.:

          GI Group Sex   Age
Patient | +1 | -1 | 0 |  5.2 |
Control | +1 |  1 | 1 | -1.6 |

, then beta0.mif will be closer to the cohort mean than the control group mean, but still not actually exactly such. Technically, it’s the value of your quantitative metric that is predicted for a hypothetical subject whose group assignment is 0, sex is 0, and age is 0. If you were to instead want the value predicted for a hypothetical member of the control group (again, akin to the control group mean but not exactly the same) with all other explanatory variables being zero, that would be an inner product of the beta coefficients with the vector 1 1 0 0 (i.e. what would appear as a row in the design matrix for that hypothetical subject), or (beta0.mif + beta1.mif).

I think I’ve made this comment elsewhere, might it might warrant repeating. The whole concept of utilising GLM beta coefficient images to compute group means originates from the fact that prior to version 3.0.0, smoothing of the fixel data occurred inside of the fixelcfestats command; as such, the beta coefficients were the only data available that were based on smoothed data. With fixelconnectivity | fixelsmooth now provided, giving direct user access to the smoothed data, if what you really truly want is the group mean, then you can literally just execute mrmath mean across those fixel data files corresponding to your control group.

Otherwise, with the way I have my design matrices set up currently, would it still be correct to use the beta0.mif file as the control group mean?

Nope; if we revert back to:

          GI Group Sex  Age
Patient | +1 | 0 | 0 |  5.2 |
Control | +1 | 1 | 1 | -1.6 |

, then beta0.mif is still “the value of the global intercept”; but because your patient group has the dummy variable value of 0, that value is actually more representative of the patient group mean rather than the control group mean.

Also, if I use the beta0.mif file to calculate the percentage effect and with thresholding using the fwe_pvalue.tsf file I get a range of tract-specific significant values from around 0-40%, is it correct to interpret this as the fixel metrics being higher in the patient group than in the control group?

I can’t say definitively without having full access to both the design matrix used and the list of commands that produced such. Indeed even in your description, beta0.mif can’t be used to derive the percentage effect in isolation: that calculation involves both a numerator and a denominator.

But if I assume you’re referring to use of beta0.mif in the denominator, with beta1.mif in the numerator, “patient” is encoded as 0 and “control” encoded as 1, then positive values indicate that the relevant fixel metric is greater in controls than patients.

Rob

1 Like