FBA analysis pipeline help

fba

#21

Hi Rob!

Thanks a lot for answering my questions! It will be of great help for my analysis :slight_smile:

I will get in touch if there are further issues :wink:


#22

Hello again :grinning:

I’m trying to run FBA with data acquired at 2 different sites (different scanners, sequence).

I set the design matrix as
0 1 0
1 0 1
with the 0 in the 3rd column representing data from the 1st center & 1 in the 3rd column representing data from the 2nd center.

Similarly, I set the contrast matrix with an additional 0 as 1 -1 0

Is this the right setup :stuck_out_tongue_winking_eye:

Moreover, the pathology is more severe in one of the data-sets as compared to the other.

How can I incorporate this info in my design and contrast matrix :thinking:

Thanks,
Karthik


#23

Is this the right setup

If you had provided more rows, then I believe that your setup would be correct. However the very limited matrix that you provide there would lead to severe problems due to under-determinedness; I honestly don’t know how the current master code would react, but I recently observed my development branch code providing t-values of ±1e9 in a similar circumstance, which leads to essentially a stalled command. But if you have more than two subjects, then what you’re describing is correct.

Moreover, the pathology is more severe in one of the data-sets as compared to the other … How can I incorporate this info in my design and contrast matrix

This kind of question really gets to the fundamentals of design & contrast matrix construction. The answer to this question is: What is your hypothesis?

I’m first assuming that you have both controls and patients scanned at both centres. If this is not the case, then there’s no mathematical way of disentangling the effect of variation in pathology severity between sites from the effect of scan site.

From there, how to proceed depends on which of these two cases has caused you to raise the question:

  • You have some continuous measure of pathology severity, and have noticed that the mean differs in the patient groups between the two sites. How to deal with this depends on whether or not that continuous measure of severity can also be computed for the control group.

  • The patients that were scanned at one site are subjectively / clinically more severe than the patients scanned at the other site, but this is not quantified. If this is the case, then one would expect the magnitude of the difference between patients and controls to differ between the two sites. This should also be possible to deal with. Though going back to the point about the hypothesis, that question extends to whether this is an effect that you expect to observe in your data and hence with to include in your model in order to provide the best possible fit to the data, or whether you actually want to test whether this effect is non-zero.

There’s a number of possibilities here, and I’m trying to avoid writing a comprehensive GLM instruction manual :cold_sweat:

Rob


#24

Thanks for the detailed reply.

Actually, I do have more no. of rows in the actual experiment :wink: I just wanted to check if the setup was right!

Yes, both controls and patients were scanned at the 2 centres. I also did an ROI analyis on the FA maps and found a difference in magnitude in the FA values between the centres as expected.

The hypothesis is that the patients should show reduced FBA metrics as compared to controls (also FBA should be more sensitive than DTI in that aspect).

Looking forward to your comments :slight_smile:

Karthik


#25

The hypothesis is that the patients should show reduced FBA metrics as compared to controls

Well, in that case your design matrix would only have two columns; either one for each group, or a column of 1’s (global intercept) and then a column with -1 / +1 group assignment. What I’m trying to communicate by asking about your hypothesis is: any nuisance regressors you add into your design matrix in fact form part of your hypothesis.

For instance: Imagine that you simply add a column containing a -1 / +1 corresponding to which site each subject was scanned at. Within the GLM, this will add a factor that attempts to model any global difference in metric values between sites. However this will be applied to all subjects; i.e. the model hypothesizes that values for both patients and controls will be equally larger / smaller at one site than the other. Whereas from your description, it sounds like you are additionally expecting that, independently of the control subjects scanned at each site, the patients scanned at one site are expected to be more severe than those scanned at the other; but they are not so grossly different that you wish to examine patients from the two sites as two separate groups.

What I think you’re looking for (more than happy to be corrected on this), is:

                  GI   Group    Site   Severity
Control, site 1 | +1 |   +1   |  -1  |     0    |
Patient, site 1 | +1 |   -1   |  -1  |    -1    |
Control, site 2 | +1 |   +1   |  +1  |     0    |
Patient, site 2 | +1 |   -1   |  +1  |    +1    |
                ---------------------------------
                  b1     b2      b3       b4
  • Coefficient b1 encodes the Global Intercept: The predicted value of your quantitative measure of interest after all other predicted sources of variance in your experiment have been factored out.

  • Coefficient b2 encodes the group (patient / control); a non-zero value here indicates that whether or not a subject is a patient or a control has an effect on the value of the quantitative measure observed in that subject. This is most likely the feature of the experiment that is of interest to you: the null hypothesis is that there is no difference between patients and controls: b2 = 0.

  • Coefficient b3 encodes the site of the scan. If this factor is non-zero, then whether a particular subject was scanned at site 1 or site 2 has an effect on the quantitative measure observed in that subject - regardless of whether that subject is a patient or a control.

  • Coefficient b4 encodes the site of scan, but is zero-filled for controls. What this factor therefore represents is the possibility that the severity of the disease (and hence the magnitude of the effect on the quantitative measure relative to controls) may vary between the two sites. For instance, if b4 > 0, then this implies that the quantitative measure of interest is larger in patients from site 2 than it is in patients from site 1, irrespective of the value of the quantitative measure of interest within controls at either site.

octave assures me that this design matrix is not rank-deficient… :crossed_fingers:

Maybe this can act as the example for teaching GLM at the @workshop? :stuck_out_tongue: