Statistical analysis of FD, FC, and FDC

Hi
I have been trying to run fixel based analysis based on the instructions (http://mrtrix.readthedocs.io/en/latest/workflows/fixel_based_analysis.html) and following

this article (http://www.sciencedirect.com/science/article/pii/S1053811915004218)

I do not understand how to proceed. I wrote following command
fixelcfestats files.txt output_analysis_fixel_mask.msf design_matrix.txt result_contrast_matrix.txt template.tck fd_

I found following error
[ERROR] required input file “result_contrast_matrix.txt” not found

I have 16 samples (8 control and 8 experimental).
I have created files.txt and wrote the file path of 16 files
/data/home/uqmalam9/Documents/mamun/FBD/06_fd.msf

The design matrix I used in the desing_matrix.txt file is

0 1
0 1
0 1
0 1
0 1
0 1
0 1
0 1
1 0
1 0
1 0
1 0
1 0
1 0
1 0
1 0

Kind regards
Mamun

Hi Mamun,
The contrast_matrix.txt is an input (not a result). It’s a text file with a single row defining your contrast vector. Assuming you have patients listed first (i.e. the first 8 rows), then it should be [1 -1] to test for controls > patients. Note the input design matrix and contrast for fixelcfestats is the same as what is expected FSL’s randomise.

I’d recommend reading up on GLMs if you are not familiar. Here or here would be a good place to start.
Cheers,
Dave

If I could take this a step further…

When performing fixel-wise correlations with other metrics using fixelcfestats I understand that it behaves like FSL but need to explicitly add a column of 1s.

I have two groups and an associated (de-meaned) global measure. Am I right in saying that my design matrix should look something like;

1 1 0 3
1 0 1 8
1 0 1 13
1 1 0 4

…where column #1 is the added column of 1s, column #2 denotes the first group, column #3 denotes the second group and column #4 the global metric?

And the contrast matrix would be;

0 0 0 1

Does this seem right?

Hi

I was trying to do something similar while including the groups as covariates so I demeaned all the columns but the first, then the design matrix would be something like (if it has only the 4 rows):

1 0.5 0 -4
1 0 0.5 1
1 0 0.5 6
1 0.5 0 -3

With the contrast as you say:
0 0 0 1

I am not entirely sure if it will make a difference or not, but according to the randomise instructions you should do that.

1 Like

While those design matrices should “work”, they’re not ideal; they’re what’s referred to as rank-deficient. In both cases, you can see that the first column can be constructed as a linear combination of the second and third columns. It is therefore impossible for the linear regression to estimate unique beta coefficients for these columns.

Having said that, the MRtrix3 GLM code is written in such a way as to be compatible with rank-deficient design matrices; and the fact that your contrast vector does not use the beta coefficients for those redundant columns means that the non-uniqueness of the regression shouldn’t actually influence the resulting t-statistic.

Using @phmag’s example, more “standard” ways of expressing the design would be either:

1 0  3
0 1  8
0 1 13
1 0  4

(each of the two first columns models the group mean of one of the two groups)
, or:

1  1  3
1 -1  8
1 -1 13
1  1  4

(the first column estimates the “global intercept” of the regression, while the second directly encodes the difference between the two groups).

3 Likes

Great. I gotcha. Thanks for your help @rsmith and @diagiraldo

As I understand it from http://mrtrix.readthedocs.io/en/latest/fixel_based_analysis/ss_fibre_density_cross-section.html

It still needs a first column of 1s added - unlike FSL which does it automatically for you.

So in both the cases above @rsmith , I’d still need to add another column of 1s at the front?

I found that particular wording RE: “correlation analysis” unhelpful myself when I saw it in the code & documentation. The key point is that unlike FSL, a column of ones will not be automatically added in all circumstances: it is left up to the creator of the design & contrast matrices whether or not their experimental design requires a column of ones, and if it does they need to define it manually in their matrix. The examples I gave show how the same experimental design can be expressed in two different ways: one contains a column of ones and the other does not, yet both model the same experiment. So the documentation should not be interpreted as “this command does not add a column of ones automatically, and therefore you should do it yourself in all scenarios”.

Fundamentally, the role of the ones column is to estimate the “global intercept”, which is the value of the dependent variable when all explanatory variables are zero. If one expects that the dependent variable should be zero if all explanatory variables are zero, then that column may be best omitted. Conversely, if it is included, then the other columns should ideally be constructed in such a way as to avoid rank deficiency; in many cases this won’t have a practical effect (depends on the contents of the contrast vector), but it’s nevertheless more mathematically stable and elegant to do so.

1 Like

Thanks @rsmith That’s perfectly clear now. I did get confused by it and couldn’t make out the correct interpretation from the code. Cheers again.

Hi all, I’m stuck at step 21: fba-link for computing statistical analysis.

If I have control1.mif, patient1.mif, control2.mif, patient2.mif … :
Is it right to change the filenames in the fd folder to c1.mif,c2.mif,p1.mif,p2.mif

and make a design_matrix.txt with ‘1 0’ ‘1 0’ ‘0 1’ ‘0 1’… ?
and the contrast_matrix.txt then with ‘1 -1’ ?

And what would the contrast_matrix.txt be if it was 001_control.mif, 001_patient.mif, 002_control.mif, 002_patient.mif?

Best regards and thanks!

Hi Lucius,

In your example, you have:

c1.mif        1  0
c2.mif        1  0
p1.mif        0  1
p2.mif        0  1
------------------
Contrast      1 -1

Your hypothesis is therefore that the mean of your metric of interest is larger in the group consisting of [c1.mif, p1.mif] than in the group consisting of [p1.mif, p2.mif]. I’m guessing this is the hypothesis you intended. If you were to change the file names, both on your file system and in the input text file containing the subject images, this would have zero effect; the GLM doesn’t care what the files are called. However if you alter the order of the files (specifically the order in which they appear in the input subjects text file), this would be equivalent to permuting the rows of the table above, and therefore you would need to perform the appropriate permutation of the design matrix

I prefer to avoid giving copy-and-paste solutions to GLM questions, because if there is not a fundamental understanding of what’s going on, there is a greater risk of incorrect experimentation or reporting. I would highly recommend understanding as much as possible from FSL’s GLM page, and this article.

Rob

Dear Rob,

thank you very much for your answers, the papers and for the info concerning the order.

I’m still a bit confused though: I’d like a hypothesis for two groups: c, control and p, patients.
Therefore I’d like to compare the c.mif's with the p.mif's
In your answer there’s a group consisting of [c1.mif, p1.mif] and a group consisting of [p1.mif, p2.mif].
How can I get the groups consisting of [c1.mif, c2.mif] to compare with the group[p1.mif,p2.mif].
Wouldn’t that still be:

c1.mif    1 0
c2.mif    1 0
p1.mif    0 1
p2.mif    0 1
–––––––––––––
contrast 1 -1

?

Best, Lucius

In your answer there’s a group consisting of [c1.mif, p1.mif] and a group consisting of [p1.mif, p2.mif].

That was a typo on my part. Initially I started drafting a response on the premise of your files being named:

control1.mif, patient1.mif, control2.mif, patient2.mif …

However in the next line you then changed the file names to:

c1.mif,c2.mif,p1.mif,p2.mif

So in the process of that renaming, the order of your files changed, which then alters the appearance of the design matrix. I simply failed to fully correct my example once I realised that inconsistency.

The table is based on the latter of your two naming schemes, and is identical to the one you provide. But for the sake of clarity, imagine that you were to instead use the former naming scheme. In that case, the experimental design would be:

control1.mif    1  0
patient1.mif    0  1
control2.mif    1  0
patient2.mif    0  1
––––––––––––––––––––
contrast        1 -1

Note how the rows of the design matrix have been permuted, corresponding to the permutation of order of the images when you change from:
control1.mif, patient1.mif, control2.mif, patient2.mif
to
c1.mif,c2.mif,p1.mif,p2.mif

Dear Rob, sorry for my confusion…
Thank you very much for your answer, it’s quite clear to me now!

Hi @rsmith,

Apologies for reviving such an old post, however, I was left with a question after reading it.

Regarding the two identical design matrices you describe:

This:

1 0  3
0 1  8
0 1 13
1 0  4

and this:

1  1  3
1 -1  8
1 -1 13
1  1  4

What would be the contrast matrices for them?

Design 1 seems to be built for this contrast:
-1 1 0

However, design 2 already has the contrast built in… would this be the appropriate (and equivalent) contrast:
0 1 0 ?

Regards,
Claude

What would be the contrast matrices for them?

You got those right.

Though I wouldn’t go so far to express it as “design 2 already has the contrast built in”; I totally get what you mean, in that it collapses the effect of interest into a single column of the design matrix, it’s just the potential for misinterpreation that gives me the heebie jeebies. There is and always will be a requisite inner product between the regression coefficients and a manually-defined vector of values in order to extract the feature of interest from the model (and note I said inner product there: the values are not actually constrained to be only -1 / 0 / +1), and so this always needs to be defined in conjunction with the model in order to properly express your hypothesis, and can’t be “built in”.

There can also be ramifications of these decisions in terms of either the numerical magnitude of the absolute effect extracted from the model (and hence the interpretation of such), or the interpretation of the individual regression coefficients.

Sorry for being hyper-particular on language: it just tends to result in more work for me in the long-term if I don’t get things absolutely precisely accurate the first time. :confounded:

Hi @cbajada ,

I think I’m a little bit confused, but when you say:

Is in general no? because if I understand it correctly, the strictly equivalent for the for contrast should be 0 -1 0. Is this correct or I’m missing something?

Best regards,

Manuel

Sorry I haven’t look at the forum in a long time and have only just read these

Thank for your reply @rsmith!

@mblesac yes, you are right! the contrast should be 0 -1 0

A post was split to a new topic: Understanding GLM global intercept

Hi Lucius,
I’m sorry to bother you after such a long time. I also did it based on the steps you provided on the webpage. How should I use mrview to view the statistical analysis results in the final step of visualization? I’m not very clear, I hope you can give me some tips. Thank you very much!

Best,Apple

Dear apple,

apologies for this late reply, maybe you have solved it already by yourself.
To visualize the results, you should be able to simply threshold it by thresolding it with your “fwe_pvalue.tsf” file.

I can not better explain it as it is formulated here:
https://mrtrix.readthedocs.io/en/latest/fixel_based_analysis/displaying_results_with_streamlines.html

This entry in the forum here might also help as it provides insights into loaded files in mrview:

Best, Lucius