Normalization of csv.-matrix

Hey community! I need your help:-)

I created csv.matrices (84x84) and connectomes for a sample of 25 patients.
I heard that before starting the graph theory analyses, I should normalize the csv.matrices to allow inter-subject comparisons (see: https://osf.io/c67kn/)
Is it necessary? And if so, can anybody tell me, how to do it? Is there a MRTrix-Command?

Thank you in advance!

Leonie

Hi Leonie,

I can’t provide any evidence for this step being “necessary” over and above what is presented in the preprint. There do not exist studies where that step has vs. has not been applied. My suspicion is that its consequence will for most cohorts be relatively small given the magnitude of the inter-subject variance in such data. But it’s nevertheless the physically appropriate thing to do.

There is no dedicated command provided to perform this step as it is simply the multiplication of a two-dimensional numerical matrix by a scalar value, which can be done in any environment capable of manipulating numerical data; e.g. Python, MatLab / Octave, R.

Rob

Hello Rob, thanks so much for your response.
So I would scale my individual connectome-matrices so that all values are between 0 and 1 to achieve a normalization. I think it is more accurate to do this for every file individually and not across all subjects, do you agree?

And one other question: My data is a comparison of patients with CNS inflammation compared to healthy controls. My preprocessing pipeline included segmentation and parcellation of the MPRAGE images, denoising, bias field correction, average group response function and CSD of the DWI images. We did NOT do the dwi group normalization since we only have a single-tissue response function and all subjects were measured using the same scanner and the same applications. To create the connectomes we performed ACT (20mio streamlines) and then SIFT using 5mio streamlines.
Do you think this approach is acceptable?

Thanks in advance!

So I would scale my individual connectome-matrices so that all values are between 0 and 1 to achieve a normalization. I think it is more accurate to do this for every file individually and not across all subjects, do you agree?

This is a form of inter-subject connection density normalization; and one to which I probably should have invested more space in that manuscript, because it’s quite common. But I’m not a fan. It makes the connection density of every single edge in the connectome dependent on the connection density of just one edge, that being the edge of greatest connection density. So modulate the connection density of just that one edge, and the entire connectome modulates with it.

Or as a slightly silly example (but one that I nevertheless considered including), take figure 12 of that manuscript, and imagine adding to that set of proposed connectivity measures the case where you divide the connection density by the maximal connection density for any edge in that subject; you would end up with all 16 subjects having precisely the same “normalized” connection density, and be completely insensitive to the clear inter-subject variance.

Also, to do this across all subjects (i.e. find the single most dense edge across all sibjects, and divide the values of all edges across all subjects by that value) would necessitate that comparison of connection density measures across participants be properly robust in order to facilitate determination of that maximum; so I don’t think that it actually solves a problem.

Is there some downstream element that depends on values lying between 0 and 1 that is motivating such a normalisation? Because this might all be a red herring.

Do you think this approach is acceptable?

Even with only one unique non-zero b-value you can still do a two-tissue decomposition (WM + CSF) by making use of b=0 image data, for which multi-tissue normalisation still works well enough.

Rob