Edge-wise connectome analysis : how to define hub region?

Dear MRtrix3 experts,

Hello, I am currently studying the structural network within a disease group and its correlated relationship with the behavioral score.
I decided to run TFNBS to avoid arbitrariness in cluster forming threshold selection. However, since it is an edge-wise analysis, unlike NBS, it shows many single edges (which is obvious) as significant.
I tried to identify the hub region using properties of graph theory (degree, betweenness centrality, etc) within the significant nodes, but few hub regions are quite hard to interpret.

I want to narrow down the network to specify few nodes to be significant.
Is there any great approach to identify more significant nodes/edges for edge-wise analysis?

All discussion and comments are welcomed!

Hi Jean,

I suspect this one is going to require some careful disentanglement… grabs drink

However, since it is an edge-wise analysis, unlike NBS, it shows many single edges (which is obvious) as significant.

Trigger Warning: Rob Rant. :laughing:

You can say that unlike NBS, TFNBS has the capacity to attribute statistical significance to individual edges. However from the standpoint of the statistical inference pipeline and its generalisation, you can actually conceptualise NBS as being “an edge-wise analysis”; and indeed it is implemented as such in the MRtrix3 code. The catch is that the hypothesis being tested is “the size of the connected supra-threshold cluster of which this edge is a member is greater than that predicted by chance”. Of course, if the size of such a connected supra-threshold cluster is greater than predicted by chance, then that statement will be equivalently true for all edges within that supra-threshold cluster, and the null hypothesis will be rejected for all such edges.

If that comes off as too cryptic, feel free to ignore; but personally I find it insightful to think of NBS in this way.

I tried to identify the hub region using properties of graph theory (degree, betweenness centrality, etc) within the significant nodes, but few hub regions are quite hard to interpret.

It’s not clear from your limited description whether this is an a priori defined analysis pipeline, or simply the result of toying with data. Applying graph theory metrics to a matrix masked by statistical significance of some prior hypothesis is not guaranteed to make sense. Further, it’s not clear whether the results of such are then themselves intended to be analysed in some way.

Here’s a different way of thinking about it. I am inferring from your description that you are not satisfied with applying those graph theory metrics to the complete matrix data, and are seeking a way to use only a subset of edges within the matrix as part of those calculations. You have then decided that that subset of edges is to be decided through explicit statistical testing of some hypothesis (which, by design, differs from whatever implicit or explicit hypothesis you may be intending to test through the use of graph theory metrics). I am however skeptical of that last decision. If the primary experimental output at which you are aiming is the application of graph theory metrics, then explicit edge-wise statistical inference leading up to such seems to me to be out of place. If my first statement (re: wanting to do graph theory but only involving a subset of edges) is correct, then it would seem to me that simply applying a threshold to select some fraction of edges where the association is strongest would make more sense. But it’s hard to know without being intimately familiar with your whole project.

I want to narrow down the network to specify few nodes to be significant.

There’s an immediate issue here; even though it’s quite possibly just a communication error, it’s nevertheless worth exploring.

Both NBS and TFNBS attribute statistical significance to edges, not nodes. While you can e.g. post hoc look at the subset of nodes for which there is at least one edge with statistical significance, statistical significance is not applied to the nodes themselves.

That’s not to say that statistical significance can’t be ascribed to nodes. Any quantitative value calculated on a per-node basis can be tested for statistical significance. If the nodes are sufficiently large as to be considered as independent of one another, you could use the vectorstats command (which does not apply statistical enhancement of any form). Alternatively, something I’ve thought about in the past but never been sufficiently motivated to implement is the concept that one could take as input complete connectome matrix data but perform statistical inference upon nodes. The GLM would be fit for each level, but some statistical enhancement algorithm could then be defined that generates an enhanced statistic per node; the simplest case would be to take the mean statistic across all edges connecting each node, but anything higher-order could theoretically be used. It’s worth contemplating whether or not this is actually what you’re looking for…

Is there any great approach to identify more significant nodes/edges for edge-wise analysis?

Well, it would be nice to have a magic button that yields more extensive results in statistical inference, but such buttons are in short supply. Now it does happen to be the case in both NBS and TFNBS that perturbations to the statistical enhancement parameters can have a considerable effect on the results of statistical inference. However such a proposal kind of reinforces my prior point: apart from introducing a multiple comparison correction problem by using different enhancement parameters, it is entirely unconventional to be modulating the extent of the results of statistical inference in order to tailor them for the needs of some downstream analysis. This is why I think simply masking the connectome based on the magnitude of the effect may in fact be more sensible.

Cheers
Rob