Hi Lena,
Understanding the output of connectomestats
using the nbs
algorithm requires a slightly different interpretation of what NBS is doing fundamentally. A large part of the justification of this is that the code path for NBS is almost precisely equivalent to that of NBSE; and indeed is really not all that different to CFE. By making code and algorithms as generalised as possible, not only does it reduce our total volume of code, and mean that improvements to the code can trivially be made available across multiple commands, but it can also provide insight into the fundamental nature of the data manipulations.
Note that NBSE intrinsically performs statistical inference at the edge level. Even though the results it produces most frequently produce connected sub-networks, inference is not performed upon those sub-networks as a whole; each edge is independently statistically significant. The connectedness of these networks is an emergent property of the statistical inference step, resulting from the fact that connect edges enhance the statistical belief in one another from the hypothesis of a common genuine biological effect, but the inference is per-edge. This is similar to CFE, where macroscopic white matter bundles emerge as being statistically significant, even though inference was performed at the fixel level.
As you said (I think), a perhaps more intuitive output from NBS would be: For each connected sub-network that survives p<0.05 following permutation testing (note that there may be more than one), have a text file containing a binary matrix that highlights those edges that are part of the significant sub-network, and possibly also provide the p-value for that network. This would however clearly be quite inconsistent with both NBSE and CFE, which report significance at the edge / fixel level, and would therefore require a considerably different code path.
The above equates to the question:
“Are there any sub-networks that are larger than predicted by chance? If so, show me the network and the p-value.”
connectomestats
using the nbs
algorithm instead re-formulates this question as:
“Is the size of the sub-network comprising this particular edge larger than predicted by chance?”
Pragmatically this has a number of implications. In both CFE and NBSE, the file enhanced.*
provides the enhanced t-statistic value for that fixel / edge, as resulting from statistical enhancement. This then allows the data from each fixel / edge to be compared against the null distribution independently. In NBS, identification of the sub-network size acts as the equivalent of “statistical enhancement”: the original t-statistic is “enhanced” by selecting the size of the sub-network of which all t-values are above some ascribed threshold. Therefore, it is the total size of the connected sub-network that is compared to the null distribution. But instead of doing this once for each sub-network, in connectomestats
this value is simply duplicated across all edges comprising the sub-network (these are the values you see in enhanced.csv
), and comparison to the null distribution is done (somewhat redundantly) independently for each edge. Hence, despite inference being applied by connectomestats
at the edge level rather than the sub-network level (and therefore being entirely in common with NBSE), in the case of NBS the resulting corrected p-value is equivalent for all edges within the sub-network, and hence the entire sub-network is invariably extracted as a whole when applying a p-value threshold.
Hopefully this makes sense…
Regarding the other output files, we probably need to add somewhere (either in the online documentation or in the commands’ help pages) descriptions of the output files, since they’re consistent across all statistical inference commands.
Rob