Generating vertex-wise connectivity matrix

hptaylor · November 19, 2018, 8:21am

Hi all,
I am attempting to construct a connectivity matrix wherein each node is a vertex of a freesurfer white matter surface. I have transformed the tractogram to anatomical space, extracted the endpoints of each fiber, and visually checked that they are close to the freesurfer white matter surface (there are small regions of discrepancy because an FSL segmentation was used to generate the 5tt image, so the wm/gm boundary differs from the freesurfer surface). I originally intended to write a nearest neighbor routine that would assign each fiber endpoint to its nearest white matter surface vertex, and manually construct the connectivity matrix, but I am wondering if there is a way that I can make use of the tck2connectome tool to construct this matrix. Is there a way that I can create a labelled parcellation image in which each node refers to a specific point in on the freesurfer white matter surface and use this in tck2connectome?
Thanks so much!

rsmith · November 25, 2018, 9:04am

Hi @hptaylor,

We will hopefully eventually have tools for doing various types of processing based on surface data, including per-vertex connectomes, but there’s a decent amount of work required for such.

I’d also note that per-vertex connectivity matrices require either huge numbers of streamlines (extrapolate this data to your number of vertices), or some form of connectivity smoothing to account for the relative sparsity of streamline counts (have seen a few papers do this, but can’t think of any off the top of my head to cite).

The issue with going from vertex labelling to a 3D image is that the inherent inaccuracies of that process are grossly exaggerated with more fine-grained labelling. With < 100 nodes per hemisphere, the inappropriate labelling of streamlines that terminate right at the interface between parcels when using an image rather than the surface itself is probably not of huge consequence; but when the distance between vertex-based labels gets close to the image voxel size, such inappropriate assignments become more dominant. With per-vertex labelling this could potentially be quite problematic (just imagine if two vertices resided in the same image voxel). Therefore, if you really wanted to use tck2connectome, I would suggest that you would need to generate a template image at a higher spatial resolution (maybe a voxel size of 1/4 of the distance between vertices), and project each vertex label to the voxel in which it resides. The default radial search assignment mechanism in tck2connectome may need to have the maximal distance increased in order to be able to find parcels that are only labelled sparsely within the image, rather than being “filled in”, but utilising that capability would be easier than performing any kind of “dilation” of labels (which is an alternative solution to the streamline-to-node assignment issue, but I don’t have a script for doing such, and the radial search mechanism can be thought of as a more accurate version of).

Rob

hptaylor · December 12, 2018, 11:15pm

Hi Rob,
Thanks so much for your response! To your point about the sparsity of the adjacency matrix, I should have mentioned that in my formulation, vertices on the white matter surface are connected both via the triangular mesh connections that define the surface and by the connections derived from fiber tracking, which is the same method as used in Atasoy et al 2016. Maybe that is one of the papers you were referring to that applies a form of connectivity smoothing??- this method also has the benefit of encapsulating information about the shape of the manifold the adjacency matrix describes. Additionally, I am able to vary the number of vertices in the white matter surface mesh (and therefor the size of the adjacency matrix) which has been helpful for tweaking the accuracy of the vertex assignment for streamline endpoints.

It probably wasn’t the most efficient method, but I actually ended up using a kd tree nearest neighbors search routine, and just filtered the streamline connections based on a tolerance for the distance between the endpoints and their nearest surface vertex. I computed the laplacian matrix of the resulting adjacency matrix, and solved for its eigenvectors/values, and color mapped the eigenvectors back onto the white matter surface, which generated some very pretty images that parcellate the surface into regions of increasing spatial frequency (and look promising for the remainder of my analysis).

Interestingly, when I increase the number of streamlines (before the nearest neighbor filtering) from 250k to 1M, regardless of the resolution of the surface mesh, the resulting eigenvectors of the graph laplacian matrix seem to be meaningless. Specifically, the eigenvectors with the lowest eigenvalues do not divide the surface into the expected regions, and instead have nonzero values focused solely in corpus callosum/ thalamus. This is the issue I am currently trying to address. Do you know of any difficulties that arise from spatially transforming large numbers of fibers? I used the same process to transform both the 250k and 1M tractograms (described by @jdtournier in a previous thread) from diffusion space to the anatomical space in which the white matter surface resides. If not, do you have any input on as to why the graph laplacian derived using 1M streamlines and the method described above might give seemingly less meaningful eigenvectors?

Thanks so much for reading and for your help/input!

Patrick

rsmith · December 16, 2018, 11:00am

It probably wasn’t the most efficient method, but I actually ended up using a kd tree nearest neighbors search routine, and just filtered the streamline connections based on a tolerance for the distance between the endpoints and their nearest surface vertex.

I believe an octree is the way to go, though you need to be able to deal with the same vertex appearing in multiple lookup domains. @chunhungyeh is the person to talk to if you’re really into the guts of this stuff.

I computed the laplacian matrix of the resulting adjacency matrix, and solved for its eigenvectors/values, and color mapped the eigenvectors back onto the white matter surface, which generated some very pretty images that parcellate the surface into regions of increasing spatial frequency (and look promising for the remainder of my analysis).

Looking forward to seeing it!

Interestingly, when I increase the number of streamlines (before the nearest neighbor filtering) from 250k to 1M, regardless of the resolution of the surface mesh, the resulting eigenvectors of the graph laplacian matrix seem to be meaningless.

1M is still a very small number for a whole-brain tractogram looking at vertex-wise connectivities, regardless of the nature of the processing of said data. For comparative purposes I’d typically use ~ 10M streamlines for ~ 100 nodes.

As far as the graph Laplacian is concerned, I’m not familiar with the method (and am not really in a position to be familiarising myself right now given the HBM deadline). But from a naive perspective, there must be some kind of ‘weighting’ between streamlines-based vertex-vertex connectivity and surface-based vertex adjacency if they are to be combined into a single adjacency matrix, and so if the contribution of streamlines is not in ‘appropriately’ scaled according to the total number of streamlines in the tractogram, the relative influence of each of these two sources of information will be modulated by changing the number of streamlines, which may influence the outcome.

Anyone else who knows more about the method, feel free to jump in.

Rob

hptaylor · January 17, 2019, 7:41am

Thanks for the input!
I’m still not sure I understand your point about the number of streamlines necessary for a given number of vertices. My interpretation of the paper you pointed me to was that a large number of streamlines are necessary to achieve acceptably low variability in individual vertex-vertex connectivity weights. In the formulation I am using, streamline-based edges and surface-based edges are 1 if two vertices are connected, and 0 if they are not (so to your point about weighting the two, I have yet to fiddle with it). Since my goal is to use the eigenvectors of the laplacian matrix as a basis for functional activation, it seems like the reproducibility of individual edge values isn’t of paramount importance. Instead, I am concerned with the reproducibility in the geometry of the eigenvectors of the graph laplacian. This leads me to think that as long as the number of streamline-based connections aren’t dwarfed by the number of surface connections, I shouldn’t need an obscenely large number of streamlines to achieve geometrically reproducible eigenvectors. Am I crazy, overlooking something obvious, or both?

I guess stated differently, my question is: how many streamlines are needed to accurately represent the large scale geometry of structural connectivity? I know this depends on many factors; I have been using preprocessed HCP WU-Minn data, single shell response function estimation, and tcksift to generate tractograms.

Thanks again,
Patrick

rsmith · January 27, 2019, 12:34pm

I’m still not sure I understand your point about the number of streamlines necessary for a given number of vertices.

That tends to happen a bit on here as I try to give concise answers so as to actually have some time for my own work :-/

In the formulation I am using, streamline-based edges and surface-based edges are 1 if two vertices are connected, and 0 if they are not …

OK. A useful way of thinking about what you’re doing here is that you’re producing a weighted connectivity matrix, but then binarising the matrix by applying a threshold of 1 streamline. So any two vertices connected by at least one streamline have a value of 1 in the matrix, and all other vertex pairs have a value of 0.

Because you are (implicitly) applying a hard-coded threshold, which does not scale with the number of streamlines generated, is that the density of that binary connectome (the fraction of all edges with a value of 1) will progressively increase as you generate more streamlines; potentially even all the way to 100% density if you were to leave a probabilistic streamlines tractography algorithm running long enough. So the matrix you get, and hence any concomitant calculations you perform, are going to be highly sensitive to the number of streamlines you generate; unless you use a deterministic tractography algorithm. This effect is over and above the issue of variability, whereby doing the tracking experiment twice independently (with the same number of streamlines each time) may give considerably different matrix outputs. Exactly how much such variability may “propagate” into higher-level calculations based on those matrices depends on the numerical properties of those calculations.

It’s also worth noting that using SIFT doesn’t make a great deal of sense in this context: it’s predicated on modulating weighted connection densities, which are subsequently discarded if you binarise the connectome with a threshold of one streamline.

How many streamlines are needed to accurately represent the large scale geometry of structural connectivity?

That’s very hard to answer given your binarisation procedure. The brain is known to show orders of magnitude differences in inter-areal connection strengths, so discarding that information makes justifying any subsequent experimental details on biological accuracy very difficult.

hptaylor · January 29, 2019, 11:41pm

Thanks so much for taking the time to elaborate, it’s a massive help for me to get advice from someone so knowledgable about the tools I’m using. I understand you have work of your own, so feel free to forego a response in order to focus on it.

The way you phrase it as a threshold does seem very helpful.

For what it’s worth, it is very easy for me to weight edges by the number of streamlines connecting the vertices that define that edge, and I have tried implementing this (ie edge weight= 1*number of streamlines connecting vertices). For the number of tracks I have used (up to 10M), it does not seem to make a huge difference in at least the low frequency eigenvectors. My best guess is that this is due to the very high resolution of the surface mesh that defines the vertices of the graph; my most recent implementation has about 300k vertices. Given this and cursory inspection of the total number of connections per vertex, it doesn’t seem like many of the edges are being given weight greater than 1.

I guess a simple way to investigate the effect of edge weighting would be to resample the surface mesh to a lower resolution with something like 20k or even 10k vertices, relax the exclusion tolerance in my nearest neighbors method, and recalculate the eigenvectors of the laplacian matrix resulting from the weighted adjacency matrix. My only worry here is that with a lower mesh resolution, information about individual subjects’ white matter surface geometry might be lost?

To your point about probabilistic vs deterministic tracking, I will re-run the probabilistic tracking and all subsequent calculations, compute the normalized dot product between each eigenvector from both runs,
see how close to 1 they are, and let you know. If it turns out they are not close to 1, I will switch to a deterministic method.

As for using SIFT, I mostly was using it because, to my best understanding, it eliminates fibers that under my binarisation procedure would be irrelevant. With fewer fiber endpoints, my python script that identifies nearest neighbors, constructs the adjacency and laplacian matrices, and finds a specified number of lowest frequency eigenpairs requires less RAM and runs quicker. I guess my underlying thought was that if I originally generate 10M tracks, then sift them down to ~3M based on their redundancy, I get a better representation of the major white matter geometry under my graph construction than if I was to generate 3M fibers and use them as is. Is there any sense to that, or am I conceptualizing what SIFT actually does incorrectly? I suppose I can just compare the eigenvectors resulting from each method to see.

Finally, just a thought on biological accuracy: What I’m seeking to do is construct a basis of steady-state spatial patterns of functional activation on the cortex. Given the complete set of eigenvectors, regardless of the laplacian matrix they are calculated from, it is guaranteed that any state of functional activation on the cortex can be reconstructed at the resolution of the surface mesh, just as a result of the fact that they are orthogonal and complete. So, the goal of most importance is really to construct this basis so as to give low reconstruction error for a given state of functional activation with a relatively small portion of the spectrum. Eigenvectors of the surface alone accomplish this to some degree, but the idea of including white matter fibers in the formulation of the laplacian matrix intuitively makes sense to me, as functional activation patterns functional networks must be rooted in structural connectivity. All of this is to say, the ‘biological accuracy’ of the laplacian matrix I construct is important to my project only insofar as it affects the efficiency of its spectrum in representing patterns of functional activation +connectivity on the cortex.

Hopefully if any experts in tractography experts make it into my (physics) thesis panel, I can use a similar rationale to explain my choices.

rsmith · February 10, 2019, 2:06pm

My only worry here is that with a lower mesh resolution, information about individual subjects’ white matter surface geometry might be lost?

The HCP map everything to a 32k mesh in addition to a higher resolution one, and from memory that provides a ~ 2mm spacing between vertices, which is about the spatial resolution of the fMRI acquisition and hence doesn’t incur either major losses in resolution or major redundancy when resampling the acquired data into that space. I would expect a 2mm spacing to still preserve subject differences in cortical folding or the like. Besides, any inter-subject differences finer than that likely couldn’t be reconstructed with diffusion MRI tractography anyways.

As for using SIFT, I mostly was using it because, to my best understanding, it eliminates fibers that under my binarisation procedure would be irrelevant.

That’s probably not wholly accurate. I commonly find misinterpretations that SIFT somehow removes false positives or isolated (i.e. highly unique) trajectories, which aren’t really the case. It’s better thought of as a modulation of the relative densities of different trajectories; following on from which, if the individual trajectories of interest (at whatever resolution / granularity they may be) are not adequately densely reconstructed, it becomes impossible to appropriately modulate their relative densities, particularly with SIFT which is constrained to operate using “whole streamline retain / whole streamline remove” operations only.

With fewer fiber endpoints, my python script that identifies nearest neighbors, constructs the adjacency and laplacian matrices, and finds a specified number of lowest frequency eigenpairs requires less RAM and runs quicker.

Unsurprising; but is it feasible with more streamlines? Particularly in a proof-of-concept stage, I’d err toward what makes the most sense in terms of scientific validity and robustness rather than what can or cannot be run on a local system. Do you have access to local HPC infrastructure? Locally here we can get compute nodes with as much as 1.5TB of RAM.

I guess my underlying thought was that if I originally generate 10M tracks, then sift them down to ~3M based on their redundancy, I get a better representation of the major white matter geometry under my graph construction than if I was to generate 3M fibers and use them as is. Is there any sense to that, or am I conceptualizing what SIFT actually does incorrectly?

In general yes; but the theoretical benefits of SIFT will only be achieved if the magnitude of the modulation of streamlines based connection densities, in the space in which you are quantifying them are not exceeded by experimental variability. This is why there are measurable benefits when generating 100M streamlines, reducing to 10M using SIFT, and assigning to ~ 100 nodes, whereas generating 10M, reducing to 3M, and assigning to ~ 100,000 nodes, the effects specifically from SIFT may be swamped by variance.