Tckgen - Number of streamlines generated

Hi,

I’m using tckgen to generate a certain number of streamlines between two ROIs. However, I’m having trouble understanding the termination criterion for the algorithm. I understand that tckgen terminates when one of the following three criteria are reached:

  • When the number of fibers generated reaches the desired maximum value (as given by the -maxnum parameter).

  • When the number of fibers selected reaches the desired value (as given by the -number parameter).

  • When the number of seeds used are finite.

I notice in my case that tckgen most often does not generate the number of streamlines specified (as specified by -maxnum) and stops as soon as it has selected 1000 fibers. In fact, it seems to terminate when it either reaches the specified maximum number of fibers generated or when it has selected 1000 fibers, whichever happens first. I have attached two snippets of code that I used to identify the problem; one where I set -maxnum to 50000 and another where I set -maxnum to 500000. In the former case, tckgen stopped as soon as 50000 streamlines were generated whereas in the latter, it stopped as soon as 1000 fibers were selected, even though the maximum number of streamlines required to be generated was not reached.

tckgen -force -include -46.17,15.89,38.81,15 -include -30.53,-63.44,36.25,15 -seed_sphere -46.17,15.89,38.81,15 -seed_sphere -30.53,-63.44,36.25,15 -act ACT_5tt.nii -backtrack -crop_at_gmwmi CSD.mif track_1.tck -maxnum 50000
tckgen: [WARNING] existing output files will be overwritten
tckgen: [100%]    50000 generated,      310 selected

tckgen -force -include -46.17,15.89,38.81,15 -include -30.53,-63.44,36.25,15 -seed_sphere -46.17,15.89,38.81,15 -seed_sphere -30.53,-63.44,36.25,15 -act ACT_5tt.nii -backtrack -crop_at_gmwmi CSD.mif track_1.tck -maxnum 500000
tckgen: [WARNING] existing output files will be overwritten
tckgen: [100%]   162618 generated,     1000 selected

I’m not sure why this is happening. Any help would be appreciated.

Thanks,
Varsha

So I think the confusion here stems from the fact that tckgen will always stop once -number streamlines have been selected, or -maxnum streamlines have been generated. The latter is a safeguard to avoid having tckgen stuck generating tracks that don’t ever meet the selection criteria. As far as I can tell, it’s -number you want to set here…

Actually, I think the issue is that the ‘hidden default’ of 1000 streamlines selected is being applied despite the fact that the -maxnum option has been provided. I think the expectation is that the default behaviour only kicks in if both options are absent. So if @Varsha provides -maxnum 500000, that should behave equivalently to -maxnum 500000 -number 0. This makes sense to me, and shouldn’t be too hard to implement.

That is exactly what I was assuming would happen. However, setting -maxnum 500000 -number 0 works well for me. I can see -maxnum streamlines being generated. Thank you for your input.

Another question. How does tckgen generate and select streamlines? Given a seed point, does it generate a streamline (which could be running anywhere across the brain) and then check whether it satisfies the selection criteria (such as whether or not it traverses the specified regions)? If so, am I right in saying that imposing stringent selection criteria will result in the generation of a very large number of streamlines before a desired number can be selected? Wouldn’t this take a really long time to run, particularly if one is implementing ACT?

OK, I can see how that would make sense depending on what people’s expectations are. The original intent of -maxnum was to allow users to tell tckgen to try really hard to find the desired number of streamlines, even if that meant fewer than 1% of those generated actually ended up selected. And the idea of placing a cap on the number generated was that you sometimes ended up with never-ending tckgen runs if there was simply no way to connect the ROIs (contrary to popular opinion, probabilistic streamlines doesn’t necessarily allow you to connect any arbitrary pair of ROIs…).

That approach was designed to suit a targeted tractography use case, where the aim is to get a representative delineation of an individual tract. It still works in a whole-brain context for most applications, for example where you want the same overall density of streamlines across subjects, etc.

But it seems many users do expect a fixed density of seeds, irrespective of how many seeds may fail. There are situations where such an approach might be appropriate (e.g. tractography-based parcellation of the thalamus), but these are catered for explicitly with the -seed_per_voxel options. I’m not convinced I can think of any other use cases where a fixed number of attempts makes sense: it implies that the number of resulting streamlines is being interpreted in some way, which is not considered a good idea. If you do want to interpret streamlines counts, then an approach like SIFT (or SIFT2) is required, and in this case I’d argue that a fixed number of selected streamlines is the right target.

So to cut a long story short: while I can appreciate that users may expect that setting -maxnum would override -number, I can’t think of many use cases where that would be the recommended approach. I would prefer that we clarify in the documentation that the intention of -maxnum is only to increase the cap on the number of generated streamlines, with the final criterion still being the number of selected streamlines. Maybe also point users to the -seed_per_ voxel options at this point, as it will most likely be more appropriate for legitime use cases. It might be nice to also find clearer terminology for what we mean by a generated streamline, since I can appreciate it isn’t so obvious unless you understand the inner workings of tractography and our particular implementation of it - I’m open to suggestions on that front…

Yes, that’s exactly right. We can’t know whether a streamline will satisfy the criteria for selection until it’s fully generated (or enters an exclusion ROI, in which case we throw it out straight away).

Spot on again. Some tracts are notoriously hard to delineate (e.g. optic radiations), and it can take an enormous number of generated streamlines before any significant number ends up selected. This is where it pays to be savvy with ROI placement and other tricks to help the process along (e.g. with -initdirection, -unidirectional, etc). I’m not sure using ACT necessarily makes such a massive difference here, but yes, it will undoubtedly require more streamlines be generated than otherwise. Note that this is a problem for targeted tracking in particular, not so much for whole-brain approaches. But this is the nature of probabilistic streamlines, there’s no way around it that I can think of…