Tcksift: only 1 tracks read from input file

aszymanski · March 6, 2017, 3:37pm

Hello,

It appears that tcksift has run into a problem from the tracks generated from tckgen. I’ve gotten this error:

    tcksift: [WARNING] Only 1 tracks read from input track file; expected 10000000

However, it looks like tckgen was able to create 10,000,000 tracks based off of this output:

    tckgen: [100%] 18546421 generated, 10000000 selected

I used -seed_gmwmi for this instance of tckgen. I also used -seed_image, which did not cause the tcksift error. Any suggestions as to what the problem is?

Thank you!

jdtournier · March 6, 2017, 5:36pm

This is unexpected. Any chance you might have run out of space on your storage device…? What does tckinfo report for this file?

aszymanski · March 6, 2017, 7:00pm

It seems it was a lack of storage space. I’ve rerun that portion of the script and it’s all good now. Thank you

aszymanski · December 1, 2017, 3:42pm

Hello,

Running into this problem again I’m doing the Fixel/AFD walkthrough & just created 20 million tracks with tckgen. Tcksift only seems to find 1 track. I ran tckinfo and it returned actual count in file: 19999920. I’m working on a cluster & cleared out my home directory incase space was the issue again, but that didn’t help. Here are the exact lines of code I used:

$ tckgen -angle 22.5 -maxlen 250 -minlen 10 -power 1.0 FOD_template.mif -seed_image voxel_mask.mif -mask voxel_mask.mif -select 20000000 tracks_20_million.tck
tckgen: [100%] 32695691 seeds, 22442383 streamlines, 20000000 selected

[as695@blade04 template5]$ tcksift tracks_20_million.tck FOD_template.mif tracks_2_mill_sift.tck -term_number 2000000
tcksift: [100%] Creating homogeneous processing mask
tcksift: [100%] segmenting FODs
tcksift: [100%] mapping tracks to image
tcksift: [WARNING] Only 1 tracks read from input track file; expected 20000000
tcksift: [ERROR] Filtering failed; desired number of filtered streamlines is greater than or equal to the size of the input dataset

$ tckinfo tracks_20_million.tck -count
***********************************
  Tracks file: "tracks_20_million.tck"
    count:                20000000
    downsample_factor:    3
    fod_power:            1.0
    init_threshold:       0.100000001
    lmax:                 8
    max_angle:            22.5
    max_dist:             250
    max_num_seeds:        20000000000
    max_num_tracks:       20000000
    max_seed_attempts:    1000
    max_trials:           1000
    method:               iFOD2
    min_dist:             10
    mrtrix_version:       3.0_RC2-82-gbb77205e
    output_step_size:     1.07646811
    rk4:                  0
    samples_per_step:     4
    sh_precomputed:       1
    source:               FOD_template.mif
    step_size:            1.07646811
    stop_on_all_include:  0
    threshold:            0.100000001
    timestamp:            1512053545.4279363155
    total_count:          32695691
    unidirectional:       0
    ROI:                  mask voxel_mask.mif
    ROI:                  mask voxel_mask.mif
    ROI:                  seed voxel_mask.mif
tckinfo: [done] counting tracks in file
actual count in file: 19999920

Is the issue potentially that there are 19,999,920 tracks and not 20 million exactly?

Thank you much,
A

jdtournier · December 1, 2017, 4:20pm

I’m not too familiar with this part of the code (@rsmith might want to help me out here…), but this does look a bit odd… So to try to get to the bottom of this:

what is the size of the tracks_20_million.tck:
```
ls -l tracks_20_million.tck
```
what does this report:
```
tckstats tracks_20_million.tck
```
are you sure about the space issue:
```
df -h tracks_20_million.tck
```
has anything happened to FOD_template.mif between tckgen and tcksift? Anything that would cause the streamlines to no longer overlap with the image…?

aszymanski · December 1, 2017, 4:34pm

Thanks for the quick response The answers to your questions:

$ ls -l tracks_20_million.tck 
-rwx------ 1 root 15387839012 Nov 30 20:25 tracks_20_million.tck
$ ls -lh tracks_20_million.tck 
-rwx------ 1 root 15G Nov 30 20:25 tracks_20_million.tck

$ tckstats tracks_20_million.tck 
tckstats: [100%] Reading track file
tckstats: [WARNING] expected 20000000 tracks according to header; read 19999920
         mean       median    std. dev.          min          max       count
      66.7682      57.4064      40.4398      10.0399      5914.83     19999920

$ df -h tracks_20_million.tck 
Filesystem                               Size  Used Avail Use% Mounted on
//path/to/dir                            4.0T  3.4T  675G  84% /path/to/dir/

has anything happened to FOD_template.mif between tckgen and tcksift? Anything that would cause the streamlines to no longer overlap with the image…?

No, not that I’m aware of. I started tckgen last night and it finished this morning, and immediately ran tcksift once I saw it was done.

Thank you again!

jdtournier · December 1, 2017, 4:51pm

OK, nothing suspicious in there at all… I’m out of ideas. Can you try running the tcksift command with the -debug flag, see if that shows up anything useful?

jdtournier · December 1, 2017, 4:54pm

Actually, there is something a bit odd in your tckstats output: the max length is over 5000 - you have a 5m long streamline in there…?!? And this despite your explicit -maxlen 250 option to tckgen… Something doesn’t sound right here, but I’m not sure I know what could possibly cause this…

aszymanski · December 1, 2017, 5:01pm

Ah jeez, that’s no good. Good catch, I didn’t even notice that. I’m currently running tcksift with -debug and while it’s not finished, there seem to be issues with finding SH peaks…

tcksift: [  0%] segmenting FODs... 
tcksift: [DEBUG] launching thread "sink"...
tcksift: [DEBUG] waiting for completion of thread "source"...
tcksift: [ 33%] segmenting FODs... 
tcksift: [DEBUG] failed to find SH peak!
tcksift: [ 34%] segmenting FODs... 
tcksift: [DEBUG] failed to find SH peak!
tcksift: [DEBUG] failed to find SH peak!
tcksift: [DEBUG] failed to find SH peak!
tcksift: [DEBUG] failed to find SH peak!
tcksift: [DEBUG] failed to find SH peak!

It continues on like that until 100% completion of FOD segmentation . It’s currently 57% of the way through of mapping tracks to the image.

Also, not sure if this is relevant, but it can’t seem to find a config file?:

$ tcksift tracks_20_million.tck FOD_template.mif tracks_2_mill_sift.tck -term_number 2000000 -debug
tcksift: [INFO] reading config file "/etc/mrtrix.conf"...
tcksift: [DEBUG] reading key/value file "/etc/mrtrix.conf"...
tcksift: [DEBUG] No config file found at "/home/linux/.mrtrix.conf"

aszymanski · December 1, 2017, 5:21pm

Wait, this could be the maxlen issue:

From the tckgen page:
-maxlength value set the maximum length of any track in mm (default is 100 x voxelsize).

The AFD/Fixel walkthrough (step 19) instructs users to write this code, maxlen/minlen rather than maxlength/minlength:

tckgen -angle 22.5 -maxlen 250 -minlen 10 -power 1.0 wmfod_template.mif -seed_image voxel_mask.mif -mask voxel_mask.mif -select 20000000 tracks_20_million.tck

I’ll rerun tckgen with the altered flags and see if the max length value changes

Also here’s the rest of the debugging output from tcksift:

tcksift: [DEBUG] reading key/value file "tracks_20_million.tck"...
tcksift: [DEBUG] initialising threads...
tcksift: [DEBUG] launching thread "source"...
tcksift: [DEBUG] launching 2 threads "pipe"...
tcksift: [  0%] mapping tracks to image... 
tcksift: [DEBUG] launching 2 threads "sink"...
tcksift: [DEBUG] waiting for completion of thread "source"...
tcksift: [100%] mapping tracks to image
tcksift: [DEBUG] no writers left on queue "source->pipe"
tcksift: [DEBUG] thread "source" completed OK
tcksift: [DEBUG] waiting for completion of threads "pipe"...
tcksift: [DEBUG] no writers left on queue "pipe->sink"
tcksift: [DEBUG] no readers left on queue "source->pipe"
tcksift: [DEBUG] threads "pipe" completed OK
tcksift: [DEBUG] waiting for completion of threads "sink"...
tcksift: [DEBUG] no readers left on queue "pipe->sink"
tcksift: [DEBUG] threads "sink" completed OK
tcksift: [WARNING] Only 1 tracks read from input track file; expected 20000000
tcksift: [INFO] Proportionality coefficient after streamline mapping is 3.7425955080267309e-05
tcksift: [DEBUG] deleting scratch buffer for image "SIFT model processing mask"...
tcksift: [DEBUG] image "SIFT model processing mask" unloaded
tcksift: [DEBUG] deleting scratch buffer for image "fixel map voxels"...
tcksift: [DEBUG] image "fixel map voxels" unloaded
tcksift: [DEBUG] unmapping file "FOD_template.mif"
tcksift: [DEBUG] image "FOD_template.mif" unloaded
tcksift: [ERROR] Filtering failed; desired number of filtered streamlines is greater than or equal to the size of the input dataset

jdtournier · December 1, 2017, 6:50pm

No need to worry about the ‘failed to find SH peak’ messages, they don’t necessarily indicate an outright failure - just that a particular starting point failed, but there are typically multiple restarts (note to self: check that one). Also, no point in retrying with the full option names: that’s a feature of MRtrix3 - it would have failed if there had been any ambiguity. I don’t see anything suspicious in what you’re showing…

aszymanski · December 1, 2017, 7:08pm

Also, no point in retrying with the full option names: that’s a feature of MRtrix3 - it would have failed if there had been any ambiguity.

I see! That’s clever & useful. Good to know.

Yeah, I don’t really know what’s going on. I will say that I ran into something similar yesterday on a much smaller tckgen threshold with a subject. tcksift was giving me the same error of only finding one streamline. I reran tckgen on the subject and tcksift worked the second time around. I didn’t change anything except reordered the flags in tckgen. Hopefully this second run of tckgen on my FOD_template will lead to the same result of tcksift deciding to work! Thanks for your help, I’ll keep looking into my data & post if I see anything out of the ordinary.

rsmith · December 3, 2017, 10:48am

tcksift: [WARNING] Only 1 tracks read from input track file; expected 20000000
tcksift: [INFO] Proportionality coefficient after streamline mapping is 3.7425955080267309e-05

OK, that’s weird: It claims only 1 track was read, yet the proportionality coefficient is of the order of magnitude I would expect had a 20M whole-brain tractogram been read successfully. That would suggest there’s something wrong with the delimiters in the track file, and the whole tractogram is being read as a single streamline. But more fundamentally, tckstats and tcksift use the same back-end code to read streamlines files, so there shouldn’t be such a drastic difference in outcomes…

If you can’t find a solid way to reproduce the behaviour, we’ll probably need access to some example data.

aszymanski · December 14, 2017, 5:36pm

Forgive me for the delayed response! I was able to generate a SIFT file with my second run of tckgen, where I altered -max/minlen to -max/minlength. I’d be more than happy to share the original tckgen file that SIFT failed on, & also generate a new tckgen file the same way I did the original one and see if the error replicates. Let me know if you’re still interested & I’ll send over the files via dropbox

aszymanski · December 18, 2017, 4:21pm

Blah, happened again. I’m doing whole-brain tractography & generated 30 million tracks on a subject with the aim of SIFTing down to 1 million.

$ ls -lh sub_seed_image_mif.tck
-rwx------ 1 as695 root 20G Dec 16 09:19 sub_seed_image_mif.tck

$ tckstats sub_seed_image_mif.tck 
tckstats: [100%] Reading track file
tckstats: [WARNING] expected 30000000 tracks according to header; read 29999807
         mean       median    std. dev.          min          max       count
      59.6605      49.5079      41.2023      10.7101      5979.78     29999807

This has happened on 3/5 subjects I’ve run the tractography on.

These are the commands used to generate the track/SIFT files:

tckgen ${OUTPUT}/TS_FOD_${SUBJ1}.mif -seed_image ${TEMPLATE}/voxel_mask.mif ${OUTPUT}/${SUBJ1}_seed_image_mif.tck -select 30M -maxlength 250
tcksift ${OUTPUT}/${SUBJ1}_seed_image_mif.tck ${OUTPUT}/TS_FOD_${SUBJ1}.mif ${OUTPUT}/${SUBJ1}_SIFT_seed_image_mif.tck -term_number 1M  -force

So it looks like I’m going to recreate the track files & then run SIFT on the new ones and see if that helps. Will update on the outcome.

anege · November 20, 2020, 5:37pm

Any news on how you solved the issue? I’m facing the same problem after running tckgen (20 M tracks) and tcksift (reducing to 2M tracks) on the cluster:

tcksift: [WARNING] Only 1 tracks read from input track file; expected 20000000

According to tckinfo, I have 20000000 tracks:

Tracks file: "WB_tracks_20M_iFOD2.tck"
    act:                  Analysis/sub-blnd02/5tt_NatDiff.nii.gz
    backtrack:            0
    count:                20000000
    downsample_factor:    3
    fod_power:            1.0
    init_threshold:       0.100000001
    lmax:                 8
    max_angle:            45
    max_dist:             250
    max_num_seeds:        20000000000
    max_num_tracks:       20000000
    max_seed_attempts:    1000
    max_trials:           1000
    method:               iFOD2
    min_dist:             10
    mrtrix_version:       unknown
    output_step_size:     0.625
    rk4:                  0
    samples_per_step:     4
    sh_precomputed:       1
    source:               Analysis/sub-blnd02/wm_fod_norm.mif
    step_size:            0.625
    stop_on_all_include:  0
    threshold:            0.10
    timestamp:            1605769712.8068575859
    total_count:          101814876
    unidirectional:       0
    ROI:                  seed dwi_mask_up.mif

However, according to tckstats, I only have 5409747 tracks.

tckstats: [100%] Reading track file
tckstats: [WARNING] expected 20000000 tracks according to header; read 5409747
         mean       median    std. dev.          min          max       count
      60.6027      54.7897       37.128      9.97026          250      5409747

This is weird since in the process of tckgen, I see that at least 18M tracks are created (please see below):

Best,
A

rsmith · November 22, 2020, 8:51am

Hi @anege,

I have no recollection of having received exemplar data from @aszymanski (correct me if I’m wrong), so I don’t know if the source of the problem was isolated or if some mitigating strategy was found.

According to tckinfo, I have 20000000 tracks:

It’s important to know here that tckinfo is only reporting the contents of the .tck file header, which is just a set of keys and values similar to that used by the .mif / .mih format. So “count: 20000000” is just a pair of text strings near the start of the file, which is echoed without ever actually reading the streamlines data. That number is updated by tckgen dynamically as the .tck file is generated; so we can at least say that tckgen thinks that it has written 20M streamlines.

If you were to run tckinfo -count, which explicitly reads through all streamline data in order to determine the streamline count, I would expect it to report the same number of streamlines as does tckstats. I still do not know why the number of streamlines reported by tcksift is different again, but (combined also with the seemingly intermittent nature of the issue) it nevertheless points to some form of data corruption.

I’d need to have access to some raw data in order to properly investigate. However given I’m currently constrained by satellite internet, if you are indeed able / willing to share such, if you can reproduce the fault with a smaller streamline count before uploading it would be very much appreciated.

Rob

anege · November 30, 2020, 9:49am

Indeed, it reported the same number of streamlines.

I think that the corruption of data was due to the fact that I was exceeding my quota limit in the cluster I’m using (hence it could not continue writing the file), so it has nothing to do with MRtrix. I’m testing this at the moment.

Thank you,
ane

AmirHussein · April 3, 2021, 8:26am

Hi everyone,

The same issue is being encountered here. I may have come up with a reason and a possible solution.

I am running the HCP’s analysis pipeline on a set of 80 HCP-YA subjects with the same pipeline being applied to all, however, tcksift stops with the same warning for some of the subjects (1 tract is read instead of 25M). Here is the tckstats output for one of those subjects (Subject ID for a probable replication: #206222):

tckstats: [100%] Reading track file
tckstats: [WARNING] expected 25000000 tracks according to header; read 24788701
         mean       median    std. dev.          min          max       count
      52.8172      38.7908      45.7504          2.5      249.239     24788701

Here is the tckstats output for one of the successfully SIFTed subjects (ID: 102513):

tckstats: [100%] Reading track file
         mean       median    std. dev.          min          max       count
      56.5981      41.8272      48.3274          2.5      249.137     25000000

tckgen I used:
tckgen WM_FODs.mif 25M.tck -act 5TT.mif -backtrack -crop_at_gmwmi -seed_dynamic WM_FODs.mif -maxlength 250 -select 25M -cutoff 0.06 -nthreads 20

tcksift I used:
tcksift 25M.tck WM_FODs.mif 5M_SIFT.tck -act 5TT.mif -term_number 5M -nthreads 20

According to this comparison and previous posts, I guess the problem lies in the mismatch between the expected and the actual number of tracts. So, I think editing the header with the actual number of tracts may be an option to overcome this problem but I don’t know how to do it. Any ideas?

Bests,
Amir

rsmith · April 15, 2021, 1:37am

Hi Amir,

I am running the HCP’s analysis pipeline

Just repeating the reminder at the head of that page that it’s intended to act as a historical reference, not as a “recommended pipeline” that is kept up-to-date and maintained.

tcksift stops with the same warning for some of the subjects (1 tract is read instead of 25M).

With the tckstats example you show, it’s only fractionally less than the intended 25M streamlines that are read, which is quite drastically different from 1. Can you confirm that this is just a matter of different subjects having been used as exemplars for the tcksift vs. tckstats warning messages? Or are tcksift and tckstats reading a different number of streamlines from the same track file?

I think editing the header with the actual number of tracts may be an option to overcome this problem but I don’t know how to do it. Any ideas?

The only problem that this manipulation would overcome would be the issuing of the warning itself regarding the mismatch (this is actually what tckgen & other commands do internally: as more streamlines are written to an output track file in batches, the “count” field in the header of the file is updated accordingly). It would not alter the number of streamlines that can actually be read by any MRtrix3 command, just their expectation of how many will be read. So if the problem is that tcksift can only read one streamline from the file, then modifying the header so that tcksift knows immediately upon reading the track file header that it can only expect to read one streamline from that file doesn’t actually fix your fundamental problem, which is the fact that only one streamline can be read, and SIFT doesn’t work too well in that scenario

Nevertheless, if you’re interested in this kind of data “hacking”, the kind of software you’re looking for are colloquially referred to as “hex editors”. This is the kind of software that a developer such as myself may well employ in trying to diagnose the origin of such a fault.

Rob