Tck2connectome: different results between mac and linux

Dear MRtrix community,
I am writing here because I found a problem in the output from tck2connectome.
For my work, I have always used MAC computers. Last week I installed UBUNTU v 20.04 and I tried my pre-processing script with MRtrix3.
The version of MRtrix3 is the same in these two computers:

e.g. from UBUNTU:

== tck2connectome 3.0.3 ==
64 bit release version, built Jul 19 2021, using Eigen 3.3.7
Author(s): Robert E. Smith (robert.smith@florey.edu.au)
Copyright (c) 2008-2021 the MRtrix3 contributors.

This Source Code Form is subject to the terms of the Mozilla Public
License, v. 2.0. If a copy of the MPL was not distributed with this
file, You can obtain one at http://mozilla.org/MPL/2.0/.

Covered Software is provided under this License on an "as is"
basis, without warranty of any kind, either expressed, implied, or
statutory, including, without limitation, warranties that the
Covered Software is free of defects, merchantable, fit for a
particular purpose or non-infringing.
See the Mozilla Public License v. 2.0 for more details.

For more details, see http://www.mrtrix.org/.

When I ran tck2connectome by using the same files (same tracks, same atlas) I found the outputs completely different.
I know that there are differences between versions (Tck2connectome different outputs), but I cannot understand why I have this big difference from the same version and different OS.

Thanks,
Maurizio

Dear @mbergamino

Although things should work similarly on different operating systems, there are too many factors involved. Usually there is something different in how you set up the environment, which can be tracked down and hopefully corrected, but in some cases, the underlying libraries simply work a bit differently on different operating systems (e.g., different numerical precision), which is not under our control really.

My best recommendation for you is to put all of your pipeline inside a container, and then reproducibility is guaranteed. If you’re not familiar with containers, or if you wish a simpler solution, I’d recommend on the NeuroDesk neuroimaging analysis virtual desktop that I co-develop. It is basically a completely reproducible environment (which can be run on any operating system), and it includes MRtrix: https://neurodesk.github.io/ (to see what MRtrix versions are available in NeuroDesk, check Applications | NeuroDesk).

If you use NeuroDesk, you can keep your existing scripts (even if they call to other packages in addition to MRtrix), but under the hood, everything will run within containers that guarantee 100% reproducible results across operating systems. I’m contributing to both MRtrix and Neurodesk, so you’ll welcome to post here any questions you might have on this solution.

All the best,
Oren

As @orencivier mentioned, it’s not always trivial to guarantee consistent behaviour across platforms. Version changes in the libraries we use can also introduce some minor differences, but that’s generally not a problem. We rely on very few external dependencies, and I don’t expect these to cause any trouble.

That said, we do run checks to ensure results are consistent to numerical precision. I’m surprised there would be differences for that command in particular, given that all tests are required to pass on all platforms before we merge - and that includes 3 tests for the tck2connectome command, checking that results match exactly with those expected for all 3 platforms we support (Linux, macOS, Windows).

With that in mind, I’m surprised to hear that you would end up with different results – let alone very large discrepancies. Can you be more specific as to what the differences are, how large the errors are relative to the expected values, and what the commands that you used were exactly?

Can you also state what hardware you’re running this on? We have definitely had issues with the newer ARM-based mac M1 range, though they’ve all been related to OpenGL rather than numerical accuracy…

Hi,
Thank you so much for your answers.
I can understand that some small differences can exist between different OS versions, however, here I have very big differences.

The ubuntu version is:

Distributor ID: Ubuntu
Description: Ubuntu 20.04.3 LTS
Release: 20.04

The MAC version: (Mojave)

ProductName: Mac OS X
ProductVersion: 10.14.6
BuildVersion: 18G103

I tried to upload the matrices here, but I have some difficulties. I have inserted only some parts of the matrices (12 rows X 12 columns):

command in ubuntu:
# tck2connectome tracks.tck NODES.mif connectome_original_symmetric.csv -out_assignments assignments_symmetric.txt -symmetric -zero_diagonal -force (version=3.0.3)

0 218 1621 208 2 0 3652 128 4 0 1763
218 0 50 2776 0 1 29 4270 0 5 4
1621 50 0 393 573 15 7903 357 69 24 379
208 2776 393 0 24 298 368 10086 10 365 41
2 0 573 24 0 17 377 7 920 14 3
0 1 15 298 17 0 10 80 4 1013 0
3652 29 7903 368 377 10 0 388 503 19 1000
128 4270 357 10086 7 80 388 0 7 889 42
4 0 69 10 920 4 503 7 0 8 1
0 5 24 365 14 1013 19 889 8 0 0
1763 4 379 41 3 0 1000 42 1 0 0

command in MAC:
tck2connectome tracks.tck NODES.mif connectome_original_symmetric.csv -out_assignments assignments_symmetric.txt -symmetric -zero_diagonal -force (version=3.0.3)

0 113 834 123 2 0 1941 71 2 0 939
113 0 20 1475 0 0 14 2273 0 4 3
834 20 0 194 293 8 4288 197 38 18 191
123 1475 194 0 14 162 204 5394 6 192 24
2 0 293 14 0 8 194 2 512 7 0
0 0 8 162 8 0 3 44 2 540 0
1941 14 4288 204 194 3 0 200 282 15 520
71 2273 197 5394 2 44 200 0 3 461 22
2 0 38 6 512 2 282 3 0 4 1
0 4 18 192 7 540 15 461 4 0 0
939 3 191 24 0 0 520 22 1 0 0

As atlas file, I tried to use both nii.gz and .mif (here NODES.mif) but the results are the same (same differences between the matrices).

Maurizio

Sorry about the slow response – trying to get all my teaching in order for next week… :tired_face:

Looking at your results, it’s clear this is not a numerical precision issue – we did see some differences introduced in the past due to the use of different rounding modes on the CPU floating-point unit, but that’s clearly not it.

What’s interesting here is that most values in the macOS results seem to be consistently just over half those in the Linux results. What I’m thinking is that for some reason, the tracks.tck file may have got corrupted / truncated on the mac. Is the output of tckinfo -count tracks.tck identical on both platforms?

Hi,
sorry for my late reply, but I wanted to try several options before to continue the discussion.
I tried the same command on Mac High Sierra and (I do not understand why…) I had the same matrix that I had from UBUNTU. The matrix with the different values was generated by a laptop Mac 15.1 with Catalina. Really, I do not have any answer for that.
However, I run:
tckinfo -count tracks.tck
and I have:

***********************************
  Tracks file: "tracks.tck"
    act:                  5tt_to_FA.nii.gz
    backtrack:            1
    count:                2000000
    downsample_factor:    3
    fod_power:            0.25
    init_threshold:       0.0599999987
    lmax:                 8
    max_angle:            45
    max_dist:             150
    max_num_seeds:        2000000000
    max_num_tracks:       2000000
    max_seed_attempts:    1000
    max_trials:           1000
    method:               iFOD2
    min_dist:             2.5
    mrtrix_version:       unknown
    output_step_size:     0.625
    rk4:                  0
    samples_per_step:     4
    sh_precomputed:       1
    source:               wmfod_norm.mif
    step_size:            0.625
    stop_on_all_include:  0
    threshold:            0.06
    timestamp:            1641421735.7773880959
    total_count:          3091897
    unidirectional:       0
    ROI:                  mask data_bias_final_nodif_brain_mask.nii.gz
    ROI:                  mask data_bias_final_nodif_brain_mask.nii.gz
    ROI:                  seed gmwmSeed_coreg.mif
tckinfo: [done] counting tracks in file
actual count in file: 2000000
***********************************

for both systems and it is logic, because I used the same tractography.
Honestly, if someone can help me to understand the differences between Catalina and High Sierra/Ibuntu will be fantastic.

Thanks
Maurizio

There’s no way a difference of that magnitude could be present due only to execution on different OS’s that would have remained undetected to this point. A truncation of the track file is the most likely culprit, but in general you want to do as much testing as you can to make sure that you’re performing a fair comparison. Personally, I would:

  1. Transfer files from both machines into a common location and execute the diff command to make sure that the files that you are using on the two systems are indeed precisely identical.

  2. Run tck2connectome with the -keep_unassigned option and without the -symmetric and -zero_diagonal options. This will yield the unmodified matrix, on which you can e.g. calculate the sum across all rows / columns to see how many streamlines were read (since in this use case all read streamlines are assigned to the matrix, even if they’re not successfully assigned to a node). I’m expecting that on the MAC the matrix will contain less streamlines than what are expected to be present in the input track file.

We can discuss further diagnosis strategies beyond that point once both of these have been done.

Cheers
Rob