Dear MRtrix community,
I am writing here because I found a problem in the output from tck2connectome.
For my work, I have always used MAC computers. Last week I installed UBUNTU v 20.04 and I tried my pre-processing script with MRtrix3.
The version of MRtrix3 is the same in these two computers:
e.g. from UBUNTU:
== tck2connectome 3.0.3 ==
64 bit release version, built Jul 19 2021, using Eigen 3.3.7
Author(s): Robert E. Smith (robert.smith@florey.edu.au)
Copyright (c) 2008-2021 the MRtrix3 contributors.
This Source Code Form is subject to the terms of the Mozilla Public
License, v. 2.0. If a copy of the MPL was not distributed with this
file, You can obtain one at http://mozilla.org/MPL/2.0/.
Covered Software is provided under this License on an "as is"
basis, without warranty of any kind, either expressed, implied, or
statutory, including, without limitation, warranties that the
Covered Software is free of defects, merchantable, fit for a
particular purpose or non-infringing.
See the Mozilla Public License v. 2.0 for more details.
For more details, see http://www.mrtrix.org/.
When I ran tck2connectome by using the same files (same tracks, same atlas) I found the outputs completely different.
I know that there are differences between versions (Tck2connectome different outputs), but I cannot understand why I have this big difference from the same version and different OS.
Although things should work similarly on different operating systems, there are too many factors involved. Usually there is something different in how you set up the environment, which can be tracked down and hopefully corrected, but in some cases, the underlying libraries simply work a bit differently on different operating systems (e.g., different numerical precision), which is not under our control really.
My best recommendation for you is to put all of your pipeline inside a container, and then reproducibility is guaranteed. If you’re not familiar with containers, or if you wish a simpler solution, I’d recommend on the NeuroDesk neuroimaging analysis virtual desktop that I co-develop. It is basically a completely reproducible environment (which can be run on any operating system), and it includes MRtrix: https://neurodesk.github.io/ (to see what MRtrix versions are available in NeuroDesk, check Applications | NeuroDesk).
If you use NeuroDesk, you can keep your existing scripts (even if they call to other packages in addition to MRtrix), but under the hood, everything will run within containers that guarantee 100% reproducible results across operating systems. I’m contributing to both MRtrix and Neurodesk, so you’ll welcome to post here any questions you might have on this solution.
As @orencivier mentioned, it’s not always trivial to guarantee consistent behaviour across platforms. Version changes in the libraries we use can also introduce some minor differences, but that’s generally not a problem. We rely on very few external dependencies, and I don’t expect these to cause any trouble.
That said, we do run checks to ensure results are consistent to numerical precision. I’m surprised there would be differences for that command in particular, given that all tests are required to pass on all platforms before we merge - and that includes 3 tests for the tck2connectome command, checking that results match exactly with those expected for all 3 platforms we support (Linux, macOS, Windows).
With that in mind, I’m surprised to hear that you would end up with different results – let alone very large discrepancies. Can you be more specific as to what the differences are, how large the errors are relative to the expected values, and what the commands that you used were exactly?
Can you also state what hardware you’re running this on? We have definitely had issues with the newer ARM-based mac M1 range, though they’ve all been related to OpenGL rather than numerical accuracy…
Hi,
Thank you so much for your answers.
I can understand that some small differences can exist between different OS versions, however, here I have very big differences.
The ubuntu version is:
Distributor ID:
Ubuntu
Description:
Ubuntu 20.04.3 LTS
Release:
20.04
The MAC version: (Mojave)
ProductName:
Mac OS X
ProductVersion:
10.14.6
BuildVersion:
18G103
I tried to upload the matrices here, but I have some difficulties. I have inserted only some parts of the matrices (12 rows X 12 columns):
Sorry about the slow response – trying to get all my teaching in order for next week…
Looking at your results, it’s clear this is not a numerical precision issue – we did see some differences introduced in the past due to the use of different rounding modes on the CPU floating-point unit, but that’s clearly not it.
What’s interesting here is that most values in the macOS results seem to be consistently just over half those in the Linux results. What I’m thinking is that for some reason, the tracks.tck file may have got corrupted / truncated on the mac. Is the output of tckinfo -count tracks.tck identical on both platforms?
Hi,
sorry for my late reply, but I wanted to try several options before to continue the discussion.
I tried the same command on Mac High Sierra and (I do not understand why…) I had the same matrix that I had from UBUNTU. The matrix with the different values was generated by a laptop Mac 15.1 with Catalina. Really, I do not have any answer for that.
However, I run:
tckinfo -count tracks.tck
and I have:
for both systems and it is logic, because I used the same tractography.
Honestly, if someone can help me to understand the differences between Catalina and High Sierra/Ibuntu will be fantastic.
There’s no way a difference of that magnitude could be present due only to execution on different OS’s that would have remained undetected to this point. A truncation of the track file is the most likely culprit, but in general you want to do as much testing as you can to make sure that you’re performing a fair comparison. Personally, I would:
Transfer files from both machines into a common location and execute the diff command to make sure that the files that you are using on the two systems are indeed precisely identical.
Run tck2connectome with the -keep_unassigned option and without the -symmetric and -zero_diagonal options. This will yield the unmodified matrix, on which you can e.g. calculate the sum across all rows / columns to see how many streamlines were read (since in this use case all read streamlines are assigned to the matrix, even if they’re not successfully assigned to a node). I’m expecting that on the MAC the matrix will contain less streamlines than what are expected to be present in the input track file.
We can discuss further diagnosis strategies beyond that point once both of these have been done.