Capture standard output of topup and eddy in dwipreproc

Antonin_Skoch · October 31, 2017, 10:19am

Dear experts,

is it possible to get standard output of topup and eddy in dwipreproc? I cannot get it despite adding -debug option. I also did not find any log file in temporary directory containing that. log.txt contains only commands invoked.

Regards,

Antonin

Alistair_Perry · October 31, 2017, 10:30pm

Hi @Antonin_Skoch,

What is this standard output you are referring to? I presume the motion parameters and translations etc?

If you add -nocleanup to dwipreproc, it will not delete the working directory, and stores all the files invoked by topup and eddy.

Best,

Alistair

Antonin_Skoch · October 31, 2017, 11:18pm

No, I meant every diagnostic messages eddy is sending to standard output during its execution when invoked directly (i.e. without dwipreproc) with parameter --verbose or --very_verbose. When eddy is invoked as part of dwipreproc, no message from eddy (despite adding --very_verbose to --eddy_options) is sent to standard output and it seems that there is no file in working temporary directory containing that messages.

rsmith · October 31, 2017, 11:48pm

Hi Antonin,

The issue here is that these commands write their text output to stdout rather than stderr. This is fundamentally inappropriate use of these streams: stdout should be the “output” of the command (particularly in the context of piping data between Unix commands), and be silent in instances where output is entirely file-based; stderr should provide diagnostic and error messages to the terminal.

Since the Python back-end is capable of dealing with piped commands, it has to explicitly connect stdout from one command to stdin of the next in order for data to flow from one command to the next. Furthermore, due to the use of subprocess, the only reason that stderr data can appear at the command-line during MRtrix3 script execution is because there is code explicitly written to make this happen.

While technically it might be possible to print both stdout and stderr to the terminal within that code block (specifically only in the case where no command-line piping is occurring), this would:

Be quite awkward to implement (there’s been more than enough fragility around this code in the past, it’d require a lot more code branching, and avoiding hanging when there’s data on one stream but not the other might be tough);
Would likely produce jargon intermingled text for any command that writes to both stdout and stderr;
May lead to jargon terminal output in the case where a command is being run with the genuine intention of capturing the contents of stdout into a local variable, but those data are not text.

So I’m rather hesitant to try to address this at my end; having topup and eddy appropriately write to stderr would be a much easier & more targeted solution.

Rob

jdtournier · November 1, 2017, 3:47pm

I agree with what @rsmith says, although I think this statement is a bit too strong:

There’s plenty of perfectly legitimate and widely-used Unix commands that send diagnostic / user feedback to stdout - but they’re just not designed to be used as part of larger pipelines (e.g. cp -v, tar, rsync). But as soon as you want to allow piped operations, then it becomes essential to keep stdout clean of any diagnostic feedback so that the data passed to subsequent commands is bona fide output and nothing else. Unfortunately, that means they need to use stderr for both regular user feedback and actual warnings/errors, and this is what MRtrix3 commands do - an issue already raised a while back.

@rsmith: would it be possible within the current run.command() call to output stdout for the last command on the stack…? This would more or less mirror the output you’d see on the command line, and basically fix this issue. The one thing it won’t do is preserve the order for mixed stdout & stderr output - but that’s fragile anyway, given that stdout is buffered (by default), whereas stderr is not…

rsmith · November 1, 2017, 11:17pm

Would it be possible within the current run.command() call to output stdout for the last command on the stack…?

That’s be slightly more extensive than my suggestion of only writing stdout contents when there’s a single command being executed. The same difficulties apply though. For instance, pretty sure that either process.stdout.read(1) or process.stderr.read(1) will hang until process completion if the process doesn’t write anything to that stream; which means that if you “guess incorrectly” with regards to which you try to read first, you won’t get any dynamic output from that command; instead you’ll get the entire contents of the other stream written in one go at process completion. Unless you can wrap each read(1) call in some kind of timeout exception catch?

The other option, if the stdout contents of only specific command outputs are sought (e.g. topup, eddy), is to simply capture these from the output of run.command() (which outputs (stdout, stderr) contents as a tuple), and write these to the terminal if the output verbosity is high enough, and/or write them to files in the temporary directory. That’d be much easier, as long as the list of interesting commands is short enough.

ThijsDhollander · November 2, 2017, 12:00am

Just my 2 cents, but this seems to make the most sense to me.

Romain_Valabregue · November 2, 2017, 8:13am

Just my 1 cents too:
a similar solution would be to add an argument (log_file.txt) to your rum.command() function. And if this log_file is defined, then redirect the output to the given file.
Doing so you do not need to handle all possible case, but this become the responsibility of the one who write the script to specify which command (topup,eddy) should write a log file. So it becomes an explicit choice you can make …