Hi Mark,
Thanks for the data. I can confirm it’s messed up in 3.0.3, and not with 3.0.2, as you stated. It’s fine with the changes that are pending on this pull request (full details actually on this one for anyone who’s interested) – though the order of volumes is different, as I mentioned earlier.
After a bit of investigation, it turns out the problem was introduced in this commit, which tried to simplify the handling a bit, but that ended up picking up the wrong InstanceNumber
entry in your specific dataset. Since this is the last line of defence for sorting the slices (everything else is otherwise identical between volumes), this ends up messing up the sorting.
The reason why we end up picking the wrong InstanceNumber
is that there are actually 3 such tags in your data. For example, this is what I can see for one slice (file) in your dataset:
dcminfo -a DICOM/1.3.46.670589.11.78208.5.0.3428.2021021712041673004-201-108-woqsd.dcm | grep InstanceNumber
[DCM] 0020 0013 IS 2 1134 InstanceNumber 0
[DCM] 0020 0013 IS 4 2598 InstanceNumber 108
[DCM] 0020 0013 IS 10 5428 InstanceNumber 522119740
but only the middle one is the right one… The first one is nested within a standard DICOM ReferencedPerformedProcedureStepSequence
sequence, while the last one is nested within a private Philips-specific Sequence (with tag 2001 9000
, I don’t actually know what that corresponds to). I can get it to work if I also ignore tags nested within private (non-standard sequences), which can be done with this change at line 38 in core/file/dicom/image.cpp
(in the 3.0.3
branch):
// ignore anything within IconImageSequence:
if (seq.group == 0x0088U && seq.element == 0x0200U)
return;
+ // ignore anything within sequences with unknown (private) group:
+ if (seq.group & 1U)
+ return;
}
I’m considering adding this to the master
branch as well, but I’ll need to check that this doesn’t break anything else first…
Part of the (many) intricacies of DICOM parsing is that manufacturers like to put important parameters in different places, sometimes as top-level items, sometimes nested within other Sequences, and these things change between versions of the same manufacturers’ software (and even more so across manufacturers). Here, they’ve used a standard DICOM tag in multiple places, and we need to figure out which of them is relevant – and to do so in a way that works across manufacturers and scanner variants. Unfortunately, it’s very much a process of trial-and-error, and so the only way to ensure things keep working correctly is by testing on as many real-world datasets as we can get our hands on – therefore: a big thanks for adding to my pile of test data!
All the best,
Donald.