Extracting dixels from a large population

Dear MRtrix community,

I have recently started the processing of a large dataset (~42k subjects of the UK biobank) using various tools from MRtrix. The main aim is to extract some latent factors from the diffusion data and particularly performing/optimizing the feature extraction methods in a hypothesis-free way but maximizing associations of the latent variables with some genetic factors. For this aim, I would like to keep data preprocessing as exploratory as possible, and then use some neural networks to extract genetically-informative features from the (dixel-wise?) diffusion data. I think I will need to proceed to the stage of the MRtrix pipeline at which the diffusion data has just been aligned/comparable across the individuals in 2D spherical and 3D cartesian coordinates (dixels in the MRtrix terminology I guess?).

I have a few questions:

-Am I correct in understanding that the FOD amplitudes become comparable across the subjects after nonlinear registration (+ reorientation) of indivudal FODs to the FOD template?

-The data has two non-zero shells (b=1000 and b=2000, each at 50 directions), so I used the dhollander response model with an average response function, and then applied mtnormalise without bias field correction and no T1-wighted segmentation information. Is it a sensible choice given the multi-shell data?

-I’m thinking about obtaining the amplitudes of the template-warped FOD images using the sh2amp command, interpolated at ~200 directions on a unit sphere using dirgen. Considering the sample size and computational/memory constraints of our work, is it a reasonable interpolation scheme?

-Due to computational burden, I skipped denoising + upsampling. I guess that removing these steps does not systematically affect the results?

-I generated the FOD population template using 895 random subjects of the study. The population_template command was used with nthreads=192 and default settings. The script took a few days to finish and the output files seem sensible. However, the last lines of the log file reads “population_template: Optimising template with non-linear registration (stage 1 of 16)…”, a few blank lines and then final housekeeping commands (deleting scratch directory, copying output files etc.). Does it mean that only one nonlinear registration iteration was successfully run?

-Any further insights or piece of advice is greatly appreciated!

Cheers,
Sourena