It depends. If it is purely for rough visualisation / artistic purposes, I’d say it is fine to use averaged ODFs from subject-specific response functions.
When response functions are estimated from the data, one reason for using a single response function is to fix the basis with which the signal is deconvolved across all images so that we can focus on analysing the coefficients (ODFs), without the need to also take the subject-specific ODFs into account. Your question really boils down to what you want your template to represent. If you are after the group-average density then you’d need to measure each data point (subject) using the same units.
As an example, two images could very well have different WM RFs due to genuine signal differences in single fibre WM voxels, for instance, caused by pathology. When you deconvolve the DWI signal using the respective subject-native RFs you might end up with identical ODFs as the difference in the signal could be captured by the different RFs. If you used the average of both response functions, you would most likely see a difference in the ODFs due to the difference in signal (be it differences in tissue density or other effects) that now needs to be explained by a difference in the ODFs. Note that it does not necessarily need to be the average response function of all subjects but a common representative response function (see here) but averaging is unbiased towards specific subjects and can average out uncertainty in the parameters. If your response function estimation is not robust or you are interested in ODFs wrt to a certain population, you might be better off using a subset of the RFs.
For tracking (and filtering) in template space the same holds, you’d want to have a sensible well-defined measure of density.
A scenario where you might not end up using a single response function is when you have reasons to believe that subject-specific response functions are more representative of the quantity you are interested in than a single (group-average) response. This is very rarely the case but we used subject-specific response functions to derive ODFs used for multi-contrast registration in neonatal data as group-average RFs would have biased tissue boundaries (link). I’d say this is an exception to the rule.
If you want to dig deeper and get some more context, have a look at Daan Christiaen’s (i.e. section 2.4.2) or Maxime Descoteaux’ (i.e. chapter 9) theses.