Perhaps we should start to classify the filters based on some mathematical keys of correspondence, that comes from the filter curve being generated.
For example a quantities such as... as some initial suggestions...
- rolloff (height at 1 or about that distance),
- blur/sigma (a fit of Gaussian curve to the first lobe? or some other statistical measure),
- ringing (height of first negative lobe)
I don't know if these are sensible suggestions (but I think they may be), but it would help to start qualify what makes a specific 'good' filter and how they relate to each other. You may in fact produce a data cloud for the filters that are found to be good for specific purposes (or test images).
If it results in some type of correspondence between 2, 3, or 4 lobe filters, at least for specific images, then we will have verification that we have a good measure for evaluating filters.