To capture, as nearly as possible, clinical decision making in the “real world,” methods for testing reliability should allow for both forms of noise; after all, it’s the noise that makes diagnosis difficult and (sometimes) variable in the real world, and so researchers who want a true picture of clinical reality must account for it in their study design; test methods that fail to do so render inflated reliability scores.