Article
A simulation study into the performance of ‘‘optimal’’ diagnostic thresholds in the population: ‘‘large’’ effect sizes are not enough
Registro en:
HIRSCHFELD, Gerrit; PEDRO, Emmanuel Alvarenga Americano do Brasil. A simulation study into the performance of ‘‘optimal’’ diagnostic thresholds in the population: ‘‘large’’ effect sizes are not enough. Journal of Clinical Epidemiology, v.67, n.4, p.449–453, 2014.
0895-4356
10.1016/j.jclinepi.2013.07.018
1878-5921
Autor
Hirschfeld, Gerrit
Brasil, Pedro Emmanuel Alvarenga Americano do
Resumen
Objectives: Many diagnostic studies are aimed at defining ‘‘optimal’’ thresholds. Here, we evaluate the performance of empirically
defined optimal thresholds (1) in the sample in which they were defined and (2) in the population from which the sample was drawn.
Study Design and Setting: We simulated test results for 120,000 samples varying the number of people without a disease (n between
20 and 500), number of people with a disease (m between 20 and 500), the magnitude of the difference between group means [effect size
(ES) between 0.5 and 4], and distributions (normal and log-normal). The thresholds associated with the maximal Youden index were
defined as optimal. Performance was defined as the percentage of correct classifications in the sample and when applied to the whole
population.
Results: At the population level, the thresholds defined for the four ESs (0.5, 0.8, 2, and 4) yielded a median of 59%, 65%, 83%, and
97% correct classifications, respectively. At the sample level, the samples with similar characteristics yielded widely varying estimates of
the performance that were systematically higher than at the population level.
Conclusion: Researchers need to be careful defining cut points for mean differences that are traditionally considered ‘‘large’’
(ES 5 0.8). The diagnostic utility of optimal thresholds needs to be assessed in prospective studies. 2014 Elsevier Inc. All rights
reserved.