Artículos de revistas
Adaptative significance levels using optimal decision rules: Balancing by weighting the error probabilities
Registro en:
0103-0752
Autor
Pericchi Guerra, Luis R.
Pereira, Carlos
Institución
Resumen
Abstract. Our purpose is to recommend a change in the paradigm of testing by generalizing a very natural idea, originated perhaps in Jeffreys [Proceed- ings of the Cambridge Philosophy Society 31 (1935) 203–222; The Theory of Probability (1961) Oxford Univ. Press] and clearly exposed by DeGroot [Probability and Statistics (1975) Addison-Wesley], with the aim of devel- oping an approach that is attractive to all schools of statistics, resulting in a procedure better suited to the needs of science. The essential idea is to base testing statistical hypotheses on minimizing a weighted sum of type I and type II error probabilities instead of the prevailing paradigm, which is fixing type I error probability and minimizing type II error probability. For simple vs simple hypotheses, the optimal criterion is to reject the null using the like- lihood ratio as the evidence (ordering) statistic, with a fixed threshold value instead of a fixed tail probability. By defining expected type I and type II error probabilities, we generalize the weighting approach and find that the optimal region is defined by the evidence ratio, that is, a ratio of averaged likelihoods (with respect to a prior measure) and a fixed threshold. This approach yields an optimal theory in complete generality, which the classical theory of testing does not. This can be seen as a Bayesian/non-Bayesian compromise: using a weighted sum of type I and type II error probabilities is Frequentist, but bas- ing the test criterion on a ratio of marginalized likelihoods is Bayesian. We give arguments to push the theory still further, so that the weighting measures (priors) of the likelihoods do not have to be proper and highly informative, but just “well calibrated.” That is, priors that give rise to the same evidence (marginal likelihoods) using minimal (smallest) training samples.
The theory that emerges, similar to the theories based on objective Bayesian approaches, is a powerful response to criticisms of the prevailing approach of hypothesis testing. For criticisms see, for example, Ioannidis [PLoS Medicine 2 (2005) e124] and Siegfried [Science News 177 (2010) 26–29], among many others.