Abstract
The ease with which data can be collected and analyzed via personal computer makes it potentially attractive to “peek” at the data before a target sample size is achieved. This tactic might seem appealing because data collection could be stopped early, which would save valuable resources, if a peek revealed a significant effect. Unfortunately, such data snooping comes with a cost. When the null hypothesis is true, the Type I error rate is inflated, sometimes quite substantially. If the null hypothesis is false, premature significance testing leads to inflated estimates of power and effect size. This program provides simulation results for a wide variety of premature and repeated null hypothesis testing scenarios. It gives researchers the ability to know in advance the consequences of data peeking so that appropriate corrective action can be taken.
Article PDF
Similar content being viewed by others
References
Brysbaert, M. (1991). Algorithms for randomness in the behavioral sciences: A tutorial.Behavior Research Methods, Instruments, & Computers,23, 45–60.
Clark-Carter, D. (1997). The account taken of statistical power in research published in theBritish Journal of Psychology.British Journal of Psychology,88, 71–83.
Cohen, J. (1962). The statistical power of abnormal-social psychological research: A review.Journal of Abnormal & Social Psychology,65, 145–153.
Cohen, J. (1988).Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.
Cohen, J. (1992). A power primer.Psychological Bulletin,112, 155–159.
Cohen, J. (1994). The earth is round (p <.05).American Psychologist,49, 997–1003.
Dar, R., Serlin, R. C., &Omer, H. (1994). Misuse of statistical tests in three decades of psychotherapy research.Journal of Consulting & Clinical Psychology,62, 75–82.
Finch, S., Cumming, G., &Thomason, N. (2001). Reporting of statistical inference in theJournal of Applied Psychology: Little evidence of reform.Educational & Psychological Measurement,61, 181–210.
McCarroll, D., Crays, N., &Dunlap, W. P. (1992). Sequential ANOVAs and Type I error rates.Educational & Psychological Measurement,52, 387–393.
Sedlmeier, P., &Gigerenzer, G. (1989). Do studies of statistical power have an effect on the power of studies?Psychological Bulletin,105, 309–316.
Strube, M. J, & Hanson, J. S. (2004).The perils of peeking: Consequences of premature and repeated null hypothesis testing. Manuscript submitted for publication.
Wichmann, B. A., &Hill, J. D. (1982). Algorithm AS 183: An efficient and portable pseudo-random number generator.Applied Statistics,31, 188–190.
Wilkinson, L., &Task Force on Statistical Inference (1999). Statistical methods in psychology journals: Guidelines and explanations.American Psychologist,54, 594–604.
Author information
Authors and Affiliations
Corresponding author
Additional information
This program is available from the author via e-mail attachment, via FTP at www.artsci.wustl.edu/∼socpsy (executable file name: snoop.exe, zipped files for Visual Basic, Version 5 Professional: snoop.zip), or on disk (send a self-addressed and stamped disk mailer to the author).
Rights and permissions
About this article
Cite this article
Strube, M.J. SNOOP: A program for demonstrating the consequences of premature and repeated null hypothesis testing. Behavior Research Methods 38, 24–27 (2006). https://doi.org/10.3758/BF03192746
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BF03192746