Skip to main content

Advertisement

Log in

Using a Nonparametric Bootstrap to Obtain a Confidence Interval for Pearson’s r with Cluster Randomized Data: A Case Study

  • Original Paper
  • Published:
The Journal of Primary Prevention Aims and scope Submit manuscript

Abstract

A nonparametric bootstrap was used to obtain an interval estimate of Pearson’s r, and test the null hypothesis that there was no association between 5th grade students’ positive substance use expectancies and their intentions to not use substances. The students were participating in a substance use prevention program in which the unit of randomization was a public middle school. The bootstrap estimate indicated that expectancies explained 21% of the variability in students’ intentions (r = 0.46, 95% CI = [0.40, 0.50]). This case study illustrates the use of a nonparametric bootstrap with cluster randomized data and the danger posed if outliers are not identified and addressed. Editors’ Strategic Implications: Prevention researchers will benefit from the authors’ detailed description of this nonparametric bootstrap approach for cluster randomized data and their thoughtful discussion of the potential impact of cluster sizes and outliers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Altman, D. G. (2000). Statistics in medical journals: Some recent trends. Statistics in Medicine, 19, 3275–3289.

    Article  PubMed  CAS  Google Scholar 

  • Andrews, D. W. K. (2000). Inconsistency of the bootstrap when a parameter is on the boundary of the parameter space. Econometrika, 68, 399–405.

    Article  Google Scholar 

  • Bieler, G. S., & Williams, R. L. (1995). Cluster sampling techniques in quantal response teratology and developmental toxicity studies. Biometrics, 51, 764–776.

    Article  PubMed  CAS  Google Scholar 

  • Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical linear models: Applications and data analysis methods. Newbury Park, CA: Sage.

    Google Scholar 

  • Buckland, S. T. (1984). Monte Carlo confidence intervals. Biometrics, 40, 811–817.

    Article  Google Scholar 

  • Carpenter, J., & Bithell, J. (2000). Bootstrap confidence intervals: When? Which? What? A practical guide to medical statisticians. Statistics in Medicine, 19, 1141–1164.

    Article  PubMed  CAS  Google Scholar 

  • Carpenter, J. R., Goldstein, H., & Rasbash, J. (2003). A novel bootstrap procedure for assessing the relationship between class size and achievement. Applied Statistics, 52, 431–443.

    Google Scholar 

  • Chan, W., & Chan, D. W.-L. (2004). Bootstrap standard error and confidence interval for the correlation corrected for range restriction: A simulation study. Psychological Methods, 9, 369–385.

    Article  PubMed  Google Scholar 

  • Cook, R. D. (1977). Detection of influential observations in linear regression. Technometrics, 19, 15–18.

    Article  Google Scholar 

  • Cornfield, J. (1978). Randomization by group: A formal analysis. American Journal of Epidemiology, 108, 100–102.

    PubMed  CAS  Google Scholar 

  • Davison, A. C., & Hinkley, D. V. (1997). Bootstrap methods and their application. New York: Cambridge University Press.

    Google Scholar 

  • Derzon, J. (2007). Using correlational evidence to select youth for prevention programming. The Journal of Primary Prevention, 28, 421–447.

    Article  PubMed  Google Scholar 

  • DiCiccio, T. J., & Efron, B. (1996). Bootstrap confidence intervals. Statistical Science, 13, 189–228.

    Google Scholar 

  • Donner, A. (1998). Some aspects of the design and analysis of cluster randomization trials. Applied Statistics, 47, 95–113.

    Google Scholar 

  • Donner, A., & Klar, N. (1999). Design and analysis of cluster randomization trials in health research. New York: Oxford University Press.

    Google Scholar 

  • Efron, B. (1979). Bootstrap methods: Another look at the Jackknife. Annals of Statistics, 7, 1–26.

    Article  Google Scholar 

  • Efron, B. (1987). Better bootstrap confidence intervals. Journal of the American Statistical Association, 82, 171–185.

    Article  Google Scholar 

  • Efron, B., & Gong, G. (1983). A leisurely look at the bootstrap, the jackknife, and cross-validation. The American Statistician, 37, 36–48.

    Article  Google Scholar 

  • Efron, B., & Tibshirani, R. (1986). Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Statistical Science, 1, 54–77.

    Article  Google Scholar 

  • Efron, B., & Tibshirani, R. (1993). An introduction to the bootstrap. New York: Chapman & Hall.

    Google Scholar 

  • Field, C. A., & Welsh, A. H. (2007). Bootstrapping clustered data. Journal of the Royal Statistical Society, Series B, 69, 369–390.

    Article  Google Scholar 

  • Goldstein, H. (1995). Multilevel statistical models (2nd ed.). London: Edward Arnold.

    Google Scholar 

  • Hall, P. (1986). On the number of bootstrap simulations required to construct a confidence interval. Annals of Statistics, 14, 1453–1462.

    Article  Google Scholar 

  • Hall, P. (1988). Theoretical comparison of bootstrap confidence intervals. Annals of Statistics, 16, 927–953.

    Article  Google Scholar 

  • Keen, K., & Elston, R. C. (2003). Robust asymptotic theory for correlations in pedigrees. Statistics in Medicine, 22, 3229–3247.

    Article  PubMed  CAS  Google Scholar 

  • Kelly, K. (2005). The effects of nonnormal distributions on confidence intervals around the standardized mean difference: Bootstrap and parametric confidence intervals. Educational and Psychological Measurement, 65, 51–69.

    Article  Google Scholar 

  • Kelly, K., & Maxwell, S. E. (2003). Sample size for multiple regression: Obtaining regression coefficients that are accurate, not simply significant. Psychological Methods, 8, 305–321.

    Article  Google Scholar 

  • Kish, L. (1957). Confidence intervals for clustered samples. American Sociological Review, 22, 154–165.

    Article  Google Scholar 

  • Kish, L., & Frankel, M. R. (1974). Inference from complex samples. Journal of the Royal Statistical Society, Series B, 36, 1–37.

    Google Scholar 

  • Korn, E. L., & Graubard, B. I. (1999). Analysis of health surveys. New York: Wiley.

    Google Scholar 

  • LaVange, L. M., Keys, L. L., Koch, G. G., & Margolis, P. A. (1994). Application of sample dose-response modeling ratios to incidence densities. Statistics in Medicine, 13, 343–355.

    Article  PubMed  CAS  Google Scholar 

  • Levy, P. S., & Lemeshow, S. (1999). Sampling of populations: Methods and applications (3rd ed.). New York: Wiley.

    Google Scholar 

  • Localio, A. R., Sharp, T. J., & Landis, J. R. (1995). Analysis of clustered categorical data in an experimental design: Sample survey methods compared to alternatives. Proceedings of the Biometrics Section, American Statistical Association, 71–76.

  • Manly, B. F. J. (1997). Randomization, bootstrap and Monte Carlo methods in biology (2nd ed.). London: Chapman & Hall.

    Google Scholar 

  • Maxwell, S. E. (2004). The persistence of underpowered studies in psychological research: Causes, consequences, and remedies. Psychological Methods, 9, 147–163.

    Article  PubMed  Google Scholar 

  • Murray, D. M. (1998). Design and analysis of group-randomized trials. New York: Oxford University Press.

    Google Scholar 

  • Myers, J. L., DiCecco, J. V., & Lorch, R. F., Jr. (1981). Group dynamics and individual differences: Pseudogroup and quasi-F analyses. Journal of Personality and Social Psychology, 40, 86–98.

    Article  Google Scholar 

  • Ren, S., Yang, S., & Lai, S. (2006). Intraclass correlation coefficients and bootstrap methods of hierarchical binary outcomes. Statistics in Medicine, 25, 3576–3588.

    Article  PubMed  Google Scholar 

  • Rosenthal, R. (1991). Meta-analytic procedures for social research. Newbury Park, CA: Sage. revised edition.

    Google Scholar 

  • Rosner, B., Donner, A., & Hennekens, C. H. (1977). Estimation of interclass correlation from familial data. Applied Statistics, 26, 179–187.

    Article  Google Scholar 

  • Shao, J. (2003). Impact of the bootstrap on sample surveys. Statistical Science, 18, 191–198.

    Article  Google Scholar 

  • Sribney, B. (2001). How can I estimate correlations and their level of significance with survey data? Retrieved March 06, 2007 from http://www.stata.com/support/faqs/stat/survey.html.

  • Stacy, A. W., Widaman, K. F., & MarLatt, G. A. (1990). Expectancy models of alcohol use. Journal of Personality and Social Psychology, 58, 918–928.

    Article  PubMed  CAS  Google Scholar 

  • Stata Corporation. (2005). Stata statistical software: Release 9.0. College Station, TX: Author.

    Google Scholar 

  • Ukoumunne, O. C., Davison, A. C., Gulliford, M. C., & Chinn, S. (2003). Non-parametric bootstrap confidence intervals for the intraclass correlation coefficient. Statistics in Medicine, 22, 3805–3821.

    Article  PubMed  Google Scholar 

  • Walsh, J. E. (1947). Concerning the effect of intraclass correlation on certain significance tests. Annals of Mathematical Statistics, 18, 88–96.

    Article  Google Scholar 

Download references

Acknowledgments

The project described was supported by Grant Number DA005629 awarded by the National Institute On Drug Abuse to The Pennsylvania State University (Grant Recipient), Michael Hecht, Principal Investigator, with Arizona State University as the collaborating subcontractor. The data used in the present study would not have been available had it not been for the dedication of the Drug Resistance Strategies Project team members in Phoenix, Arizona. These researchers are led by Drs. Flavio Marsiglia, Stephen Kulis, and Patricia Dustman. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute on Drug Abuse or the National Institutes of Health. Finally, we would like to thank Drs. Eric Loken and Michael Rovine for helpful comments and suggestions on the preparation of this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David A. Wagstaff.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wagstaff, D.A., Elek, E., Kulis, S. et al. Using a Nonparametric Bootstrap to Obtain a Confidence Interval for Pearson’s r with Cluster Randomized Data: A Case Study. J Primary Prevent 30, 497–512 (2009). https://doi.org/10.1007/s10935-009-0191-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10935-009-0191-y

Keywords

Navigation