Top

Journal of Psychopathology and Behavioral Assessment

Gepubliceerd in:

22-04-2017

Quality Vs. Quantity: Assessing Behavior Change over Time

Auteurs: Andrew L. Moskowitz, Jennifer L. Krull, K. Alex Trickey, Bruce F. Chorpita

Gepubliceerd in: Journal of Psychopathology and Behavioral Assessment | Uitgave 3/2017

Abstract

Beyond the typical design factors that impact a study’s power (e.g., participant sample size), planning longitudinal research involves additional considerations such as assessment frequency and participant retention. Because this type of research relies so strongly on individual commitment, investigators must be judicious in determining how much information is necessary to study the phenomena in question; collecting too little information will render the data less useful, but requiring excessive participant investment will likely lower participation rates. We conducted a simulation study to empirically examine statistical power and the trade-off between assessment quality (as a function of instrument length) and assessment frequency across a number of sample sizes with intermittently missing data or attrition. Results indicated that reductions in power resulting from shorter, less reliable measurements can be at least somewhat offset by increasing assessment frequency. Because study planning involves a number of factors competing for finite resources, equations were derived to find the balance points between pairs of design characteristics affecting statistical power. These equations allow researchers to calculate the amount that a particular design factor (e.g., assessment frequency) would need to increase to result in the same improvement in power as increasing an alternative factor (e.g., measurement reliability. Applications for the equations are discussed.

vorige artikel Resilience to Interpersonal Trauma and Decreased Risk for Psychopathology in an Epidemiologic Sample

volgende artikel Modifying Obsessive-Compulsive Beliefs about Controlling One’s Thoughts

Alleen toegankelijk voor geautoriseerde gebruikers

As a reviewer has noted, carryover effects resulting from increased measurement frequency may also threaten measurement reliability to some degree. Thus, in practice improvements in power may actually be dampened. In this manuscript, we consider reliability of a measurement instrument collected at a given number of time points regardless of how or why that reliability is reduced. Measurement reliability remains constant across time points.

Although three-level measurement models were used to generate parameter estimates for item level data necessary for the simulation study, these data would normally be modeled as two-level with a repeatedly measured composite problem behavior score nested within participants. The measurement model was necessary to estimate the proper amount of measurement error per item so as to be able to manipulate total measurement reliability by altering the number of items estimated. The two-level approach was utilized after the data generation phase of the simulation study when modeling treatment effects.

Supplementary materials include tables of empirical power for all missingness conditions and figures similar to Figure 1 for all of the equations in Table 5.

Non-linear effects of predictors were tested and found to be unnecessary, as the non-linearity inherent in the logistic model itself was sufficient to accommodate any non-linearity in associations between the predictors and the outcome.

The 75- and 60-item conditions were not included in this particular study, but the results presented in Table 4 suggest that any increases in measurement reliability and consequent power would be very small for instruments of more than 50-items. Thus, the 50-item conditions with the appropriate assessment frequency will be used as proxies for the 75- and 60-item measures in this example.

To display these results graphically, the extraneous factor (i.e., the factor that was not changing) was held to an arbitrary constant.

Algina, J., & Olejnik, S. (2003). Conducting power analyses for ANOVA and ANCOVA in between-subjects designs. Evaluation & the Health Professions, 26(3), 288–314.CrossRef

Aydin, B., Leite, W. L., & Algina, J. (2014). The consequences of ignoring variability in measurement occasions within data collection waves in latent growth models. Multivariate Behavioral Research, 49(2), 149–160.CrossRefPubMed

Bakker, M., van Dijk, A., & Wicherts, J. M. (2012). The rules of the game called psychological science. Perspectives on Psychological Science, 7(6), 543–554.CrossRefPubMed

Barrera-Gómez, J., Spiegelman, D., & Basagaña, X. (2013). Optimal combination of number of participants and number of repeated measurements in longitudinal studies with time-varying exposure. Statistics in Medicine, 32(27). doi:10.1002/sim.5870.

Bloch, D. A. (1986). Sample size requirements and the cost of a randomized clinical trial with repeated measurements. Statistics in Medicine, 5(6), 663–667.CrossRefPubMed

Brandmaier, A. M., von Oertzen, T., Ghisletta, P., Hertzog, C., & Lindenberger, U. (2015). LIFESPAN: A tool for the computer-aided design of longitudinal studies. Frontiers in Psychology, 6, 272. doi:10.3389/fpsyg.2015.00272.CrossRefPubMedPubMedCentral

Browne, W. J., Lahi, M. G., & Parker, R. M. (2009). A guide to sample size calculations for random effect models via simulation and the MLPowSim software package. Bristol, United Kingdom: University of Bristol.

Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafo, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews. Neuroscience, 14(5), 365–376. doi:10.1038/nrn3475.CrossRefPubMed

Cheng, Y., & Berry, D. A. (2007). Optimal adaptive randomized designs for clinical trials. Biometrika, 94(3), 673–689.CrossRef

Cohen, J. (1962). The statistical power of abnormal-social psychological research: A review. The Journal of Abnormal and Social Psychology, 65(3), 145.CrossRefPubMed

Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45(12), 1304.CrossRef

Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155.CrossRefPubMed

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2002). Applied Multple Regression/Correlation Analysis for the Behavioral Sciences (Third ed.): Routledge.

Collins, L. M. (2006). Analysis of longitudinal data: The integration of theoretical model, temporal design, and statistical model. Annual Review of Psychology, 57, 505–528. doi:10.1146/annurev.psych.57.102904.190146.CrossRefPubMed

Collins, L. M., Murphy, S. A., & Strecher, V. (2007). The multiphase optimization strategy (MOST) and the sequential multiple assignment randomized trial (SMART): New methods for more potent eHealth interventions. American Journal of Preventive Medicine, 32(5), S112–S118.CrossRefPubMedPubMedCentral

Dziak, J. J., Nahum-Shani, I., & Collins, L. M. (2012). Multilevel factorial experiments for developing behavioral interventions: Power, sample size, and resource considerations(). Psychological Methods, 17(2), 153–175. doi:10.1037/a0026972.CrossRefPubMedPubMedCentral

Enders, C. K. (2010). Applied missing data analysis: Guilford Press.

Faulkner, C., Fidler, F., & Cumming, G. (2008). The value of RCT evidence depends on the quality of statistical analysis. Behaviour Research and Therapy, 46(2), 270–281.CrossRefPubMed

Hedeker, D., Gibbons, R. D., & Waternaux, C. (1999). Sample size estimation for longitudinal designs with attrition: Comparing time-related contrasts between two groups. Journal of Educational and Behavioral Statistics, 24(1), 70–93.CrossRef

Hertzog, C., Lindenberger, U., Ghisletta, P., & Oertzen, T. v. (2006). On the power of multivariate latent growth curve models to detect correlated change. Psychological Methods, 11(3), 244.CrossRefPubMed

Hox, J. J. (2010). Multilevel analysis: Techniques and applications (Second Edition ed.): Routledge.

Laurenceau, J. P., Hayes, A. M., & Feldman, G. C. (2007). Some methodological and statistical issues in the study of change processes in psychotherapy. Clinical Psychology Review, 27(6), 682–695. doi:10.1016/j.cpr.2007.01.007.CrossRefPubMedPubMedCentral

Lavori, P. W., Rush, A. J., Wisniewski, S. R., Alpert, J., Fava, M., Kupfer, D. J., et al. (2001). Strengthening clinical effectiveness trials: Equipoise-stratified randomization. Biological Psychiatry, 50(10), 792–801.CrossRefPubMed

Liu, X., Spybrook, J., Congdon, R., Martinez, A., & Raudenbush, S. (2005). OD 2.0: Optimal design for multi-level and longitudinal research.

MacKinnon, D. P., & Lockwood, C. M. (2003). Advances in statistical methods for substance abuse prevention research. Prevention Science, 4(3), 155–171.CrossRefPubMedPubMedCentral

Mathieu, J. E., Aguinis, H., Culpepper, S. A., & Chen, G. (2012). Understanding and estimating the power to detect cross-level interaction effects in multilevel modeling. Journal of Applied Psychology, 97(5), 951.CrossRefPubMed

Maxwell, S. E. (1998). Longitudinal designs in randomized group comparisons: When will intermediate observations increase statistical power? Psychological Methods, 3(3), 275.CrossRef

Moerbeek, M. (2008). Powerful and cost-efficient designs for longitudinal intervention studies with two treatment groups. Journal of Educational and Behavioral Statistics, 33(1), 41–61. doi:10.3102/1076998607302630.CrossRef

Moerbeek, M. (2011). The effects of the number of cohorts, degree of overlap among cohorts, and frequency of observation on power in accelerated longitudinal designs. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 7(1), 11.CrossRef

Moerbeek, M., & Teerenstra, S. (2015). Power analysis of trials with multilevel data: CRC Press.

Moerbeek, M., Van Breukelen, G. J. P., & Berger, M. P. F. (2001). Optimal experimental designs for multilevel models with covariates. Communications in Statistics - Theory and Methods, 30(12), 2683–2697. doi:10.1081/STA-100108453.CrossRef

Murray, D. M., Varnell, S. P., & Blitstein, J. L. (2004). Design and analysis of group-randomized trials: A review of recent methodological developments. American Journal of Public Health, 94(3), 423–432.CrossRefPubMedPubMedCentral

Muthén, B. O., & Curran, P. J. (1997). General longitudinal modeling of individual differences in experimental designs: A latent variable framework for analysis and power estimation. Psychological Methods, 2(4), 371.CrossRef

Muthén, L. K., & Muthén, B. O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling: A Multidisciplinary Journal, 9(4), 599–620. doi:10.1207/s15328007sem0904_8.CrossRef

Nicewander, W. A., & Price, J. M. (1983). Reliability of measurement and the power of statistical tests: Some new results. Psychological Bulletin, 94(3), 524–533. doi:10.1037/0033-2909.94.3.524.CrossRef

Potvin, P. J., & Schutz, R. W. (2000). Statistical power for the two-factor repeated measures ANOVA. Behavior Research Methods, Instruments, & Computers, 32(2), 347–356. doi:10.3758/BF03207805.CrossRef

Rast, P., & Hofer, S. M. (2014). Longitudinal design considerations to optimize power to detect variances and covariances among rates of change: Simulation results based on actual longitudinal studies. Psychological Methods, 19(1), 133–154. doi:10.1037/a0034524.CrossRefPubMed

Raudenbush, S. W. (2011). Optimal Design Software for Multi-level and Longitudinal Research (Version 3.01).

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (Second ed.): Sage.

Raudenbush, S. W., & Liu, X.-F. (2001). Effects of study duration, frequency of observation, and sample size on power in studies of group differences in polynomial change. Psychological Methods, 6(4), 387.CrossRefPubMed

Remmers, H. H., Karslake, R., & Gage, N. (1940). Reliability of multiple-choice measuring instruments as a function of the Spearman-Brown prophecy formula, I. Journal of Educational Psychology, 31, (8), 583–590. doi:10.1037/h0054189.

Rhemtulla, M., & Hancock, G. R. (2016). Planned missing data designs in educational psychology research. Educational Psychologist, 51(3–4), 305–316.CrossRef

Rhemtulla, M., Jia, F., Wu, W., & Little, T. D. (2014). Planned missing designs to optimize the efficiency of latent growth parameter estimates. International Journal of Behavioral Development, 38(5), 423–434.

Rossi, J. S. (1990). Statistical power of psychological research: What have we gained in 20 years? Journal of Consulting and Clinical Psychology, 58(5), 646–656.CrossRefPubMed

Salekin, R. T. (2014). Some new directions for publication in the Journal of psychopathology and behavioral assessment: New constructs, physiological assessment, worldwide contribution, and economics. Journal of Psychopathology and Behavioral Assessment, 36(1), 1–3.CrossRef

Silvia, P. J., Kwapil, T. R., Walsh, M. A., & Myin-Germeys, I. (2014). Planned missing-data designs in experience-sampling research: Monte Carlo simulations of efficient designs for assessing within-person constructs. Behavior Research Methods, 46(1), 41–54. doi:10.3758/s13428-013-0353-y.CrossRefPubMedPubMedCentral

Snijders, T. A. (2005). Power and sample size in multilevel linear models. Encyclopedia of statistics in behavioral science.

Timmons, A. C., & Preacher, K. J. (2015). The importance of temporal design: How do measurement intervals affect the accuracy and efficiency of parameter estimates in longitudinal research? Multivariate Behavioral Research, 50(1), 41–55. doi:10.1080/00273171.2014.961056.CrossRefPubMed

von Oertzen, T., & Brandmaier, A. M. (2013). Optimal study design with identical power: An application of power equivalence to latent growth curve models. Psychology and Aging, 28(2), 414.CrossRef

Weisz, J. R., Chorpita, B. F., Palinkas, L. A., Schoenwald, S. K., Miranda, J., Bearman, S. K., et al. (2012). Testing standard and modular designs for psychotherapy treating depression, anxiety, and conduct problems in youth: A randomized effectiveness trial. Archives of General Psychiatry, 69(3), 274–282. doi:10.1001/archgenpsychiatry.2011.147.CrossRefPubMed

Widaman, K. F. (2006). Missing data: What to do with or without them. Monographs of the Society for Research in Child Development, 71(3), 42–64. doi:10.1111/j.1540-5834.2006.00404.x.

Willett, J. B. (1989). Some results on reliability for the longitudinal measurement of change: Implications for the design of srudies of individual growth. Educational and Psychological Measurement, 49(3), 587–602. doi:10.1177/001316448904900309.CrossRef

Zimmerman, D. W., Williams, R. H., & Zumbo, B. D. (1993). Reliability of measurement and power of significance tests based on differences. Applied Psychological Measurement, 17(1), 1–9. doi:10.1177/014662169301700101

Titel: Quality Vs. Quantity: Assessing Behavior Change over Time
Auteurs: Andrew L. Moskowitz
Jennifer L. Krull
K. Alex Trickey
Bruce F. Chorpita
Publicatiedatum: 22-04-2017
Uitgeverij: Springer US
Gepubliceerd in: Journal of Psychopathology and Behavioral Assessment / Uitgave 3/2017
Print ISSN: 0882-2689
Elektronisch ISSN: 1573-3505
DOI: https://doi.org/10.1007/s10862-017-9602-1

Bohn Stafleu van Loghum

Deel dit onderdeel of sectie (kopieer de link)

Abstract

Log in om toegang te krijgen

Andere artikelen Uitgave 3/2017

Treating the Child or Syndrome: Does Context Matter for Treatment Decisions for Antisocially Behaving Youth?

Delivering Parent-Teen Therapy for ADHD through Videoconferencing: a Preliminary Investigation

Psychometric Properties and Measurement Invariance of the Youth Psychopathic Traits Inventory - Short Version among Portuguese Youth

Resilience to Interpersonal Trauma and Decreased Risk for Psychopathology in an Epidemiologic Sample

Modifying Obsessive-Compulsive Beliefs about Controlling One’s Thoughts

Unpacking Response Contingent Positive Reinforcement: Reward Probability, but Not Environmental Suppressors, Prospectively Predicts Depressive Symptoms via Behavioral Activation