Skip to main content
Top
Gepubliceerd in: Journal of Psychopathology and Behavioral Assessment 3/2017

22-04-2017

Quality Vs. Quantity: Assessing Behavior Change over Time

Auteurs: Andrew L. Moskowitz, Jennifer L. Krull, K. Alex Trickey, Bruce F. Chorpita

Gepubliceerd in: Journal of Psychopathology and Behavioral Assessment | Uitgave 3/2017

Log in om toegang te krijgen
share
DELEN

Deel dit onderdeel of sectie (kopieer de link)

  • Optie A:
    Klik op de rechtermuisknop op de link en selecteer de optie “linkadres kopiëren”
  • Optie B:
    Deel de link per e-mail

Abstract

Beyond the typical design factors that impact a study’s power (e.g., participant sample size), planning longitudinal research involves additional considerations such as assessment frequency and participant retention. Because this type of research relies so strongly on individual commitment, investigators must be judicious in determining how much information is necessary to study the phenomena in question; collecting too little information will render the data less useful, but requiring excessive participant investment will likely lower participation rates. We conducted a simulation study to empirically examine statistical power and the trade-off between assessment quality (as a function of instrument length) and assessment frequency across a number of sample sizes with intermittently missing data or attrition. Results indicated that reductions in power resulting from shorter, less reliable measurements can be at least somewhat offset by increasing assessment frequency. Because study planning involves a number of factors competing for finite resources, equations were derived to find the balance points between pairs of design characteristics affecting statistical power. These equations allow researchers to calculate the amount that a particular design factor (e.g., assessment frequency) would need to increase to result in the same improvement in power as increasing an alternative factor (e.g., measurement reliability. Applications for the equations are discussed.
Bijlagen
Alleen toegankelijk voor geautoriseerde gebruikers
Voetnoten
1
As a reviewer has noted, carryover effects resulting from increased measurement frequency may also threaten measurement reliability to some degree. Thus, in practice improvements in power may actually be dampened. In this manuscript, we consider reliability of a measurement instrument collected at a given number of time points regardless of how or why that reliability is reduced. Measurement reliability remains constant across time points.
 
2
Although three-level measurement models were used to generate parameter estimates for item level data necessary for the simulation study, these data would normally be modeled as two-level with a repeatedly measured composite problem behavior score nested within participants. The measurement model was necessary to estimate the proper amount of measurement error per item so as to be able to manipulate total measurement reliability by altering the number of items estimated. The two-level approach was utilized after the data generation phase of the simulation study when modeling treatment effects.
 
3
Supplementary materials include tables of empirical power for all missingness conditions and figures similar to Figure 1 for all of the equations in Table 5.
 
4
Non-linear effects of predictors were tested and found to be unnecessary, as the non-linearity inherent in the logistic model itself was sufficient to accommodate any non-linearity in associations between the predictors and the outcome.
 
5
The 75- and 60-item conditions were not included in this particular study, but the results presented in Table 4 suggest that any increases in measurement reliability and consequent power would be very small for instruments of more than 50-items. Thus, the 50-item conditions with the appropriate assessment frequency will be used as proxies for the 75- and 60-item measures in this example.
 
6
To display these results graphically, the extraneous factor (i.e., the factor that was not changing) was held to an arbitrary constant.
 
Literatuur
go back to reference Algina, J., & Olejnik, S. (2003). Conducting power analyses for ANOVA and ANCOVA in between-subjects designs. Evaluation & the Health Professions, 26(3), 288–314.CrossRef Algina, J., & Olejnik, S. (2003). Conducting power analyses for ANOVA and ANCOVA in between-subjects designs. Evaluation & the Health Professions, 26(3), 288–314.CrossRef
go back to reference Aydin, B., Leite, W. L., & Algina, J. (2014). The consequences of ignoring variability in measurement occasions within data collection waves in latent growth models. Multivariate Behavioral Research, 49(2), 149–160.CrossRefPubMed Aydin, B., Leite, W. L., & Algina, J. (2014). The consequences of ignoring variability in measurement occasions within data collection waves in latent growth models. Multivariate Behavioral Research, 49(2), 149–160.CrossRefPubMed
go back to reference Bakker, M., van Dijk, A., & Wicherts, J. M. (2012). The rules of the game called psychological science. Perspectives on Psychological Science, 7(6), 543–554.CrossRefPubMed Bakker, M., van Dijk, A., & Wicherts, J. M. (2012). The rules of the game called psychological science. Perspectives on Psychological Science, 7(6), 543–554.CrossRefPubMed
go back to reference Barrera-Gómez, J., Spiegelman, D., & Basagaña, X. (2013). Optimal combination of number of participants and number of repeated measurements in longitudinal studies with time-varying exposure. Statistics in Medicine, 32(27). doi:10.1002/sim.5870. Barrera-Gómez, J., Spiegelman, D., & Basagaña, X. (2013). Optimal combination of number of participants and number of repeated measurements in longitudinal studies with time-varying exposure. Statistics in Medicine, 32(27). doi:10.​1002/​sim.​5870.
go back to reference Bloch, D. A. (1986). Sample size requirements and the cost of a randomized clinical trial with repeated measurements. Statistics in Medicine, 5(6), 663–667.CrossRefPubMed Bloch, D. A. (1986). Sample size requirements and the cost of a randomized clinical trial with repeated measurements. Statistics in Medicine, 5(6), 663–667.CrossRefPubMed
go back to reference Browne, W. J., Lahi, M. G., & Parker, R. M. (2009). A guide to sample size calculations for random effect models via simulation and the MLPowSim software package. Bristol, United Kingdom: University of Bristol. Browne, W. J., Lahi, M. G., & Parker, R. M. (2009). A guide to sample size calculations for random effect models via simulation and the MLPowSim software package. Bristol, United Kingdom: University of Bristol.
go back to reference Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafo, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews. Neuroscience, 14(5), 365–376. doi:10.1038/nrn3475.CrossRefPubMed Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafo, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews. Neuroscience, 14(5), 365–376. doi:10.​1038/​nrn3475.CrossRefPubMed
go back to reference Cheng, Y., & Berry, D. A. (2007). Optimal adaptive randomized designs for clinical trials. Biometrika, 94(3), 673–689.CrossRef Cheng, Y., & Berry, D. A. (2007). Optimal adaptive randomized designs for clinical trials. Biometrika, 94(3), 673–689.CrossRef
go back to reference Cohen, J. (1962). The statistical power of abnormal-social psychological research: A review. The Journal of Abnormal and Social Psychology, 65(3), 145.CrossRefPubMed Cohen, J. (1962). The statistical power of abnormal-social psychological research: A review. The Journal of Abnormal and Social Psychology, 65(3), 145.CrossRefPubMed
go back to reference Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45(12), 1304.CrossRef Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45(12), 1304.CrossRef
go back to reference Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2002). Applied Multple Regression/Correlation Analysis for the Behavioral Sciences (Third ed.): Routledge. Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2002). Applied Multple Regression/Correlation Analysis for the Behavioral Sciences (Third ed.): Routledge.
go back to reference Collins, L. M., Murphy, S. A., & Strecher, V. (2007). The multiphase optimization strategy (MOST) and the sequential multiple assignment randomized trial (SMART): New methods for more potent eHealth interventions. American Journal of Preventive Medicine, 32(5), S112–S118.CrossRefPubMedPubMedCentral Collins, L. M., Murphy, S. A., & Strecher, V. (2007). The multiphase optimization strategy (MOST) and the sequential multiple assignment randomized trial (SMART): New methods for more potent eHealth interventions. American Journal of Preventive Medicine, 32(5), S112–S118.CrossRefPubMedPubMedCentral
go back to reference Dziak, J. J., Nahum-Shani, I., & Collins, L. M. (2012). Multilevel factorial experiments for developing behavioral interventions: Power, sample size, and resource considerations(). Psychological Methods, 17(2), 153–175. doi:10.1037/a0026972.CrossRefPubMedPubMedCentral Dziak, J. J., Nahum-Shani, I., & Collins, L. M. (2012). Multilevel factorial experiments for developing behavioral interventions: Power, sample size, and resource considerations(). Psychological Methods, 17(2), 153–175. doi:10.​1037/​a0026972.CrossRefPubMedPubMedCentral
go back to reference Enders, C. K. (2010). Applied missing data analysis: Guilford Press. Enders, C. K. (2010). Applied missing data analysis: Guilford Press.
go back to reference Faulkner, C., Fidler, F., & Cumming, G. (2008). The value of RCT evidence depends on the quality of statistical analysis. Behaviour Research and Therapy, 46(2), 270–281.CrossRefPubMed Faulkner, C., Fidler, F., & Cumming, G. (2008). The value of RCT evidence depends on the quality of statistical analysis. Behaviour Research and Therapy, 46(2), 270–281.CrossRefPubMed
go back to reference Hedeker, D., Gibbons, R. D., & Waternaux, C. (1999). Sample size estimation for longitudinal designs with attrition: Comparing time-related contrasts between two groups. Journal of Educational and Behavioral Statistics, 24(1), 70–93.CrossRef Hedeker, D., Gibbons, R. D., & Waternaux, C. (1999). Sample size estimation for longitudinal designs with attrition: Comparing time-related contrasts between two groups. Journal of Educational and Behavioral Statistics, 24(1), 70–93.CrossRef
go back to reference Hertzog, C., Lindenberger, U., Ghisletta, P., & Oertzen, T. v. (2006). On the power of multivariate latent growth curve models to detect correlated change. Psychological Methods, 11(3), 244.CrossRefPubMed Hertzog, C., Lindenberger, U., Ghisletta, P., & Oertzen, T. v. (2006). On the power of multivariate latent growth curve models to detect correlated change. Psychological Methods, 11(3), 244.CrossRefPubMed
go back to reference Hox, J. J. (2010). Multilevel analysis: Techniques and applications (Second Edition ed.): Routledge. Hox, J. J. (2010). Multilevel analysis: Techniques and applications (Second Edition ed.): Routledge.
go back to reference Lavori, P. W., Rush, A. J., Wisniewski, S. R., Alpert, J., Fava, M., Kupfer, D. J., et al. (2001). Strengthening clinical effectiveness trials: Equipoise-stratified randomization. Biological Psychiatry, 50(10), 792–801.CrossRefPubMed Lavori, P. W., Rush, A. J., Wisniewski, S. R., Alpert, J., Fava, M., Kupfer, D. J., et al. (2001). Strengthening clinical effectiveness trials: Equipoise-stratified randomization. Biological Psychiatry, 50(10), 792–801.CrossRefPubMed
go back to reference Liu, X., Spybrook, J., Congdon, R., Martinez, A., & Raudenbush, S. (2005). OD 2.0: Optimal design for multi-level and longitudinal research. Liu, X., Spybrook, J., Congdon, R., Martinez, A., & Raudenbush, S. (2005). OD 2.0: Optimal design for multi-level and longitudinal research.
go back to reference MacKinnon, D. P., & Lockwood, C. M. (2003). Advances in statistical methods for substance abuse prevention research. Prevention Science, 4(3), 155–171.CrossRefPubMedPubMedCentral MacKinnon, D. P., & Lockwood, C. M. (2003). Advances in statistical methods for substance abuse prevention research. Prevention Science, 4(3), 155–171.CrossRefPubMedPubMedCentral
go back to reference Mathieu, J. E., Aguinis, H., Culpepper, S. A., & Chen, G. (2012). Understanding and estimating the power to detect cross-level interaction effects in multilevel modeling. Journal of Applied Psychology, 97(5), 951.CrossRefPubMed Mathieu, J. E., Aguinis, H., Culpepper, S. A., & Chen, G. (2012). Understanding and estimating the power to detect cross-level interaction effects in multilevel modeling. Journal of Applied Psychology, 97(5), 951.CrossRefPubMed
go back to reference Maxwell, S. E. (1998). Longitudinal designs in randomized group comparisons: When will intermediate observations increase statistical power? Psychological Methods, 3(3), 275.CrossRef Maxwell, S. E. (1998). Longitudinal designs in randomized group comparisons: When will intermediate observations increase statistical power? Psychological Methods, 3(3), 275.CrossRef
go back to reference Moerbeek, M. (2008). Powerful and cost-efficient designs for longitudinal intervention studies with two treatment groups. Journal of Educational and Behavioral Statistics, 33(1), 41–61. doi:10.3102/1076998607302630.CrossRef Moerbeek, M. (2008). Powerful and cost-efficient designs for longitudinal intervention studies with two treatment groups. Journal of Educational and Behavioral Statistics, 33(1), 41–61. doi:10.​3102/​1076998607302630​.CrossRef
go back to reference Moerbeek, M. (2011). The effects of the number of cohorts, degree of overlap among cohorts, and frequency of observation on power in accelerated longitudinal designs. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 7(1), 11.CrossRef Moerbeek, M. (2011). The effects of the number of cohorts, degree of overlap among cohorts, and frequency of observation on power in accelerated longitudinal designs. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 7(1), 11.CrossRef
go back to reference Moerbeek, M., & Teerenstra, S. (2015). Power analysis of trials with multilevel data: CRC Press. Moerbeek, M., & Teerenstra, S. (2015). Power analysis of trials with multilevel data: CRC Press.
go back to reference Moerbeek, M., Van Breukelen, G. J. P., & Berger, M. P. F. (2001). Optimal experimental designs for multilevel models with covariates. Communications in Statistics - Theory and Methods, 30(12), 2683–2697. doi:10.1081/STA-100108453.CrossRef Moerbeek, M., Van Breukelen, G. J. P., & Berger, M. P. F. (2001). Optimal experimental designs for multilevel models with covariates. Communications in Statistics - Theory and Methods, 30(12), 2683–2697. doi:10.​1081/​STA-100108453.CrossRef
go back to reference Murray, D. M., Varnell, S. P., & Blitstein, J. L. (2004). Design and analysis of group-randomized trials: A review of recent methodological developments. American Journal of Public Health, 94(3), 423–432.CrossRefPubMedPubMedCentral Murray, D. M., Varnell, S. P., & Blitstein, J. L. (2004). Design and analysis of group-randomized trials: A review of recent methodological developments. American Journal of Public Health, 94(3), 423–432.CrossRefPubMedPubMedCentral
go back to reference Muthén, B. O., & Curran, P. J. (1997). General longitudinal modeling of individual differences in experimental designs: A latent variable framework for analysis and power estimation. Psychological Methods, 2(4), 371.CrossRef Muthén, B. O., & Curran, P. J. (1997). General longitudinal modeling of individual differences in experimental designs: A latent variable framework for analysis and power estimation. Psychological Methods, 2(4), 371.CrossRef
go back to reference Potvin, P. J., & Schutz, R. W. (2000). Statistical power for the two-factor repeated measures ANOVA. Behavior Research Methods, Instruments, & Computers, 32(2), 347–356. doi:10.3758/BF03207805.CrossRef Potvin, P. J., & Schutz, R. W. (2000). Statistical power for the two-factor repeated measures ANOVA. Behavior Research Methods, Instruments, & Computers, 32(2), 347–356. doi:10.​3758/​BF03207805.CrossRef
go back to reference Rast, P., & Hofer, S. M. (2014). Longitudinal design considerations to optimize power to detect variances and covariances among rates of change: Simulation results based on actual longitudinal studies. Psychological Methods, 19(1), 133–154. doi:10.1037/a0034524.CrossRefPubMed Rast, P., & Hofer, S. M. (2014). Longitudinal design considerations to optimize power to detect variances and covariances among rates of change: Simulation results based on actual longitudinal studies. Psychological Methods, 19(1), 133–154. doi:10.​1037/​a0034524.CrossRefPubMed
go back to reference Raudenbush, S. W. (2011). Optimal Design Software for Multi-level and Longitudinal Research (Version 3.01). Raudenbush, S. W. (2011). Optimal Design Software for Multi-level and Longitudinal Research (Version 3.01).
go back to reference Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (Second ed.): Sage. Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (Second ed.): Sage.
go back to reference Raudenbush, S. W., & Liu, X.-F. (2001). Effects of study duration, frequency of observation, and sample size on power in studies of group differences in polynomial change. Psychological Methods, 6(4), 387.CrossRefPubMed Raudenbush, S. W., & Liu, X.-F. (2001). Effects of study duration, frequency of observation, and sample size on power in studies of group differences in polynomial change. Psychological Methods, 6(4), 387.CrossRefPubMed
go back to reference Remmers, H. H., Karslake, R., & Gage, N. (1940). Reliability of multiple-choice measuring instruments as a function of the Spearman-Brown prophecy formula, I. Journal of Educational Psychology, 31, (8), 583–590. doi:10.1037/h0054189. Remmers, H. H., Karslake, R., & Gage, N. (1940). Reliability of multiple-choice measuring instruments as a function of the Spearman-Brown prophecy formula, I. Journal of Educational Psychology, 31, (8), 583–590. doi:10.​1037/​h0054189.
go back to reference Rhemtulla, M., & Hancock, G. R. (2016). Planned missing data designs in educational psychology research. Educational Psychologist, 51(3–4), 305–316.CrossRef Rhemtulla, M., & Hancock, G. R. (2016). Planned missing data designs in educational psychology research. Educational Psychologist, 51(3–4), 305–316.CrossRef
go back to reference Rhemtulla, M., Jia, F., Wu, W., & Little, T. D. (2014). Planned missing designs to optimize the efficiency of latent growth parameter estimates. International Journal of Behavioral Development, 38(5), 423–434. Rhemtulla, M., Jia, F., Wu, W., & Little, T. D. (2014). Planned missing designs to optimize the efficiency of latent growth parameter estimates. International Journal of Behavioral Development, 38(5), 423–434.
go back to reference Rossi, J. S. (1990). Statistical power of psychological research: What have we gained in 20 years? Journal of Consulting and Clinical Psychology, 58(5), 646–656.CrossRefPubMed Rossi, J. S. (1990). Statistical power of psychological research: What have we gained in 20 years? Journal of Consulting and Clinical Psychology, 58(5), 646–656.CrossRefPubMed
go back to reference Salekin, R. T. (2014). Some new directions for publication in the Journal of psychopathology and behavioral assessment: New constructs, physiological assessment, worldwide contribution, and economics. Journal of Psychopathology and Behavioral Assessment, 36(1), 1–3.CrossRef Salekin, R. T. (2014). Some new directions for publication in the Journal of psychopathology and behavioral assessment: New constructs, physiological assessment, worldwide contribution, and economics. Journal of Psychopathology and Behavioral Assessment, 36(1), 1–3.CrossRef
go back to reference Silvia, P. J., Kwapil, T. R., Walsh, M. A., & Myin-Germeys, I. (2014). Planned missing-data designs in experience-sampling research: Monte Carlo simulations of efficient designs for assessing within-person constructs. Behavior Research Methods, 46(1), 41–54. doi:10.3758/s13428-013-0353-y.CrossRefPubMedPubMedCentral Silvia, P. J., Kwapil, T. R., Walsh, M. A., & Myin-Germeys, I. (2014). Planned missing-data designs in experience-sampling research: Monte Carlo simulations of efficient designs for assessing within-person constructs. Behavior Research Methods, 46(1), 41–54. doi:10.​3758/​s13428-013-0353-y.CrossRefPubMedPubMedCentral
go back to reference Snijders, T. A. (2005). Power and sample size in multilevel linear models. Encyclopedia of statistics in behavioral science. Snijders, T. A. (2005). Power and sample size in multilevel linear models. Encyclopedia of statistics in behavioral science.
go back to reference Timmons, A. C., & Preacher, K. J. (2015). The importance of temporal design: How do measurement intervals affect the accuracy and efficiency of parameter estimates in longitudinal research? Multivariate Behavioral Research, 50(1), 41–55. doi:10.1080/00273171.2014.961056.CrossRefPubMed Timmons, A. C., & Preacher, K. J. (2015). The importance of temporal design: How do measurement intervals affect the accuracy and efficiency of parameter estimates in longitudinal research? Multivariate Behavioral Research, 50(1), 41–55. doi:10.​1080/​00273171.​2014.​961056.CrossRefPubMed
go back to reference von Oertzen, T., & Brandmaier, A. M. (2013). Optimal study design with identical power: An application of power equivalence to latent growth curve models. Psychology and Aging, 28(2), 414.CrossRef von Oertzen, T., & Brandmaier, A. M. (2013). Optimal study design with identical power: An application of power equivalence to latent growth curve models. Psychology and Aging, 28(2), 414.CrossRef
go back to reference Weisz, J. R., Chorpita, B. F., Palinkas, L. A., Schoenwald, S. K., Miranda, J., Bearman, S. K., et al. (2012). Testing standard and modular designs for psychotherapy treating depression, anxiety, and conduct problems in youth: A randomized effectiveness trial. Archives of General Psychiatry, 69(3), 274–282. doi:10.1001/archgenpsychiatry.2011.147.CrossRefPubMed Weisz, J. R., Chorpita, B. F., Palinkas, L. A., Schoenwald, S. K., Miranda, J., Bearman, S. K., et al. (2012). Testing standard and modular designs for psychotherapy treating depression, anxiety, and conduct problems in youth: A randomized effectiveness trial. Archives of General Psychiatry, 69(3), 274–282. doi:10.​1001/​archgenpsychiatr​y.​2011.​147.CrossRefPubMed
go back to reference Willett, J. B. (1989). Some results on reliability for the longitudinal measurement of change: Implications for the design of srudies of individual growth. Educational and Psychological Measurement, 49(3), 587–602. doi:10.1177/001316448904900309.CrossRef Willett, J. B. (1989). Some results on reliability for the longitudinal measurement of change: Implications for the design of srudies of individual growth. Educational and Psychological Measurement, 49(3), 587–602. doi:10.​1177/​0013164489049003​09.CrossRef
go back to reference Zimmerman, D. W., Williams, R. H., & Zumbo, B. D. (1993). Reliability of measurement and power of significance tests based on differences. Applied Psychological Measurement, 17(1), 1–9. doi:10.1177/014662169301700101 Zimmerman, D. W., Williams, R. H., & Zumbo, B. D. (1993). Reliability of measurement and power of significance tests based on differences. Applied Psychological Measurement, 17(1), 1–9. doi:10.​1177/​0146621693017001​01
Metagegevens
Titel
Quality Vs. Quantity: Assessing Behavior Change over Time
Auteurs
Andrew L. Moskowitz
Jennifer L. Krull
K. Alex Trickey
Bruce F. Chorpita
Publicatiedatum
22-04-2017
Uitgeverij
Springer US
Gepubliceerd in
Journal of Psychopathology and Behavioral Assessment / Uitgave 3/2017
Print ISSN: 0882-2689
Elektronisch ISSN: 1573-3505
DOI
https://doi.org/10.1007/s10862-017-9602-1

Andere artikelen Uitgave 3/2017

Journal of Psychopathology and Behavioral Assessment 3/2017 Naar de uitgave