Swipe om te navigeren naar een ander artikel
Following debates in psychology on the importance of replication research, we have also started to see pleas for a more prominent role for replication research in medical education. To enable replication research, it is of paramount importance to carefully study the reliability of the instruments we use. Cronbach’s alpha has been the most widely used estimator of reliability in the field of medical education, notably as some kind of quality label of test or questionnaire scores based on multiple items or of the reliability of assessment across exam stations. However, as this narrative review outlines, Cronbach’s alpha or alternative reliability statistics may complement but not replace psychometric methods such as factor analysis. Moreover, multiple-item measurements should be preferred above single-item measurements, and when using single-item measurements, coefficients as Cronbach’s alpha should not be interpreted as indicators of the reliability of a single item when that item is administered after fundamentally different activities, such as learning tasks that differ in content. Finally, if we want to follow up on recent pleas for more replication research, we have to start studying the test-retest reliability of the instruments we use.
Crandall CS, Sherman JW. On the scientific superiority of conceptual replications for scientific progress. J Exp Soc Psychol. 2016;66:93–9. CrossRef
Cumming G. Replication and p intervals: p values predict the future only vaguely, but confidence intervals do much better. Perspect Psychol Sci. 2008;3:286–300. CrossRef
Earp BD, Trafimow D. Replication, falsification and the crisis of confidence in social psychology. Front Psychol. 2015;6:1–11. CrossRef
Huffmeier J, Mazei J, Schultze T. Reconceptualizing replication as a sequence of different studies:a replication typology. J Exp Soc Psychol. 2016;66:81–92. CrossRef
Ioannidis JP. Why most published research findings are false. PLOS Med. 2005;2:e124. CrossRef
Klein SB. What can recent replication failures tell us about theoretical commitments of psychology? Theory Psychol. 2014;24:326–38. CrossRef
Nosek BA, Aarts A, Anderson JE, et al. PSYCHOLOGY. Estimating the reproducibility of psychological science. Science. 2015; doi: 10.1126/science.aac4716.
Pashler H, Harris CR. Is the replicability crisis overblown? Three arguments examined. Perspect Psychol Sci. 2012;7:531–6. CrossRef
Schmidt S. Shall we really do it again? The powerful concept of replication is neglected in social sciences. Rev Gen Psychol. 2009;13:90–100. CrossRef
Leppink J, O’Sullivan P, Winston K. On variation and uncertainty. Perspect Med Educ. 2016;5:231–4. CrossRef
Leppink J, Pérez-Fuster P. What is science without replication? Perspect Med Educ. 2016;5:320-2. doi: 10.1007/s40037-016-0307-z.
Picho K, Maggio L, Artino AR. Science: the slow march of accumulating evidence. Perspect Med Educ. 2016;5:350-3. doi: 10.1007/s40037-016-0305-1.
Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika. 1951;16:297–334. CrossRef
Revelle W, Zinbarg RE. Coefficients alpha, beta, omega, and the glb: comments on Sijtsma. Psychometrika. 2009;74:145–54. CrossRef
Crocker L, Algina J. Introduction to classical & modern test theory. London: Thomson; 2006.
Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20:37–46. CrossRef
Fleiss JL, Cohen J. The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ Psychol Meas. 1973;33:613–9. CrossRef
Beckman TJ, Ghosh AK, Cook DA, Erwin PJ, Mandrekar JN. How reliable are assessments of clinical teaching? A review of the published instruments. J Gen Intern Med. 2004;19:971–7. CrossRef
Bland JM, Altman DG. Statistics notes: Cronbach’s alpha. BMJ. 1997;314:572. CrossRef
Sullivan GM. A primer on the validity of assessment instruments. J Grad Med Educ. 2011;3:119–20. CrossRef
Tavakol M, Dennick R. Making sense of Cronbach’s alpha. Int J Med Educ. 2011;2:53–5. CrossRef
Kline RB. Principle and practice of structural equation modeling, 3rd ed. London: The Guilford Press; 2010.
Tacq J. Multivariate analysis techniques in social science research:from problem to analysis. London: SAGE; 1997.
Field A. Discovering statistics using IBM SPSS statistics, 4th ed. London: SAGE; 2013.
Dunn TJ, Baguley T, Brunsden V. From alpha to omega: a practical solution to the pervasive problem of internal consistency estimation. Br J Psychol. 2014;105:399–412. CrossRef
Peters GJY. The alpha and the omega of scale reliability and validity. Eur Health Psychol. 2014;16:56–69.
Sijtsma K. On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika. 2009;74:107–20. CrossRef
Paas F. Training strategies for attaining transfer of problem-solving skill in statistics: a cognitive-load approach. J Educ Psychol. 1992;84:429–34. CrossRef
Ayres P. Using subjective measures to detect variations of intrinsic load within problems. Learn Instr. 2006;16:389–400. CrossRef
Paas F, Tuovinen J, Tabbers H, Van Gerven PWM. Cognitive load measurement as a means to advance cognitive load theory. Educ Psychol. 2003;38:63–71. CrossRef
Crutzen R. Time is a jailer: what do alpha and its alternatives tell us about reliability? Eur Health Psychol. 2014;16:70–4.
Leppink J, Van Merriënboer JJG. The beast of aggregating cognitive load measures in technology-based learning. Educ Technol Soc. 2015;18:230–45.
Graham JM. Congeneric and (essentially) tau-equivalent estimates of score reliability: what they are and how to use them. Educ Psychol Meas. 2006;66:930–44. CrossRef
Koriat A, Nussinson R, Ackerman R. Judgments of learning depend on how learners interpret study effort. J Exp Psychol. 2014;40:1624–37.
Van Loon MH, De Bruin ABH, Van Gog T, Van Merriënboer JJG. The effect of delayed JOLs and sentence generation on children’s monitoring accuracy and regulation of idiom study. Metacogn Learn. 2013;8:173–91. CrossRef
Sibbald M, De Bruin ABH. Feasibility of self-reflection as a tool to balance clinical reasoning strategies. Adv Health Sci Educ. 2012;17:419–29. CrossRef
Leppink J, Van den Heuvel A. The evolution of cognitive load theory and its application to medical education. Perspect Med Educ. 2015;4:119–27. CrossRef
Leppink J, Paas F, Van der Vleuten CPM, Van Gog T, Van Merriënboer JJG. Development of an instrument for measuring different types of cognitive load. Behav Res Methods. 2013;45:1058–72. CrossRef
Naismith LM, Cheung JJH, Ringsted C, Cavalcanti RB. Limitations of subjective cognitive load measures in simulation-based procedural training. Med Educ. 2015;49:805–14. CrossRef
Young JQ, Irby DM, Barilla-LaBarca ML, Ten Cate O, O’Sullivan PS. Measuring cognitive load:mixed results from a handover simulation for medical students. Perspect Med Educ. 2016;5:24–32. CrossRef
Chmielewski M, Watson D. What is being assessed and why it matters: the impact of transient error on trait research. J Pers Soc Psychol. 2009;97:186–202. CrossRef
Green SB. A coefficient alpha for test-retest data. Psychol Meth. 2003;8:88–101. CrossRef
Salerno DF, Franzblau A, Armstrong TJ, Werner RA, Becker MA. Test-retest reliability of the upper extremity questionnaire among keyboard operators. Am J Ind Med. 2001;40:655–66. CrossRef
Hedeker D, Gibbons RD. Longitudinal data analysis. New York: Wiley; 2006.
Molenberghs G, Verbeke G. Models for discrete longitudinal data. New York: Springer; 2006.
Verbeke G, Molenberghs G. Linear mixed models for longitudinal data. New York: Springer; 2000.
Kramer MS, Feinstein AR. Clinical biostatistics LIV. The biostatistics of concordance. Clin Pharmacol Ther. 1981;29:111–23. CrossRef
- We need more replication research – A case for test-retest reliability
- Bohn Stafleu van Loghum