Abstract
Assessment of clinical competence is complex and inference based. Trustworthy and defensible assessment processes must have favourable evidence of validity, particularly where decisions are considered high stakes. We aimed to organize, collect and interpret validity evidence for a high stakes simulation based assessment strategy for certifying paramedics, using Kane’s validity framework, which some report as challenging to implement. We describe our experience using the framework, identifying challenges, decisions points, interpretations and lessons learned. We considered data related to four inferences (scoring, generalization, extrapolation, implications) occurring during assessment and treated validity as a series of assumptions we must evaluate, resulting in several hypotheses and proposed analyses. We then interpreted our findings across the four inferences, judging if the evidence supported or refuted our proposed uses of the assessment data. Data evaluating “Scoring” included: (a) desirable tool characteristics, with acceptable inter-item correlations (b) strong item-total correlations (c) low error variance for items and raters, and (d) strong inter-rater reliability. Data evaluating “Generalizability” included: (a) a robust sampling strategy capturing the majority of relevant medical directives, skills and national competencies, and good overall and inter-station reliability. Data evaluating “Extrapolation” included: low correlations between assessment scores by dimension and clinical errors in practice. Data evaluating “Implications” included low error rates in practice. Interpreting our findings according to Kane’s framework, we suggest the evidence for scoring, generalization and implications supports use of our simulation-based paramedic assessment strategy as a certifying exam; however, the extrapolation evidence was weak, suggesting exam scores did not predict clinical error rates. Our analysis represents a worked example others can follow when using Kane’s validity framework to evaluate, and iteratively develop and refine assessment strategies.
Similar content being viewed by others
References
Brennan, B. L. (2001). Generalizability theory. New York, NY: Springer.
Brennan, B. L. (2013). Commentary on “validating the interpretations and uses of test scores”. Journal of Educational Measurement, 50(1), 74–83.
Clauser, B. E., Margolis, M. J., Holtman, M. C., Katsufrakis, P. J., & Hawkins, R. E. (2012). Validity considerations in the assessment of professionalism. Advances in Health Sciences Education, Theory, and Practice, 17(2), 165–181.
Cook, D., Brydges, R., Ginsburg, S., & Hatala, R. (2015). A contemporary approach to validity arguments: A practical guide to Kane’s framework. Medical Education, 49(6), 560–575.
Cook, D. A., & Hatala, R. (2016). Validation of educational assessments: A primer for simulation and beyond. Advances in Health Sciences Education Theory and Practice, 1(1), 31.
Cook, D. A., Zendejas, B., Hamstra, S. J., Hatala, R., & Brydges, R. (2014). What counts as validity evidence? Examples and prevalence in a systematic review of simulation-based assessment. Advances in Health Sciences Education Theory and Practice, 19(2), 233–250.
Cronbach, L. J. (1989). Construct validation after thirty years. Intelligence Measurement Theory and Public Policy, 3, 147–171.
Frank, J., Snell, L., & Sherbino, J. (2014). Draft Can-MEDS 2015 physician competency framework-series III. Ottawa, Ontario: The Royal College of Physicians and Surgeons of Canada.
Hatala, R., Cook, D. A., Brydges, R., & Hawkins, R. (2015). Constructing a validity argument for the objective structured assessment of technical skills (OSATS): A systematic review of validity evidence. Advances in Health Sciences Education Theory and Practice, 20(5), 1149–1175.
Humphrey-Murto, S., & MacFadyen, J. (2002). Standard setting: A comparison of case-author and modified borderline-group methods in a small-scale OSCE. Academic Medicine, 77(7), 729–732.
Kane, M. T. (1999). Validating measures of performance. Educational Measurement: Issues and Practice, 20(10), 5–17.
Kane, M. T. (2012). Validating score interpretations and uses. Language Testing, 29(1), 3–17.
Kane, M. T. (2013a). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73.
Kane, M. T. (2013b). Validity. In B. L. Brennan (Ed.), Educational measurement. Westport, CT: Praeger Publishers.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational Measurement. McMillan: Old Tappan, NJ.
Ministry of Health and Long-Term Care, Emergency Health Services Branch. (2007). Basic life support patient care standards version 2.0. Toronto, Ontario: Publications Ontario.
Ministry of Health and Long-Term Care, Emergency Health Services Branch. (2015). Advanced life support patient care standards version 3.2. Toronto, Ontario: Publications Ontario.
Mylopoulous, M., & Regehr, G. (2011). Putting the expert together again. Medical Education, 45(9), 920–926.
Paramedic Association of Canada. (2016). National occupational competency profile 2016. Retrieved February 22, 2017, from http://paramedic.ca/site/nocp?nav=02.
Ponton-Carass, J., Kortbeek, J. B., & Ma, I. W. Y. (2016). Assessment of technical and nontechnical skills in surgical residents. The American Journal of Surgery, 212(5), 1011–1019.
Roch, S. G., Woehr, D. J., Mishra, V., & Kieszczynska, U. (2011). Rater training revisited: An updated meta-analytic review of frame-of-reference training. Journal of Occupational and Organizational Psychology, 85(2), 370–395.
St-Onge, C., Young, M., Eva, K. W., & Hodges, B. (2017). Validity: One word with a plurality of meanings. Advances in Health Sciences Education Theory and Practice. https://doi.org/10.1007/s10459-016-9716-3.
Streiner, D. L., & Norman, G. R. (2008). Health Measurement Scales: A practical guide to their development and use (4th ed.). Oxford, New York: Oxford University Press.
Sunnybrook Centre for Prehospital Medicine. (2016). Regional Base Hospital 2015–2016 annual report. Toronto: Ontario.
Tavares, W., Boet, S., Theriault, R., Mallette, T., & Eva, K. (2012). Global rating scale for the assessment of paramedic clinical competence. Prehospital Emergency Care, 17(1), 57–67.
Tavares, W., Bowles, R., & Donelon, B. (2016). Informing a Canadian paramedic profile: Framing concepts, roles and crosscutting themes. BMC Health Services Research, 16, 477.
Tavares, W., LeBlanc, V. R., Mausz, J., Sun, V., & Eva, K. W. (2014). Simulation-based assessment of paramedics and performance in real clinical contexts. Prehospital Emergency Care, 18(1), 116–122.
Woehr, D., & Huffcutt, A. (1994). Rater training for performance appraisal: A quantitative review. Journal of Occupational and Organizational Psychology, 67(3), 189–205.
Acknowledgements
The authors would like to thank the Ontario Base Hospital Group for their support in completing this study.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Tavares, W., Brydges, R., Myre, P. et al. Applying Kane’s validity framework to a simulation based assessment of clinical competence. Adv in Health Sci Educ 23, 323–338 (2018). https://doi.org/10.1007/s10459-017-9800-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10459-017-9800-3