Validity: one word with a plurality of meanings

St-Onge, Christina; Young, Meredith; Eva, Kevin W.; Hodges, Brian

doi:10.1007/s10459-016-9716-3

Validity: one word with a plurality of meanings

Published: 01 October 2016

Volume 22, pages 853–867, (2017)
Cite this article

Advances in Health Sciences Education Aims and scope Submit manuscript

Christina St-Onge ORCID: orcid.org/0000-0001-5313-0456¹,
Meredith Young²,
Kevin W. Eva³ &
…
Brian Hodges⁴

1964 Accesses
52 Citations
18 Altmetric
2 Mentions
Explore all metrics

Abstract

Validity is one of the most debated constructs in our field; debates abound about what is legitimate and what is not, and the word continues to be used in ways that are explicitly disavowed by current practice guidelines. The resultant tensions have not been well characterized, yet their existence suggests that different uses may maintain some value for the user that needs to be better understood. We conducted an empirical form of Discourse Analysis to document the multiple ways in which validity is described, understood, and used in the health professions education field. We created and analyzed an archive of texts identified from multiple sources, including formal databases such as PubMED, ERIC and PsycINFO as well as the authors’ personal assessment libraries. An iterative analytic process was used to identify, discuss, and characterize emerging discourses about validity. Three discourses of validity were identified. Validity as a test characteristic is underpinned by the notion that validity is an intrinsic property of a tool and could, therefore, be seen as content and context independent. Validity as an argument-based evidentiary-chain emphasizes the importance of supporting the interpretation of assessment results with ongoing analysis such that validity does not belong to the tool/instrument itself. The emphasis is on process-based validation (emphasizing the journey instead of the goal). Validity as a social imperative foregrounds the consequences of assessment at the individual and societal levels, be they positive or negative. The existence of different discourses may explain—in part—results observed in recent systematic reviews that highlighted discrepancies and tensions between recommendations for practice and the validation practices that are actually adopted and reported. Some of these practices, despite contravening accepted validation ‘guidelines’, may nevertheless respond to different and somewhat unarticulated needs within health professional education.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Validity arguments for patient-reported outcomes: justifying the intended interpretation and use of data

Article Open access 30 July 2021

Melanie Hawkins, Gerald R. Elsworth, … Richard H. Osborne

Application of validity theory and methodology to patient-reported outcome measures (PROMs): building an argument for validity

Article Open access 20 February 2018

Melanie Hawkins, Gerald R. Elsworth & Richard H. Osborne

Considering axiological integrity: a methodological analysis of qualitative evidence syntheses, and its implications for health professions education

Article 14 May 2018

Martina Kelly, Rachel H. Ellaway, … Tim Dornan

References

AERA, Apa, & NCME (American Educational Research Association & National Council on Measurement in Education) Joint Committee on Standards for Educational and Psychological Testing, A. P. A. (1999). Standards for Educational and Psychological Testing. Washington, DC: AERA.
Google Scholar
Anastasi, A. (1988). Psychological testing (Vol. 6th). New York: Macmillan.
Google Scholar
Andreatta, P. B., & Gruppen, L. D. (2009). Conceptualising and classifying validity evidence for simulation. Medical Education, 43(11), 1028–1035. doi:10.1111/j.1365-2923.2009.03454.x.
Article Google Scholar
Beckman, T. J., Ghosh, A. K., Cook, D. A., Erwin, P. J., & Mandrekar, J. N. (2004). How reliable are assessments of clinical teaching? A review of the published instruments. Journal of General Internal Medicine, 19(9), 971–977. doi:10.1111/j.1525-1497.2004.40066.x.
Article Google Scholar
Beckman, T. J., Mandrekar, J. N., Engstler, G. J., & Ficalora, R. D. (2009). Determining reliability of clinical assessment scores in real time. Teaching and Learning in Medicine, 21(3), 188–194.
Article Google Scholar
Berendonk, C., Stalmeijer, R. E., & Schuwirth, L. W. T. (2013). Expertise in performance assessment: Assessors’ perspectives. Advances in Health Sciences Education, 18(4), 559–571.
Article Google Scholar
Bertrand, R., & Blais, J.-G. (2004). Modèles de Mesure: L’Apport de la Théorie des Réponses aux Items (Vol. 2004). Retrieved from https://books.google.com/books?hl=fr&lr=&id=3hPlCHaA7DoC&pgis=1.
Charlin, B., Roy, L., Brailovsky, C., Goulet, F., & van der Vleuten, C. (2000). The script concordance test: A tool to assess the reflective clinician. Teachning and Learning in Medicine, 12(4), 189–195.
Article Google Scholar
Cizek, G. J., Bowen, D., & Church, K. (2010). Sources of validity evidence for educational and psychological tests: A follow-up study. Educational and Psychological Measurement, 70(5), 732–743. doi:10.1177/0013164410379323.
Article Google Scholar
Cizek, G. J., Rosenberg, S. L., & Koons, H. H. (2008). Sources of validity evidence for educational and psychological tests. Educational and Psychological Measurement, 68(3), 397–412. doi:10.1177/0013164407310130.
Article Google Scholar
Cook, D. A., & Beckman, T. J. (2006). Current concepts in validity and reliability for psychometric instruments: Theory and application. The American Journal of Medicine, 119(2), 166.e7–166.e16. doi:10.1016/j.amjmed.2005.10.036.
Article Google Scholar
Cook, D. A., Brydges, R., Ginsburg, S., & Hatala, R. (2015). A contemporary approach to validity arguments: A practical guide to Kane’s framework. Medical Education, 49(6), 560–575.
Article Google Scholar
Cook, D. A., Brydges, R., Zendejas, B., Hamstra, S. J., & Hatala, R. (2013). Technology-enhanced simulation to assess health professionals: A systematic review of validity evidence, research methods, and reporting quality. Academic Medicine: Journal of the Association of American Medical Colleges, 88(6), 872–883. doi:10.1097/ACM.0b013e31828ffdcf.
Article Google Scholar
Cook, D. A., Zendejas, B., Hamstra, S. J., Hatala, R., & Brydges, R. (2014). What counts as validity evidence? Examples and prevalence in a systematic review of simulation-based assessment. Advances in Health Sciences Education: Theory and Practice, 19(2), 233–250. doi:10.1007/s10459-013-9458-4.
Article Google Scholar
Cronbach, L. J. (1971). Test validation. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 443–507). Washington, DC: American Council on Education.
Google Scholar
Crossley, J., Humphris, G., & Jolly, B. (2002). Assessing health professionals. Medical Education, 36, 800–804.
Article Google Scholar
Cureton, E. E. (1951). Validity. In E. F. Lindquist (Ed.), Educational measurement (1st ed., pp. 621–694). Washington, DC: American Council on Education.
Downing, S. M. (2003). Validity: On the meaningful interpretation of assessment data. Medical Education, 37, 830–837.
Article Google Scholar
Eva, K. W., & Macala, C. (2014). Multiple mini-interview test characteristics:’Tis better to ask candidates to recall than to imagine. Medical Education, 48(6), 604–613. doi:10.1111/medu.12402.
Article Google Scholar
Gieryn, T. F. (1983). Boundary-work and the demarcation of science from non-science: Strains and interests in professional ideologies of scientists. American Sociological Review, 48(6), 781–795.
Article Google Scholar
Gould, S. J. (1996). The mismeasure of man. New York: WW Norton & Company.
Google Scholar
Graeff, E. C., Leafman, J. S., Wallace, L., & Stewart, G. (2014). Job satisfaction levels of physician assistant faculty in the United States. The Journal of Physician Assistant Education, 25(2), 15–20.
Article Google Scholar
Guilford, J. P. (1946). New standards for test evaluation. Educational and Psychological Measurement, 6(4), 427–438.
Article Google Scholar
Haladyna, T. M., Downing, S. M., & Rodriguez, M. C. (2002). A review of multiple-choice item-writing guidelines for classroom assessment. Applied Measurement in Education, 15(3), 309–334.
Article Google Scholar
Hodges, B. D. (2003). Validity and the OSCE. Medical Teacher, 25(3), 250–254.
Article Google Scholar
Hodges, B. D., Kuper, A., & Reeves, S. (2008). Discourse analysis. BMJ (Clinical Research Ed.), 337, a879. doi:10.1136/bmj.a879.
Article Google Scholar
Huddle, T. S., & Heudebert, G. R. (2007). Taking apart the art: The risk of anatomizing clinical competence. Academic Medicine: Journal of the Association of American Medical Colleges, 82(6), 536–541. doi:10.1097/ACM.0b013e3180555935.
Article Google Scholar
Kane, M. (2006). Content-related validity evidence in test development. In S. M. Downing & T. M. Haladyna (Eds.), Handbook of test development (pp. 131–153). Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Google Scholar
Kuper, A., Reeves, S., Albert, M., & Hodges, B. D. (2007). Assessment: Do we need to broaden our methodological horizons? Medical Education, 41, 1121–1123.
Article Google Scholar
Lineberry, M., Kreiter, C. D., & Bordage, G. (2013). Threats to validity in the use and interpretation of script concordance test scores. Medical Education, 47(12), 1175–1183. doi:10.1111/medu.12283.
Article Google Scholar
Lingard, L. (2009). What we see and don’t see when we look at “competence”: Notes on a god term. Advances in Health Sciences Education, 14, 625–628.
Article Google Scholar
McCoubrie, P. (2004). Improving the fairness of multiple-choice questions: A literature review. Medical Teacher, 26(8), 709–712.
Article Google Scholar
Messick, S. (1995). Standards of validity and the validity of standards in performance assessment. Educational measurement: Issues and practice, 14(4), 5–8.
Article Google Scholar
Mills, S. (2004). Discourse. London: Routledge.
Google Scholar
Mokkink, L. B., Terwee, C. B., Patrick, D. L., Alonso, J., Stratford, P. W., Knol, D. L., et al. (2012). The COSMIN checklist manual. Amsterdam: VU University Medical. doi:10.1186/1471-2288-10-22.
Google Scholar
Norman, G. (2004). Editorial—The morality of medical school admissions. Advances in Health Sciences Education, 9(2), 79–82. doi:10.1023/B:AHSE.0000027553.28703.cf.
Article Google Scholar
Norman, G. (2015). Identifying the bad apples. Advances in Health Sciences Education, 20(2), 299–303. doi:10.1007/s10459-015-9598-9.
Article Google Scholar
Portney, L. G. (2000). Validity of measurements. In U. S. River (Ed.), Foundations of clinical research: Applications to practice (Vol. 2, Chap. 6). NJ: Prentice Hall.
Google Scholar
Roberts, C., Newble, D., Jolly, B., Reed, M., & Hampton, K. (2006). Assuring the quality of high-stakes undergraduate assessments of clinical competence. Medical Teacher, 28(6), 535–543. doi:10.1080/01421590600711187.
Article Google Scholar
Schulman, J. A., & Wolfe, E. W. (2000). Development of a nutrition self-efficacy scale for prospective physicians. Journal of Applied Measurement, 1(2), 107–130.
Google Scholar
Schuwirth, L. W. T., & van der Vleuten, C. (2012). Programmatic assessment and Kane’s validity perspective. Medical Education, 46(1), 38–48. doi:10.1111/j.1365-2923.2011.04098.x.
Article Google Scholar
Shepard, L. A. (1997). The centrality of test use and consequences for test validity. Educational Measurement: Issues and Practice, 16(2), 5–8. doi:10.1111/j.1745-3992.1997.tb00585.x.
Article Google Scholar
Swanson, D. B., & Roberts, T. E. (2016). Trends in national licensing examinations in medicine. Medical Education, 50(1), 101–114. doi:10.1111/medu.12810.
Article Google Scholar
Van Der Vleuten, C. P. M., Schuwirth, L. W. T., Scheele, F., Driessen, E. W., & Hodges, B. (2010). The assessment of professional competence: Building blocks for theory development. Best Practice and Research: Clinical Obstetrics and Gynaecology, 24(6), 703–719. doi:10.1016/j.bpobgyn.2010.04.001.
Article Google Scholar
Van Winkle, L. J., La Salle, S., Richardson, L., Bjork, B. C., Burdick, P., Chandar, N., et al. (2013). Challenging medical students to confront their biases: A case study simulation approach, 23(2), 217–224.
Google Scholar
Wools, S., & Eggens, T. (2013). Systematic review on validation studies in medical education assessment. In AERA annual meeting 2013. San Francisco.

Download references

Acknowledgments

The authors would like to thank Catherine Côté for her help with data management in NVivo, Tim Dubé, Ph.D. and anonymous reviewers for feedback on a previous version of this manuscript. Funding for this project was provided by the Société des médecins de l’Université de Sherbrooke Research Chair in Medical Education, held by Christina St-Onge.

Author information

Authors and Affiliations

Université de Sherbrooke, Sherbrooke, Canada
Christina St-Onge
McGill University, Montreal, Canada
Meredith Young
University of British Columbia, Vancouver, Canada
Kevin W. Eva
University of Toronto, Toronto, Canada
Brian Hodges

Authors

Christina St-Onge
View author publications
You can also search for this author in PubMed Google Scholar
Meredith Young
View author publications
You can also search for this author in PubMed Google Scholar
Kevin W. Eva
View author publications
You can also search for this author in PubMed Google Scholar
Brian Hodges
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christina St-Onge.

Rights and permissions

Reprints and permissions

About this article

Cite this article

St-Onge, C., Young, M., Eva, K.W. et al. Validity: one word with a plurality of meanings. Adv in Health Sci Educ 22, 853–867 (2017). https://doi.org/10.1007/s10459-016-9716-3

Download citation

Received: 13 March 2016
Accepted: 26 September 2016
Published: 01 October 2016
Issue Date: October 2017
DOI: https://doi.org/10.1007/s10459-016-9716-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Validity: one word with a plurality of meanings

Abstract

Access this article

Similar content being viewed by others

Validity arguments for patient-reported outcomes: justifying the intended interpretation and use of data

Application of validity theory and methodology to patient-reported outcome measures (PROMs): building an argument for validity

Considering axiological integrity: a methodological analysis of qualitative evidence syntheses, and its implications for health professions education

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Validity: one word with a plurality of meanings

Abstract

Access this article

Similar content being viewed by others

Validity arguments for patient-reported outcomes: justifying the intended interpretation and use of data

Application of validity theory and methodology to patient-reported outcome measures (PROMs): building an argument for validity

Considering axiological integrity: a methodological analysis of qualitative evidence syntheses, and its implications for health professions education

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation