Abstract
An Objective Structured Clinical Examination (OSCE) is an effective method for evaluating competencies. However, scores obtained from an OSCE are vulnerable to many potential measurement errors that cases, items, or standardized patients (SPs) can introduce. Monitoring these sources of errors is an important quality control mechanism to ensure valid interpretations of the scores. We describe how one can use generalizability theory (GT) and many-faceted Rasch measurement (MFRM) approaches in quality control monitoring of an OSCE. We examined the communication skills OSCE of 79 residents from one Midwestern university in the United States. Each resident performed six communication tasks with SPs, who rated the performance of each resident using 18 5-category rating scale items. We analyzed their ratings with generalizability and MFRM studies. The generalizability study revealed that the largest source of error variance besides the residual error variance was SPs/cases. The MFRM study identified specific SPs/cases and items that introduced measurement errors and suggested the nature of the errors. SPs/cases were significantly different in their levels of severity/difficulty. Two SPs gave inconsistent ratings, which suggested problems related to the ways they portrayed the case, their understanding of the rating scale, and/or the case content. SPs interpreted two of the items inconsistently, and the rating scales for two items did not function as 5-category scales. We concluded that generalizability and MFRM analyses provided useful complementary information for monitoring and improving the quality of an OSCE.
Similar content being viewed by others
References
Accreditation Council for Graduate Medical Education. (1999). The ACGME outcome project. Retrieved August 2006, from http://www.acgme.org/outcome/.
American Board of Internal Medicine. (1989). Final report on the patient satisfaction questionnaire project. Philadelphia, PA: American Board of Internal Medicine.
Barrows, H. S. (1993). An overview of the uses of standardized patients for teaching and evaluating clinical skills. Academic Medicine, 68, 443–451.
Boulet, J. R., Mckinley, D. W., Whelan, G. P., & Hambleton, R. K. (2003). Quality assurance methods for performance-based assessments. Advances in Health Sciences Education, 8, 27–47.
Boursicot, K., & Roberts, T. (2005). How to set up an OSCE. The Clinical Teacher, 2, 16–20.
Brennan, R. L. (2001). Generalizability theory. New York: Springer-Verlag.
Carraccio, C., & Englander, R. (2000). The objective structured clinical examination: A step in the direction of competency-based evaluation. Archives of Pediatrics & Adolescent Medicine, 154, 736–741.
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability of score and profiles. New York: Wiley.
Cuschieri, A., Gleeson, F. A., Harden, R. M., & Wood, R. A. (1979). A new approach to a final examination in surgery: Use of the objective structured clinical examination. Annals of the Royal College of Surgeons of England, 61, 400–405.
Downing, S. M. (2003). Item response theory: Applications of modern test theory in medical education. Medical Education, 37, 739–745.
Downing, S. M., & Haladyna, T. M. (2004). Validity threats: Overcoming interference with proposed interpretations of assessment data. Medical Education, 38, 327–333.
Engelhard G. Jr. (2002). Monitoring raters in performance assessments. In G. Tindal & T. M. Haladyna (Eds.), Large-scale assessment programs for all students: Validity, technical adequacy, and implementation (pp. 261–287). Mahwah, N.J.: Lawrence Erlbaum Associates.
Guilford, J. P. (1954). Psychometric methods (2nd ed.). New York: McGraw Hill.
Guiton, G., Hodgson, C. S., Delandshere, G., & Wilkerson, L. (2004). Communication skills in standardized-patient assessment of final-year medical students: A psychometric study. Advances in Health Sciences Education, 9, 179–187.
Haladyna, T. M., & Downing, S. M. (2004). Construct-irrelevant variance in high-stakes testing. Educational Measurement: Issues and Practice, 23, 17–27.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage Publications.
Harden, R. M. (1988). What is an OSCE? Medical Teacher, 10, 19–22.
Harden, R. M. (1990). Twelve tips for organizing an Objective Structured Clinical Examination (OSCE). Medical Teacher, 12, 259–264.
Harden, R. M., Stevenson, M., Downie, W. W., & Wilson, G. M. (1975). Assessment of clinical competence using objective structured examination. British Medical Journal, 1, 447–451.
Linacre, J. M. (1989). Many-faceted Rasch measurement. Chicago, IL: MESA Press.
Linacre, J. M. (1996). Generalizability theory and many-facet Rasch measurement. In G. Englehard Jr. & M. Wilson (Eds.), Objective measurement: Theory into practice, Vol. 3 (pp. 85–98). Norwood, NJ: Ablex Publishing Corporation.
Linacre, J. M. (2005). Facets (Version 3.57) [Computer program]. Chicago, IL: Winsteps.
Linacre, J. M., & Wright, B. D. (2004). Construction of measures from many-facet data. In E. V. Smith Jr. & R. M. Smith (Eds.), Introduction to Rasch measurement: Theory, models, and applications (pp. 296–321). Maple Grove, MN: JAM Press.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–104). New York: American Council on Education.
Myford, C. M., & Wolfe, E. W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement: Part I. Journal of Applied Measurement, 4, 386–422.
Myford, C. M., & Wolfe, E. W. (2004). Detecting and measuring rater effects using many-facet Rasch measurement: Part II. Journal of Applied Measurement, 5, 189–227.
Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park, CA: Sage Publications.
Shavelson, R. J., Webb, N. M., & Rowley, G. L. (1989). Generalizability theory. American Psychologist, 44, 922–932.
Sireci, S. G., Thissen, D., & Wainer, H. (1991). On the reliability of testlet-based tests. Journal of Educational Measurement, 28(3), 237–247.
SPSS. (2002). SPSS (Version 11.5) [Computer program]. Chicago, IL: SPSS Inc.
Thissen, D., Steinberg, L., & Mooney, J. (1989). Trace lines for testlets: A use of multiple-categorical response models. Journal of Educational Measurement, 26(3), 247–260.
Van der Vleuten, C. P. M., & Swanson, D. B. (1990). Assessment of clinical skills with standardized patients: State of the art. Teaching and Learning in Medicine, 2, 58–76.
Wright, B. D., & Linacre, J. M. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 8, 370. Retrieved August 2006, from http://www.rasch.org/rmt/rmt83b.htm.
Yudkowsky, R., Alseidi, A., & Cintron, J. (2004). Beyond fulfilling the core competencies: An objective structured clinical examination to assess communication and interpersonal skills in a surgical residency. Current Surgery, 61, 499–503.
Yudkowsky, R., Downing, S. M., & Sandlow, L. J. (2006). Developing an institution-based assessment of resident communication and interpersonal skills. Academic Medicine, 81, 1115–1122.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Items Used in Communication Skills Performance Rating
-
1.
I felt you greeted me warmly upon entering the room.
-
2.
I felt you were friendly throughout the encounter. You were never crabby or rude to me.
-
3.
I felt that you treated me like we were on the same level. You never “talked down” to me or treated me like a child.
-
4.
I felt you let me tell my story and were careful to not interrupt me while I was speaking.
-
5.
I felt you were telling me everything; being truthful, up front and frank; not keeping things from me.
-
6.
I felt you showed interest in me as a “person.” You never acted bored or ignored what I had to say.
-
7.
I felt that you discussed options with me.
-
8.
I felt you made sure that I understood those options.
-
9.
I felt you asked my opinion, allowing me to make my own decision.
-
10.
I felt you encouraged me to ask questions.
-
11.
I felt you displayed patience when I asked questions.
-
12.
I felt you answered my questions, never avoiding them.
-
13.
I felt you clearly explained what I needed to know about my problem; how and why it occurred.
-
14.
I felt you clearly explained what I should expect next.
-
15.
I felt you were careful to use plain language and not medical jargon when speaking to me.
-
16.
I felt you approached sensitive/difficult subject matters, such as religion, sexual history, tobacco/drug/alcohol history, sexual orientation, giving bad news, etc., with sensitivity and without being judgmental.
-
17.
I felt the resident displayed a positive attitude during the verbal feedback session.
-
18.
If given the choice in the future, I would choose this resident as my personal physician.
Rights and permissions
About this article
Cite this article
Iramaneerat, C., Yudkowsky, R., Myford, C.M. et al. Quality control of an OSCE using generalizability theory and many-faceted Rasch measurement. Adv in Health Sci Educ 13, 479–493 (2008). https://doi.org/10.1007/s10459-007-9060-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10459-007-9060-8