Skip to main content
Log in

Quality control of an OSCE using generalizability theory and many-faceted Rasch measurement

  • Original Paper
  • Published:
Advances in Health Sciences Education Aims and scope Submit manuscript

Abstract

An Objective Structured Clinical Examination (OSCE) is an effective method for evaluating competencies. However, scores obtained from an OSCE are vulnerable to many potential measurement errors that cases, items, or standardized patients (SPs) can introduce. Monitoring these sources of errors is an important quality control mechanism to ensure valid interpretations of the scores. We describe how one can use generalizability theory (GT) and many-faceted Rasch measurement (MFRM) approaches in quality control monitoring of an OSCE. We examined the communication skills OSCE of 79 residents from one Midwestern university in the United States. Each resident performed six communication tasks with SPs, who rated the performance of each resident using 18 5-category rating scale items. We analyzed their ratings with generalizability and MFRM studies. The generalizability study revealed that the largest source of error variance besides the residual error variance was SPs/cases. The MFRM study identified specific SPs/cases and items that introduced measurement errors and suggested the nature of the errors. SPs/cases were significantly different in their levels of severity/difficulty. Two SPs gave inconsistent ratings, which suggested problems related to the ways they portrayed the case, their understanding of the rating scale, and/or the case content. SPs interpreted two of the items inconsistently, and the rating scales for two items did not function as 5-category scales. We concluded that generalizability and MFRM analyses provided useful complementary information for monitoring and improving the quality of an OSCE.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Accreditation Council for Graduate Medical Education. (1999). The ACGME outcome project. Retrieved August 2006, from http://www.acgme.org/outcome/.

  • American Board of Internal Medicine. (1989). Final report on the patient satisfaction questionnaire project. Philadelphia, PA: American Board of Internal Medicine.

    Google Scholar 

  • Barrows, H. S. (1993). An overview of the uses of standardized patients for teaching and evaluating clinical skills. Academic Medicine, 68, 443–451.

    Article  Google Scholar 

  • Boulet, J. R., Mckinley, D. W., Whelan, G. P., & Hambleton, R. K. (2003). Quality assurance methods for performance-based assessments. Advances in Health Sciences Education, 8, 27–47.

    Article  Google Scholar 

  • Boursicot, K., & Roberts, T. (2005). How to set up an OSCE. The Clinical Teacher, 2, 16–20.

    Article  Google Scholar 

  • Brennan, R. L. (2001). Generalizability theory. New York: Springer-Verlag.

    Google Scholar 

  • Carraccio, C., & Englander, R. (2000). The objective structured clinical examination: A step in the direction of competency-based evaluation. Archives of Pediatrics & Adolescent Medicine, 154, 736–741.

    Google Scholar 

  • Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability of score and profiles. New York: Wiley.

    Google Scholar 

  • Cuschieri, A., Gleeson, F. A., Harden, R. M., & Wood, R. A. (1979). A new approach to a final examination in surgery: Use of the objective structured clinical examination. Annals of the Royal College of Surgeons of England, 61, 400–405.

    Google Scholar 

  • Downing, S. M. (2003). Item response theory: Applications of modern test theory in medical education. Medical Education, 37, 739–745.

    Article  Google Scholar 

  • Downing, S. M., & Haladyna, T. M. (2004). Validity threats: Overcoming interference with proposed interpretations of assessment data. Medical Education, 38, 327–333.

    Article  Google Scholar 

  • Engelhard G. Jr. (2002). Monitoring raters in performance assessments. In G. Tindal & T. M. Haladyna (Eds.), Large-scale assessment programs for all students: Validity, technical adequacy, and implementation (pp. 261–287). Mahwah, N.J.: Lawrence Erlbaum Associates.

    Google Scholar 

  • Guilford, J. P. (1954). Psychometric methods (2nd ed.). New York: McGraw Hill.

    Google Scholar 

  • Guiton, G., Hodgson, C. S., Delandshere, G., & Wilkerson, L. (2004). Communication skills in standardized-patient assessment of final-year medical students: A psychometric study. Advances in Health Sciences Education, 9, 179–187.

    Article  Google Scholar 

  • Haladyna, T. M., & Downing, S. M. (2004). Construct-irrelevant variance in high-stakes testing. Educational Measurement: Issues and Practice, 23, 17–27.

    Article  Google Scholar 

  • Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage Publications.

    Google Scholar 

  • Harden, R. M. (1988). What is an OSCE? Medical Teacher, 10, 19–22.

    Article  Google Scholar 

  • Harden, R. M. (1990). Twelve tips for organizing an Objective Structured Clinical Examination (OSCE). Medical Teacher, 12, 259–264.

    Article  Google Scholar 

  • Harden, R. M., Stevenson, M., Downie, W. W., & Wilson, G. M. (1975). Assessment of clinical competence using objective structured examination. British Medical Journal, 1, 447–451.

    Article  Google Scholar 

  • Linacre, J. M. (1989). Many-faceted Rasch measurement. Chicago, IL: MESA Press.

    Google Scholar 

  • Linacre, J. M. (1996). Generalizability theory and many-facet Rasch measurement. In G. Englehard Jr. & M. Wilson (Eds.), Objective measurement: Theory into practice, Vol. 3 (pp. 85–98). Norwood, NJ: Ablex Publishing Corporation.

  • Linacre, J. M. (2005). Facets (Version 3.57) [Computer program]. Chicago, IL: Winsteps.

    Google Scholar 

  • Linacre, J. M., & Wright, B. D. (2004). Construction of measures from many-facet data. In E. V. Smith Jr. & R. M. Smith (Eds.), Introduction to Rasch measurement: Theory, models, and applications (pp. 296–321). Maple Grove, MN: JAM Press.

  • Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–104). New York: American Council on Education.

    Google Scholar 

  • Myford, C. M., & Wolfe, E. W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement: Part I. Journal of Applied Measurement, 4, 386–422.

    Google Scholar 

  • Myford, C. M., & Wolfe, E. W. (2004). Detecting and measuring rater effects using many-facet Rasch measurement: Part II. Journal of Applied Measurement, 5, 189–227.

    Google Scholar 

  • Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park, CA: Sage Publications.

    Google Scholar 

  • Shavelson, R. J., Webb, N. M., & Rowley, G. L. (1989). Generalizability theory. American Psychologist, 44, 922–932.

    Article  Google Scholar 

  • Sireci, S. G., Thissen, D., & Wainer, H. (1991). On the reliability of testlet-based tests. Journal of Educational Measurement, 28(3), 237–247.

    Article  Google Scholar 

  • SPSS. (2002). SPSS (Version 11.5) [Computer program]. Chicago, IL: SPSS Inc.

  • Thissen, D., Steinberg, L., & Mooney, J. (1989). Trace lines for testlets: A use of multiple-categorical response models. Journal of Educational Measurement, 26(3), 247–260.

    Article  Google Scholar 

  • Van der Vleuten, C. P. M., & Swanson, D. B. (1990). Assessment of clinical skills with standardized patients: State of the art. Teaching and Learning in Medicine, 2, 58–76.

    Article  Google Scholar 

  • Wright, B. D., & Linacre, J. M. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 8, 370. Retrieved August 2006, from http://www.rasch.org/rmt/rmt83b.htm.

  • Yudkowsky, R., Alseidi, A., & Cintron, J. (2004). Beyond fulfilling the core competencies: An objective structured clinical examination to assess communication and interpersonal skills in a surgical residency. Current Surgery, 61, 499–503.

    Article  Google Scholar 

  • Yudkowsky, R., Downing, S. M., & Sandlow, L. J. (2006). Developing an institution-based assessment of resident communication and interpersonal skills. Academic Medicine, 81, 1115–1122.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cherdsak Iramaneerat.

Appendix

Appendix

Items Used in Communication Skills Performance Rating

  1. 1.

    I felt you greeted me warmly upon entering the room.

  2. 2.

    I felt you were friendly throughout the encounter. You were never crabby or rude to me.

  3. 3.

    I felt that you treated me like we were on the same level. You never “talked down” to me or treated me like a child.

  4. 4.

    I felt you let me tell my story and were careful to not interrupt me while I was speaking.

  5. 5.

    I felt you were telling me everything; being truthful, up front and frank; not keeping things from me.

  6. 6.

    I felt you showed interest in me as a “person.” You never acted bored or ignored what I had to say.

  7. 7.

    I felt that you discussed options with me.

  8. 8.

    I felt you made sure that I understood those options.

  9. 9.

    I felt you asked my opinion, allowing me to make my own decision.

  10. 10.

    I felt you encouraged me to ask questions.

  11. 11.

    I felt you displayed patience when I asked questions.

  12. 12.

    I felt you answered my questions, never avoiding them.

  13. 13.

    I felt you clearly explained what I needed to know about my problem; how and why it occurred.

  14. 14.

    I felt you clearly explained what I should expect next.

  15. 15.

    I felt you were careful to use plain language and not medical jargon when speaking to me.

  16. 16.

    I felt you approached sensitive/difficult subject matters, such as religion, sexual history, tobacco/drug/alcohol history, sexual orientation, giving bad news, etc., with sensitivity and without being judgmental.

  17. 17.

    I felt the resident displayed a positive attitude during the verbal feedback session.

  18. 18.

    If given the choice in the future, I would choose this resident as my personal physician.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Iramaneerat, C., Yudkowsky, R., Myford, C.M. et al. Quality control of an OSCE using generalizability theory and many-faceted Rasch measurement. Adv in Health Sci Educ 13, 479–493 (2008). https://doi.org/10.1007/s10459-007-9060-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10459-007-9060-8

Keywords

Navigation