Quality control of an OSCE using generalizability theory and many-faceted Rasch measurement

Iramaneerat, Cherdsak; Yudkowsky, Rachel; Myford, Carol M.; Downing, Steven M.

doi:10.1007/s10459-007-9060-8

Quality control of an OSCE using generalizability theory and many-faceted Rasch measurement

Original Paper
Published: 20 February 2007

Volume 13, pages 479–493, (2008)
Cite this article

Advances in Health Sciences Education Aims and scope Submit manuscript

Cherdsak Iramaneerat¹,
Rachel Yudkowsky²,
Carol M. Myford¹ &
…
Steven M. Downing²

1545 Accesses
46 Citations
Explore all metrics

Abstract

An Objective Structured Clinical Examination (OSCE) is an effective method for evaluating competencies. However, scores obtained from an OSCE are vulnerable to many potential measurement errors that cases, items, or standardized patients (SPs) can introduce. Monitoring these sources of errors is an important quality control mechanism to ensure valid interpretations of the scores. We describe how one can use generalizability theory (GT) and many-faceted Rasch measurement (MFRM) approaches in quality control monitoring of an OSCE. We examined the communication skills OSCE of 79 residents from one Midwestern university in the United States. Each resident performed six communication tasks with SPs, who rated the performance of each resident using 18 5-category rating scale items. We analyzed their ratings with generalizability and MFRM studies. The generalizability study revealed that the largest source of error variance besides the residual error variance was SPs/cases. The MFRM study identified specific SPs/cases and items that introduced measurement errors and suggested the nature of the errors. SPs/cases were significantly different in their levels of severity/difficulty. Two SPs gave inconsistent ratings, which suggested problems related to the ways they portrayed the case, their understanding of the rating scale, and/or the case content. SPs interpreted two of the items inconsistently, and the rating scales for two items did not function as 5-category scales. We concluded that generalizability and MFRM analyses provided useful complementary information for monitoring and improving the quality of an OSCE.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Augmenting physician examiner scoring in objective structured clinical examinations: including the standardized patient perspective

Article 20 August 2020

Improving the assessment of communication competencies in a national licensing OSCE: lessons learned from an experts’ symposium

Article Open access 26 May 2020

Inter-rater reliability and generalizability of patient note scores using a scoring rubric based on the USMLE Step-2 CS format

Article 12 January 2016

References

Accreditation Council for Graduate Medical Education. (1999). The ACGME outcome project. Retrieved August 2006, from http://www.acgme.org/outcome/.
American Board of Internal Medicine. (1989). Final report on the patient satisfaction questionnaire project. Philadelphia, PA: American Board of Internal Medicine.
Google Scholar
Barrows, H. S. (1993). An overview of the uses of standardized patients for teaching and evaluating clinical skills. Academic Medicine, 68, 443–451.
Article Google Scholar
Boulet, J. R., Mckinley, D. W., Whelan, G. P., & Hambleton, R. K. (2003). Quality assurance methods for performance-based assessments. Advances in Health Sciences Education, 8, 27–47.
Article Google Scholar
Boursicot, K., & Roberts, T. (2005). How to set up an OSCE. The Clinical Teacher, 2, 16–20.
Article Google Scholar
Brennan, R. L. (2001). Generalizability theory. New York: Springer-Verlag.
Google Scholar
Carraccio, C., & Englander, R. (2000). The objective structured clinical examination: A step in the direction of competency-based evaluation. Archives of Pediatrics & Adolescent Medicine, 154, 736–741.
Google Scholar
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability of score and profiles. New York: Wiley.
Google Scholar
Cuschieri, A., Gleeson, F. A., Harden, R. M., & Wood, R. A. (1979). A new approach to a final examination in surgery: Use of the objective structured clinical examination. Annals of the Royal College of Surgeons of England, 61, 400–405.
Google Scholar
Downing, S. M. (2003). Item response theory: Applications of modern test theory in medical education. Medical Education, 37, 739–745.
Article Google Scholar
Downing, S. M., & Haladyna, T. M. (2004). Validity threats: Overcoming interference with proposed interpretations of assessment data. Medical Education, 38, 327–333.
Article Google Scholar
Engelhard G. Jr. (2002). Monitoring raters in performance assessments. In G. Tindal & T. M. Haladyna (Eds.), Large-scale assessment programs for all students: Validity, technical adequacy, and implementation (pp. 261–287). Mahwah, N.J.: Lawrence Erlbaum Associates.
Google Scholar
Guilford, J. P. (1954). Psychometric methods (2nd ed.). New York: McGraw Hill.
Google Scholar
Guiton, G., Hodgson, C. S., Delandshere, G., & Wilkerson, L. (2004). Communication skills in standardized-patient assessment of final-year medical students: A psychometric study. Advances in Health Sciences Education, 9, 179–187.
Article Google Scholar
Haladyna, T. M., & Downing, S. M. (2004). Construct-irrelevant variance in high-stakes testing. Educational Measurement: Issues and Practice, 23, 17–27.
Article Google Scholar
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage Publications.
Google Scholar
Harden, R. M. (1988). What is an OSCE? Medical Teacher, 10, 19–22.
Article Google Scholar
Harden, R. M. (1990). Twelve tips for organizing an Objective Structured Clinical Examination (OSCE). Medical Teacher, 12, 259–264.
Article Google Scholar
Harden, R. M., Stevenson, M., Downie, W. W., & Wilson, G. M. (1975). Assessment of clinical competence using objective structured examination. British Medical Journal, 1, 447–451.
Article Google Scholar
Linacre, J. M. (1989). Many-faceted Rasch measurement. Chicago, IL: MESA Press.
Google Scholar
Linacre, J. M. (1996). Generalizability theory and many-facet Rasch measurement. In G. Englehard Jr. & M. Wilson (Eds.), Objective measurement: Theory into practice, Vol. 3 (pp. 85–98). Norwood, NJ: Ablex Publishing Corporation.
Linacre, J. M. (2005). Facets (Version 3.57) [Computer program]. Chicago, IL: Winsteps.
Google Scholar
Linacre, J. M., & Wright, B. D. (2004). Construction of measures from many-facet data. In E. V. Smith Jr. & R. M. Smith (Eds.), Introduction to Rasch measurement: Theory, models, and applications (pp. 296–321). Maple Grove, MN: JAM Press.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–104). New York: American Council on Education.
Google Scholar
Myford, C. M., & Wolfe, E. W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement: Part I. Journal of Applied Measurement, 4, 386–422.
Google Scholar
Myford, C. M., & Wolfe, E. W. (2004). Detecting and measuring rater effects using many-facet Rasch measurement: Part II. Journal of Applied Measurement, 5, 189–227.
Google Scholar
Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park, CA: Sage Publications.
Google Scholar
Shavelson, R. J., Webb, N. M., & Rowley, G. L. (1989). Generalizability theory. American Psychologist, 44, 922–932.
Article Google Scholar
Sireci, S. G., Thissen, D., & Wainer, H. (1991). On the reliability of testlet-based tests. Journal of Educational Measurement, 28(3), 237–247.
Article Google Scholar
SPSS. (2002). SPSS (Version 11.5) [Computer program]. Chicago, IL: SPSS Inc.
Thissen, D., Steinberg, L., & Mooney, J. (1989). Trace lines for testlets: A use of multiple-categorical response models. Journal of Educational Measurement, 26(3), 247–260.
Article Google Scholar
Van der Vleuten, C. P. M., & Swanson, D. B. (1990). Assessment of clinical skills with standardized patients: State of the art. Teaching and Learning in Medicine, 2, 58–76.
Article Google Scholar
Wright, B. D., & Linacre, J. M. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 8, 370. Retrieved August 2006, from http://www.rasch.org/rmt/rmt83b.htm.
Yudkowsky, R., Alseidi, A., & Cintron, J. (2004). Beyond fulfilling the core competencies: An objective structured clinical examination to assess communication and interpersonal skills in a surgical residency. Current Surgery, 61, 499–503.
Article Google Scholar
Yudkowsky, R., Downing, S. M., & Sandlow, L. J. (2006). Developing an institution-based assessment of resident communication and interpersonal skills. Academic Medicine, 81, 1115–1122.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Educational Psychology, College of Education, University of Illinois at Chicago, 1407 S. Indiana Ave., Chicago, IL, 60605, USA
Cherdsak Iramaneerat & Carol M. Myford
Department of Medical Education, College of Medicine, University of Illinois at Chicago, Chicago, IL, USA
Rachel Yudkowsky & Steven M. Downing

Authors

Cherdsak Iramaneerat
View author publications
You can also search for this author in PubMed Google Scholar
Rachel Yudkowsky
View author publications
You can also search for this author in PubMed Google Scholar
Carol M. Myford
View author publications
You can also search for this author in PubMed Google Scholar
Steven M. Downing
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cherdsak Iramaneerat.

Appendix

Items Used in Communication Skills Performance Rating

1.
I felt you greeted me warmly upon entering the room.
2.
I felt you were friendly throughout the encounter. You were never crabby or rude to me.
3.
I felt that you treated me like we were on the same level. You never “talked down” to me or treated me like a child.
4.
I felt you let me tell my story and were careful to not interrupt me while I was speaking.
5.
I felt you were telling me everything; being truthful, up front and frank; not keeping things from me.
6.
I felt you showed interest in me as a “person.” You never acted bored or ignored what I had to say.
7.
I felt that you discussed options with me.
8.
I felt you made sure that I understood those options.
9.
I felt you asked my opinion, allowing me to make my own decision.
10.
I felt you encouraged me to ask questions.
11.
I felt you displayed patience when I asked questions.
12.
I felt you answered my questions, never avoiding them.
13.
I felt you clearly explained what I needed to know about my problem; how and why it occurred.
14.
I felt you clearly explained what I should expect next.
15.
I felt you were careful to use plain language and not medical jargon when speaking to me.
16.
I felt you approached sensitive/difficult subject matters, such as religion, sexual history, tobacco/drug/alcohol history, sexual orientation, giving bad news, etc., with sensitivity and without being judgmental.
17.
I felt the resident displayed a positive attitude during the verbal feedback session.
18.
If given the choice in the future, I would choose this resident as my personal physician.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Iramaneerat, C., Yudkowsky, R., Myford, C.M. et al. Quality control of an OSCE using generalizability theory and many-faceted Rasch measurement. Adv in Health Sci Educ 13, 479–493 (2008). https://doi.org/10.1007/s10459-007-9060-8

Download citation

Received: 26 October 2006
Accepted: 26 January 2007
Published: 20 February 2007
Issue Date: October 2008
DOI: https://doi.org/10.1007/s10459-007-9060-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quality control of an OSCE using generalizability theory and many-faceted Rasch measurement

Abstract

Access this article

Similar content being viewed by others

Augmenting physician examiner scoring in objective structured clinical examinations: including the standardized patient perspective

Improving the assessment of communication competencies in a national licensing OSCE: lessons learned from an experts’ symposium

Inter-rater reliability and generalizability of patient note scores using a scoring rubric based on the USMLE Step-2 CS format

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Quality control of an OSCE using generalizability theory and many-faceted Rasch measurement

Abstract

Access this article

Similar content being viewed by others

Augmenting physician examiner scoring in objective structured clinical examinations: including the standardized patient perspective

Improving the assessment of communication competencies in a national licensing OSCE: lessons learned from an experts’ symposium

Inter-rater reliability and generalizability of patient note scores using a scoring rubric based on the USMLE Step-2 CS format

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation