Skip to main content
Log in

What is the validity evidence for assessments of clinical teaching?

  • Clinical Review
  • Published:
Journal of General Internal Medicine Aims and scope Submit manuscript

Abstract

BACKGROUND: Although a variety of validity evidence should be utilized when evaluating assessment tools, a review of teaching assessments suggested that authors pursue a limited range of validity evidence.

OBJECTIVES: To develop a method for rating validity evidence and to quantify the evidence supporting scores from existing clinical teaching assessment instruments.

DESIGN: A comprehensive search yielded 22 articles on clinical teaching assessments. Using standards outlined by the American Psychological and Education Research Associations, we developed a method for rating the 5 categories of validity evidence reported in each article. We then quantified the validity evidence by summing the ratings for each category. We also calculated weighted κ coefficients to determine interrater reliabilities for each category of validity evidence.

MAIN RESULTS: Content and Internal Structure evidence received the highest ratings (27 and 32, respectively, of 44 possible). Relation to Other Variables, Consequences, and Response Process received the lowest ratings (9, 2, and 2, respectively). Interrater reliability was good for Content, Internal Structure, and Relation to Other Variables (κ range 0.52 to 0.96, all P values <.01), but poor for Consequences and Response Process.

CONCLUSIONS: Content and Internal Structure evidence is well represented among published assessments of clinical teaching. Evidence for Relation to Other Variables, Consequences, and Response Process receive little attention, and future research should emphasize these categories. The low interrater reliability for Response Process and Consequences likely reflects the scarcity of reported evidence. With further development, our method for rating the validity evidence should prove useful in various settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Downing SM. Validity: on the meaningful interpretation of assessment data. Med Educ. 2003;37:830–7.

    Article  PubMed  Google Scholar 

  2. Crossley J, Humphris G, Jolly B. Assessing health professionals. Med Educ. 2002;36:800–4.

    Article  PubMed  Google Scholar 

  3. Beckman TJ, Ghosh AK, Cook DA, Erwin PJ, Mandrekar JN. How reliable are assessments of clinical teaching? A review of the published instruments. J Gen Intern Med. 2004;19:971–7.

    Article  PubMed  Google Scholar 

  4. Beckman TJ, Lee MC, Rohren CH. Evaluating an instrument for the peer review of inpatient teaching. Med Teach. 2003;25:131–5.

    Article  PubMed  Google Scholar 

  5. Benbassat J, Bachar E. Validity of students’ ratings of clinical instructors. Med Educ. 1981;15:373–6.

    PubMed  CAS  Google Scholar 

  6. Cohen R, McRae H, Jamieson C. Teaching effectiveness of surgeons. Am J Surg. 1996;171:612–4.

    Article  PubMed  CAS  Google Scholar 

  7. Copeland HL, Hewson MG. Developing and testing an instrument to measure the effectiveness of clinical teaching in an academic medical center. Acad Med. 2000;75:161–6.

    Article  PubMed  CAS  Google Scholar 

  8. Donnelly MB, Woolliscroft JO. Evaluation of clinical instructors by third year medical students. Acad Med. 1989;64:159–64.

    Article  PubMed  CAS  Google Scholar 

  9. Donner-Banzhoff N, Merle H, Baum E, Basler HD. Feedback for general practice trainers: developing and testing a standardized instrument using the importance-quality-score method. Med Educ. 2003;37:772–7.

    Article  PubMed  Google Scholar 

  10. Guyatt GH, Nishikawa J, Willan A, et al. A measurement process for evaluating clinical teachers in internal medicine. Can Med Assoc J. 1993;149:1097–102.

    CAS  Google Scholar 

  11. Hayward RA, Williams BC, Gruppen LD, Rosenbaum D. Measuring attending physician performance in a general medicine outpatient clinic. J Gen Intern Med. 1995;10:504–10.

    Article  PubMed  CAS  Google Scholar 

  12. Irby DM, Rakestraw P. Evaluating clinical teaching in medicine. J Med Educ. 1981;56:181–6.

    PubMed  CAS  Google Scholar 

  13. James PA, Osborne JW. A measure of medical instructional quality in ambulatory settings: the MedIQ. Fam Med. 1999;31:263–9.

    PubMed  CAS  Google Scholar 

  14. Litzelman DK, Westmorland GR, Skeff KM, Stratos GA. Student and resident evaluations of faculty—how reliable are they? Acad Med. 1999;74(suppl):s25–7.

    Article  PubMed  CAS  Google Scholar 

  15. Litzelman DK, Stratos GA, Marriott DJ, Skeff KM. Factorial validation of a widely disseminated educational framework for evaluating clinical teachers. Acad Med. 1998;73:688–95.

    Article  PubMed  CAS  Google Scholar 

  16. McGill MK, McClure C, Commerford K. A system for evaluating teaching in the ambulatory setting. Fam Med. 1986;18:173–4.

    Google Scholar 

  17. McLeod PJ, James CA, Abrahamowicz M. Clinical tutor evaluation: a 5-year study by students on an in-patient service and residents in an ambulatory care clinic. Med Educ. 1993;27:48–53.

    Article  PubMed  CAS  Google Scholar 

  18. Ramsbottom-Lucier MT, Gillmore GM, Irby DM, Ramsey PG. Evaluation of clinical teaching by general internal medicine faculty in outpatient and inpatient settings. Acad Med. 1994;69:152–4.

    Article  PubMed  CAS  Google Scholar 

  19. Risucci DA, Lutsky L, Rosati RJ, Tortolani AJ. Reliability and accuracy of resident evaluations of surgical faculty. Eval Health Prof. 1992;15:313–24.

    Article  PubMed  CAS  Google Scholar 

  20. Shellenberger S, Mahan JM. A factor analytic study of teaching in off-campus general practice clerkships. Med Educ. 1982;16:151–5.

    PubMed  CAS  Google Scholar 

  21. Solomon DJ, Speer AJ, Rosebraugh CJ, DiPette DJ. The reliability of medical student ratings of teaching. Eval Health Prof. 1997;20:343–52.

    Article  PubMed  CAS  Google Scholar 

  22. Steiner IP, Franc-Law J, Kelly KD, Rowe BH. Faculty evaluation by residents in an emergency medicine program: a new evaluation instrument. Acad Emerg Med. 2000;7:1015–21.

    PubMed  CAS  Google Scholar 

  23. Tortolani AJ, Rissucci DA, Rosati RJ. Resident evaluation of surgical faculty. J Surg Res. 1991;51:186–91.

    Article  PubMed  CAS  Google Scholar 

  24. Williams BC, Litzelman DK, Babbott SF, Lubitz RM, Hofer TP. Validation of a global measure of faculty’s clinical teaching performance. Acad Med. 2002;77:177–80.

    Article  PubMed  Google Scholar 

  25. Smith CA, Varkey AB, Evans AT, Reilly BM. Evaluating the performance of inpatient attending physicians: a new instrument for today’s teaching hospitals. J Gen Intern Med. 2004;19:766–71.

    Article  PubMed  Google Scholar 

  26. American Education Research Association and American Psychological Association. Standards for Educational and Psychological Testing. Washington, DC: American Education Research Association; 1999.

    Google Scholar 

  27. Messick S Validity. In: Linn RL, ed. Educational Measurement. 3rd ed. Phoenix, Ariz: Oryx Press; 1993.

    Google Scholar 

  28. Fleiss JL, Cohen J. The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ Psychol Meas. 1973;33:613–9.

    Article  Google Scholar 

  29. Fleiss JL, Cohen J, Everitt BS. Large sample standard errors of kappa and weighted kappa. Psychol Bull. 1969;72:323–7.

    Article  Google Scholar 

  30. Landis JR, Koch GG. The measure of observer agreement for categorical data. Biometrics. 1977;33:159–74.

    Article  PubMed  CAS  Google Scholar 

  31. Downing SM. Reliability: on the reproducibility of assessment data. Med Educ. 2004;38:1006–12.

    Article  PubMed  Google Scholar 

  32. Carney PA, Neirenberg DW, Pipas CF, Brooks WB, Stukel TA, Keller AM. Educational epidemiology: applying population-based design and analytic approaches to study medical education. JAMA. 2004;292:1044–50.

    Article  PubMed  CAS  Google Scholar 

  33. Beckman TJ, Cook DA. Educational epidemiology (letter). JAMA. 2004;292:1969.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas J. Beckman MD.

Additional information

None of the authors have any conflicts of interest to declare for this paper.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Beckman, T.J., Cook, D.A. & Mandrekar, J.N. What is the validity evidence for assessments of clinical teaching?. J GEN INTERN MED 20, 1159–1164 (2005). https://doi.org/10.1111/j.1525-1497.2005.0258.x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1111/j.1525-1497.2005.0258.x

Key Words

Navigation