Elsevier

Journal of Surgical Education

Volume 75, Issue 2, March–April 2018, Pages 370-376
Journal of Surgical Education

Original reports
Nonspecialist Raters Can Provide Reliable Assessments of Procedural Skills

https://doi.org/10.1016/j.jsurg.2017.07.003Get rights and content

Background

Competency-based learning has become a crucial component in medical education. Despite the advantages of competency-based learning, there are still challenges that need to be addressed. Currently, the common perception is that specialist assessment is needed for evaluating procedural skills which is difficult owing to the limited availability of faculty time. The aim of this study was to explore the validity of assessments of video recorded procedures performed by nonspecialist raters.

Methods

This study was a blinded observational trial. Twenty-three novices (senior medical students) and 9 experienced doctors were video recorded while each performing 2 flexible cystoscopies on patients. The recordings were anonymized and placed in random order and then rated by 2 experienced cystoscopists (specialist raters) and 2 medical students (nonspecialist raters). Flexible cystoscopy was chosen as it is a simple procedural skill that is crucial to master in a resident urology program.

Results

The internal consistency of assessments was high, Cronbach’s α = 0.93 and 0.95 for nonspecialist and specialist raters, respectively (p < 0.001 for both correlations). The interrater reliability was significant (p < 0.001) with a Pearson’s correlation of 0.77 for the nonspecialists and 0.75 for the specialists. The test-retest reliability showed the biggest difference between the 2 groups, 0.59 and 0.38 for the nonspecialist raters and the specialist raters, respectively (p < 0.001).

Conclusion

Our study suggests that nonspecialist raters can provide reliable and valid assessments of video recorded cystoscopies. This could make mastery learning and competency-based education more feasible.

Introduction

Medical education is changing rapidly, and the way doctors train procedural skills is shifting from traditional apprenticeship and time-based learning to competency-based attainments of skills.1, 2 Competency-based learning is being favored as it implies that trainees will pass when they are competent and not after a certain prescribed time or when a certain number of procedures have been performed, which does not necessarily reflect competence.3, 4, 5, 6 Competency-based learning require specialist assessment of procedural skills. Despite the advantages of competency-based learning, there are still challenges that need to be addressed, e.g., the limited availability of faculty time.7, 8, 9, 10 Some studies also suggest that knowing the identity of the trainee can influence assessment.11, 12 Technology holds some promise, as video recordings of performances create more flexibility and reduce the risk of bias.13, 14, 15

Studies show that rater training is beneficial and even suggest that 1-hour frame-of-reference training sessions are able to sufficiently train raters to use a simple evaluation instrument for the assessment of procedural skills.16, 17 Studies have shown that medical students can be used in teaching settings instead of professors,18, 19, 20 and this could be translated to competency-based assessment where the use of nonspecialist raters could be implemented to further reduce the time spent by specialists on assessment. The common perception is currently that specialists need to assess procedural skills, but previous studies have shown that even nonmedically trained individuals can be used to assess surgical skills.21, 22

Using nonspecialist raters would not only decrease the workload of specialists and minimize interpersonal bias but also provide a more beneficial economical solution in areas where competency-based assessment is needed. The use of nonspecialist raters needs to be proven reliable and valid before it can be implemented as part of competency-based learning assessments.

The aim of this study was to explore the validity of assessments of video recorded procedures performed by nonspecialist raters.

Section snippets

Design

This study was a blinded observational trial. Novices (senior medical students) and experienced doctors were video recorded while performing 2 flexible cystoscopies each. The recordings were anonymized and placed in random order and then rated by 2 experienced cystoscopists (specialist raters) and 2 medical students (nonspecialist raters). Flexible cystoscopy was chosen as it is a simple procedural skill that is crucial to master in a resident urology program.23

Participants

The novices participating in this

Results

Twenty-three novices and 9 experienced doctors participated in this study giving a total of 64 video recordings all rated by a pair of specialist raters and a pair of nonspecialist raters (256 ratings in total). The internal consistency of assessments was high, Cronbach’s α = 0.93 and 0.95 for nonspecialist and specialist raters, respectively (p < 0.001 for both correlations). The interrater reliability was significant (p < 0.001) with a Pearson’s correlation of 0.77 for the nonspecialist and

Discussion

Our study showed that nonspecialist raters can provide reliable and valid assessments of video recorded cystoscopies when comparing the level of internal consistency, interrater reliability, and test-retest reliability to the level of specialist raters. The strength of the reliability coefficient always depends on the purpose of the assessment, and the consequences of the assessment.25 Overall, the interrater reliability was reasonable approaching the 0.80 level needed for high-stakes

Conclusion

Our study suggests that nonspecialist raters can provide reliable assessments of video recorded cystoscopies.

References (46)

  • D.C. Leach

    Building and assessing competence: the potential for evidence-based graduate medical education

    Qual Manag Health Care

    (2002)
  • M.G. Tolsgaard et al.

    The assessment of clinical skills is imperative in postgraduate specialty training

    Ugeskr Laeger

    (2014)
  • R.E. Hawkins et al.

    Implementation of competency-based medical education: are we addressing the concerns and challenges?

    Med Educ

    (2015)
  • K.E. Hauer et al.

    Reviewing residents′ competence: a qualitative study of the role of clinical competency committees in performance assessment

    Acad Med

    (2015)
  • J.R. Kogan et al.

    Reconceptualizing variable rater assessments as both an educational and clinical care problem

    Acad Med

    (2014)
  • E.S. Holmboe et al.

    The role of assessment in competency-based medical education

    Med Teach

    (2010)
  • L. Konge et al.

    Reliable and valid assessment of competence in endoscopic ultrasonography and fine-needle aspiration for mediastinal staging of non-small cell lung cancer

    Endoscopy

    (2012)
  • Y. Subhi et al.

    An integrable, web-based solution for easy assessment of video-recorded performances

    Adv Med Educ Pract

    (2014)
  • D. Dath et al.

    Toward reliable operative assessment: the reliability and feasibility of videotaped assessment of laparoscopic technical skills

    Surg Endosc

    (2004)
  • J.R. Kogan et al.

    How faculty members experience workplace-based assessment rater training: a qualitative study

    Med Educ

    (2015)
  • M.G. Tolsgaard et al.

    Student teachers can be as good as associate professors in teaching clinical skills

    Med Teach

    (2007)
  • S.A. Josephson et al.

    A new first-year course designed and taught by a senior medical student

    Acad Med

    (2002)
  • S. Kassab et al.

    Student-led tutorials in problem-based learning: educational outcomes and students′ perceptions

    Med Teach

    (2005)
  • Cited by (14)

    • ChatGPT: The brightest student in the class

      2023, Thinking Skills and Creativity
    • Crowdsourced assessment of surgical skills: A systematic review

      2022, American Journal of Surgery
      Citation Excerpt :

      All but four studies used the crowdsourcing platforms C-SATS and/or AMT to obtain video reviews from CW. The remaining studies used Vimeo (New York, NY),42 Facebook users,40 local university websites,29 and recruitment at work.23 Most studies used both experts and CW to assess the videos, but a few studies did not.

    • A video anchored rating scale leads to high inter-rater reliability of inexperienced and expert raters in the absence of rater training

      2020, American Journal of Surgery
      Citation Excerpt :

      The literature supports our approach of using non-medically trained individuals to assess surgical skills.18,27 Mahmood et al. showed that trained non-specialist raters can provide reliable and valid assessments of video recorded cystoscopies compared to specialist raters as evidenced by similar internal consistency, inter-rater reliability, and test-retest reliability.30 Yeung et al. showcased that raters with varying levels of expertise can reliably grade performance of an intra-corporeal suturing task.

    • Assessment of competence in video-assisted thoracoscopic surgery lobectomy: A Danish nationwide study

      2018, Journal of Thoracic and Cardiovascular Surgery
      Citation Excerpt :

      Using nonexperts or novice raters may be considered, because the availability is easier and the costs are less. This approach should be used with some caution, but recent work has shown good inter-rater reliability between expert and nonexpert raters.27,28 The logarithmic relation between the experience level of the thoracic surgeons and the mean VATS score shows good consistency.

    • Direct Observation vs. Video-Based Assessment in Flexible Cystoscopy

      2018, Journal of Surgical Education
      Citation Excerpt :

      With such reductions in time consumption, it would be feasible to let 2 different raters assess the same performance, as they would be able to see twice as many performances compared to direct observations and schedule their time for rating and feedback in more calm settings. It has earlier been described how residents or even medical students and chief physicians generate similar scores in video-rating.34,35 It could therefore be a possibility to let one of the raters be a resident and the other a chief physician.

    View all citing articles on Scopus
    View full text