Elsevier

World Neurosurgery

Volume 127, July 2019, Pages e230-e235
World Neurosurgery

Original Article
A Comparison of Visual Rating Scales and Simulated Virtual Reality Metrics in Neurosurgical Training: A Generalizability Theory Study

https://doi.org/10.1016/j.wneu.2019.03.059Get rights and content

Background

Adequate assessment and feedback remains a cornerstone of psychomotor skills acquisition, particularly within neurosurgery where the consequence of adverse operative events is significant. However, a critical appraisal of the reliability of visual rating scales in neurosurgery is lacking. Therefore, we sought to design a study to compare visual rating scales with simulated metrics in a neurosurgical virtual reality task.

Methods

Neurosurgical faculty rated anonymized participant video recordings of the removal of simulated brain tumors using a visual rating scale made up of seven composite elements. Scale reliability was evaluated using generalizability theory, and scale subcomponents were compared with simulated metrics using Pearson correlation analysis.

Results

Four staff neurosurgeons evaluated 16 medical student neurosurgery applicants. Overall scale reliability and internal consistency were 0.73 and 0.90, respectively. Reliability of 0.71 was achieved with two raters. Individual participants, raters, and scale items accounted for 27%, 11%, and 0.6% of the data variability. The hemostasis scale component related to the greatest number of simulated metrics, whereas respect for no-go zones and tissue was correlated with none. Metrics relating to instrument force and patient safety (brain volume removed and blood loss) were captured by the fewest number of rating scale components.

Conclusions

To our knowledge, this is the first study comparing participant's ratings with simulated performance. Given rating scales capture less well instrument force, quantity of brain volume removed, and blood loss, we suggest adopting a hybrid educational approach using visual rating scales in an operative environment, supplemented by simulated sessions to uncover potentially problematic surgical technique.

Introduction

As residency programs continue to evolve toward a competency-based curriculum, there is an increasing need for assessment of resident technical skills. Adequate assessment and feedback remain a cornerstone of psychomotor skills acquisition, particularly within neurosurgery where the consequence of adverse operative events is great.1 Visual rating scales remain convenient tools for generating organized formative assessments. Different rating scales for surgery have been developed, including the Objective Structured Assessment of Technical Skills (OSATS), which has been used previously in a neurosurgical context.2, 3, 4 A theoretical limitation of visual rating scales is the risk of rater subjectivity in skills assessment. Furthermore, little information exists on the ability of rating scales to capture subtler aspects of performance, including instrument force applied during a procedure. This last point is particularly important because consistent evidence from the neurosurgical simulation literature suggests that applied force differentiates levels of expertise.5, 6, 7, 8, 9, 10 In addition, a recent study found that excess force applied during live neurosurgical operations is associated with increased intraoperative bleeding.11

The objective of the project was to conduct a generalizability study to better understand the use of a visual rating scale of operative performance in neurosurgery and to compare it with computerized metrics generated during a virtual reality neurosurgical operative procedure. We hypothesize that both methods will measure the same underlying construct, namely, surgical performance.

Section snippets

Subjects

Medical student applicants to a single Canadian neurosurgery program in 2015 were recruited to participate in a trial involving a simulated brain tumor resection task.8 Sixteen of the 17 applicants participated, comprising over 70% of the national neurosurgical applicant pool for that study year.11 Data were collected at a single time point within the Neurosurgical Simulation and Artificial Intelligence Learning Centre in a controlled laboratory environment void of distracting noise. No

Results

Four staff neurosurgeons evaluated 16 medical students for a total of 64 observations. Table 1 includes a descriptive analysis, demonstrating use of the full range of the Likert scale. Demographic information is available in a previous publication8 and can be summarized as follows: 7 out of 16 participants (43%) previously used a simulator, the mean number of neurosurgery elective weeks was 11.2 ± 4.6 (range, 4–22), and the mean number of surgical skin closures was 10.9 ± 6.3 (range, 1–25).

Five

Discussion

Based on studies of technical performance in neurosurgery, we have recently introduced a conceptual framework to understand surgical expertise in neurosurgery.14 Although it is clear that many nontechnical factors, such as clinical decision-making, contribute to expertise, having a framework allows one to better structure research questions relating to the interaction of cognitive and motor domains and how these contribute to operative outcomes, particularly at a challenging juncture in the

Conclusions

The visual rating scale can reliably be administered by as few as two raters and seems to reflect operative performance as measured on the simulator. However, force exerted during the neurosurgical operation and the quantity of brain volume removed and blood loss were less well captured by the visual rating scale. To our knowledge, this is the first study to be able to concurrently compare participant's ratings with their computationally measured performance and operative complications. We

Acknowledgments

The authors thank Dr. Valérie Dory and Dr. Meredith Young for their input. The authors also thank all the medical students and raters who participated in this study and Robert DiRaddo, Group Leader, Simulation, Life Sciences Division, National Research Council of Canada at Boucherville, and his team, including Denis Laroche, Valérie Pazos, Nusrat Choudhury, and Linda Pecora, for their support in the development of the scenarios used in these studies. The authors also thank all the members of

References (19)

There are more references available in the full text version of this article.

Cited by (0)

Conflict of interest statement: This work was supported by the Di Giovanni Foundation, Montreal English School Board, Colannini Foundation, and Montreal Neurological Institute and Hospital. A. Winkler-Schwartz is supported by a doctoral training grant for applicants with a professional degree issued by the Fonds de recherche du Québec – Santé. R. Del Maestro is the William Feindel Emeritus Professor in Neuro-Oncology at McGill University.

View full text