Skip to main content
main-content
Top

Tip

Swipe om te navigeren naar een ander artikel

02-01-2018 | Original article | Uitgave 2/2018 Open Access

Perspectives on Medical Education 2/2018

Detecting rater bias using a person-fit statistic: a Monte Carlo simulation study

Tijdschrift:
Perspectives on Medical Education > Uitgave 2/2018
Auteurs:
André-Sébastien Aubin, Christina St-Onge, Jean-Sébastien Renaud
Belangrijke opmerkingen
Editor’s Note: Commentary by: A. Harris, https://​doi.​org/​10.​1007/​s40037-017-0396-3

Abstract

Introduction

With the Standards voicing concern for the appropriateness of response processes, we need to explore strategies that would allow us to identify inappropriate rater response processes. Although certain statistics can be used to help detect rater bias, their use is complicated by either a lack of data about their actual power to detect rater bias or the difficulty related to their application in the context of health professions education. This exploratory study aimed to establish the worthiness of pursuing the use of l z to detect rater bias.

Methods

We conducted a Monte Carlo simulation study to investigate the power of a specific detection statistic, that is: the standardized likelihood l z person-fit statistics (PFS). Our primary outcome was the detection rate of biased raters, namely: raters whom we manipulated into being either stringent (giving lower scores) or lenient (giving higher scores), using the l z statistic while controlling for the number of biased raters in a sample (6 levels) and the rate of bias per rater (6 levels).

Results

Overall, stringent raters (M = 0.84, SD = 0.23) were easier to detect than lenient raters (M = 0.31, SD = 0.28). More biased raters were easier to detect then less biased raters (60% bias: 62, SD = 0.37; 10% bias: 43, SD = 0.36).

Discussion

The PFS l z seems to offer an interesting potential to identify biased raters. We observed detection rates as high as 90% for stringent raters, for whom we manipulated more than half their checklist. Although we observed very interesting results, we cannot generalize these results to the use of PFS with estimated item/station parameters or real data. Such studies should be conducted to assess the feasibility of using PFS to identify rater bias.
Literatuur
Over dit artikel

Andere artikelen Uitgave 2/2018

Perspectives on Medical Education 2/2018 Naar de uitgave