Elsevier

Nurse Education Today

Volume 30, Issue 6, August 2010, Pages 539-543
Nurse Education Today

A comparison of the psychometric properties of three- and four-option multiple-choice questions in nursing assessments

https://doi.org/10.1016/j.nedt.2009.11.002Get rights and content

Summary

In multiple-choice tests, four-option items are the standard in nursing education. There are few evidence-based reasons, however, for MCQs to have four or more options as studies have shown that three-option items perform equally as well and the additional options most often do not improve test reliability and validity. The aim of this study was to examine and compare the psychometric properties of four-option items with the same items rewritten as three-option items. Using item-analysis data to eliminate the distractor with the lowest response rate, we compared three- and four-option versions of 41 multiple-choice items administered to two student cohorts over two subsequent academic years. Removing the non-functioning distractor resulted in minimal changes in item difficulty and discrimination. Three-option items contained more functioning distractors despite having fewer distractors overall. Existing distractors became more discriminating when infrequently selected distractors were removed from items. Overall, three-option items perform equally as well as four-option items. Since three-option items require less time to develop and administer and additional options provide no psychometric advantage, teachers are encouraged to adopt three-option items as the standard on multiple-choice tests.

Introduction

In nursing education, multiple-choice questions (MCQs) are one of the most popular written assessment formats. Single best-answer MCQs consist of a question, two or more choices from which examinees must choose the correct option (the distractors), and one correct or best response. While MCQs are often criticized for largely assessing factual recall over higher cognitive thinking (Pamplett and Farnhill, 1995), MCQs still offer many advantages when compared with other types of written assessment. Despite what many teachers believe, MCQs are adaptable to different, although not all, levels of learning outcomes (Gronlund and Waugh, 2008). High quality MCQs present clinical vignettes to students that mimic actual clinical problems and assess application of knowledge rather than simple factual recall (Case and Swanson, 2003). Therefore, well constructed MCQs can accurately discriminate between high- and low-ability students (Schuwirth and van der Vleuten, 2003). MCQs are objective and they allow teachers to test a wider range of content and educational objectives than many other written assessment methods. Additionally, MCQs allow teachers to efficiently assess large numbers of candidates as they are easy to administer and score (McCoubrie, 2004). Furthermore, because of this broader sampling of content and because MCQ test items can be subjected to post-test review using item analysis procedures, MCQ tests have higher validity than other test methods such as short-answer or essay-style questions (Gronlund and Waugh, 2008).

Four-option MCQs remain the standard in nursing, both on in-house developed tests (Tarrant et al., 2006) and in test banks and text books used in nursing education (Masters et al., 2001). In other health-science disciplines, such as medicine, five-option items are more common (Haladyna and Downing, 1993). Although measurement specialists have long discovered that there are few evidence-based reasons for MCQs to have four or five options, many introductory books on item writing continue to recommend this practice and a majority of teachers continue to follow this recommendation (Owen and Froman, 1987). Three-options items however, have many advantages over four- and five-option items, including less time required to construct items and less testing time required. Alternately, with less time required to complete three-option items, teachers are able to increase the number of items administered on a test and thereby increase the amount of content tested (Haladyna and Downing, 1993). Furthermore, researchers have shown that in both teacher-generated (Tarrant et al., 2009) and professionally-developed (Haladyna and Downing, 1993) four- and five-option MCQs, students rarely select more than two or three of the options.

In most nursing programmes, the amount of content that requires assessment can be overwhelming. A substantial proportion of a teacher’s time is spent on developing written assessments and since a substantial proportion of those assessments will likely contain MCQs, it is important that teachers are basing those practices on the best available research evidence. Additionally, student numbers are generally increasing to meet workplace shortages, while at the same time the number of available teaching faculty is often getting smaller (Broome, 2009). Therefore, because of their efficiency and ability to assess different learning outcomes, MCQs are likely to continue to remain an important component of written assessment in many nursing programmes for the foreseeable future. Thus if the time required to develop multiple-choice tests can be reduced without reducing the reliability and validity of the assessment, this is an important consideration for nursing faculty.

Section snippets

Background

Numerous research studies have compared three-, four-, and five-option MCQs and most have found that three-option items perform equally as well or better than either four- or five-option items. Crehan et al., 1993, Schuwirth and van der Vleuten, 2004, Sidick et al., 1994 rewrote 68 five-option items on public sector employment tests by removing the two least functional distractors. Overall, there was little difference in the psychometric properties between the three- and five-option items.

Methods

Data for this study consisted of two tests administered to two cohorts of students in an undergraduate public health nursing course over two subsequent academic years. The first test consisted of 50 four-option items administered to 36 students at the end of the fall semester in 2006. The second test consisted of 70 three-option items administered to a subsequent cohort of 106 students at the same time the next year. Using item-analysis data from the four-option test administered in 2006, the

Results

Table 1 shows the summary characteristics of the two tests and the subsets of 41 items. In total, 142 students were tested. On the original tests, overall mean test scores and the range of test scores were similar for both the 2006 and 2007 cohorts. The pass rate for the 2007 cohort was marginally lower than the 2006 cohort and the reliability was lower for both subsets of 41 items when compared with the whole test. However, this would be expected with fewer test items. The 41-item subset of

Discussion

To our knowledge, this is the first study in nursing and only the second in a health-science discipline (Cizek and O’Day, 1994) to specifically compare item characteristics of three- and four-option MCQs. Although findings from this study are consistent with other research on this topic, generalizability may be limited by several factors. Since our study examined only two undergraduate nursing examinations, further research in other settings should be done to determine the applicability of our

Conclusion

Results from this study of teacher-generated MCQs lends further support to the conclusion that in most circumstances, three-option items are the more feasible and practical choice when compared with four-option items. Given the time constraints of most nursing faculty today, and the increasing focus on evidence-based education, teachers involved in developing MCQs for nursing assessments are encouraged to use three-option items. Three-option items perform equally as well as the longer

Acknowledgements

Financial support for this study was provided by the Leung Kau Kui/Run Run Shaw Research and Teaching Endowment Fund, the University of Hong Kong.

References (29)

  • M.E. Broome

    The faculty shortage in nursing: global implications

    Nursing Outlook

    (2009)
  • M. Tarrant et al.

    The frequency of item writing flaws in multiple-choice questions used in high stakes nursing assessments

    Nurse Education Today

    (2006)
  • M.G. Aamodt et al.

    A meta-analytic investigation of the effect of various test item characteristics on test scores

    Public Personnel Management

    (1992)
  • A.K. Beuchert et al.

    A Monte Carlo comparison of ten item discrimination indices

    Journal of Educational Measurement

    (1979)
  • S.M. Case et al.

    Constructing Written Test Questions for the Basic and Clinical Sciences

    (2003)
  • G.J. Cizek et al.

    Further investigations of nonfunctioning options in multiple-choice test items

    Educational and Psychological Measurement

    (1994)
  • K.D. Crehan et al.

    Use of an inclusive option and the optimal number of options for multiple-choice items

    Educational and Psychological Measurement

    (1993)
  • S.M. Downing

    Threats to the validity of locally developed multiple-choice tests in medical education: construct-irrelevant variance and construct underrepresentation

    Advances in Health Sciences Education

    (2002)
  • R.L. Ebel et al.

    Essentials of Educational Measurement

    (1991)
  • N.E. Gronlund et al.

    Assessment of Student Achievement

    (2008)
  • T.M. Haladyna

    Developing and Validating Multiple-Choice Test Items

    (2004)
  • T.M. Haladyna et al.

    How many options is enough for a multiple-choice test item?

    Educational and Psychological Measurement

    (1993)
  • T.M. Haladyna et al.

    A review of multiple-choice item-writing guidelines for classroom assessment

    Applied Measurement in Education

    (2002)
  • J.C. Masters et al.

    Assessment of multiple-choice questions in selected test banks accompanying text books used in nursing education

    Journal of Nursing Education

    (2001)
  • Cited by (35)

    • Promoting collaborative learning through regulation of guessing in clickers

      2017, Computers in Human Behavior
      Citation Excerpt :

      On the other hand, some researchers argue that multiple-choice assignments are deemed to measure only factual recalling (Butler & Roediger, 2008; Nickerson, Butler, & Carlin, 2015; Nicol, 2007). Therefore, many instructors offer the easiest way to manipulate test difficulty, i.e. to vary the number of multiple-choice alternatives (Butler & Roediger, 2008; Dehnad, Nasser, & Hosseini, 2014; Lesage, Valcke, & Sabbe, 2013; Tarrant & Ware, 2010). But an increase in the number of distractors may lead to a decrease in proportions of correct responses.

    • How-to-guide for writing multiple choice questions for the pharmacy instructor

      2017, Currents in Pharmacy Teaching and Learning
      Citation Excerpt :

      This percentage can increase to 100% with three poorly constructed distractors. When writing MCQ, the answer choices should be plausible and appropriate for the question posed.3,5,6,8,11,12,19–21,23,26,32,34,35,38,43,46,50,51 Non-plausible answers are simple to discount, and as the number of options decrease, the odds are higher that the test-taker may correctly guess the answer.

    View all citing articles on Scopus
    View full text