A comparison of the psychometric properties of three- and four-option multiple-choice questions in nursing assessments

doi:10.1016/j.nedt.2009.11.002

Nurse Education Today

Volume 30, Issue 6, August 2010, Pages 539-543

https://doi.org/10.1016/j.nedt.2009.11.002 Get rights and content

Summary

In multiple-choice tests, four-option items are the standard in nursing education. There are few evidence-based reasons, however, for MCQs to have four or more options as studies have shown that three-option items perform equally as well and the additional options most often do not improve test reliability and validity. The aim of this study was to examine and compare the psychometric properties of four-option items with the same items rewritten as three-option items. Using item-analysis data to eliminate the distractor with the lowest response rate, we compared three- and four-option versions of 41 multiple-choice items administered to two student cohorts over two subsequent academic years. Removing the non-functioning distractor resulted in minimal changes in item difficulty and discrimination. Three-option items contained more functioning distractors despite having fewer distractors overall. Existing distractors became more discriminating when infrequently selected distractors were removed from items. Overall, three-option items perform equally as well as four-option items. Since three-option items require less time to develop and administer and additional options provide no psychometric advantage, teachers are encouraged to adopt three-option items as the standard on multiple-choice tests.

Introduction

In nursing education, multiple-choice questions (MCQs) are one of the most popular written assessment formats. Single best-answer MCQs consist of a question, two or more choices from which examinees must choose the correct option (the distractors), and one correct or best response. While MCQs are often criticized for largely assessing factual recall over higher cognitive thinking (Pamplett and Farnhill, 1995), MCQs still offer many advantages when compared with other types of written assessment. Despite what many teachers believe, MCQs are adaptable to different, although not all, levels of learning outcomes (Gronlund and Waugh, 2008). High quality MCQs present clinical vignettes to students that mimic actual clinical problems and assess application of knowledge rather than simple factual recall (Case and Swanson, 2003). Therefore, well constructed MCQs can accurately discriminate between high- and low-ability students (Schuwirth and van der Vleuten, 2003). MCQs are objective and they allow teachers to test a wider range of content and educational objectives than many other written assessment methods. Additionally, MCQs allow teachers to efficiently assess large numbers of candidates as they are easy to administer and score (McCoubrie, 2004). Furthermore, because of this broader sampling of content and because MCQ test items can be subjected to post-test review using item analysis procedures, MCQ tests have higher validity than other test methods such as short-answer or essay-style questions (Gronlund and Waugh, 2008).

Four-option MCQs remain the standard in nursing, both on in-house developed tests (Tarrant et al., 2006) and in test banks and text books used in nursing education (Masters et al., 2001). In other health-science disciplines, such as medicine, five-option items are more common (Haladyna and Downing, 1993). Although measurement specialists have long discovered that there are few evidence-based reasons for MCQs to have four or five options, many introductory books on item writing continue to recommend this practice and a majority of teachers continue to follow this recommendation (Owen and Froman, 1987). Three-options items however, have many advantages over four- and five-option items, including less time required to construct items and less testing time required. Alternately, with less time required to complete three-option items, teachers are able to increase the number of items administered on a test and thereby increase the amount of content tested (Haladyna and Downing, 1993). Furthermore, researchers have shown that in both teacher-generated (Tarrant et al., 2009) and professionally-developed (Haladyna and Downing, 1993) four- and five-option MCQs, students rarely select more than two or three of the options.

In most nursing programmes, the amount of content that requires assessment can be overwhelming. A substantial proportion of a teacher’s time is spent on developing written assessments and since a substantial proportion of those assessments will likely contain MCQs, it is important that teachers are basing those practices on the best available research evidence. Additionally, student numbers are generally increasing to meet workplace shortages, while at the same time the number of available teaching faculty is often getting smaller (Broome, 2009). Therefore, because of their efficiency and ability to assess different learning outcomes, MCQs are likely to continue to remain an important component of written assessment in many nursing programmes for the foreseeable future. Thus if the time required to develop multiple-choice tests can be reduced without reducing the reliability and validity of the assessment, this is an important consideration for nursing faculty.

Section snippets

Background

Numerous research studies have compared three-, four-, and five-option MCQs and most have found that three-option items perform equally as well or better than either four- or five-option items. Crehan et al., 1993, Schuwirth and van der Vleuten, 2004, Sidick et al., 1994 rewrote 68 five-option items on public sector employment tests by removing the two least functional distractors. Overall, there was little difference in the psychometric properties between the three- and five-option items.

Methods

Data for this study consisted of two tests administered to two cohorts of students in an undergraduate public health nursing course over two subsequent academic years. The first test consisted of 50 four-option items administered to 36 students at the end of the fall semester in 2006. The second test consisted of 70 three-option items administered to a subsequent cohort of 106 students at the same time the next year. Using item-analysis data from the four-option test administered in 2006, the

Results

Table 1 shows the summary characteristics of the two tests and the subsets of 41 items. In total, 142 students were tested. On the original tests, overall mean test scores and the range of test scores were similar for both the 2006 and 2007 cohorts. The pass rate for the 2007 cohort was marginally lower than the 2006 cohort and the reliability was lower for both subsets of 41 items when compared with the whole test. However, this would be expected with fewer test items. The 41-item subset of

Discussion

To our knowledge, this is the first study in nursing and only the second in a health-science discipline (Cizek and O’Day, 1994) to specifically compare item characteristics of three- and four-option MCQs. Although findings from this study are consistent with other research on this topic, generalizability may be limited by several factors. Since our study examined only two undergraduate nursing examinations, further research in other settings should be done to determine the applicability of our

Conclusion

Results from this study of teacher-generated MCQs lends further support to the conclusion that in most circumstances, three-option items are the more feasible and practical choice when compared with four-option items. Given the time constraints of most nursing faculty today, and the increasing focus on evidence-based education, teachers involved in developing MCQs for nursing assessments are encouraged to use three-option items. Three-option items perform equally as well as the longer

Acknowledgements

Financial support for this study was provided by the Leung Kau Kui/Run Run Shaw Research and Teaching Endowment Fund, the University of Hong Kong.

References (29)

M.E. Broome
The faculty shortage in nursing: global implications
Nursing Outlook
(2009)
M. Tarrant et al.
The frequency of item writing flaws in multiple-choice questions used in high stakes nursing assessments
Nurse Education Today
(2006)
M.G. Aamodt et al.
A meta-analytic investigation of the effect of various test item characteristics on test scores
Public Personnel Management
(1992)
A.K. Beuchert et al.
A Monte Carlo comparison of ten item discrimination indices
Journal of Educational Measurement
(1979)
S.M. Case et al.
Constructing Written Test Questions for the Basic and Clinical Sciences
(2003)
G.J. Cizek et al.
Further investigations of nonfunctioning options in multiple-choice test items
Educational and Psychological Measurement
(1994)
K.D. Crehan et al.
Use of an inclusive option and the optimal number of options for multiple-choice items
Educational and Psychological Measurement
(1993)
S.M. Downing
Threats to the validity of locally developed multiple-choice tests in medical education: construct-irrelevant variance and construct underrepresentation
Advances in Health Sciences Education
(2002)
R.L. Ebel et al.
Essentials of Educational Measurement
(1991)
N.E. Gronlund et al.
Assessment of Student Achievement
(2008)

T.M. Haladyna

Developing and Validating Multiple-Choice Test Items

(2004)

T.M. Haladyna et al.

How many options is enough for a multiple-choice test item?

Educational and Psychological Measurement

(1993)

T.M. Haladyna et al.

A review of multiple-choice item-writing guidelines for classroom assessment

Applied Measurement in Education

(2002)

J.C. Masters et al.

Assessment of multiple-choice questions in selected test banks accompanying text books used in nursing education

Journal of Nursing Education

(2001)

Cited by (35)

Promoting collaborative learning through regulation of guessing in clickers
2017, Computers in Human Behavior
Citation Excerpt :
On the other hand, some researchers argue that multiple-choice assignments are deemed to measure only factual recalling (Butler & Roediger, 2008; Nickerson, Butler, & Carlin, 2015; Nicol, 2007). Therefore, many instructors offer the easiest way to manipulate test difficulty, i.e. to vary the number of multiple-choice alternatives (Butler & Roediger, 2008; Dehnad, Nasser, & Hosseini, 2014; Lesage, Valcke, & Sabbe, 2013; Tarrant & Ware, 2010). But an increase in the number of distractors may lead to a decrease in proportions of correct responses.
Collaborative learning is a promising avenue in education research. Learning from others and with others can foster deeper learning at a multiple-choice assignment, but it is hard to control the level of students' pure guessing. This paper addresses the problem of promoting collaborative learning through regulation of guessing when students use clickers to answer multiple-choice questions of various levels of difficulty. The study is aimed at identifying how the difficulty of the task and students' levels of knowledge influence on the degree of partial guessing. To answer this research question, we developed two research models and validated them by testing 84 students with regard to the students' level of knowledge and the penalty announcement. The findings of this research reveal that: a) the announcement of penalty has a negative effect on promoting collaborative learning even if it leads to reducing pure guesses in test results; b) questions that require higher-order thinking skills promote collaborative learning to a greater extent; c) creating mixed level groups of students seems advisable to enhance learning from collaboration and, thus, to decrease the degree of pure guessing.
How-to-guide for writing multiple choice questions for the pharmacy instructor
2017, Currents in Pharmacy Teaching and Learning
Citation Excerpt :
This percentage can increase to 100% with three poorly constructed distractors. When writing MCQ, the answer choices should be plausible and appropriate for the question posed.3,5,6,8,11,12,19–21,23,26,32,34,35,38,43,46,50,51 Non-plausible answers are simple to discount, and as the number of options decrease, the odds are higher that the test-taker may correctly guess the answer.
Background: Writing multiple choice questions (MCQ) takes a lot of practice. Often, pharmacy practitioners lack the training to write effective MCQ. Sources for instruction in effective MCQ writing can be overwhelming with numerous suggestions of what should and should not be done. Purpose: The following guide is prepared to serve as a succinct reference for creation and revision of MCQ by both novice and seasoned pharmacy faculty practitioners. Methods: The literature is summarized into 12 best practices for writing effective MCQ. Pharmacy specific examples that demonstrate violations of best practices and how they can be corrected are provided. Implications: The guide can serve as a primer to write new MCQ, as a reference to revise previously created questions, or as a guide to peer review of MCQ.
Postexamination item analysis of undergraduate pediatric multiple-choice questions exam: implications for developing a validated question Bank
2024, BMC Medical Education
A comparison of 3- and 4-option multiple-choice items for medical subspecialty in-training examinations
2023, BMC Medical Education
Decreasing the options’ number in multiple choice questions in the assessment of senior medical students and its effect on exam psychometrics and distractors’ function
2023, BMC Medical Education
A Method for Converting 4-Option Multiple-Choice Items to 3-Option Multiple-Choice Items Without Re-Pretesting
2023, Practical Assessment, Research and Evaluation

View all citing articles on Scopus

View full text

A comparison of the psychometric properties of three- and four-option multiple-choice questions in nursing assessments

Summary

Introduction

Section snippets

Background

Methods

Results

Discussion

Conclusion

Acknowledgements

Nursing Outlook

Nurse Education Today

A meta-analytic investigation of the effect of various test item characteristics on test scores

Public Personnel Management

A Monte Carlo comparison of ten item discrimination indices

Journal of Educational Measurement

Constructing Written Test Questions for the Basic and Clinical Sciences

Further investigations of nonfunctioning options in multiple-choice test items

Educational and Psychological Measurement

Use of an inclusive option and the optimal number of options for multiple-choice items

Educational and Psychological Measurement

Threats to the validity of locally developed multiple-choice tests in medical education: construct-irrelevant variance and construct underrepresentation

Advances in Health Sciences Education

Essentials of Educational Measurement

Assessment of Student Achievement

Developing and Validating Multiple-Choice Test Items

How many options is enough for a multiple-choice test item?

Educational and Psychological Measurement

A review of multiple-choice item-writing guidelines for classroom assessment

Applied Measurement in Education

Assessment of multiple-choice questions in selected test banks accompanying text books used in nursing education

Journal of Nursing Education