Development of a new reading comprehension assessment: Identifying comprehension differences among readers

https://doi.org/10.1016/j.lindif.2014.03.003Get rights and content

Highlights

  • We report preliminary findings for a reliable and valid reading comprehension assessment with diagnostic qualities.

  • This assessment identifies the types of reading comprehension processes used during reading.

  • This assessment can be used to identify reading comprehension processing differences between good, average, and poor comprehenders.

  • This assessment can be used to identify reading comprehension processing differences between subtypes of poor comprehenders.

  • This assessment is easy to use in a variety of educational settings.

Abstract

The purpose of this study was to evaluate the Multiple-choice Online Cloze Comprehension Assessment (MOCCA), designed to identify individual differences in reading comprehension. Data were collected with two sets of 3rd through 5th grade students during two years: 92 students participated in Year 1 and 98 students participated in Year 2 to address primary research questions, and an additional 94 (N = 192) students participated in Year 2 to address the limitation of test administration time. Participants were group administered the MOCCA and a standardized reading proficiency assessment, and individually administered other reading measures. Preliminary analyses indicated that the MOCCA produced reliable and valid scores as a new reading comprehension assessment for identifying types of comprehension processes used during reading, as well as for identifying individual differences in the types of comprehension processes used during reading. Findings are discussed in terms of developing a new measure to identify cognitive reading comprehension processes used during reading. Future research is needed to provide additional support for the technical adequacy of the assessment.

Introduction

Many students struggle with reading, and in particular, reading comprehension. As students advance in school, they transition from learning to read (e.g., learning to decode and developing fluency and comprehension skills) to reading to learn (e.g., using comprehension skills to learn from text; Chall, 1996). This transition is often most evident in the upper elementary grades, when many readers begin to encounter difficulties with new comprehension requirements (Shanahan and Shanahan, 2008, Shanahan and Shanahan, 2012).

Assessments are needed to determine why readers experience comprehension difficulties in order to develop appropriate instruction to meet their individual needs, yet few such assessments are available. Thus, the purpose of this study was to report preliminary findings from a new reading comprehension assessment, the Multiple-choice Online Cloze Comprehension Assessment (MOCCA), developed to identify individual differences in reading comprehension. In this paper, we first discuss theories of reading comprehension that guided the development of MOCCA. Second, we describe existing reading comprehension assessments used to measure specific aspects of comprehension, and how they have informed the development of MOCCA. Finally, we report initial evidence of the reliability and validity of MOCCA, and discuss how the present study extends the reading comprehension assessment literature.

Reading comprehension is a complex and multidimensional construct; thus, the development of reading comprehension assessments should be guided by theory (August et al., 2006, Fletcher, 2006). Reading comprehension theories help identify constructs that work during the process of comprehension and specify the relationships among them so that researchers can better operationalize the dimensions to be assessed.

Reading comprehension theories suggest that successful reading comprehension involves the extent to which a reader can develop a coherent mental representation of a text through developing a coherent situation model (e.g., Graesser et al., 1994, Kintsch, 1998, McNamara et al., 1996, van den Broek et al., 2005). A situation model is comprised of the situations that take place in a text (e.g., time, space, characters, and causality) (van Dijk and Kintsch, 1983, Zwaan et al., 1995). For instance, a reader may track causality by keeping track of the goal of the text (Trabasso and van den Broek, 1985, van den Broek et al., 2003). The following example describes a causal connection: “Jimmy wanted to buy a bike. He got a job and earned enough money. He went to the store to buy the bike. Jimmy was happy.” In this example, a reader could make a causal connection by generating an inference that Jimmy was happy because he reached his goal and bought a bike.

Researchers have found that many poor comprehenders (i.e., readers with adequate word reading skills but with poor comprehension skills compared to peers with similar word reading skills) fail to make causal inferences while reading, which may stem from failure to track causal relations and goals in a text (e.g., Cain and Oakhill, 1999, Cain and Oakhill, 2006, McMaster et al., 2012, Rapp et al., 2007, van den Broek, 1997). To provide appropriate instruction to improve such inference generation, it is important that reading comprehension assessments identify the specific processes with which poor comprehenders struggle.

Researchers have assessed reading comprehension processes to understand how readers build connections (i.e., inferences) and track relations during reading to develop a coherent representation of a text, and have assessed reading comprehension products to evaluate the result of the representation of the text. The products are the ‘end result’ of reading, or what the reader learned or stored in memory from the text after reading (i.e., offline). Reading products are typically assessed using recall, questioning activities, and traditional multiple-choice assessments.

In contrast, reading processes occur during the act of reading (i.e., online) and can be somewhat more difficult to assess because the examiner must infer what is taking place during reading. Methods to assess online reading comprehension processes include eye-tracking methods, reading time measures, and think-aloud tasks (e.g., Ericsson and Simon, 1993, Kaakinen et al., 2003, Linderholm et al., 2008). Think-aloud tasks, for example, are used to identify specific reading comprehension processes (e.g., causal, bridging, elaborative inferences; paraphrases) that readers use during reading (Ericsson & Simon, 1993). Findings from think-aloud studies indicate that readers use different types of comprehension processes during reading to develop coherent situation models (e.g., Laing and Kamhi, 2002, Trabasso and Magliano, 1996a, Trabasso and Magliano, 1996b, van den Broek et al., 2001). Although think-aloud data provide fruitful information about the processes that readers use during comprehension, they are laborious, time consuming, and impractical for practitioners to use to identify reading comprehension differences among their students for instructional purposes.

Researchers who have assessed reading comprehension processes using think-aloud methods have identified individual processing differences among readers at different levels of comprehension skill (McMaster et al., 2012, Rapp et al., 2007). Specifically, McMaster et al. (2012) administered a think-aloud task to fourth grade readers at different levels of comprehension skill (i.e., good, average, and poor). They identified two types of poor comprehenders: (1) paraphrasers: poor comprehenders who mostly paraphrased during reading; and (2) elaborators: poor comprehenders who elaborated about the text, including information that was connected to background knowledge that was not always relevant to the text. These findings were consistent with previous research that found similar types of poor comprehenders, and support other researchers' conclusions that poor comprehenders may struggle with reading in different ways (Cain and Oakhill, 2006, Nation et al., 2002, Perfetti, 2007, Rapp et al., 2007).

McMaster et al. (2012) also found that the two types of poor comprehenders responded to intervention in different ways. Specifically, they compared two questioning interventions: one that prompted readers to answer causal questions (Why questions that prompted readers to make causal connections during reading), and one that prompted readers to answer general questions (questions that prompted readers to make any kind of connections during reading). The researchers found that paraphrasers benefitted more from the general questioning intervention than elaborators did, whereas elaborators benefited more from the causal questioning intervention than paraphrasers did. These findings suggest that different types of poor comprehenders may respond differently to intervention.

Though researchers have employed methods to assess reading comprehension processing differences among readers (e.g., think-aloud tasks), most traditional school-based reading comprehension assessments (e.g., reading proficiency assessments, standardized measures) have not been designed to detect such processes or to identify individual comprehension differences. In addition, many of these methods assess the product of reading comprehension rather than the process, limiting the types of conclusions that can be drawn about how readers comprehend differently. For example, Keenan, Betjemann, and Olson (2008) found that commonly used standardized reading comprehension assessments measure aspects of reading such as decoding and word recognition, but not necessarily reading comprehension, and what is measured varies depending on the age of the reader. Thus, such traditional assessments may be insufficient for identifying specific reading comprehension differences; yet, educators often make instructional decisions based on their outcomes (Keenan et al., 2008).

Researchers have begun to develop other methods to help address the shortcomings of traditional reading assessments and measure how readers comprehend text rather than only assessing the product of comprehension. For instance, Magliano and colleagues developed the Reading Strategy Assessment Tool (RSAT; Magliano, Millis, Development Team, Levinstein, & Boonthum, 2011), which measures a subset of the comprehension processes found to lead to a coherent representation of a text. RSAT is an automated computer-based assessment in which readers read texts one sentence at a time, and are asked either indirect questions (i.e., “What are your thoughts regarding your understanding of the sentence in the context of the passage?”) or direct questions (i.e., Why questions related to a target sentence). Readers type their responses, which are later analyzed for types of comprehension processes (e.g., paraphrases, bridging inferences, elaborations) and content words (e.g., nouns, verbs, adjectives, adverbs) used during reading.

Magliano et al. (2011) identified unique types of comprehension processes that readers used during reading using RSAT, and also found that RSAT predicted scores on measures of reading comprehension. However, the measure is limited in several ways. First, RSAT uses an open-ended response format where participants type their responses to questions, limiting its use to older participants who have developed appropriate typing skills. Second, linguistic algorithms used to identify the types of comprehension processes produced in responses may be limited in capturing the quality of responses and identifying individual profiles of readers. Finally, like think alouds, the open-ended response task used in RSAT can produce a large amount of variability in how readers interpret the task instructions, especially the instructions for answering the indirect question which could be interpreted differently from reader to reader. Thus, it seems useful to develop an assessment that capitalizes on the strengths of RSAT (e.g., identify comprehension processes during reading), but is also familiar to readers in terms of testing format, efficient for educators to administer and score, and can be used for making instructional decisions with children.

Other recently developed assessments, such as the Diagnostic Assessment of Reading Comprehension (DARC; August et al., 2006) and The Bridging Inferences Test, Picture Version (Bridge-IT, Picture Version; Pike, Barnes, & Barron, 2010), measure individual differences in reading comprehension processes for readers in Grades 2–6. The DARC requires readers to remember newly read text, connect to and integrate relevant background knowledge, and generate bridging inferences (August et al., 2006). Despite its usefulness for identifying certain types of comprehension processes, the DARC uses unfamiliar pseudo-word relational statements embedded in texts. Readers are only asked to judge if such statements are true or false, and the assessment does not identify whether readers build a coherent representation of a text. The Bridge-IT, Picture Version also assesses children's ability to generate bridging inferences during reading, as well as the ability to suppress irrelevant text information (Pike et al., 2010). In addition, this assessment involves a task in which readers choose the last sentence of a narrative text and each text is either accompanied with a related picture, inconsistent picture, or no picture. Similar to the DARC, the Bridge-IT, Picture Version is limited in its utility for distinguishing between different comprehension processes used to develop a coherent representation of a text, and for identifying individual comprehension differences.

In sum, researches have developed assessments that target identifying comprehension processes; however few reading comprehension assessments are available for educators and practitioners to use to easily assess differences in readers' comprehension processes among children at various levels of comprehension skills. Limitations from previously developed assessments provide a rationale for developing new assessments that address the needs of readers who struggle with reading comprehension in different ways. Furthermore, developing reading comprehension assessments that focus on efficiently identifying specific reading comprehension processes used to develop a coherent representation of a text may be useful for identifying different types of comprehenders for the purposes of instruction.

In addition to variation in purpose and utility for educational decision making, reading comprehension assessments vary across many dimensions, including response format (e.g., cloze, multiple-choice, open-ended), presentation format (e.g., paper–pencil and computer-based), and the components of reading comprehension measured (e.g., literal comprehension, inferential processes, main idea identification) (Eason and Cutting, 2009, Keenan et al., 2008). Each dimension presents a challenge for assessment development.

In designing an assessment, the developer must make decisions about each dimension, which requires careful consideration of the benefits and drawbacks of options under each dimension. For instance, multiple-choice tests are efficient for administrating in group settings and are familiar to readers; however, traditional multiple-choice tests require readers to choose only one correct choice and alternative choices are mainly distracters without diagnostic meaning (Cutting & Scarborough, 2006). Additionally, multiple-choice questions are traditionally presented after an entire text, thus measuring the product of comprehension rather than the processes used to build a coherent representation of the text. Open-ended questions allow readers to demonstrate comprehension processes used to build a coherent text representation; however, open-ended assessments, like think alouds, are time consuming and difficult to score (e.g., Magliano et al., 2011). Modified cloze tests, such as the maze task in which every nth word is deleted and replaced with three options for the reader to select, are efficient to administer and score, and have been demonstrated to provide a general indicator of reading proficiency (Deno, 1985, Espin and Foegen, 1996, Fuchs and Fuchs, 1992, Wayman et al., 2007). In addition, maze tasks are often timed, which does not allow the reader to build a complete and coherent mental representation of the text. In fact, researchers have provided evidence that such approaches assess decoding or sentence level comprehension, rather than discourse level comprehension (Francis et al., 2005, Keenan et al., 2008, Nation and Snowling, 1997). Further, maze tasks were designed primarily for progress monitoring in reading rather than for assessing processes that take place during reading, and are thus limited in their diagnostic utility for comprehension (Wayman et al., 2007).

In the present study, we developed and evaluated an assessment to measure comprehension processes that readers use during reading (i.e., online), capitalizing on the benefits of existing measures (e.g., efficient and familiar presentation formats), but also addressing the shortcomings of existing measures (i.e., identify specific online reading comprehension processes and individual processing differences used to develop a coherent representation of a text). The resulting tool is the Multiple-choice Online Cloze Comprehension Assessment (MOCCA). MOCCA is a paper and pencil assessment that consists of short narrative texts (seven sentences long). For each text, the sixth sentence is deleted and readers are required to choose among four multiple-choice responses to complete the sixth sentence of the text. The best response requires the reader to make a causal inference that results in a coherent representation of the text. Unlike traditional multiple-choice assessments, MOCCA was designed with alternate response types that represent specific reading comprehension processes used during reading (i.e., causal inferences, paraphrases, local bridging inferences, and lateral connections). Fig. 1 provides an item from MOCCA, with each response type labeled for the comprehension process it identifies. Instructions and additional items from MOCCA can be found in Appendix A.

The purposes of this study were to evaluate the initial technical adequacy of MOCCA and examine its capacity to identify reading comprehension processes differences among readers. Included in this examination was also whether MOCCA can be used to identify subtypes of poor comprehenders similar to those identified in previous research using think-aloud approaches (McMaster et al., 2012, Rapp et al., 2007). Our research questions included: (1) Does MOCCA produce scores that are reliable (internally consistent) depending on the amount of time provided during test administration (i.e., timed vs. untimed) and depending on the difficultly and discrimination levels of the items? (2) Does MOCCA produce scores that are valid (in terms of criterion validity)? and (3) To what extent does MOCCA distinguish among comprehension processes of good, average, and poor comprehenders, including subtypes of poor comprehenders, during reading depending on the amount of time provided during test administration?

Section snippets

Participants

To address our research questions, data were collected across two years. Specifically, 92 third, fourth, and fifth grade students in Year 1 and 98 third, fourth, and fifth grade students in Year 2 completed the MOCCA (timed version) and a full battery of additional reading related assessments (as described under Measures). In Year 2, an additional 94 third, fourth, and fifth grade students, along with the other 98 students (N = 192) were provided additional testing time (untimed version) to

Results

Data were analyzed separately for participants in Years 1 and 2 to address each of our research questions. First, separate analyses were conducted to assess the internal consistency of the MOCCA (both timed and untimed data) as well as the difficulty and discrimination levels of the items. Second, we assessed the criterion validity of the MOCCA using the Year 1 (timed) and 2 (timed and untimed) datasets. Third, separate analyses were conducted to identify different types of comprehenders during

Discussion

In this study, we examined a new reading comprehension assessment (MOCCA) to identify individual comprehension processing differences. The MOCCA was developed to assess the processes of reading comprehension used when reading narrative texts. Assessing reading comprehension processes has been useful in previous research for identifying individual comprehension differences among readers (e.g., McMaster et al., 2012, Rapp et al., 2007), which may in turn be useful for identifying appropriate

Conclusion

The purposes of this study were to examine MOCCA, a new reading comprehension assessment designed to identify specific comprehension processes used during reading, and to identify individual differences among the types of processes different comprehenders use during reading, in particular, poor comprehenders. The results from this study support our purpose for developing a reading comprehension assessment around how readers use cognitive processes to build a coherent representation of a text.

References (59)

  • R.J. De Ayala

    The theory and practice of item response theory

    (2009)
  • S.L. Deno

    Curriculum-based measurement: The emerging alternative

    Exceptional Children

    (1985)
  • S.H. Eason et al.

    Examining sources of poor comprehension in older poor readers: Preliminary findings, issues, and challenges

  • R.L. Ebel

    Procedures for the analyses of classroom tests

    Educational and Psychological Measurement

    (1954)
  • K.A. Ericsson et al.

    Protocol analysis: Verbal reports as data

    (1993)
  • C.A. Espin et al.

    Validity of general outcome measures for predicting secondary students' performance on content-area tasks

    Exceptional Children

    (1996)
  • J.M. Fletcher

    Measuring reading comprehension

    Scientific Studies of Reading

    (2006)
  • D.J. Francis et al.

    Dimensions affecting the assessment of reading comprehension

  • L.S. Fuchs et al.

    Identifying a measure for monitoring student reading progress

    School Psychology Review

    (1992)
  • A.C. Graesser et al.

    Structures and procedures of implicit knowledge

    (1985)
  • A.C. Graesser et al.

    Constructing inferences during narrative text comprehension

    Psychological Review

    (1994)
  • P.N. Johnson-Laird

    Mental models: Towards a cognitive science of language, inference, and consciousness

    (1983)
  • Iteman, Version 3.5

    Conventional item analysis program

    (1989)
  • J.K. Kaakinen et al.

    How prior knowledge, working memory capacity, and relevance of information affect eye-fixations in expository text

    Journal of Experimental Psychology: Learning, Memory, and Cognition

    (2003)
  • J.M. Keenan et al.

    Reading comprehension tests vary in the skills they assess: Differential dependence on decoding and oral comprehension

    Scientific Studies of Reading

    (2008)
  • J.P. Kincaid et al.

    Derivation of new readability formulas (Automated Readability Index, Fog Count, and Flesch Reading Ease formula) for Navy enlisted personnel. Research Branch report 8-75. Chief of Naval technical training: Naval air station Memphis

    Scientific Studies of Reading

    (1975)
  • W. Kintsch

    Comprehension: A paradigm for cognition

    (1998)
  • W. Kintsch et al.

    Toward a model of text comprehension and production

    Psychological Review

    (1978)
  • Cited by (0)

    This research was supported by Grant #R305C050059 from the Institute of Education Sciences (IES), U.S. Department of Education, to the University of Minnesota and through the Interdisciplinary Education Sciences Predoctoral Training Program, “Minnesota Interdisciplinary Training in Education Sciences (MITER)” for data collection and resources, as well as by Grant #R305b110012 from the IES, U.S. Department of Education, to the Center on Teaching and Learning at the University of Oregon, through a Postdoctoral Fellowship for writing resources. The opinions expressed are those of the authors and do not necessarily represent views of the IES or the U.S. Department of Education.

    View full text