It is no surprise that when one’s attention is diverted away from a particular task, these lapses of attention, or mind-wandering episodes, are often accompanied by measureable changes in performance. Although there has been a good deal of research documenting these effects in the laboratory using sustained attention tasks (McVay & Kane, 2009; Smallwood et al. 2004), the common experience of mind wandering while reading (i.e., having one’s eyes continue to move across a page of text, while the mind is focused elsewhere) has until recently received little attention. Fortunately, since considerable research has been done to identify the critical processes necessary for successful text comprehension, such as phonological and lexical processing (e.g., Perfetti et al. 2007), working memory (see, e.g., Daneman & Carpenter, 1980), and metacognitive skills (e.g., Brown & Palincsar, 1987), these findings can be exploited to discover how mind wandering impacts upon these skills. The aim of the present study is to use these systematic differences between mindful versus mindless reading to predict off-task reading before it is reported by a participant.

In the studies that have investigated mindless reading, participants typically read text and are periodically probed by asking whether at that particular moment their minds are on or off task, followed by a reading comprehension test (Schooler, Reichle, & Halpern, 2004; Smallwood, McSpadden, & Schooler, 2008). The results from these studies have revealed that the frequency of mindless reading is highly correlated with reading comprehension performance, which indicates the important relationship between mind wandering and comprehension failure (Smallwood, Fishman, & Schooler, 2007).

In addition to studying how mind wandering influences overall text comprehension, researchers have also begun to focus on how mindless reading influences the processing of lexical features of the words being read. When individuals read, there is typically a strong relationship between the lexical properties of the words and the amount of time that is devoted to their processing (Rayner, 1998). Reichle, Reineberg, and Schooler (2010) measured eye movements during reading and showed that while gaze durations prior to on-task reports were sensitive to lexical features, such as word length and word frequency, these effects were attenuated in periods immediately prior to off-task reports. In addition, a recent study by Smilek, Carriere, and Cheyne (2010) showed that mindless reading is associated with increased blinking. This result could help explain the deficit in encoding the lexical features of words, since frequent blinking is associated with deactivation of cortical areas that process the external visual world (Bristow, Haynes, Sylvester, Frith, & Rees, 2005). Consistent with these findings are results indicating that reading comprehension can be compromised when critical regions of the text are poorly encoded (Christianson, Williams, Zacks, & Ferreira, 2006; Sanford & Graesser, 2006; Stine-Morrow, Noh, & Shake, 2010). Together, these findings indicate that the negative impact of mindless reading is likely a consequence of participants’ neglect of the visual, phonological, and semantic features of the words, which has been described as the “cascade model of inattention” (Smallwood, 2011).

Present study

The strong relationship between reading times and mind wandering suggests that it may be possible to detect mindless reading by discerning when reading behavior deviates from what can be considered normal. Here, we attempted to use the speed with which participants manually advanced words in a word-by-word reading paradigm to predict, prior to their being probed, whether participants would report mind wandering. The word-by-word paradigm has the notable advantage over costly eyetracking devices of working on any computer, while nevertheless closely mirroring the behavioral reading patterns observed with eyetracking (Just, Carpenter, & Woolley, 1982). Thus, identification of a behavioral signature of mind wandering with this paradigm has the potential to provide a widely applicable methodology for discerning mindless reading.

Importantly, pilot testing confirmed that participants’ behavior in a word-by-word reading paradigm varied based on whether or not they reported mind wandering.Footnote 1 Consistent with Reichle et al. (2010), we found a number of lexical effects that were attenuated when participants reported being off task. Specifically, it was shown that when on task, participants become significantly slower for many-letter, multisyllable, low-familiarity words. When off task, participants no longer slow to the same extent for these words. Interestingly, however, whereas Reichle et al. reported increased gaze durations during periods of mind wandering, the present findings showed that overall, participants tended to go faster when they were off task. One potential reason for this discrepancy between studies might be the different task environments; whereas, in our study, the words were displayed individually at the center of the screen and required a manual response to proceed to the next word, Reichle et al. had participants read their text one page at a time (see the General Discussion for further elaboration on this point).

The aim of the present study was to use differences in the patterns of reaction times associated with attentive and inattentive reading to create an algorithm that can predict in real time when participants are reading mindlessly. The online identification of mindless reading based on real-time appraisals of participants’ reaction times would offer a key advance toward the development of a pedagogical tool for minimizing the negative impact of mindless reading on reading comprehension. In addition, by running participants without thought probes, we hoped to show that the predicted number of mind-wandering episodes correlated negatively with reading comprehension. This would suggest that the algorithm could be used to covertly track mind wandering, and could therefore be a powerful tool for investigating the processes involved in mind wandering, without requiring participants to explicitly report their mental states.

Method

Participants

A total of 49 participants from the University of California, Santa Barbara, were tested in the experiment (23 female, 26 male; mean age = 19.2 years). Of these participants, 28 performed the task with thought probes, and 21 received no thought probes.

Materials

Text

The text used in this experiment was a shortened version of “The Red-Headed League” (Conan-Doyle, 1892/2001), edited to approximately 5,000 words. This was the same version that had been used by Smallwood et al. (2008).

Design

The basic rationale for the mind-wandering algorithm was based on findings from the pilot study that when participants are paying attention, they are slowest at times when the text is more difficult (i.e., for many-letter, multisyllable, low-familiarity words). Therefore, it was during the difficult text that thought probes would be initiated based on participants’ reaction times to the words. At times in the story when the text was difficult and a participant was going fast, we predicted that the participant would be off task; if a participant was going slowly during difficult text, we predicted that he or she would be on task.

Text difficulty was calculated using the lexical variables from the pilot study. Words were categorized as long (at least four letters) or short (less than four letters). There were approximately equal numbers of long (2,577, or 50.24%) and short (2,552, or 49.76%) words. Words were also categorized as having many (at least two syllables) or few (less than two syllables) syllables. Although this measure was more biased, since there were more words with only a few syllables (N = 4,241; 82.69%), there were still an adequate number of words with many syllables for a meaningful analysis (N = 888; 17.31%). The words in the text were assigned a familiarity score using the MRC psycholinguistic database (Coltheart, 1981; 78% of the text words were in the database, and the missing 22% were automatically classified as low-familiar words). The values ranged from 100 to 700, with a mean of 488 and standard deviation of 99. Based on the mean, words with a value less than or equal to 488 were classified as low-familiar words (N = 1,230; 23.98%), those words above 488 were classified as high-familiar words (N = 3,899; 76.02%). Using these criteria, each word was classified according to word length (long = 1/short = 0), number of syllables (many = 1/ few = 0), and word familiarity (low = 1, high = 0). With these categories coded numerically, each word was assigned a mean difficulty rating ranging from 0 (easy) to 1 (hard) based on the average of the three lexical variables. A running average consisting of 10 words was used to provide a measure of the local difficulty of portions of the text within the story. The mean local text difficulty was .30 (SD = .09). On the basis of the pilot testing, we used a threshold of .45, with text having a value greater than this being classified as “difficult.” Therefore, thought probes could only be initiated if the local text difficulty was greater than .45.

In order to determine whether participants were going fast or slow, a measure of their local reaction time (LRT, using a running average consisting of 10 words) was compared to a measure of their global reaction time (GRT, using a cumulative running average). If the LRT was less than the GRT, the participant was classified as going fast. If the LRT was greater than the GRT, the participant was slow. Since it was possible to further refine this speed classification, we manipulated the amount faster or slower that the LRT had to be relative to the GRT to initiate a thought probe. The following values were based on pilot data. For participants to be considered as going fast (i.e., off task), their LRT had to be less than 0.55 times the GRT. Participants were considered as going slow (i.e., on task) if their LRT was greater than 1.3 times the GRT and less than 1.75 times the GRT. This second criterion was applied in order to avoid classifying participants as being on task when they were going extremely slow, since mind wandering is also associated with increased reading times (Reichle et al. 2010). A more conservative threshold was used to initiate off-task probes because pilot data suggested that there are more off- than on-task probes. This is likely due to the fact that participants tend to go faster as the task continues.

Participants who were probed used a 1–5 scale to rate the extent to which they were focused on the task. Specifically, for each thought probe, participants were asked “In the moments prior to the probe, was your attention focused: (1) Completely on the task (2) Mostly on the task (3) On both the task and unrelated concerns (4) Mostly on unrelated concerns (5) Completely on unrelated concerns.” This was done in order to treat mind wandering as a nondiscrete state and to allow us to capture more subtle differences in the subjective reports of mind wandering (see Christoff, Gordon, Smallwood, Smith, & Schooler, 2009, for the use of a similar technique).

Procedure

The text was presented word by word in black on a white screen. Participants advanced the text by pressing the space bar. The words remained on the screen for at least 150 ms in order to make sure that the participants fixated on all of the words. For participants in the probe condition, the thought probes were presented only at times that the algorithm predicted either on- or off-task behavior. After participants had finished reading the text, they answered 23 comprehension questions. Each question had four possible answers. The entire task took approximately 50 min.

Results and discussion

Comprehension

Reading comprehension did not differ between the probed (mean accuracy = 61.23%, SD = 0.15) and the nonprobed (mean accuracy = 59.01%, SD = 0.15) participants. For the probed participants, comprehension accuracy correlated negatively with participants’ mean thought probe score; that is, a higher score was associated with lower accuracy (r = −.35, p < .05). In addition, comprehension accuracy also correlated negatively with the number of predicted off-task episodes (r = −.33, p < .05). For the nonprobed participants, comprehension accuracy also correlated negatively with the number of predicted off-task episodes (r = −.54, p < .01). There was no significant difference between these correlations (Fisher’s r-to-z transformation: z = .85, p = .40).

Mind wandering based on thought probe type

The success of the algorithm was determined by comparing the mean thought probe scores from when participants were predicted to be either on or off task, using a repeated measures ANOVA. Given that on- and off-task probes were only initiated if the reading behavior conformed to the algorithm parameters, a few participants received only on-task (n = 2) or off-task (n = 2) probes. In order to include these participants in the analysis, which captures the effectiveness of the algorithm for participants who are mostly on or off task, these missing values were replaced with the mean on- or off-task thought probe score. This analysis revealed a significant difference between the predicted on-task thought probe score (M = 2.40, SD = 1.27) as compared to the predicted off-task thought probe score (M = 3.33, SD = 1.32) [F(1, 27) = 6.50, p = .01, d = 0.71]. Participants, therefore, reported more mind wandering when we predicted them to be off task, and less mind wandering when we predicted them to be on task.

In addition, it was also possible to calculate the probability that the algorithm correctly identified whether a person was on versus off task if these were treated as binary categories (see, e.g., Smallwood et al. 2004). In order to do this, given the 1–5 scale used to measure mind wandering, all responses of 1 and 2 were coded as on task, and all responses of 4 and 5 were coded as off task. On average, there were 11.3 thought probes per participant for this analysis (on task = 5.0, off task = 6.3; in the nonprobed condition, there were comparable numbers of thought probes: total = 10.8, on task = 6.4, off task = 4.4). We then used an exact binomial test to compare the success of the algorithm for each participant to what would be expected by chance. The expected chance probability was determined to be 49% through a Monte Carlo simulation (10,000 iterations) that randomly recategorized the thought probes as on or off task while maintaining the same proportion of on- versus off-task thought probes. This expected chance value was slightly less than 50% because, while the algorithm tends to predict participants as being off task more often than on task (57.2%), participants responded using 4 or 5 less often (47.1% of the time) than 1 or 2. This analysis reveals that the algorithm was successful at predicting whether a participant was on versus off task 72.0% of the time (p < .0001, 95% confidence interval 66.4%–77.0%). Of course, 72% accuracy would not be that impressive if participants happened to be on task (or off task) a disproportionate amount of time. Importantly, however, it was shown that participants were about equally likely to report being on versus off task.

Together, these results suggest that it is possible to use behavioral measures to predict in real time when participants are mind wandering while reading. In addition, data from the nonprobed participants show that the probes do not fundamentally alter the nature of the reading task and, more importantly, that the algorithm can provide a covert measure of mind wandering.

General discussion

The results from this study show that reaction times for advancing individual words in a word-by-word reading paradigm can be used as an index of mind wandering and can predict in real time whether or not a participant will report being on or off task when given a thought probe. These findings constitute the first demonstration that it is possible to “catch” mind wandering in real time as it happens. The present paradigm thus provides a critical tool for assessing mind wandering, enabling us to precisely assess its efficacy (by comparing mind-wandering rates when participants are predicted to be on vs. off task). Critically, the algorithm was capable of estimating the accuracy of comprehension even when participants were never probed, indicating that these results are not an artifact of the experience-sampling methodology.

We first showed that when participants were paying attention to the text, they were sensitive to lexical effects and evidenced increased reaction times for long, multisyllable, low-familiar words; during mindless reading, these effects were dampened. While these results are consistent with work using an eyetracking methodology (Reichle et al. 2010), being able to assess mindless reading with manual reaction times is particularly advantageous, in that any computer could easily be used to recognize mind wandering while reading, which currently is not feasible with eyetracking, given the high costs of the equipment.Footnote 2 In addition, this covert measure of mind wandering could provide new insight into the processes involved in mind wandering by avoiding the reactivity issues that potentially accompany studies in which participants are explicitly asked whether or not they are mind wandering.

Although the present results closely mirrored those of Reichle et al. (2010) with respect to the relationship between mind wandering and sensitivity to the lexical qualities of words, they diverged with respect to the overall reaction times preceding the thought probes. Whereas eyetracking revealed an overall increase in gaze duration for individual words prior to mind wandering, the present paradigm showed the opposite results, with participants speeding up during mind-wandering episodes. One potential explanation for this discrepancy is that there are simply paradigmatic differences between word-by-word reading and naturalistic reading (one page at a time), where certain aspects of reading behavior change without necessarily resulting in overall changes in comprehension (Just et al. 1982). An analogue to the speeding up of responses while mind wandering in this word-by-word reading paradigm may be sustained attention to response tasks (SART), in which participants also respond frequently to individual items appearing in the center of the screen. In the SART, responses are similarly faster and error rates increase when participants are mind wandering (Smallwood et al. 2004; Smallwood et al. 2007; Smallwood et al. 2008), mirroring the relationship between the local reaction time speed-up and reading comprehension in the present study. An alternative reason for the discrepancy between the eyetracking and word-by-word paradigms may entail intrinsic differences in the relationship between information processing and hand versus eye movements; when a participant is mind wandering, gaze duration may naturally slow due to the inherent information extraction associated with the eyes, whereas finger tapping may naturally speed up due to the common link between automatized behaviors and rapid hand movements.

Future work, varying both mode of advancing the text (i.e., via eyes vs. via the hand) and the presentation type (e.g., single words vs. sentences) could offer further insight into this issue. If, for example, eyetracking during word-by-word reading still leads to longer gaze durations while mind wandering, this would support the information extraction hypothesis proposed above. If, however, mind wandering was associated with shortened gaze durations during word-by-word reading (vs. naturalistic reading), and/or increased reaction times when reading text a sentence at a time (or a page at time), this would indicate that the presentation style is responsible for this discrepancy. Despite these differences, the fact that mind wandering systematically alters reading behavior in both word-by-word reading and naturalistic reading suggests that the present algorithm could be tailored to either context.

The most significant aspect of this study is that it is the first demonstration of a capacity to predict mind wandering in real time. In order to better understand the basic rationale of the algorithm used, consider the analogy of trying to catch a person speeding while driving. Since it is not feasible to have officers stationed at every possible location, only certain locations are chosen—for instance, the dreaded school zone coming after a 55-mph stretch of road. Likewise, in our study the sensitivity of the algorithm for catching mind wandering was maximized by assessing participants’ attention at difficult points in the text, where, as in the comparable scenario of the school zone, participants might not slow down if they were failing to pay attention. These “speed traps” were successful in discriminating on- versus off-task behavior because participants who were mind wandering tended to go fast at these difficult points in the text, while participants who were paying attention tended to slow down.

These results are important from a pedagogical perspective, because they suggest that it may be possible to decrease the amount of mind wandering that readers engage in by cuing them to reengage the text when their reading behavior suggests they are mind wandering. Given the consequences of mind wandering on comprehension demonstrated here and in prior work (Schooler et al. 2004; Smallwood et al. 2008), it is likely that an intervention that was able to decrease mind wandering could have a significant impact on students’ ability to understand the text that they are reading. For example, students could first be assessed for their tendency to mind wander with a random-probe word-by-word reading paradigm; they could then receive practice with the present paradigm, whereby participants receive regular feedback regarding precisely when they are mind wandering. This feedback might provide a critical training tool for enhancing individuals’ ability to catch mind-wandering episodes for themselves, thereby promoting a meta-awareness of mind wandering (Sayette, Reichle, & Schooler, 2009; Schooler, 2002) that would enable it to be kept in check. Success could be assessed by measuring comprehension accuracy and the number of predicted off-task episodes in a no-probe version. Given that the present algorithm required extensive analysis of the text’s lexical features, future work with different texts will be needed in order to see how well the present parameters in the algorithm generalize to other text. For example, it may be possible with mostly high-familiar text (such as in children’s stories) to only use the number of syllables and the number of letters for a given word and still to successfully predict mind wandering.

Finally, the present approach may help overcome many of the experimental challenges to mind-wandering research. An important contribution of the present experiments is that they demonstrate that subtle changes in behavior that are associated with mind wandering can provide a more nuanced measure of mind wandering. One advantage is that by bypassing the requirement for self-report, it is possible to assess the extent to which introspection might fundamentally alter participants’ behavior, their underlying mental states, their neurocognitive activity, and/or the relationship between these components (Schooler, 2002). A key advance of the research presented in this article is that the question of whether thought probes have a reactive consequence is now an experimental question. Likewise, a limit in the experience-sampling method is that the number of probes provides an upper limit on the number of times that mind wandering can be reported. By contrast, in the present approach, the limit in the frequency that probes occur is related to the frequency of the “speed traps”/difficult text (which can vary with much greater freedom). The capability of the present algorithm to covertly track mind-wandering episodes in real time in the future will allow a more powerful estimate of the occurrence of this state. Finally, the fact that the present procedure allows mind wandering to be assessed without recourse to self-report allows for the extension of the mind-wandering procedure to groups such as children, as well as to populations with language problems (such as autism), for whom the veracity of self-reported information may be doubted.