Why interleaving enhances inductive learning: The roles of discrimination and retrieval

Birnbaum, Monica S.; Kornell, Nate; Bjork, Elizabeth Ligon; Bjork, Robert A.

doi:10.3758/s13421-012-0272-7

Why interleaving enhances inductive learning: The roles of discrimination and retrieval

Published: 09 November 2012

Volume 41, pages 392–402, (2013)
Cite this article

Download PDF

Memory & Cognition Aims and scope Submit manuscript

Why interleaving enhances inductive learning: The roles of discrimination and retrieval

Download PDF

Monica S. Birnbaum¹,
Nate Kornell²,
Elizabeth Ligon Bjork¹ &
…
Robert A. Bjork¹

19k Accesses
148 Citations
71 Altmetric
12 Mentions
Explore all metrics

Abstract

Kornell and Bjork (Psychological Science 19:585–592, 2008) found that interleaving exemplars of different categories enhanced inductive learning of the concepts based on those exemplars. They hypothesized that the benefit of mixing exemplars from different categories is that doing so highlights differences between the categories. Kang and Pashler (Applied Cognitive Psychology 26:97–103, 2012) obtained results consistent with this discriminative-contrast hypothesis: Interleaving enhanced inductive learning, but temporal spacing, which does not highlight category differences, did not. We further tested the discriminative-contrast hypothesis by examining the effects of interleaving and spacing, as well as their combined effects. In three experiments, using photographs of butterflies and birds as the stimuli, temporal spacing was harmful when it interrupted the juxtaposition of interleaved categories, even when total spacing was held constant, supporting the discriminative-contrast hypothesis. Temporal spacing also had value, however, when it did not interrupt discrimination processing.

The benefits of interleaved and blocked study: Different tasks benefit from different schedules of study

Article 02 July 2014

Paulo F. Carvalho & Robert L. Goldstone

Long-Lasting Effects of an Instructional Intervention on Interleaving Preference in Inductive Learning and Transfer

Article 04 March 2022

Yuqi Sun, Aike Shi, … Liang Luo

Retrieval Practice Benefits Deductive Inference

Article 15 September 2016

Luke G. Eglington & Sean H. K. Kang

People accumulate a great deal of knowledge via inductive learning. Children, for example, learn concepts such as boat or fruit by being exposed to exemplars of those categories and inducing the commonalities that define the concepts. Later in life, we might learn to distinguish between different species of butterflies or birds, as in the present research. Such inductive learning is critical in making sense of events, objects, and actions—and, more generally, in structuring and understanding our world. In the present research, we examined how exemplars of to-be-learned categories should be sequenced and spaced in order to optimize inductive learning.

Kornell and Bjork (2008) investigated the effect of study schedules on inductive learning—specifically, learning artists’ painting styles from exemplars of their paintings. Images of six paintings by each of 12 artists were presented for study, with the artist’s name displayed below each painting. The paintings by half of the artists were blocked (i.e., all six paintings by a given artist were shown consecutively), whereas the paintings by the other six artists were interleaved (i.e., mixed together). After the learning phase, participants were shown new paintings by each of the 12 artists and were asked to identify which artist had painted each new painting. Kornell and Bjork found that interleaving artists’ paintings led to better performance on this inductive task than did blocking—even though participants consistently believed that blocking, rather than interleaving, had been more helpful for learning the artists’ styles.

Kornell and Bjork’s (2008) findings were replicated by Kornell, Castel, Eich, and Bjork (2010) with older adults as participants. Furthermore, Zulkiply, McLean, Burt, and Bath (2012) found similar results when their participants read case studies exemplifying different psychological disorders in an inductive-learning experiment, and Vlach, Sandhofer, and Kornell (2008) found similar results when three-year-old children learned the names of novel objects on the basis of induction. Additionally, Kornell and Bjork’s results have been replicated by Kang and Pashler (2012), Zulkiply and Burt (in press), and Wahlheim, Dunlosky, and Jacoby (2011). All of these findings seem to fit within an extensive literature on the spacing effect—that is, the finding that items studied once and restudied after a delay are recalled better in the long term than are items studied and restudied in quick succession (for reviews, see Cepeda, Pashler, Vul, Wixted, & Rohrer, 2006; Dempster, 1988; Glenberg, 1979). In all of the recent studies demonstrating the benefits of spacing for inductive learning (with the exception of Vlach et al., 2008, who did not have an interleaved condition in addition to their spaced condition), however, interleaving—mixing exemplars from different categories together—was what enhanced learning, rather than temporal spacing per se—a point to which we will return shortly.

A major goal of the present research was to discover why interleaving appears to enhance inductive learning. Intuitively, it might seem that studying a single category in a block would be beneficial, as learners would notice similarities within the category. Consistent with this idea, early studies showed that mixing exemplars from different categories resulted in poorer learning than did grouping exemplars of the same category (Kurtz & Hovland, 1956; Whitman & Garner, 1963). More recently, Goldstone (1996) found better performance when categories alternated on 25 % of the trials than when they alternated on 75 % of the trials—that is, less-frequent alternation appeared to produce more learning. (We will discuss these findings and why they may differ from the more recent results in the General Discussion.)

Discriminative-contrast hypothesis

If blocked studying facilitates noticing similarities among intracategory exemplars, interleaved studying might facilitate noticing the differences that separate one category from another. In other words, perhaps interleaving is beneficial because it juxtaposes different categories, which then highlights differences across the categories and supports discrimination learning—an idea that we refer to as the discriminative-contrast hypothesis. As Goldstone (1996) pointed out, “frequent alternation of categories has the advantage of highlighting features that serve to distinguish categories. Conversely, infrequent alternation of categories has the advantage of highlighting information that remains constant across the members within a category” (p. 615). If the ultimate goal of category learning is to be able to classify new examples into the appropriate categories, then knowing what distinguishes categories is crucial.

In a recent study, Kang and Pashler (2012) investigated the degree to which noticing differences is the driver behind the benefit of interleaving. Using three of the 12 artists employed by Kornell and Bjork (2008), they replicated the blocked and interleaved conditions. In their first experiment, they also included a condition in which the blocked items were temporally spaced (using irrelevant cartoon drawings).^{Footnote 1} They found a benefit of interleaving over blocking (replicating Kornell & Bjork, 2008), but no benefit of temporal spacing over blocking. On the basis of this finding, they concluded that the value of interleaving lies in juxtaposing different categories, a process that they referred to as discriminative contrast. Consistent with this conclusion, they also found that simultaneous presentations of multiple paintings by the same artist did not benefit learning (Exp. 1), whereas simultaneous presentations of multiple paintings by different artists were beneficial (Exp. 2)—again, presumably, because they promoted discriminative contrast. Wahlheim et al. (2011) also found benefits of simultaneous presentation of different categories for inductive learning.

Kang and Pashler’s (2012) findings are consistent with the view that the value of interleaving lies in promoting discrimination between categories. We believe, however, that their findings do not seal the case. Their argument rests on two findings. The first has to do with the effects of simultaneous presentation. These results clearly supported the value of discriminative contrast, but presenting one item at a time is different from presenting all items at once—doing the latter places greater demands on working memory, and the discrimination processes that appear to be at work during a simultaneous presentation may not happen, or may not happen as effectively, during single presentations. Their second finding was a lack of benefit of temporal spacing, but that finding also did not pin down the cause of the interleaving benefit. That is, although a spacing effect did not occur, had it occurred, such a spacing effect would not have ruled out a role for discriminative contrast when examples are interleaved.

The present experiments

The crucial question, in our view, has to do with comparing different versions of the interleaved condition, because the interleaved condition is where the discriminative-contrast hypothesis makes strong predictions. Specifically, we tested the hypothesis that—in an interleaved condition—preventing items from being juxtaposed with one another would hurt inductive learning. We tested this prediction, in Experiment 1, by presenting interleaved exemplars with, or without, unrelated trivia questions inserted between successive exemplars. The basic logic was that if inserting trivia questions did not disrupt inductive learning, it would then be difficult to conclude that discrimination processes are responsible for the benefit of interleaving.

Our study was also motivated by practical questions. Creating a condition that was both spaced and interleaved allowed us to investigate an optimal combination of both manipulations. Zulkiply and Burt (in press) tested an interleaved schedule in which each exemplar was temporally spaced out with 30 s of unrelated filler task. They found that although interleaving led to a benefit in inductive learning, as compared to blocking, temporal spacing did not affect the benefit of interleaving. In our Experiment 2, we tested a similar condition, but used species of butterflies as the to-be-learned categories and employed 10-s intervening filler tasks. In Experiments 2 and 3, we examined the possible interaction of spacing and interleaving.

Experiment 1

In Experiment 1, we presented four photos each of eight species of birds in an interleaved order during the learning phase, using the following three study conditions. In the contiguous condition, no trivia questions were interpolated between the interleaved presentations of exemplars. In the alternating-trivia condition, the presentation order of the bird exemplars was the same as in the contiguous condition, but the presentations of successive exemplars were separated by the insertion of a trivia question. An unrelated task was inserted rather than having no filler task during the time delay so as to prevent rehearsal of the previously viewed image. In the grouped-trivia condition, single exemplars from each of the eight species of birds were presented in randomly ordered groups, with no trivia questions presented inside of a group, but successive groups of exemplars were separated by a series of eight trivia questions.

Importantly, between successive exemplars of a given species, exactly the same events transpired in the grouped-trivia and alternating-trivia conditions; that is, an average of eight trivia questions and seven photos intervened between successive exemplars of a given species in each condition. Thus, the total spacing was held constant in these two conditions, but the alternating-trivia condition interrupted comparison processes (by placing trivia questions between the photos), whereas the grouped-trivia condition did not. The discriminative-contrast hypothesis predicts that the alternating-trivia condition, which disrupted discrimination processes, should impair performance as compared to the grouped-trivia condition.

Method

Participants

A group of 102 participants (52 female, 50 male) were recruited to participate via Amazon’s Mechanical Turk, a website that allows people to sign up to complete small tasks for pay. Participants were paid 80 cents for participating, which took an average of about 12 min, and the participants averaged 31.2 years of age (range: 18–60). Of the participants, 59 were from the United States, 23 were from India, and the remaining 20 came from 17 other countries.

The numbers of participants in the contiguous, grouped-trivia, and alternating-trivia conditions were, respectively, 35, 25, and 42, with the differing numbers of participants arising through random assignment.

Materials

The materials consisted of five different photographs of each of eight different species of birds (see Wahlheim et al., 2011). As is illustrated in Fig. 1, each bird was shown against a brown background, with its species name shown below the photo during learning. The names of the species were accurate but simplified; for example, Cave Swallow was changed to Swallow (see Fig. 1). The materials also included trivia questions (e.g., “What is the name of the spear-like object that is thrown during a track meet?”), which came from Nelson and Narens (1980).

Procedure

The study was conducted online. After reading the instructions, participants were shown 32 photographs of birds on their computer screens, one at a time, for 4 s each. The eight species were interleaved, such that each group of eight trials included one photo from each species, arranged in a random order.

In the contiguous condition, the photos were presented contiguously. In the other two conditions, 32 trivia questions were also presented for 8 s each, accompanied by the following instruction: “Try to think of the answer. We will ask you to recall it later.”

In the alternating-trivia condition, a trivia question was presented before every photo. This task was intended to keep participants occupied with a task not related to bird photos. In the grouped-trivia condition, a series of eight trivia questions was presented followed by a series of eight photos, followed by a different set of eight trivia questions, and so forth. With the trials ordered in this fashion, all three conditions ended when the last photo was presented.

After the last photo was presented, participants were asked to play the computer game Tetris for 3 min as a distractor task and were then given an inductive-learning test. During the test, a previously unpresented exemplar of each species was presented, along with the names of all eight species, which were shown below the photos. Participants were asked to select the name of the species represented by the presented exemplar and to type it on the computer keyboard. Participants had unlimited time to respond, and no feedback was provided.

Results

Correct performance on the final test, which is illustrated in Fig. 2, was significantly affected by participants’ learning conditions, F(2, 99) = 8.53, p < .001, \( \eta_{\mathrm{p}}^2=.15 \). Furthermore, a Tukey–Kramer post-hoc test showed that accuracy was significantly lower in the alternating-trivia condition than in the other two conditions, for which performance did not differ significantly.

Experiment 1 went beyond previous findings by holding spacing constant between the grouped-trivia and alternating-trivia conditions and directly manipulating participants’ ability to engage in discrimination processes in an interleaved presentation condition. The results were consistent with prior studies (Kang & Pashler, 2012; Wahlheim et al., 2011) in supporting the discriminative-contrast hypothesis: Interrupting discrimination processing, which was accomplished in the present experiment by inserting trivia questions between interleaved exemplars, impaired inductive learning, whereas inserting groups of trivia questions between successive groups of exemplars—and, thus, separating exemplars from within a given species temporally—did not impair inductive learning significantly. One could argue that introducing another task increased interference, and thus a true comparison between the contiguous and grouped-trivia conditions could not be made. The grouped-trivia and alternating-trivia conditions, however, had comparable levels of interference, and a comparison between these two conditions reveals the importance of discriminative contrast in inductive learning. Additionally, interference and task were held constant across all three conditions in Experiment 3.

Experiment 2

In Experiment 2, we examined how interleaving and spacing might interact in producing their effects on inductive learning. The results of Experiment 1 suggested that spacing, as implemented in the grouped-trivia condition, might have impaired performance slightly relative to the contiguous condition, but Experiment 1 did not include a noninterleaved condition of the type that would permit an assessment of the value of interleaving, relative to blocking, as a function of temporal spacing. Experiment 2 was designed to permit such an assessment.

The participants in Experiment 2 studied four exemplars of each of 16 species of butterflies in the context of a 2 × 2 mixed design. The species were randomly assigned to either the interleaved or blocked condition. In addition, for half of the participants, the butterfly exemplars were presented contiguously, replicating previous research (e.g., Kornell & Bjork, 2008); for the other half of the participants, trivia questions were inserted during the 10-s intervals between the presentations of successive butterfly exemplars, which created a spaced condition.

We predicted an interaction between spacing and interleaving. When exemplars are presented contiguously, the discriminative-contrast hypothesis, as well as previous research, predicts that interleaving should be beneficial as compared to blocking. When, however, exemplars are spaced apart with trivia questions, the expected impairment of discrimination processing as a consequence of blocking might not be observed, because such processing has already been neutralized by the presence of the interpolated trivia questions. To the extent that trivia questions prevent discrimination processing across the board, the discriminative-contrast hypothesis does not predict a benefit of interleaving over blocking. (Such a finding would be consistent with Kang & Pashler’s, 2012, finding that spacing did not enhance blocked learning, but Kang & Pashler did not compare interleaved vs. blocked study when both conditions were spaced.)

We were also interested in participants’ metacognitive judgments regarding which study schedule, interleaved or blocked, was more helpful to their learning. Kornell and Bjork’s (2008) participants rated blocking as being more effective than interleaving, despite having just completed a test on which they had performed better for categories learned under interleaved rather than under blocked conditions. We expected that the present participants would make the same judgment error for both the contiguous and spaced conditions, because blocking appears to create a greater sense of fluency of induction than does interleaving.