Whenever one seeks to “find out something,” one is immediately faced with deciding upon the order in which to make one’s inquiries. It is commonplace to remark that some orders of inquiry are better than others. (Bruner, Goodnow, & Austin, 1956, p. 81)

In this quote from The Study of Thinking, Bruner et al. (1956) highlight the importance of the decisions that people make as they regulate their learning of concepts. Although self-regulated learning is ubiquitous in many contexts, relatively little is known about how people regulate their learning of concepts, because most research on concept formation has sought to discover how concepts are represented. To explore how people represent concepts, researchers have typically tested formal models by using artificial stimuli, and the presentation order of those stimuli during learning has been under the control of the experimenter (for overviews, see Goldstone, 1994; Medin & Schaffer, 1978). Experimenter control of presentation order was also employed in some early research on concept formation, which focused on how people generate and test hypotheses (or rules) about the concepts that they are learning (e.g., Halford, Cross, & Maybery, 1984). This research has been vital for discovering how people represent categories, yet it offers little insight into how people regulate their concept formation. Accordingly, we introduce procedures to explore how people regulate their learning of natural categories, which has relevance to research on self-regulated learning, concept formation, and education. Thus, understanding this aspect of self-regulation will have implications for multiple domains.

Our specific focus will be on whether people choose to block their learning of exemplars within a category or to interleave exemplars across categories. Consider how interleaving and blocking would occur when people are learning bird families, which were the categories used in the present experiments. When studying bird families (e.g., Sparrows, Finches, or Thrashers), blocking would involve studying several different sparrows (e.g., Chipping, House, and Song), followed by different finches (e.g., Purple, Gold, and House) in a separate block, and so forth. Interleaving would involve studying one exemplar from a family, followed by an exemplar from a different family, and so forth (e.g., House Sparrow, American Goldfinch, Brown Thrasher). For learning such natural categories, classification performance after experimenter-controlled study is better when the exemplars are interleaved (e.g., Kang & Pashler, 2011; Kornell & Bjork, 2008; Wahlheim, Dunlosky, & Jacoby, 2011), so our main question was, Would people interleave their study of exemplars from different categories, or instead block their practice?

To answer this question, participants were instructed to study birds so that they could classify novel (unstudied) birds into the same families (for examples, see Fig. 1). Most importantly, participants selected which family they wanted to study, and they chose bird families in any order until they were ready for the test. Each experiment used a variation of this method, which allowed us both to observe whether people blocked or interleaved their study and to evaluate two hypotheses competitively. These hypotheses were based on two main ways in which people form concepts: by finding differences between categories, or by finding similarities within them (Goldstone, 1996). According to the search-for-differences hypothesis, people develop a concept by comparing exemplars in one category with those from other categories, so that they can better discriminate between the categories. Such discriminative contrast between exemplars from different categories is presumably why interleaving helps people learn natural categories (Kang & Pashler, 2011). If participants understood the importance of discriminative contrast, they were expected to interleave their study. By comparison, the search-for-similarities hypothesis states that people develop a concept via identifying how exemplars within that category are similar. In this case, people are searching for the characteristics of birds that best define their inclusion within a particular family. If so, then they were expected to largely block their study of exemplars. Although we will discuss alternative hypotheses in the General Discussion, we focused on these two hypotheses because they emerge most directly from the research on categorization (Goldstone, 1996) and have motivated the present experiments.

Fig. 1
figure 1

Sample exemplars from each of the 12 bird families used in Experiment 1. Color images are available online and from the first author

Evidence from previous studies has suggested that either hypothesis could be supported. Studies on self-regulated associative learning have demonstrated that people typically space their practice while learning word pairs, although they do not exclusively do so (Toppino, Cohen, Davis, & Moors, 2009). If the same mechanisms underlie people’s regulation of associative learning and concept formation, we would expect participants to prefer interleaving. By contrast, many college students report that blocking is better (Kornell & Bjork, 2008), in which case participants may prefer to block their study. In Kornell and Bjork’s study, however, students’ beliefs about these strategies were measured after they had performed the task, and beliefs can be substantially different when assessed prior to (or during) task completion versus after the task is complete (e.g., Hertzog et al., 2009). Thus, whether people’s study choices would reflect a preference for blocking or interleaving was an unresolved issue.

Experiment 1

During self-regulated learning, participants were presented with a selection format that contained 12 bird families and placeholders for six exemplars from each family (Fig. 2). Two different selection formats (12 × 6 and 6 × 12) were used, to ensure that restudy selection was not simply attributable to habitual responding (Dunlosky & Ariel, 2011): Namely, study choices can be biased by a left-to-right reading order, which would result in blocking in the 12 × 6 format but would result in interleaving in the 6 × 12 format.

Fig. 2
figure 2

Each panel provides the selection format used for self-regulated learning in each of the reported experiments. (Top) Sample portion of the 12 × 6 selection format used in Experiment 1. Ellipses indicate where the eight additional bird families would be presented. In this example, the participant studied four flycatchers and must decide which exemplar and family to study next, or whether to terminate restudy. (Middle) Experiment 2 selection format. In this example, the participant has just studied a Warbler and must decide whether to study another Warbler, to study an exemplar from a different family, or to terminate restudy. (Bottom) Experiment 3 and 4 selection format. In this example, the participant has just studied a Jay and must decide which of the eight bird families to study next, or whether to terminate restudy

Method

Participants and design

Ninety-seven students from Kent State University (KSU) participated for course credit. Three students were excluded due to making few restudy choices (i.e., selecting 0–5 exemplars for restudy). Selection format was manipulated between participants (n = 48 in the 6 × 12 format and n = 46 in the 12 × 6 format).

Materials and procedure

The materials included 12 bird families (from Wahlheim, Teune, & Jacoby, 2011). For each family, 12 color images of perching birds of different species from that same bird family were used (see Fig. 1). Two lists containing six exemplar bird picture–family name pairs (henceforth, exemplars) from each of the 12 families were counterbalanced across participants. One list was used during familiarization, self-regulated learning, and classification of studied exemplars (72 studied exemplars), and the other list was used only during the classification test of novel exemplars (72 new exemplars).

Prior to learning, participants were told that they were in a lottery ($25) and that the better they performed, the more chances they would have of winning. During the familiarity phase, participants studied pictures of exemplars that were presented for 6 s each. Exemplars were interleaved during the familiarity phase (for details, see Wahlheim, Dunlosky, et al., 2011). During the self-regulation phase, participants had 30 min to select exemplars in any order. They were told that they could study any of the exemplars as many times as they wanted. To make their selections, participants were presented with 72 buttons (see Fig. 2, top panel), one for each exemplar that had been presented during the familiarity phase. The buttons were grouped by bird family, with families presented either in rows (12 × 6 format) or in columns (6 × 12). Participants selected an exemplar to restudy by clicking a button within a bird family. The exemplar was then presented for self-paced study on a different screen. When the participants had finished restudying, they clicked a button that returned them to the selection interface. All restudy buttons were initially superscripted with a zero, and each time that an exemplar was selected, the superscript increased by one. This phase continued until the participants were ready for the tests (no participants exceeded 30 min).

Classification was evaluated by presenting the new (unstudied) exemplars from the same families, which were presented individually along with the 12 family names. Participants were given unlimited time to select the bird family to which each exemplar belonged. The studied exemplars were then presented for classification. Note that in all experiments, participants also made some metacognitive judgments. In prior research (e.g., Benjamin & Bird, 2006), such judgments did not have reactive effects on students’ study decisions; moreover, the judgments were not relevant to the focal hypotheses, so we do not discuss them further.

Results and discussion

The participants selected to restudy about 90 exemplars, regardless of the selection format (6 × 12, M = 88.0, SE = 6.2; 12 × 6, M = 92.1, SE = 6.5), t < 1. Study times did not differ significantly between the groups (6 × 12, M = 2.0 s, SE = 0.14; 12 × 6, M = 1.7 s, SE = 0.10), t(92) = 2.02, p = .05, d = 0.42.

More importantly, we estimated how often participants blocked their study of exemplars within families, and how many participants preferred blocking over interleaving. To do so, we calculated blocking as the number of exemplars selected for restudy in which two or more exemplars from the same family were studied in succession. We calculated interleaving as the number of restudy selections in which an exemplar from one family was studied (e.g., Grosbeak), followed by an exemplar from a different family (e.g., Finch), and so on. A switch between blocks was not counted as interleaving (e.g., six grosbeaks followed by four finches were considered one block of six and one block of four, rather than one block of five, two interleaved, and one block of three).Footnote 1 As is evident from Table 1, participants largely opted to block their study of exemplars, and blocking did not differ by group, t < 1. We also categorized participants as either blockers or interleavers on the basis of whether a greater proportion of restudy selections were blocked or interleaved. Almost all of the participants were blockers (Table 1) in both groups, χ 2(1, N = 94) = .002, p = .97.

Table 1 Self-regulated learning strategies

The values in Table 1 do not provide information about how many exemplars from one family were blocked within a run. One possibility is that participants chose to study only a few exemplars from a given family (e.g., two Grosbeaks) and then chose a few from another family (e.g., two Finches). This kind of strategy was not used, because the average length of the runs typically exhausted the number of exemplars (six) within a given family: The participants averaged 6.0 exemplars (SE = 0.17) per run (6 × 12, M = 6.0, SE = 0.19; 12 × 6, M = 6.1, SE = 0.26), t < 1.

Correct classification did not vary by format group, F < 1, and was significantly better for studied (M = .44, SE = .01) than for novel (M = .34, SE = .01) exemplars, F(1, 91) = 115.28, p < .001, η 2p = .56. The interaction was not significant, F(1, 91) = 3.53, p = .06, η 2p = .04.

Experiment 2

One reason that most participants may have blocked restudy in Experiment 1 concerned the selection format: Because the birds for each family were presented within a row or column, participants may have simply progressed across rows or down columns (e.g., Dunlosky & Ariel, 2011). Note, that interleaving would have been equally easy with this interface but given a different format, participants might interleave their study more often.

The format used in Experiment 2 did not allow this bias (i.e., mindlessly choosing birds within a column or row) and was also designed to reflect the kinds of materials that people use to learn to classify birds—namely, bird field guides. Such guides present exemplars by family, so to study birds from different families the learner must turn to another chapter to locate the next family. If one elects to study birds from the same family, the learner only turns to the next page in a field guide. Thus, as compared to blocking, a minor cost is incurred by interleaving. To reflect this cost, after studying a given exemplar, participants were asked whether they next wanted to study an exemplar from that same family or from a different family (Fig. 2, middle panel). If participants chose the same family, only one decision had to be made, and another exemplar from that family was presented. If they chose a different family, this decision was followed by a decision about which family to study next, and then an exemplar from that family was presented.

Method

Thirty students from KSU participated for course credit. One participant selected only one exemplar for study and was excluded from the analyses.

The materials for Experiment 2 were the same as those in Experiment 1, except that only eight families were used, to make the task easier. The procedure was the same as in Experiment 1, with two exceptions. There was no lottery, and during the self-regulation phase, participants were first presented with a randomly selected exemplar from the familiarity phase. After studying this exemplar, participants were prompted with the text, “You just studied a _____. What would you like to study next?” (Fig. 2, middle panel). If participants selected to restudy an exemplar from the same family, another exemplar from that family was presented. If participants selected to restudy an exemplar from a different family, they were shown a list of the seven other family names and were asked to choose one. An exemplar from the chosen family was then randomly selected and presented (exemplars were selected within a family without replacement; if all six exemplars from a family had been studied, they were all replaced).

Results and discussion

On average, the participants selected 58.1 (SE = 6.6) exemplars during self-regulated learning, and spent on average 2.1 s (SE = 0.09) studying each exemplar. As is evident from Table 1, participants opted to block their study. They averaged 7.6 (SE = 0.5) exemplars per run (if all six exemplars from a family had been studied, they were replaced and could be studied again if needed). Correct classification tended to be better for studied exemplars (M = .45, SE = .03) than for novel exemplars (M = .41, SE = .02), t(28) = 2.08, p = .05, d = 0.32.

Experiment 3

Using a different selection format, Experiment 2 again revealed that participants typically chose to block their study of exemplars within families. Given our desire to make the selection format more representative of real-life learning of this natural category, more effort was required to switch families than to continue studying exemplars within a family. Although the extra effort needed to switch families was relatively minor (an extra key stroke), this selection format may have also biased participants to block. Thus, in Experiment 3, we used a selection format (Fig. 2, bottom panel) that minimized environmental biases to block or to interleave.

Method

Sixty students from KSU participated for course credit. Fourteen participants were excluded due to making too few restudy choices (i.e., they selected 0–6 exemplars). The materials were identical to those of Experiment 2, and the procedure differed in two ways: Participants were included in the lottery, and during self-regulated learning, they were first presented with a randomly selected exemplar, followed by the prompt, “You just studied a _____. Click on the bird family that you’d like to study next.” All eight families were presented with the prompt.

Results and discussion

On average, the participants selected 56.8 (SE = 5.8) exemplars for restudy and spent 2.4 s (SE = 0.04) studying each exemplar. Participants predominately blocked their study of exemplars (Table 1). However, some participants did interleave over half of the exemplars that they restudied. More specifically, 36 participants were designated as blockers, and ten were designated as interleavers. Fewer participants blocked study in this experiment as compared to the previous ones, and this difference was likely due to the differences in interfaces used across experiments. The participants averaged 6.1 (SE = 0.4) exemplars per run.

Classification was significantly better for studied exemplars (M = .52, SE = .03) than for novel exemplars (M = .44, SE = .02), t(45) = 4.42, p < .001, d = 0.47. Given that the number of exemplars selected for restudy was greater for blockers (M = 66.5, SE = 6.3) than for interleavers (M = 21.6, SE = 8.1), t(44) = 3.62, p = .001, d = 1.30, and that relatively few participants interleaved, any differences in performance as a function of blocking versus interleaving would not be interpretable; hence, we do not present these results.

Experiment 4

The results from Experiment 3 further supported the conclusion that most students prefer to block their study; however, their preference to block in Experiments 1, 2, 3 might have been attributable to using an experimenter-paced familiarization trial. To investigate this possibility, the method from Experiment 3 was used, but without a familiarization trial.

Method

Twenty-three students from KSU participated for course credit. Experiment 4 was identical to Experiment 3, except that no familiarization trial was used.

Results and discussion

On average, the participants selected 106.7 (SE = 14.1) exemplars for study and spent 2.1 s (SE = 0.09) studying each exemplar. The participants predominately blocked their study (Table 1), which suggests that their preference to block was not driven by the familiarization trial used in the other experiments. Participants averaged 3.7 (SE = 0.4) exemplars per run, and classification was significantly better for studied exemplars (M = .47, SE = .05) than for novel exemplars (M = .40, SE = .04), t(22) = 3.64, p = .001, d = 0.53.

General discussion

This research is the first to investigate people’s choices about “how to order their inquiries” (as per Bruner et al., 1956) while learning a natural category. Early research exploring such choices had used (a) artificial exemplars that varied on multiple dimensions (e.g., on shape—circle, square, cross—or on color—shaded, black, unshaded) and (b) a procedure in which only the experimenter knew the rules for belonging to a well-defined and artificial concept (e.g., all shaded circles fit the concept, and all other exemplars do not). The participants’ goal was to discover those rules. Bruner et al., among many others, used variants on this procedure to explore the hypotheses that participants would evaluate while trying to discover the sought-after concept (for details, see Bruner et al., 1956, chap. 4).

The procedure used in this prior research differed substantially from the present procedure. For the present study, exemplars from bird families could be grouped by characteristic—but not defining—features, and participants knew which concepts they were trying to learn. Nevertheless, we suspect that, like the participants in the earlier studies, many of the present participants entertained hypotheses about the different families and made choices to evaluate their hypotheses. For instance, when learning “Grosbeak,” some participants may have evaluated the hypothesis that Grosbeaks tend to be smallish birds with large triangular beaks. Perhaps most importantly, the results across all four experiments were consistent with the conclusion that most participants make study choices to evaluate hypotheses about which features are shared among birds within a family; that is, our results confirmed predictions from the search-for-similarities hypothesis.

Further evidence relevant to this hypothesis was collected in Experiment 1. Namely, after completing the task, participants answered an open-ended question about how they had approached the task. Two raters independently scored their responses; the rater agreement was 86 %, and the few disagreements were resolved via discussion. Fifty-six percent of the participants explicitly stated searching for similarities; a representative quote was “I tried to find similarities in each bird family” (corrected for spelling errors). Additionally, 36 % indicated that they were focusing on bird features, and of course, some of these participants may have been searching for similarities but failed to report doing so, given the open-ended response format. Along with the data on study selections, this evidence further supports the conclusion that the majority of the participants were trying to discover similarities among birds within families.

Of course, other factors may also contribute to people’s preference to block. For instance, blocking may lead to greater processing fluency, which in turn gives students the perception that learning is easier (e.g., Kornell & Bjork, 2008). People also may prefer to block because they have learned this strategy over their lifetime, perhaps from how information is taught in school. These possibilities are not mutually exclusive, so an important direction for future research will be to evaluate the degrees to which they jointly contribute to people’s study choices when forming natural concepts. Variations on the methods introduced here could benefit such efforts, such as by collecting verbal reports as participants make study selections.

Although people’s preference to block exemplars may prevail across many natural categories, the structure of the categories themselves may also influence study choices. For instance, the exemplars from different bird families sometimes have many similarities (i.e., intercategory similarity), as illustrated in Fig. 3. Nevertheless, the exemplars also show intracategory dissimilarity; that is, some birds within the same family have few similarities (Fig. 3, bottom panel), so blocked practice to search for similarities could prove useful (cf. Carvalho & Goldstone, 2011). In fact, such intracategory dissimilarity may have led some participants to block their study, even though interleaved practice for these bird families is normatively more effective (Wahlheim, Dunlosky, et al. 2011). If so, then changing the degree of similarity of the exemplars within (and across) categories should influence people’s decisions about whether to block or interleave their study. One prediction is that when the exemplars within a category are more similar, people will be more likely to interleave exemplars across categories.

Fig. 3
figure 3

(Top) Comparison of the three exemplars in this row exhibits intercategory similarity, because these exemplars look very similar to each other, even though they are each members of different bird families (i.e., Vireo, Thrush, and Sparrow). (Bottom) Comparison of the three exemplars here exhibits intracategory dissimilarity, because these exemplars look relatively different from each other, even though they are members of the same bird family (i.e., Thrashers). Color images are available online and from the first author

Finally, our main aim was to attract attention to an underinvestigated aspect of concept formation—how people regulate their study choices. Beyond demonstrating that people have a preference to block, we also have described some promising areas for future research, and no doubt manipulating other factors (e.g., retention interval, the amount of time available for study, and the costs associated with blocking vs. interleaving) will provide further insight into the subtleties of self-regulated concept formation.