Introduction

Computing the probability of an outcome in the light of new evidence (e.g., computing the probability of having cancer given a positive screening test) is a very challenging task. Indeed, very few individuals successfully revise their degree of certainty in accordance with the normative Bayes's rule (e.g., Eddy, 1982). However, Bayesian performance can be dramatically improved by adapting the presentation format of the statistical information (Barbey & Sloman, 2007a; Cosmides & Tooby, 1996; Gigerenzer & Hoffrage, 1995; Siegrist & Keller, 2011). Specifically, frequencies with natural sampling, also called natural frequencies (e.g., 10 out of 1,000 and 8 out of the 10) facilitate Bayesian reasoning, as compared with frequency or probability formats involving normalization (e.g., 1 % and 80 %; Gigerenzer & Hoffrage, 1995).

A considerable amount of dispute has surrounded the mechanism and processes underpinning this facilitative effect (Barbey & Sloman, 2007a; Brase, 2008; Gigerenzer & Hoffrage, 1999; Girotto & Gonzalez, 2001; Lewis & Keren, 1999; Sloman, Over, Slovak & Stibel, 2003). Two main accounts explain why different presentations of the same statistical information yield such different performances. The ecological rationality account (ERA) posits that the facilitation effect is due to our mind being evolutionarily attuned to natural frequencies, resulting in their computational simplicity (Brase, Cosmides & Tooby, 1998; Gigerenzer & Hoffrage, 1995). The nested sets account (NSA) suggests that the facilitative effect occurs because natural frequencies make the nested sets structure of the statistical information more salient (Barbey & Sloman, 2007a; Girotto & Gonzalez, 2001; Sloman et al., 2003; Tversky & Kahneman, 1983; Yamagishi, 2003).

From a process-oriented perspective, some variants of the two accounts can be placed on a continuum ranging from no cognitive involvement, as featured in some versions of the ERA (Brase, 2007; Cosmides & Tooby, 1996), to a full cognitive involvement, as represented by a dual-process version of the NSA (Barbey & Sloman, 2007a, 2007b). We refer only to these process-oriented variants of the ERA and NSA in the remainder of this article when explaining the facilitative effect of natural frequencies (see also Lesage, Navarrete & De Neys, 2013). According to proponents of the ERA, people’s general cognitive architecture consists of “functionally isolable computational systems,” which “make available specialized inferential procedures that allow certain computations to proceed automatically or ‘intuitively’ and with enhanced efficiency over what a more general reasoning process could achieve given the same input” (Cosmides & Tooby, 2008, p. 66). Automaticity is assumed to be a typical, rather than necessary, property of the underlying mechanism (see the discussions in Barrett & Kurzban, 2006; Coltheart, 1999; Lesage et al., 2013). Hence, the ERA assumes that the processing of natural frequencies is ruled by an automatically operating domain-specific computational mechanism (e.g., Brase & Barbey, 2006). In contrast to the ERA, the NSA posits that the nested sets structure of natural frequencies triggers analytical information processing, which, in turn, results in the facilitative effect. The facilitation “is a product of general-purpose reasoning processes” that employs cognitively demanding rule-based inference (Barbey & Sloman, 2007a, p. 244).

Given that the ERA and the NSA postulate different processes to account for the facilitative effect of natural frequencies in Bayesian reasoning, investigating individual differences with respect to cognitive processing will enable us to test these accounts (Barbey & Sloman, 2007a; De Neys, 2007). According to the ERA, individual differences in cognitive processing should not predict Bayesian performance with natural frequencies, since processing should result automatically. By contrast, these individual differences should predict Bayesian performance according to the NSA because analytical processes are involved, which depend on cognitive abilities and thinking dispositions.

Evidence of the effect of individual differences in cognitive processing on Bayesian reasoning with natural frequencies, however, is very limited. The few existing studies suggest that such individual differences could be related to normative performance in Bayesian reasoning with natural frequencies. First, normative reasoning was linked to an ability to suppress a first wrongful intuition. For example, Sirota and Juanchich (2011) found that cognitive reflection ability—that is, the ability to suppress intuitive answers (Frederick, 2005)—predicted Bayesian reasoning with natural frequencies, even when numeracy was statistically controlled. Lesage et al. (2013) additionally found that cognitive reflection had a stronger association with performance in natural frequencies tasks than in tasks featuring probabilities. Second, normative reasoning was linked to increased computational capacity: Lesage and colleagues used a dual-task paradigm (De Neys, 2006) to show that limiting the cognitive capacity of participants with lower cognitive reflection ability impeded their performance with natural frequencies.

Both findings lend some support to the NSA, but given that the evidence is very limited, further and more in-depth research is warranted. In addition, the previous research investigating the role of cognitive reflection ability did not address the question as to which component of cognitive reflection ability predicts the increased Bayesian performance with natural frequencies. Recent results show that cognitive reflection consists of two main components: cognitive abilities and thinking dispositions (Frederick, 2005; Liberali, Reyna, Furlan, Stein & Pardo, 2012; Stanovich, 1999; Toplak, West & Stanovich, 2011). Both cognitive abilities and thinking dispositions may be seen as indicators of analytical processing, but they tap into two different types of processes (Baron, 2005; Stanovich, 1999; Stanovich & West, 1998; Toplak et al., 2011). According to this literature, cognitive abilities refer to processes underlying computational efficiency (e.g., perceptual speed, working memory capacity, efficiency of retrieval from long-term memory), which are not malleable. Thinking dispositions refer to the intentional level (or depth) of information processing (e.g., the tendency to engage in deliberative thinking processing or the disposition to change a belief in the face of new evidence) and are quite malleable. So, for example, a person cannot easily increase his or her working memory capacity (i.e., cognitive abilities) to solve a complex mathematical task but can decide to spend longer on the task to figure out the correct solution if he or she wants to (i.e., thinking dispositions).

Clearly, studying the effect of these two components (representing analytical processing) in Bayesian reasoning with natural frequencies will further differentiate between the ERA, which posits automatic processing, and the NSA, which posits analytical processing of natural frequencies. In addition, it would also enable us as to understand in more detail the processes underlying the facilitation, since the two components tap into two different types of analytical processes, either of which may be responsible for facilitation. While cognitive abilities may be needed to build a problem representation from the text and perform mathematical computations, thinking dispositions may be needed to suppress non-Bayesian intuitions, detect conflicting problem models, or revise initial (wrongful) solutions. In the case in which both components correlate with the performance, it would not be clear, however, which part of the variance is really responsible for the facilitation, because cognitive abilities and thinking dispositions share some common variance (Stanovich, 1999). To avoid such problem, we investigated the unique effects of thinking dispositions on Bayesian performance (i.e., we adopted a standard procedure used to statistically control for variance of cognitive abilities; cf. Stanovich, 1999).

Present research

To test predictions with respect to the effect of individual differences derived from the ERA and NSA, we designed two experiments. Experiment 1 tested the respective hypotheses of the ERA and NSA by replicating the link between cognitive reflection ability and Bayesian performance with different object parsing (Lesage et al., 2013; Sirota & Juanchich, 2011). Experiment 1 can therefore be seen as a response to the recent general calls to replicate effects published in the psychological literature (e.g., Pashler & Wagenmakers, 2012). Experiment 2 had two aims: (1) to test the respective hypotheses derived from the ERA and NSA by investigating the effects of cognitive abilities and thinking dispositions on Bayesian performance separately and (2) to investigate the unique effects of thinking dispositions to further inform us about the processes underlying facilitation.

Experiment 1

In this experiment, we investigated the role of cognitive reflection in Bayesian inference over whole or arbitrarily parsed objects. Cognitive reflection is usually measured by the Cognitive Reflection Test (CRT; Frederick, 2005). A positive association of the CRT with Bayesian reasoning has recently been reported (Lesage et al., 2013; Sirota & Juanchich, 2011). Parsing can be linked to Bayesian performance by the individuation hypothesis put forward by the ERA, which posits that our mind is “better designed for operating over whole objects than arbitrary parsing of them” (Brase et al., 1998, p. 9). The present experiment used a conceptually similar manipulation to the one used in Brase et al.'s study and compared Bayesian performance in tasks featuring objects perceptually defined as whole or arbitrarily parsed objects.

The ERA predicts that (1) Bayesian performance with whole objects will be higher than that with arbitrarily parsed objects (Brase et al., 1998) and that (2) the effect of individual differences will differ for whole and arbitrarily parsed objects. Cognitive reflection should predict Bayesian performance only in tasks featuring arbitrarily parsed objects, but not with whole objects, because only whole objects are assumed to be processed automatically (Brase, 2007). The NSA predicts (1) a similar performance with whole and arbitrarily parsed objects and (2) a high predictive power of cognitive reflection in both tasks.

Method

Design and participants

Participants completed a Bayesian inference task with a nested set structure in a 2 (parsing: whole vs. arbitrarily parsed object) × 2 (cognitive reflection ability: low vs. high) between-subjects design. A total of 302 undergraduate social sciences students (232 females; age range = 18–37 years, median = 21; IQR = 2) completed the experiment via computer in a classroom.

Materials and procedure

Participants first provided a response to the three items of the CRT presented in random order (Cronbach`s α = 0.61; median = 1, M = 1.5, SD = 1.1). The median split of CRT resulted in CRT-low (n = 152) and CRT-high (n = 150) groups. Then participants read the “wheat and bread” task presented in Table 1, featuring either whole objects (i.e., “bags of wheat”; n = 155) or arbitrarily parsed objects (i.e., “cubic decimeters of wheat”; n = 147).

Table 1 Bayesian task featuring whole objects (bags) or arbitrarily parsed objects (dm3) as used in Experiment 1

Afterward, participants responded to two manipulation check questions that assessed the extent to which the object in the task was perceived as a whole or arbitrarily parsed object (e.g., they were asked the following: I imagined 20 bags [dm3] of wheat as 1: a set of physically separated parts, to 7: an inseparable unit). Results of the object’s mental representations score (Cronbach’s α = 0.80) showed that the manipulation was effective (whole M = 2.9, SD = 1.8, vs. arbitrarily parsed M = 4.8, SD = 1.9), t(300) = −8.70, p < 0.001, d = −1.03. Finally, participants completed a brief demographic questionnaire.

Results and discussion

As is shown in Fig. 1, participants performed similarly in the arbitrarily parsed and whole object conditions (59.9 % vs. 60.0 %), χ 2 (1) < 0.01, p = 1.000, φ < 0.01. Consistently, the object’s mental representations score (the manipulation check) was not correlated with performance, r pbis < 0.01, p = 0.830. Furthermore, participants scoring high in the CRT performed better than those scoring low (69.3 % vs. 50.7 %), χ 2(1) = −10.97, p = 0.001, φ = −0.19 (Note: the effect was similar for the CRT as a four-categorical variable). The CRT effect was significant in both the whole (68.6 % vs. 51.9 %), χ 2(1) = −4.22, p = 0.045, φ = −0.17, and the arbitrarily parsed (70.0 % vs. 49.3 %), χ 2(1) = −6.89, p = 0.014, φ = −0.21 object conditions. Despite the positive effect of CRT, participants with little ability to inhibit their intuition still performed quite well (i.e., 43.4 % for overall performance for the no correct CRT item group).

Fig. 1
figure 1

Proportions of correct Bayesian answers (in percentages) as a function of the format of the object manipulation (whole vs. arbitrarily parsed object) and participants' cognitive reflection ability level (low vs. high), N = 302. Error bars represent 95 % confidence intervals

These findings are hard to reconcile with the ERA (Brase et al., 1998; Cosmides & Tooby, 1996) and support the NSA (Barbey & Sloman, 2007a) for two reasons. First, we replicated the findings that cognitive reflection predicts Bayesian performance (Lesage et al., 2013; Sirota & Juanchich, 2011), indicating the involvement of a general reasoning mechanism as postulated by the NSA, rather than the involvement of a specialized cognitive mechanism operating automatically, as posited by the ERA. Second, our results do not support the individuation hypothesis stemming from the ERA and positing a Bayesian performance increment with whole objects, as compared with arbitrarily parsed objects (Brase et al., 1998).

Experiment 2

Results of Experiment 1 demonstrate that cognitive reflection predicts Bayesian normative performance. Nevertheless it remains unclear which component of cognitive reflection is responsible: cognitive abilities, thinking dispositions, or both. Experiment 2 addressed this issue. In addition, Experiment 2 featured various task contexts to ensure that the results were not the product of different cognitive demands connected with different verbal materials (see Siegrist & Keller, 2011, about context effects). The design included two statistical versions of a Bayesian inference task (normalized standard probability vs. natural frequencies) and two different task contexts (medical vs. children’s). According to the NSA, both cognitive abilities and thinking dispositions should predict variance in Bayesian reasoning with natural frequencies, whereas neither should play a role according to the ERA.

Method

Design and participants

Hypotheses were tested in a 2 (format: single-event probabilities vs. natural frequencies) × 2 (context: medical vs. children’s tasks) within-subjects design controlling for order effects. In addition, cognitive abilities and thinking dispositions were assessed by separate measures. A total of 151 psychology undergraduates took part (117 females; age range 18–42 years, M = 22.8, SD = 3.9). Eight people were excluded from the correlational analysis because they failed to complete the thinking dispositions questionnaires.

Materials and procedure

Four noncausal Bayesian tasks were presented, either in a single-event probability or in a natural frequency format. Two tasks were tailored for adults and described medical scenarios (i.e., mammography and hemoccult test scenarios; Gigerenzer & Hoffrage, 1995), and two were tailored for children (i.e., college student and bad teeth scenarios; Zhu & Gigerenzer, 2006). Following Gigerenzer and Hoffrage (1995), the same written-aloud protocol method and coding procedure was used to assess Bayesian performance.

Overall, participants completed eight Bayesian tasks. After solving the first set of four Bayesian tasks in one statistical format, participants filled in a socio-demographic form and the 18 items of the Raven advanced progressive matrices in a 15-min period (Raven, Court & Raven, 1977, 1991). Participants then worked on a second set of Bayesian tasks in the complementary statistical format (procedure adopted from Stanovich & West, 1998). Finally, participants completed two thinking disposition questionnaires: the 41 items of the Composite Actively Open Minded Thinking Scale (CAOMTS; Macpherson & Stanovich, 2007) and the 40 items of the Rational–Experiential Inventory (REI), composed of four subscales (i.e., rational ability, rational engagement, experiential ability, and experiential engagement; Pacini & Epstein, 1999). For both scales, participants provided their judgments on a 6-point Likert scale (1, strongly disagree; 6, strongly agree). The internal consistencies of the CAOMTS and of the REI subscales were satisfactory (Cronbach’s α = 0.75 – 0.81).

Results and discussion

Results replicated the facilitative effect of natural frequencies (e.g., Gigerenzer & Hoffrage, 1995) and the context dependence of Bayesian reasoning (Siegrist & Keller, 2011). Figure 2 shows that participants, on average, performed better in the frequency (57.0 %) than in the probability (19.9 %) condition, F(1, 150) = 127.90, p < 0.001 , η 2 par = 0.46, and in the children (44.5 %) than in the medical context (32.3 %), F(1, 150) = 37.33, p < 0.001 , η 2 par = 0.20. The context effect was more pronounced in the natural frequency condition than in the single-event probability condition, yielding a significant interaction, F(1, 150) = 4.62, p = 0.033, η 2 par = 0.03.

Fig. 2
figure 2

Effects of the format (single-event probability vs. natural frequencies) and the context (medical vs. children’s) on the average number of Bayesian answers in percentages (N = 151). Error bars represent 95 % confidence intervals

The zero-order correlations between cognitive abilities, thinking dispositions, and Bayesian performance are depicted in Table 2.

Table 2 Zero-order correlations between Bayesian reasoning, cognitive abilities, and thinking dispositions variables

Cognitive abilities (r from 0.28 to 0.35) and thinking dispositions (r from − 0.11 to 0.31) correlated with Bayesian performance. The correlations between cognitive abilities and thinking dispositions raise the question of whether or not thinking dispositions predicted Bayesian performance because of their association with cognitive abilities. In order to disentangle these effects, we conducted a two-block multivariate hierarchical regression analysis for the four criterion variables upon performance in the different conditions. The cognitive abilities block included the Raven advanced matrices score, whereas the thinking dispositions block included three scores: rationality, intuition, and actively open-minded thinking. Results, depicted in Table 3, indicate that cognitive abilities predicted Bayesian performance in the four versions of the task, whereas when controlling for cognitive abilities, thinking dispositions predicted performance in the single-event probability condition, but not in the natural frequency one. Thus, thinking dispositions predicted Bayesian performance in the natural frequency condition only due to its covariance with cognitive abilities.

Table 3 Hierarchical regression models of cognitive abilities (i.e., Raven progressive matrices) and thinking dispositions (rationality, intuition, and actively open-minded thinking) on Bayesian performance in the four Bayesian tasks (N = 143)

In contrast with the ERA's expectation, but in agreement with the NSA, cognitive abilities and thinking dispositions predicted Bayesian performance in both natural frequency and probability conditions. Furthermore, thinking dispositions uniquely (i.e., after controlling for cognitive abilities) predicted variance in performance in the probability, but not in the natural frequency, version of the Bayesian task, indicating that slightly different processes are responsible for the performance in the probability and frequency versions of the task.

General discussion

In the two experiments presented here, we investigated the effect of individual differences on normative performance in different formats of a Bayesian reasoning task. We found that cognitive reflection, cognitive abilities, and thinking dispositions predicted Bayesian performance with natural frequencies. Thinking dispositions were not a unique predictor of such performance. These findings are consistent with those in the previous literature (Lesage et al., 2013) and may contribute to a better understanding of other effects observed in the literature that may be predicted by cognitive abilities, such as numeracy (e.g., Sirota & Juanchich, 2011).

Three aspects of our findings are theoretically significant. First, these findings indicate that a general rather than a special cognitive mechanism underlies the facilitation effect; analytical processes seem to be recruited to solve Bayesian textbook problems. These findings support the NSA (Barbey & Sloman, 2007a), which postulates a general-purpose reasoning mechanism, more so than the ERA (Brase et al., 1998; Cosmides & Tooby, 1996), which posits a specialized cognitive mechanism to process natural frequencies. Second, the null effect of the parsing manipulation does not support the individuation hypothesis (Brase et al., 1998). Possibly, the performance decrement observed by Brase et al. for arbitrarily parsed objects was due to the ad hoc nature of the parsing (ends of candy canes), whereas the arbitrary parsing used here (cubic meters) was familiar and did not impede performance. Third, although both components of cognitive reflection, cognitive abilities, and thinking dispositions predicted Bayesian performance, we found no unique effect of thinking dispositions in the natural frequency condition, which suggests that performance in this condition depends merely on the processes underlying computational efficiency.

Such findings are not inconsistent with the NSA, which can explain the latter findings as follows: When the nested set structure becomes transparent due to the natural frequency format, thinking dispositions become less important, since no deep processing is required to build a nested set representation (e.g., there are no conflicting models). Performance, however, still depends on cognitive abilities, which enable processing of the given textual and mathematical information. In contrast, the probability format does not support a nested set representation. Therefore, thinking dispositions become relevant, and the tendency for deep processing (e.g., to suppress wrongful conflicting intuitions) increases the likelihood of representing the problem in terms of a relevant rule, which in turn enhances performance. Such an account explains the stronger correlation of performance and cognitive reflection ability in the natural frequency condition (Lesage et al., 2013) as a consequence of the cognitive ability component, which affects performance more strongly when thinking dispositions become less relevant. Similar ideas can also be found in the literature on arithmetical problem solving (e.g., Kintsch, 1988; Kintsch & Greeno, 1985). These problem-solving models posit that first a mental representation of a problem is constructed from the text, which then activates schemata, including problem-solving strategies stored in long-term memory. When text representations support the problem representation processes, as is the case for natural frequencies problems, additional capacity becomes available for solving the given task, resulting in enhanced performance (Kintsch & Greeno, 1985; Ragnubar, Barnes & Hecht, 2010).

To conclude, the present findings support the nested sets account (NSA) of facilitative effects of natural frequencies in Bayesian reasoning more than the ecological rationality account (ERA). Furthermore, our findings also suggest that different analytical processes are involved in solving tasks with probabilities and natural frequencies. Future research should focus on these processes in more detail to develop a more specific process-oriented model of the facilitation effect.