Introduction

The concept of differential strategy use is ubiquitous in developmental psychology. A first line of developmental research focuses on whether strategy use changes with age. For example, a young child having to add three and four may count to three and then count four further, whereas an older child might retrieve the answer from memory (Ashcraft and Fierman 1982). There is thus a developmental shift in strategy use: from a counting strategy to a retrieval strategy. Such developmental shifts in strategy use have been reported in a variety of domains, for example in decision making (Aïte et al. 2012; Bereby-meyer et al. 2004; Betsch and Lang 2013; Huizenga et al. 2007; Jacobs and Potenza 1991; Jansen et al. 2012; Kwak et al. 2015; Lang and Betsch 2018; Reyna 2008; Schlottmann 2001, 2000; Schlottmann and Anderson 1994), reasoning (Bouwmeester and Sijtsma 2007; Bouwmeester et al. 2004; Houdé et al. 2011; Jansen and van der Maas 2001, 2002; Siegler 1987, 2007; Siegler et al. 1981; van der Maas and Molenaar 1992; Wilkening 1981), reinforcement learning (Andersen et al. 2014; Decker et al. 2016; Palminteri et al. 2016; Potter et al. 2017; Schmittmann et al. 2012; Schmittmann et al. 2006), mathematics (Ashcraft and Fierman 1982; Bjorklund and Rosenblum 2001; Cho et al. 2011; Torbeyns et al. 2009), and categorization (Rabi et al. 2015; Raijmakers et al. 2004). A second line of research focuses on individual differences in strategy use within age groups (Siegler 1988), which have been shown to depend on variables such as intelligence (Bexkens et al. 2016) and the capacity to inhibit (Borst et al. 2012; Poirel et al. 2012). Finally, a third line of developmental research examines the effects of interventions aimed at changing children’s or adolescents’ suboptimal strategies into optimal ones (Alibali 1999; Felton 2004; Perry et al. 2010; Stevenson et al. 2016). For these three types of developmental questions, it is important to detect differential strategy use adequately. Hence, the goal of this paper is to describe an adequate method to detect differential strategy use, which can easily be used by developmental psychologists.

An example of a paradigm in which strategy use is crucial is the balance scale task (e.g., Siegler and Chen 2002). In this task, participants have to decide whether a balance will tip, and if so, to which side (cf. Fig. 1). Many strategies can be used to solve this task. For example, some participants, using the optimal strategy, decide given the number of weights multiplied by their distance to the fulcrum. Other participants, using a suboptimal strategy, decide given the number of weights only. Still, other participants use another suboptimal strategy, in which they decide given the number of weights if weight differences are present, and decide given distance if weight differences are absent. Note that the particular item in Fig. 1 does not differentiate between the latter two suboptimal strategies, as they will give rise to the same answer (“right”). Therefore, studies investigating strategy use typically administer not only a single item, but a set of items. In a fictitious eight item example, the response pattern of a single participant may be “right, balance, balance, right, left, right, left, right.” The response patterns of multiple participants are then input to analyses aimed at detecting strategy use.

Fig. 1
figure 1

In the balance scale task, participants have to decide whether the balance will tip, and if so to which side, after the supporting blocks have been removed

In developmental psychology, one approach to detect strategy use is the rule-assessment methodology (Siegler 1976; Siegler et al. 1981). According to this method, a participant is assigned to a particular strategy (“rule”), if the proportion of observed responses consistent with the predicted responses of that strategy exceeds a cut-off value. In the balance scale example, the observed response pattern (“right, balance, balance, right, left, right, left, right”) and predicted response pattern (e.g., “left, balance, balance, right, right, right, left, right”) match six out of eight (75%) responses. If the cut-off value was set at 70%, it thus is decided that this participant uses this particular strategy.

An advantage of the rule assessment methodology is that it is a one-step approach: Strategy use is inferred immediately (Table 1). Another advantage is that it can be used for small samples; even if only one child is tested, their strategy can, in principle, be inferred. However, there are also disadvantages. First, it does not account for error responses. That is, participants may not always make the response predicted by their strategy but may sometimes deviate from it, for example due to lapses of attention or premature responding. This can make strategy assignment difficult. Second, the approach only allows for discrete strategy assignment: A participant is assigned to one strategy and not to others, which might be problematic if the fit to several strategies is approximately equal. Last but not least, the cut-off value (e.g., 70%) is arbitrary; hence, strategy assessment may change if the cut-off value is changed. For these reasons, it has been argued that this methodology may result in spurious detection of strategies (Jansen and van der Maas 2002; Thomas and Horton 1997; van der Maas and Straatemeier 2008; Wilkening 1988).

Table 1 Characteristics of existing and new approaches to detect strategy use

An alternative approach is based on latent-class analysis (also called latent-mixture analysis; Bouwmeester and Sijtsma 2007; Bouwmeester and Verkoeijen 2012; Hickendorff et al. 2018; Huizenga et al. 2007; Jansen and van der Maas 2002; Nylund et al. 2007a). Although latent class analysis can be used in a confirmatory way, the exploratory way is more common. In this type of latent class analysis, participants are classified into several latent groups, based on similarity in observed response patterns. The number of latent groups required to describe the data is typically determined using penalized information criteria, such as the Bayesian Information Criterion (Schwartz 1978). Once the latent groups have been determined, the researcher evaluates the average response pattern in each latent group, comes up with a strategy that could explain this pattern, and assumes that all participants in that group used that strategy. If different possible strategies are specified beforehand, researchers can also compare the average response pattern in each latent group to the response pattern predicted by each of the pre-specified strategies, for example by computing the Euclidean distance between observed and predicted response patterns. The strategy that best fits the average response pattern of a latent group is then assigned to all participants in that group (Jansen et al. 2012).

The latent-class analysis approach has several advantages. A first advantage is that strategies are identified for average response patterns within subgroups of participants. As average response patterns are affected less by error responses than individual ones, this may benefit adequate strategy assessment. A second advantage is that—in contrast to the rule-assessment methodology—strategy assignment does not depend on an arbitrary criterion. Third, although participants are generally assigned to their most likely latent group, a posteriori probabilities offer the opportunity to identify each individual’s likelihood that they belong to each of the latent groups. Finally, as latent-class analysis does not require the specification of possible strategies beforehand, it is an ideal method when studying phenomena in which the number and nature of potential strategies are largely unknown, as in exploratory studies. There are also, however, disadvantages. First, latent-class analysis is a two-step procedure: Instead of directly defining subgroups in terms of common strategy use, it defines subgroups in terms of homogeneous response patterns, and the researcher subsequently assigns the most likely strategy to each subgroup. Second, it uses model selection criteria that have been criticized for not adequately accounting for model complexity (Burnham and Anderson 2002). Third, it does not account for individual differences in error rate. Fourth, it requires large sample sizes to guarantee stable solutions (Jaki et al. in press; Nylund et al. 2007b).

The goal of the current article is to introduce a recently developed alternative approach that is based on the advantages of both the rule assessment and the latent-class analysis methodologies while avoiding their disadvantages: a latent-mixture model implemented in a Bayesian framework (Lee and Sarnecka 2011; Lee 2016; Lee 2018a). Like the rule assessment method, latent-mixture models directly define groups in terms of common strategy use and can be used for any sample size. Furthermore, similar to latent-class analysis, this approach infers the likelihood that each participant uses each of the assumed strategies. Thus, it can simultaneously infer the number and size of strategy groups, each individual participant’s group membership, and the certainty of each participant’s group membership (Bröder and Schiffer 2003; Lee 2016). In addition, latent-mixture models provide an estimate of how accurately participants follow their inferred strategy, by explicitly formalizing individual differences in error rate. Finally, all conclusions are obtained using Bayesian inference, which quantifies uncertainty about parameters in terms of probability distributions, controls for all aspects of model complexity, and handles missing data in a comprehensive model-based way (e.g., Andrews and Baguley 2013; Bayarri et al. 2016; Lee and Wagenmakers 2005; Wagenmakers 2007; Wetzels et al. 2011, for an introduction to the use and advantages of Bayesian methods in the psychological sciences, see Lee 2018b; Lee and Wagenmakers 2013; Vandekerckhove et al. 2018; Wagenmakers et al. 2016).

Here, we apply the latent-mixture approach, implemented in a Bayesian framework, to infer differential strategy use on a previously established decision-making task, the “Gambling-Machine Task” (Jansen et al. 2012). In the remainder of this article, we first describe the Gambling-Machine Task and various strategies that can be used to perform this task. Next, we describe the latent-mixture model and its implementation in a Bayesian framework, and assess its capacity to recover strategy-use parameters by applying it to generated data. Then, we apply the latent mixture model to Gambling-Machine Task data from 210 children and adolescents (Jansen et al. 2012). In the final section, we discuss our results and their ramifications.

The Gambling-Machine Task

The Gambling-Machine Task (Jansen et al. 2012) is a paper-and-pencil decision-making task in which participants choose between two gambling machines (i.e., a specific Machine A and a specific Machine B that together represent one of the 28 items of the task; an example trial is shown in Fig. 2). Each machine contains 10 balls. Each ball is associated with a specific reward that is printed on the left corner of each machine (i.e., +2 for Machine A and + 4 for Machine B in Fig. 2). In addition, some balls, the gray shaded so-called loss balls, are associated with an additional loss—the exact amount is printed on the balls (i.e., − 10 for both machines in Fig. 2). The two machines of each item differ in at least one of three overtly presented cues: the reward associated with each ball (certain gain; CG), the proportion of loss balls (frequency of loss; FL; 0.1 for Machine A and 0.5 for Machine B in Fig. 2), and the amount of loss associated with each loss ball (AL). On each trial, participants have to indicate whether Machine A or Machine B is more profitable, or whether it does not matter. The correct answer is the option with the highest expected value (with expected value = CG + AL × FL).

Fig. 2
figure 2

Example trial from the Gambling-Machine Task. In this trial, there is a conflict between frequency of loss and certain gain. Here, “Machine A” is the correct answer. Figure taken from Jansen et al. (2012)

The task is based on seven item types, each of which has four versions. The seven item types consist of three simple item types in which the machines differ in only one cue, and four complex item types in which the machines differ in at least two cues that produce a conflict. The different versions of each item type are obtained by slightly adapting the specific amounts of CG, AL, or FL, and by changing the position of Machine A and Machine B.

Decision Strategies

We assume that participants use one of 17 strategies to complete the Gambling-Machine Task (Table 2, first two columns). These strategies cover a guessing strategy, an integrative strategy focusing on the expected value of the machines, and 15 non-compensatory strategies based on take-the-best strategies (Bröder 2000). The guessing strategy assumes that participants randomly choose one of the two machines without considering any of their cues, such that on each trial, the answers “Machine A” and “Machine B” are equally likely. It is assumed that, when guessing, participants will never consider the response “does not matter.” The integrative strategy assumes that participants correctly integrate all relevant cues by computing the expected value of each machine and subsequently choose the machine with the highest expected value; if both machines have the same expected value, the option “Does not matter” is chosen. Thus, when following the integrative strategy, the correct answer is always given.

Table 2 Overview of 17 possible decision strategies for completion of the Gambling-Machine Task, and percentage of participants assigned to each strategy according to our latent-mixture model, the latent-class analysis used by Jansen et al. (2012), and the rule-assessment method using two different cut-off values

Non-compensatory strategies, such as take-the-best strategies, on the other hand, are sequential strategies that differ in the number of cues that are considered (i.e., the complexity) and the order in which the cues are considered (i.e., the saliency of the cues). According to one-dimensional take-the-best strategies, participants compare the options on only one cue, say FL. If the machines differ on FL, the machine with the lowest FL is chosen; otherwise, the answer “Does not matter” is given. According to two-dimensional take-the-best strategies, the machines are sequentially compared on the two most salient cues, say FL and AL. The second, less salient, cue is only considered if the machines do not differ on the first cue. If the machines do not differ on either of the two cues, the answer “Does not matter” is given. The three-dimensional strategies work analogously. We obtain 15 possible take-the-best strategies as the result of all combinations of the three cues, while allowing to sequentially consider either one, two, or three cues (Table 2).

These 17 strategies predict distinctive response patterns that are deterministic for all but the guessing strategy. Thus, a participant’s answers to the Gambling-Machine Task items can be used to infer which of the 17 strategies the participant used. To accomplish this, we use a latent-mixture model implemented in a Bayesian framework.

The Latent-Mixture Model

The latent-mixture approach assumes that each individual uses one of the 17 decision strategies, such that the overall data set reflects a mixture of these specific strategies. Figure 3 shows a graphical representation of our model. The nodes stand for model variables and data, and the arrows indicate how the model variables are assumed to generate the data. Square nodes represent discrete variables and round nodes represent continuous variables. Nodes are shaded when the corresponding variables are observed but are unshaded when the variables are latent or unknown. Nodes with double borders indicate deterministic variables. The plates show independent repetitions in the graph structure across all items and participants.

Fig. 3
figure 3

A graphical representation of the Bayesian latent-mixture model to infer which of 17 decision-making strategies each participant uses to complete the Gambling-Machine Task

The data yi,q are participants’ answers to the Gambling-Machine Task items, with i referring to a specific participant and q to one of the 28 items; yi,q can be either “A” (Machine A), “B” (Machine B), or “DM” (Does not matter). The data are represented by a shaded, square node because they are observed, discrete choices.

The model assumes that each participant’s data is generated by one of the 17 strategies. The specific strategy used by participant i is indicated by the latent categorical variable zi, which can take one out of 17 values. Note that zi is represented by an unshaded, square node because it is latent and discrete. All of the strategies, except for the guessing strategy, generate deterministic predictions of participants’ answers to each item. These predictions are denoted by ti,q. The predictions depend on the cues of item q (i.e., the machines’ certain gain, amount of loss, and frequency of loss) which are labeled as sq. To account for possible deviations from the deterministic predictions, the model assumes that each participant has an individual error rate ∈i (Rieskamp 2008) that is used to convert the predictions into answer probabilities pi,q. Specifically, each participant chooses the option predicted by his/her strategy with probability 1–2∈i, and each of the two remaining options with a probability of ∈i. Thus, the higher the error rate, the more likely decisions are to deviate from the deterministic predictions. We did not implement an error rate for participants who use the guessing strategy but assumed that these participants are equally likely to choose both machines on each trial (i.e., pA,i,q = pB,i,q = 1/2, pDM,i,q = 0). Finally, the obtained probabilities pi,q specify the categorical distribution that generates the observed data, that is, yi,q ∼ Categorical(pA,i,q, pB,i,q, pDM,i,q).

Bayesian Implementation

We inferred the posterior distributions for each participant’s model parameters, zi and ∈i (Lee and Wagenmakers 2013; Vandekerckhove et al. 2018), using the following priors. We assumed that all 17 strategies are a priori equally likely; hence, we specified a categorical prior on zi that can take values 1 to 17 to indicate which strategy participant i used for all items. For the individual error rate ∈i, we chose a Uniform(0, 0.25) prior. An error rate of zero indicates that strategies are followed faultlessly, and an error rate of 0.25 indicates that there is a 50% chance of choosing the option predicted by the strategy and a 25% chance of choosing either of the remaining two options.

Applying the model to the observed data yields, for each participant, the posterior probabilities that each of the 17 strategies was used. In addition, for each participant, we can infer how closely the assigned strategy is followed.

Methods

Simulation Study

To examine our model’s ability to recover data-generating parameters, we first applied the model to generated data and compared the inferred parameters to the data-generating values (see “Results” section for the results). For each of the 17 decision strategies, we generated data for 15 synthetic participants, resulting in 255 synthetic participants. For all participants, except for those using the guessing strategy, we generated an error rate ∈i by drawing from a Uniform(0, 0.25) distribution. Each synthetic participant completed all 28 items of the Gambling-Machine Task.

Real Data Set

We then reanalyzed the Gambling-Machine Task data of 230 participants previously published by Jansen et al. (2012).Footnote 1 The data set comprises four age groups: 8–11 (n = 48), 11–12 (n = 51), 12–15 (n = 67), and 14–17 years (n = 64; see Jansen et al. 2012, for more details). After having removed 20 participants because of missing data entries (n = 9, 5, 5, 5, 1 in age groups 1, 2, 3, 4, respectively), we applied the model to 210 participants. We did not obtain Ethics Committee approval because we reanalyzed previously published data.

Model Fitting Procedure

We inferred posterior distributions for each parameter for each participant using Markov chain Monte Carlo (MCMC) sampling, which directly samples sequences of values from the posterior distribution for each model parameter (Gilks et al. 1996; Lunn et al. 2012), as implemented in JAGS (Plummer 2003) and R (R Development Core Team 2008). We ran 10 independent MCMC chains and initialized all chains with random values. For the real dataset, we collected 45,000 samples per chain and discarded the first 5000 samples as burn-in. In addition, we only used every fifth iteration to remove autocorrelation. Consequently, we obtained 80,000 representative samples per parameter (8000 per chain) per participant. In the simulation study, we collected 20,000 samples per chain and discarded the first 10,000 as burn-in. As for the real dataset, we used a thin rate of 5, resulting in a total of 20,000 representative samples per parameter per synthetic participant.

All chains showed convergence (see the online supplementary material for an assessment of the convergence and stability of our results). R and JAGS code, and the data, can be downloaded from our Open Science Framework repository (goo.gl/tkh4gE). In addition, to facilitate the use of this method by others, all steps of the modeling analysis are summarized in the online supplementary material.

Strategy Inference

We computed the mode of the posterior distribution of zi to infer, for each participant, which strategy they most likely used. In addition, to infer the certainty of the strategy assignment, we computed the proportion of zi samples that equal the mode of the posterior distribution of zi. We computed the mode of the posterior distribution of ∈i to infer each participant’s error-rate parameter.

Assessment of Model Fit

To assess the descriptive adequacy of the model—that is, its ability to “fit” the observed choice data—we conducted a standard Bayesian posterior predictive check. Specifically, we generated data for each participant using the mode of the posterior distributions of their zi and ∈i parameters, and subsequently compared the generated and observed data.

Results

Parameter Recovery

Figure 4 shows the strategy assigned to each synthetic participant in our simulation study. The synthetic participants are ordered in such a way that the data from the first 15 participants were generated with the guessing strategy, the data for the second 15 with the integrative strategy (expected value; EV), etc. We can speak of perfect recovery if the first 15 bars are all of height “guessing,” the second 15 bars all of height “EV,” etc. Figure 4 clearly shows this expected step-wise pattern. Specifically, 81.6% of the synthetic participants were classified correctly. Note that recovery is perfect if we use a small error rate to generate data (e.g., ∈i = .05 for all participants), clearly indicating the correctness of our implementation of the model. However, we prefer discussing a more realistic scenario with different error rates for each participant that vary between 0 and .25.

Fig. 4
figure 4

Strategy assignment for the 255 synthetic participants. The first 15 participants were generated according to the guessing strategy, the second 15 participants according to the integrative (EV) strategy, etc. With ideal recovery, the y value would increase one unit after every 15 participants (i.e., after every color change).

Figure S1 in the online supplementary material shows the posterior distributions of the error-rate parameter as well as the true data-generating value for 15 randomly selected synthetic participants. The mode of the posterior distribution is close to the data-generating value for some participants, indicating an adequate recovery. However, for other participants, there is a noticeable difference between the posterior mode and the data-generating value, indicating inadequate recovery. There was no systematic bias in the recovery of the individual error rates, nor a systematic relation between the bias and the standard deviation of the posterior distribution of the individual error rates (see online supplementary material; Fig. S2). Since there is no systematic bias, but a noticeable difference between the true value and the recovered value of the individual error rate for many synthetic participants, we recommend interpreting the estimated error rate at a group-level and considering individual error rates with caution.

Real Data Set: Inferred Strategy Use

The third column of Table 2 reports the percentage of the 210 real participants assigned to each of the 17 strategies according to our model. The majority of the participants used the integrative strategy, correctly integrating all three relevant cues to compute the expected value of the machines, followed by the FL > AL > CG and FL > CG > AL take-the-best strategies. In general, among the take-the-best strategies, FL seems to be a very important cue: 70.1% of the used take-the-best strategies consider FL as the first cue. It is also evident that none of the participants used the CG > AL > FL and CG > FL strategies.

Figure 5 shows the posterior inferences for strategy use, organized by the most likely strategy, for the six largest groups; the remaining 11 strategies were used by at most three participants. The header of each sub-panel contains the most likely strategy and the number of participants assigned to that strategy. The area of the circles represents the certainty of the assignment (i.e., the proportion of zi samples that equal the mode of the zi posterior): The larger the circle, the higher the certainty of the corresponding strategy assignment. It is evident that most participants were assigned with high certainty, but for some participants, related strategies are also plausible. For example, for the first participant that was assigned to the FL > AL group, the FL strategy is also very plausible.

Fig. 5
figure 5

Posterior inferences for strategy use for the majority of the participants (n = 195) obtained from Jansen et al. (2012, n = 210). Results are only presented for the six largest groups; the remaining nine groups contain one, two, or three participants. The area of each circle corresponds to the posterior probability that a participant used a particular strategy

Figure 6 shows the developmental trajectory of the six most often used strategies. With increasing age, integrative strategy use decreased, and the FL > AL > CG strategy use increased. Clear developmental changes with respect to the four remaining strategies are not evident.

Fig. 6
figure 6

Proportion of participants who used specific strategies as a function of age group. Results are only shown for the six most often used strategies

Inferred error rates

The average inferred error rate was 0.08 (SD = 0.06), suggesting that, on average, participants chose the option predicted by their specific strategy on 84% of the trials and chose each of the two remaining options on 8% of the trials. Inferred error rates for individual participants varied between 0.002 and 0.23. However, as our simulation study showed that many individual error rates were not recovered accurately, these should be considered with caution.

Posterior Predictive Check

Figure 7 shows the posterior predictive check for four randomly selected participants.Footnote 2 In this figure, observed choices are represented by solid lines and generated choices by dotted lines. Filled dots indicate that the generated and observed choice coincide. The header of each subplot shows the assigned strategy and the certainty of this assignment.

Fig. 7
figure 7

Posterior predictive check for four randomly selected participants. The dotted and solid line represent the generated and observed choices, respectively. Filled dots indicate that the generated and observed choices are identical. The header indicates the most likely strategy and the posterior probability that the participant used that strategy. DM stands for the answer option “Does not matter”

To summarize the descriptive adequacy of the model across all participants, we computed for each participant the prediction error (i.e., the proportion of items for which the generated and observed choice differ). Figure 8 shows the distribution of the prediction error across all participants. The mean prediction error is 0.135, and the prediction error is below 0.20 for most participants. This suggests that, in general, the model adequately accounts for the data and that we can draw meaningful conclusions from the model’s strategy indicator parameter.

Fig. 8
figure 8

Summary of the posterior predictive check of all participants. The figure shows the prediction error of all participants. The prediction error is defined as the proportion of items for which the predictions differ from the observed choices

Comparison of Results from the Bayesian Latent-Mixture Model and Existing Methods

The fourth, fifth, and sixth columns of Table 2 show the percentage of participants assigned to each strategy by the latent-class analysis reported by Jansen et al. (2012) and by the rule-assessment method using two different cut-off values. The two largest inferred strategy groups are the same for all three methods. However, the results obtained by the different methods differ in some crucial ways as well. First of all, we detected 15 strategy groups, whereas Jansen et al.’s (2012) latent-class analysis detected only six groups (i.e., a model with six latent classes provided the best fit to the data, according to the Bayesian Information Criterion), and the number of groups detected by the rule-assessment method depends on the cut-off value that is used. Second, the percentage of participants assigned to each group differs across the three approaches. The differences between our method and the latent-class analysis must be caused by the differences of these two modeling approaches. These differences include our use of the Bayesian rather than frequentist framework, the explicit as opposed to no explicit formalization of individual differences in error rates, and the assumption that each participant uses one out of 17 decision strategies and deviates from that strategy by an individual error rate compared to the assumption that participants with the same response pattern use the same strategy. Finally, our model outperforms the latent-class analysis on the posterior predictive check: Our model yields posterior predictions that differ from the observed data for on average 13.5% of the items, compared to 16.7% of the items for the model of Jansen et al. (2012).Footnote 3 This suggests that in addition to yielding more informative results, the Bayesian latent-mixture model is more descriptively accurate.

Discussion

In this paper, we presented a Bayesian latent-mixture model approach to infer strategy use during a decision-making task. Our approach circumvents shortcomings of the traditional rule-assessment methodology (Siegler 1976; Siegler et al. 1981) and the latent-class analysis approach (Clogg 1995; Dolan et al. 2004; Heinen 1996; Huizenga et al. 2007; Jansen and van der Maas 2002; Jansen et al. 2012; McCutcheon 1987; Rindskopf 1983, 1987) by allowing for the simultaneous inference of the number and size of strategy groups, as well as the strategy that each individual participant most likely used. In addition, our approach includes individual error rates that systematically account for individual inconsistencies in following the assigned decision strategy. Finally, it does not require large sample sizes.

We illustrated our latent-mixture model approach by applying it to a data set comprising 210 children and adolescents completing the Gambling-Machine Task (Jansen et al. 2012). Our latent-mixture model combined 17 different accounts of decision making (i.e., guessing strategy, integrative strategy, and 15 different take-the-best strategies) and allowed us to infer which of these 17 strategies every participant used. We found that most participants used the integrative strategy, followed by three-dimensional take-the-best strategies that have frequency of loss (FL) and amount of loss (AL) as most salient cues. In addition, our results suggest that with increasing age, integrative strategy use decreases, but the use of the FL > AL > CG strategy increases. These findings are consistent with the results of the original analysis from Jansen et al. (2012), and seemingly different from typical findings of proportional-reasoning tasks with two attributes, such as the balance task. As discussed by Jansen et al. (2012), the decrease in integrative strategy use with age supports the fuzzy-trace theory (Reyna and Ellis 1994) which suggests a developmental increase in the dominance of gist (fuzzy) over verbatim (detailed) processing.

A major advantage of our latent-mixture model approach is that it is very flexible. The online provided code can be modified in several ways. For example, the reader can include additional decision strategies (e.g., the weighted additive model; Gigerenzer et al. 1999; Rieskamp and Otto 2006) or exclude some of the strategies. It is also straightforward to modify the prior distributions and test the impact of such modifications as a form of sensitivity analysis. Other plausible modifications are adaptations of the model to other developmental domains and tasks, for example focusing on differential strategy use when solving categorization, logical-reasoning or mathematical problems. In addition, the model could be applied in longitudinal studies to investigate developmental changes in strategy use within individual participants.

More challenging modifications and suggestions for future research involve incorporating a more theoretically motivated model of probabilistic responding given deterministic choice rules than the simple error-rate mechanism used here (Regenwetter et al. 2011). In future work, the model could also be fitted hierarchically, such that the model parameters of each participant are sampled from a group-level distribution (Farrell and Lewandowsky 2018; Lee and Wagenmakers 2013; Rouder and Lu 2005; Shiffrin et al. 2008). In a Bayesian framework, this entails that each individual-level parameter is assigned a group-level prior distribution, whose parameters (hyperparameters) are estimated as well. One advantage of a hierarchical Bayesian implementation would be that it allows for the inclusion of covariates in the model (e.g., age, gender, intelligence) such that the effects of relevant individual-difference variables on each parameter can be directly tested. In addition, it is desirable to modify our model to account for the fact that children might learn during the task; for example, just by considering the different test items, children may discover new features and therefore change their strategy (e.g., Boncoddo et al. 2010; Jansen and van der Maas 2002). This could be done by using standard Bayesian learning applied to the valence of each cue (Mistry et al. 2016). We also note that a generative assumption of our model is that each participant uses the same strategy throughout the task. This assumption is violated if strategy use depends on specific item characteristics. The method can, however, easily be extended to include such item-specific strategy use.

Finally, it should be noted that despite the advantages of the proposed mixture model and the straightforward adaptations of the code to other application areas, caution is needed when inferences are drawn. It is important to verify that the model provides a sufficient account of the data and that the results are stable, and in case of doubts, this should be made explicit. For example, participants’ responses to all items are weighted equally in the model, whereas in actual datasets, it is possible that some task items are more important to distinguish strategies than others. At the same time, the reader should be aware that a latent-mixture approach will always yield a mixture of partially overlapping distributions. Therefore, although each participant is assigned to the most likely group, in some cases, another group is nearly as likely. For these reasons, a mixture analysis should always be performed and interpreted with caution (Huizenga et al. 2007).

Developmental change in strategy use is a popular topic in the developmental sciences. Inferences have been based on application of the rule-assessment methodology proposed by Siegler (Siegler 1976; Siegler et al. 1981) or latent-class analysis (Clogg 1995; Heinen 1996; McCutcheon 1987; Rindskopf 1983, 1987). Here, we proposed a model-based approach using latent-mixture models and Bayesian inference. Given the advantages of our approach—its highly informative inference and flexibility—we hope it will be applied in future studies to advance research areas covering the development of decision making, categorization, logical reasoning, and mathematical concepts.