Speed–accuracy manipulations and diffusion modeling: Lack of discriminant validity of the manipulation or of the parameter estimates?

Lerche, Veronika; Voss, Andreas

doi:10.3758/s13428-018-1034-7

Speed–accuracy manipulations and diffusion modeling: Lack of discriminant validity of the manipulation or of the parameter estimates?

Published: 14 March 2018

Volume 50, pages 2568–2585, (2018)
Cite this article

Download PDF

Behavior Research Methods Aims and scope Submit manuscript

Speed–accuracy manipulations and diffusion modeling: Lack of discriminant validity of the manipulation or of the parameter estimates?

Download PDF

Veronika Lerche¹ &
Andreas Voss¹

2741 Accesses
21 Citations
8 Altmetric
1 Mention
Explore all metrics

Abstract

The diffusion model (Ratcliff, 1978) is a mathematical model theorized to untangle different cognitive processes involved in binary decision tasks. To test the validity of the diffusion model parameters, several experimental validation studies have been conducted. In these studies, the validity of the threshold separation parameter was tested with speed–accuracy manipulations. Typically, this manipulation not only results in the expected effect on the threshold separation parameter but it also impacts nondecision time: Nondecision time is longer in the accuracy than in the speed condition. There are two possible interpretations of the finding: On the one hand, it could indicate that speed versus accuracy instructions really have an impact on the duration of extradecisional processes. On the other hand, the effect on the measure for nondecision time could be spurious—that is, based on a problem in the parameter estimation procedures. In simulation studies—with the parameter sets based on typical values from experimental validation studies—we checked for possible biases in the parameter estimation. Our analyses strongly suggest that the observed pattern (i.e., slower nondecision processes under accuracy instructions) is attributable to a lack of discriminant validity of the manipulation rather than to trade-offs in the parameter estimations.

A Generalized Speed–Accuracy Response Model for Dichotomous Items

Article 21 November 2017

Peter W. van Rijn & Usama S. Ali

Qualitative speed-accuracy tradeoff effects that cannot be explained by the diffusion model under the selective influence assumption

Article Open access 08 January 2021

Farshad Rafiei & Dobromir Rahnev

Combining speed and accuracy to control for speed-accuracy trade-offs(?)

Article 18 July 2018

Heinrich René Liesefeld & Markus Janczyk

When performing a speeded decision task, individuals can prioritize either accurate or fast responses. Emphasis in one domain will result in worse performance in the other domain, producing the ubiquitous speed–accuracy trade-off (SAT; Heitz, 2014). The SAT can complicate data interpretations. For example, older adults tend to produce longer response times (RTs; e.g., Ratcliff, Thapar, & McKoon, 2001; Spaniol, Voss, & Grady, 2008; Thapar, Ratcliff, & McKoon, 2003). One explanation for this finding is that older adults are slower at accumulating the information that they need to make a decision. On the other hand, it is possible that these differences in RTs are attributable to differences in speed–accuracy settings, with older adults favoring accuracy over speed. Simply analyzing mean RTs or accuracy rates alone can lead to misinterpretations, because it is uncertain whether the results reflect genuine task difficulty or ability to perform the task, or some tendency toward favoring speed or accuracy. Several integrated measures of RT and accuracy rate have been designed to provide solutions to this problem (e.g., Bruyer & Brysbaert, 2011; Hughes, Linck, Bowles, Koeth, & Bunting, 2014; see also Vandierendonck, 2017). A different approach to disentangling speed–accuracy settings from the speed of information accumulation, and also from other processes (e.g., motor responses and decision biases), is the diffusion model.

The diffusion model (Ratcliff, 1978) is a mathematical model that allows the disentangling of different processes involved in binary decision tasks. Studies employing the diffusion model have revealed differences in speed–accuracy settings between younger and older adults (e.g., Ratcliff et al., 2001; Spaniol et al., 2008; Thapar et al., 2003). An analysis of the behavioral data alone could lead to the assumption that older adults are inferior in their information-processing capability, due to their longer RTs. However, for many tasks no differences in speed of information accumulation are reported between the different age groups (Mulder et al., 2010; Ratcliff et al., 2001; Ratcliff, Thapar, & McKoon, 2003; but see Thapar et al., 2003). Instead, older adults set more conservative decision thresholds, favoring accuracy over speed, and also display longer durations of nondecision processes such as motor response execution (e.g., Ratcliff et al., 2001; Spaniol et al., 2008; Thapar et al., 2003).

To test whether the diffusion model is able to disentangle the specific processes (such as speed of information accumulation and speed–accuracy settings), several experimental validation studies have been conducted (e.g., Arnold, Bröder, & Bayen, 2015; Voss, Rothermund, & Voss, 2004). For example, difficulty manipulations have been employed to analyze the convergent validity of the model parameter assumed to measure the speed of information accumulation. This manipulation revealed that, as expected, the parameter differed between easier and more difficult trials, providing strong support that the parameter measures what it was designed to measure. Less clear are the findings regarding the parameter that measures speed–accuracy settings. This parameter has been investigated by encouraging participants to prioritize speed or accuracy, respectively, in two blocks of the same task. Providing evidence for convergent validity, speed–accuracy instructions influenced the correct model parameter in the expected direction (e.g., Ratcliff & Rouder, 1998; Wagenmakers, Ratcliff, Gomez, & McKoon, 2008). However, in several studies, if other diffusion model parameters were also allowed to vary between the two conditions, the manipulation also unexpectedly affected the estimate for nondecision time (e.g., Rinkenauer, Osman, Ulrich, Müller-Gethmann, & Mattes, 2004; Voss et al., 2004). Nondecision time comprises the time needed for encoding information and for executing the response. In these studies, the nondecision time was higher in the accuracy than in the speed condition.

Two different explanations have been proposed for this finding: (1) The parameter estimation procedures might have found it difficult to disentangle decision-based processes from nondecision processes, or (2) the speed–accuracy manipulations might lack discriminant validity, such that instructions to emphasize speed or accuracy might influence not only the speed–accuracy settings but also the speed of the motoric response or the speed to encode information. Even though the possibility of these two different accounts has been recognized (e.g., Dutilh et al., 2018), so far, to the best of our knowledge, no study has tried to uncover which account is more likely. To address this question, this article reports two simulation studies differing in their parameter values and data generation processes.

Note that even though the studies in this article are based on the diffusion model (Ratcliff, 1978), the findings are similarly relevant for the assessment of experimental studies using other sequential-sampling models, such as the popular linear ballistic accumulator model (LBA; Brown & Heathcote, 2008). The LBA model, amongst others, also includes a nondecision time estimate and a parameter measuring decision settings, which are both conceptually similar to aspects of the diffusion model context (Donkin, Brown, Heathcote, & Wagenmakers, 2011).

In the following sections, we first give a short introduction to the diffusion model, followed by a summary of diffusion model studies that have analyzed the data of speed–accuracy manipulations. Finally, we present the method and results of our two simulation studies.

Introduction to diffusion modeling

The diffusion model (Ratcliff, 1978) is a mathematical model applicable to RT data from binary decision tasks. An example of such a binary task is a color discrimination task with the two stimuli “orange” and “blue” (Voss et al., 2004). In each trial, a square composed of pixels of the two colors is presented, and participants have to assess whether orange or blue dominates. The diffusion model is illustrated in Fig. 1. In this plot, a corridor is shown that is limited by two thresholds. The thresholds are associated with two colors (in the example, “orange” and “blue,” respectively). Underlying the diffusion model is the assumption that in tasks like this simple color discrimination task, participants accumulate information continuously until one of the two thresholds is reached. In the example trial illustrated in the plot, the process ends at the upper threshold associated with “orange.” Accordingly, after the end of the decisional process, the participant will execute the motoric response associated with the answer “orange” (e.g., left key press).

The basic diffusion model is made up of four main parameters. The threshold separation (a) defines the amount of information that is required for the participant to make a decision. If participants adopt a conservative response strategy, the threshold separation is increased. This results in a longer time course to reach one of the two thresholds, but will produce more accurate responses. Threshold separation has been found to be higher under accuracy than under speed instructions (e.g., Ratcliff & Rouder, 1998; Voss et al., 2004) and for older than for younger individuals (e.g., Ratcliff et al., 2001; Spaniol et al., 2008; Thapar et al., 2003).

In Fig. 1, the accumulation process is directed to the upper threshold, as illustrated by the arrow symbolizing the drift rate (ν). Drift rate measures the direction and speed of information accumulation. If the drift is stronger (i.e., the arrow is steeper), the process is more likely to reach this threshold in a shorter time. Higher drift values, thus, indicate that the process is faster and ends more frequently at the correct threshold. It has been shown that easier trials feature higher drift rates (e.g., Ratcliff, 2014; Voss et al., 2004), as do more intelligent individuals (Ratcliff, Thapar, & McKoon, 2010; Schmiedek, Oberauer, Wilhelm, Süß, & Wittmann, 2007; Schubert, Hagemann, Voss, Schankin, & Bergmann, 2015; Schulz-Zhecheva, Voelkle, Beauducel, Biscaldi, & Klein, 2016). The decision process is influenced not only by the drift rate but also by additive Gaussian noise (the standard deviation of this noise is termed the diffusion constant and is often described as a scaling parameter of the diffusion model). Due to this noise, process durations will vary from trial to trial and processes will not all necessarily end at the threshold that the drift points toward. The figure also depicts a density distribution at each threshold indicating how often, and after how much time, a process will end at one of the two thresholds, given that the correct response is orange.

The accumulation process begins at the starting point z (or, the relative starting point z_r = z/a). The decision process is unbiased if the starting point is centered between the two thresholds, as in Fig. 1. However, if the starting point is closer to, for example, the upper threshold, it is more likely that the process will end at this threshold. In this case, the mean RTs at the threshold closer to the starting point will be shorter than mean RTs at the opposite threshold. The starting point has been the focus of a number of diffusion model studies as a measure of motivational biases in perception (e.g., Germar, Schlemmer, Krug, Voss, & Mojzisch, 2014; Voss, Rothermund, & Brandtstädter, 2008).

Finally, the duration of nondecisional processes (t₀), such as the encoding of information and the motoric response execution, is added to the duration of the decision process. In addition to the four main parameters of the simple diffusion model, the full diffusion model includes three more parameters: the intertrial variabilities of drift rate (s_ν), starting point (s_zr), and nondecision time (s_t0; e.g., Ratcliff & Rouder, 1998; Ratcliff & Tuerlinckx, 2002).

Effects of speed–accuracy manipulations

Speed–accuracy manipulations have been employed frequently in diffusion model studies (e.g., Ratcliff & Rouder, 1998) and in studies based on other sequential sampling models (e.g., Forstmann et al., 2011). In the speed conditions, participants are told to respond as fast as possible (even if this leads to an increased number of errors). Often, for slow responses (e.g., RT > 550 ms; Ratcliff & Rouder, 1998), participants get the feedback that their response was not sufficiently fast (e.g., feedback message “too slow”). In the accuracy condition, on the other hand, participants are instructed to be as accurate as possible (even if this leads to slower responses). In this condition, error feedback is typically provided.

In many diffusion model studies that used speed–accuracy manipulations, only the threshold separation (sometimes also the starting point) was allowed to vary between the conditions (Ratcliff & Rouder, 1998; Ratcliff et al., 2001, 2003; Ratcliff, Thapar, & McKoon, 2004; Thapar et al., 2003; Wagenmakers et al., 2008). These studies have consistently observed the expected effect of the speed–accuracy manipulation on the threshold separation parameter: The threshold separation was higher in the accuracy than in the speed condition.

Critically, constraints in the parameter estimation approach can distort results. Any existing differences between conditions regarding a specific parameter (e.g., nondecision time) cannot be detected if this parameter is not allowed to vary between conditions. In this case, the effects of condition will be spuriously picked up by other parameters that are estimated separately for each condition. Thus, of particular interest are studies in which not only the threshold separation but also other model parameters are allowed to vary between the speed and the accuracy conditions. In these studies, the speed–accuracy manipulations have usually influenced not only the threshold separation, but also the nondecision time estimate (e.g., Arnold et al., 2015; Dutilh et al., 2018; Voss et al., 2004). For example, Voss et al. (2004) conducted an experimental validation study using a color discrimination task in a within-subjects design. In the accuracy condition, participants were instructed to work particularly carefully trying to avoid errors. The RT limit was set at 3 s, in comparison to 1.5 s in the baseline condition. As expected, participants exhibited a higher threshold separation in the accuracy than in the baseline condition. However, additionally, nondecision time was increased by 50 ms.

Similarly, in a study based on a between-subjects design and a recognition memory paradigm, Arnold et al. (2015, Study 2) provided participants with negative feedback for errors in the accuracy group. In the speed group, on the other hand, negative feedback was given for responses slower than 1 s. Arnold et al. estimated the diffusion model parameters using several different methods: fast-dm (Voss & Voss, 2007, 2008), DMAT (Vandekerckhove & Tuerlinckx, 2008), and EZ (Wagenmakers, van der Maas, & Grasman, 2007). All three methods found the expected difference between groups in the threshold separation parameter, but the analyses with fast-dm and DMAT revealed a significant difference between groups in nondecision time, as well. Specifically, the nondecision time estimated by fast-dm was 130 ms longer in the accuracy than in the speed condition, and DMAT resulted in a difference in nondecision time of 170 ms. With the EZ method, no significant effect on nondecision time was observed (difference in t₀ of about 25 ms).

In some studies, speed–accuracy manipulations have influenced other parameters in addition to nondecision time. Lerche and Voss (2017a, Study 1) observed effects of a speed–accuracy manipulation on threshold separation, nondecision time, and the intertrial variability of nondecision time for a slow binary decision task (with mean RTs of about 7 s). Furthermore, in a study by Rae, Heathcote, Donkin, Averell, and Brown (2014), the speed–accuracy manipulation not only affected the threshold separation and nondecision time, but also the drift rate, with a higher drift value in the accuracy than in the speed condition (see also, e.g., Starns, Ratcliff, & McKoon, 2012). The authors assumed that under accuracy instructions, participants put more effort into the task, resulting in the higher drift rate in this condition. Interestingly, the opposite effect was reported by Arnold et al. (2015). The EZ parameter estimation revealed a higher drift rate in the speed than in the accuracy condition. The same effect in drift rate was reported for the fast-dm estimation, but only for the new items of the recognition memory task.

The finding that speed–accuracy manipulations often lead to effects in several parameters, and not only—as expected—in threshold separation, has also been discussed in the context of a large-scale validation project (Dutilh et al., 2018). Gilles Dutilh and Chris Donkin had a total of 17 research teams analyze the data from 14 pseudo-experiments.^{Footnote 1} The teams were provided data in which zero, one, two, or three of the psychological constructs had been experimentally manipulated. These constructs comprised (1) the ease of processing of information (i.e., the drift rate in the diffusion model framework), (2) decisional caution (threshold separation), and (3) decisional bias (starting point). No manipulation was implemented to selectively affect nondecision time. The research teams were blind with regard to the manipulations and could use an information-sampling model of their own choice to find out which of the four psychological constructs (ease of processing information, decisional caution, decisional bias, or nondecision time) was manipulated in each experiment. Ten teams opted for a diffusion model analysis (using either the simple or the full diffusion model), five teams applied the LBA model, and two teams used a model-free, heuristic approach.

In the nine studies with a speed–accuracy manipulation, on average 52.9% of the teams observed the typical effect in nondecision time (i.e., a longer nondecision time in the accuracy than in the speed condition). This percentage was even higher (72.2%) if only the ten teams are considered that based their decisions on a diffusion model analysis. Furthermore, in the three experiments with a manipulation of threshold separation and no simultaneous manipulation of drift rate, 62.7% of the teams found higher drift rates under accuracy than under speed conditions. This percentage is smaller (50.0%) if only the diffusion model teams are considered.

To account for the possibility that speed–accuracy manipulations might also influence nondecision time and drift rate, Dutilh et al. (2018) used different scoring keys. In the original scoring key, the speed–accuracy manipulation was expected to affect only the threshold separation. Thus, any other effect was coded as a false alarm. In another scoring key, an effect on nondecision time (longer nondecision times in the accuracy condition) was coded as a hit, and thus the research teams who did not report this effect had a miss. Finally, in a third scoring key, an effect on drift rate (higher drift rate in the accuracy condition) was regarded as correct, but effects on nondecision time were false alarms.

As the mere existence of these different scoring keys illustrates, there is great uncertainty regarding the true effects of speed–accuracy manipulations. Importantly, there were major differences in both the performance and the rank order of the different estimation approaches, depending on the scoring key that was used. For example, EZ2 (Grasman, Wagenmakers, & van der Maas, 2009)^{Footnote 2} performed best of all methods, with 84% correct classifications if the standard scoring key was applied. If, however, the scoring key with the assumption of an effect on nondecision time was used, EZ2 had only 68% correct classifications, which was worse than most other diffusion model methods. Accordingly, for the correct interpretation of the results obtained in this and other experimental studies based on speed–accuracy manipulations, it is of utmost importance to analyze the validity of the speed–accuracy manipulations to find out which scoring key is appropriate.

By means of analyses of model parameters gained in experimental validation studies alone, it is not possible to untangle the two possible explanations. One interesting approach to disentangling the processes involved in speed–accuracy trade-offs has been taken by Rinkenauer et al. (2004; see also Osman et al., 2000; van der Lubbe, Jaśkowski, Wauschkuhn, & Verleger, 2001). Rinkenauer et al. analyzed an event-related potential component, the lateralized readiness potential, to separate motor processes from premotor processes in tasks with different levels of speed stress. They found that speed–accuracy trade-offs are captured by both premotor and motor processes. Another promising approach that can be used to disentangle the different components involved in speed–accuracy trade-offs is simulation studies. This approach was taken in the present studies.

The present studies

We conducted two simulation studies based on typical parameter constellations observed in empirical, experimental validation studies that have used speed–accuracy manipulations. In the data generation process, we varied either the threshold separation or the nondecision time between two conditions. Basing the simulations on the values observed in specific experimental validation studies allows a comparison of the simulation results with the results of the empirical studies.

After the generation of data sets, we reestimated the parameters using the Kolmogorov–Smirnov optimization criterion implemented in fast-dm-30 (Voss, Voss, & Lerche, 2015).^{Footnote 3} Next, we examined which parameters varied between the two conditions. Thus, in the simulation studies we simulated an experimental manipulation with perfect validity. Accordingly, any effect on nondecision time (or any other diffusion model parameter) observed in the condition with manipulation of the threshold separation must necessarily result from problems in the parameter estimation process. If there is no effect on nondecision time, or an effect that is clearly smaller than in the experimental validation study, it is likely that the effects reported in empirical studies are based (mainly) on a valid assessment of psychological processes (i.e., that speed–accuracy manipulations truly do influence nondecisional processes). If, on the other hand, a substantial effect on nondecision time emerged in the simulation study, we would suppose that the lack of validity was traceable to the parameter estimation procedure.

In contrast to the approach by Rinkenauer et al. (2004), this procedure allowed us to assess the size of estimation errors and made it possible to uncover whether the effects on nondecision time typically observed in empirical studies are at least partly caused by estimation problems. Furthermore, in contrast to Rinkenauer et al., we also examined varying numbers of trials, to specify how many trials are required for a clear separation of effects on threshold separation and nondecision time.

Certainly, biases in parameter estimation have been examined in several previous simulation studies (e.g., Lerche, Voss, & Nagler, 2017; Ratcliff & Tuerlinckx, 2002; van Ravenzwaaij & Oberauer, 2009; Vandekerckhove & Tuerlinckx, 2007). However, few studies have generated differences in threshold separation between conditions, and none, as far as we know, have generated a difference in nondecision time. Moreover, these previous studies were usually not based on the specific parameter sets observed in empirical studies with speed–accuracy manipulations, or they relied on parameter estimates obtained in a fitting process with only one single nondecision time. Furthermore, importantly, these previous studies were usually based on very high trial numbers. For example, Ratcliff and Tuerlinckx (2002) used a minimum of 1,000 trials per data set. In the experimental validation studies, however, trial numbers were much smaller. For example, Voss et al. (2004) used only 40 trials, which made difficulties with disentangling different parameters more likely. We were interested in whether, even for such small trial numbers, effects on threshold separation can be separated clearly from effects on nondecision time. We therefore conducted simulation studies that were explicitly based on the parameter sets observed in experimental validation studies in order to compare the effects observed in the simulation studies with the effects in empirical studies.

In the following sections, we first present a simulation study based on parameter estimates of the experimental validation study by Voss et al. (2004, Study 1). In this study, we used several different parameter sets and generated a large number of data sets for each parameter set. In Study 2, the data were generated on the basis of parameter estimates from Experiment 3 by Dutilh et al. (2018). In contrast to Study 1, we created parameter sets based on a multivariate normal distribution. Thus, in the two studies we employed different strategies of data simulation. The main aim was the same: an examination of the causes underlying the effects of speed–accuracy manipulations on nondecision time.

Study 1: Simulation study based on Voss et al. (2004)

Simulation Study 1 was based on parameter estimates reported by Voss et al. (2004, Study 1). If the diffusion model is capable of disentangling threshold parameter settings from nondecision time effects, an effect should appear only for the specific parameter that was varied in the generation of the data sets. Thus, if two different threshold parameters were used for the data generation, differences in the reestimated parameters should be observed only in this parameter. Likewise, if the two conditions differed only in nondecision time, the effect should solely manifest in the nondecision time estimate and in no other parameter.

Method

Data generation

The data generation was based on the parameter values reported in the study by Voss et al. (2004, Study 1). The authors estimated parameters separately for the “standard” (in the following termed “speed”) and “accuracy” conditions. For our simulation, we assumed a within-subjects design with two conditions. Either the threshold separation or the nondecision time was varied between the two conditions. For all parameters that were not varied, we used the mean over the two conditions.

For the manipulation of the threshold separation, one parameter set included the threshold separation estimates by Voss et al. (2004) for the two conditions (i.e., a_spd = 1.18, a_acc = 1.37). We further generated two more parameter sets: one parameter set with a smaller difference in the threshold separation between the two conditions, namely half of the original difference; and one parameter set with a larger difference, namely a doubled difference^{Footnote 4} (see Table 1 for the exact parameter values). For the manipulation of nondecision time, one parameter set included the nondecision time estimates by Voss et al. (2004) for the two conditions (i.e., t_0,spd = 0.47, t_0,acc = 0.52). In addition—as for the manipulation of the threshold separation—we used one parameter set with a smaller and one with a larger difference (see note 4). In sum, Study 1 was based on a 2 (manipulated parameter: a or t₀) × 3 (effect size: small, medium, or large) design.

Table 1 Study 1: Parameter sets for the data generation

Full size table

The thresholds were associated with the two response alternatives from the color discrimination task. Specifically, the drift rate ν₀ refers to the blue stimulus, associated with the lower threshold, and the drift rate ν₁ to the orange stimulus, associated with the upper threshold. We set the intertrial variability of nondecision time (not estimated in the study by Voss et al., 2004) to 0. One thousand data sets were generated for each parameter set and each of six different numbers of trials (40, 100, 200, 400, 1,000, and 5,000). In half of the simulated trials, the stimulus associated with the upper versus the lower threshold was the correct response.

Parameter estimation

For the parameter estimation, we used the same procedure as in the empirical study by Voss et al. (2004): We applied the Kolmogorov–Smirnov optimization criterion implemented in fast-dm-30 (Voss et al., 2015) and reestimated the parameters separately for each data set and each condition. More specifically, we estimated the threshold separation, starting point, two drift rates (one for each response alternative), nondecision time, and the intertrial variabilities of drift rate and starting point. The intertrial variability of nondecision time was set to 0, as in the generation of the data sets.

Results

For each condition and each of the four main diffusion model parameters^{Footnote 5}, we computed the difference between the parameter estimates of the accuracy and speed conditions. Boxplots of these differences are depicted in Fig. 2 for the manipulation of the threshold separation, and in Fig. 3 for the manipulation of the nondecision time. Positive values indicate that the parameter estimates of the accuracy condition are larger than the estimates obtained for the speed condition. The boxplots show the first quartile, the median, and the third quartile. Additionally, the gray lines display the means of the different conditions. The black horizontal lines indicate the true differences between the two conditions (e.g., 0.19 for a medium-sized difference in threshold separation). Finally, the red lines show the differences reported by Voss et al. (2004) for the parameters that have not been (intentionally) manipulated. The red lines are plotted at the number of trials (n = 40) that was also employed by Voss et al. (2004), to enhance comparability with their study.

Figure 2 shows that the differences in threshold separation between the two conditions were recovered very well. Unsurprisingly, with an increasing number of trials, the estimation grows more precise. However, even for the lower trial numbers the estimations are unbiased, with both the medians and means of the differences very close to the true differences. Importantly, the nondecision times were also estimated very well. Only in the condition with the largest effect size and the smallest trial number (i.e., 40 trials) does a noticeable positive difference between the two conditions show up: In the accuracy condition, the mean nondecision times were longer than in the speed condition, even though the data were generated on the assumption of no difference in this parameter.

Notably, the mean difference found in the simulation study for the medium effect size is smaller than the mean difference observed by Voss et al. (2004)—indicated by the red line—as a one-sample t test revealed (p < .001, d_z = 0.48). Moreover, the mean difference in the simulation study deviated from 0 to only a small degree (p < .001, d_z = 0.12). Accordingly, the effect on nondecision time reported by Voss et al. (2004) is probably mainly attributable to the manipulation and not to the estimation procedure.

In terms of drift rates, there were no systematic differences between the two conditions. Note that also in the study by Voss et al. (2004), no significant differences in drift rates between the two conditions appeared. Finally, regarding the starting point, no differences between the accuracy and speed conditions were present, in the data from either our simulation study or the study by Voss et al. (2004).

The parameter estimation was also very effective if nondecision time was varied between conditions. Figure 3 reveals that the effect on nondecision time was captured exclusively by the nondecision time parameter. Even for small trial numbers and for small to large effects, the effect was attributed to the correct parameter.

Discussion

In the experimental validation study by Voss et al. (2004), the speed–accuracy manipulation influenced both threshold separation and nondecision time. As expected, the threshold separation was higher in the accuracy than in the speed condition. Additionally, however, the nondecision time was higher in the accuracy than in the speed condition. We wanted to investigate whether this effect on nondecision time was mainly attributable to trade-offs in parameter estimation. If there are no severe trade-offs, the effect is likely to be attributable to a lack of discriminant validity of the experimental manipulation. The speed–accuracy manipulation might influence not only the accumulation of information, but also nondecisional components such as motoric or encoding processes. Thus, in our simulation study we aimed to disentangle these two possible accounts.

We generated data based on the assumption that differences between the two conditions were based exclusively on either threshold separation or nondecision time. The parameter sets for the data simulation were based on the estimates reported by Voss et al. (2004). In addition to the trial number and effect size from the empirical study, we further examined larger trial numbers and smaller and larger effect sizes. Notably, the parameter estimation worked well: In almost all conditions, the true differences showed up only in the parameter actually manipulated, and in none of the other parameters.

A comparison of the mean difference reported by Voss et al. (2004) with the one in our study further revealed that it is very unlikely that the difference in nondecision time reported by Voss et al. (2004) is attributable to trade-offs in the parameter estimation procedure. Only in the condition with the largest effect and the smallest trial number did the nondecision time estimate increase notably in the accuracy condition. Conversely, we observed no difference in the condition with a medium-sized effect that was based on the effect size observed in the empirical data. Accordingly, we assume that the effect in the empirical study was due to true differences in nondecision time between the speed and accuracy conditions.

Note that in the interpretation of our results, we make the assumption that the empirical data on which we based our simulation study were generated by diffusion processes. Theoretically, differences in nondecision time between the speed and accuracy conditions might have resulted from contaminated data—that is, from trials in which decisions were not (purely) based on a diffusion process but, for example, on fast guesses or attention lapses. If there were more fast contaminants in the speed condition and more slow contaminants in the accuracy condition, this could explain the differences in nondecision time. Thus, effects on nondecision time could emerge, even though there were no real differences in either motoric response execution or the encoding of information, but rather in the contamination between the two conditions.

To examine whether contaminants were responsible for the effect on nondecision time, we “contaminated” 4% of trials (a typical value that has also been used, e.g., by Ratcliff & Childers, 2015; Ratcliff & Tuerlinckx, 2002) by changing the simulated data. We assumed an extreme case in which, in the low-threshold condition 100% of the contaminants were fast, and in the high-threshold condition 100% of the contaminants were slow. Given this extreme case, the contaminants should be most likely to impact nondecision time. For the generation of contaminants, we used the same strategy as in Lerche et al. (2017). Fast contaminants simulated fast guesses with a random response (either upper or lower threshold) and the RTs randomly drawn from a uniform distribution ranging from t₀ – 100 ms to t₀ + 100 ms. Thus, the responses in these fast-guess trials were located at the lower edge of the RT distribution. For the slow contaminant trials, the simulated response latencies were replaced by higher values without changing the actual response. The slow RTs were drawn from a uniform distribution ranging from 1.5 to 5 interquartile ranges above the third quartile of the RT distribution. The results of the analyses with contaminated data are presented in Fig. 4. Importantly, despite the presence of contaminants, condition still had no effect on nondecision time. Rather, the presence of contaminants further increased the effect on threshold separation. Thus, the presence of contaminants can result in an overestimation of the effect on threshold separation. In a further set of analyses, we added fast contaminants to the condition with low t₀ and slow contaminants to the condition with high t₀. Again, the contaminants affected threshold separation rather than nondecision time (Fig. 5). In sum, our additional analyses support the view that the effect on nondecision time typically observed in empirical validation studies cannot be explained by estimation problems, but rather is attributable to true effects of the speed–accuracy manipulation on nondecisional components.

In the following section, we present a further simulation study that was conducted to test the generalizability of our results to a different method of data generation (using a multivariate normal distribution) and different underlying parameter values.

Study 2: Simulation study based on Dutilh et al. (2018)

For Study 2, we relied on data from Experiment 3 of Dutilh et al. (2018).^{Footnote 6} Whereas some experiments from this large-scale validation project contained several manipulations, the data of Experiment 3 were based solely on a speed–accuracy manipulation. In terms of convergent validity, the results from the different research teams were promising: 82% (i.e., 14 out of 17 teams) detected the expected effect in the threshold separation parameter. In addition, all teams that employed a diffusion model (whether the simple or the full diffusion model) found the effect. Additionally, 41% (seven out of 17 teams) detected an effect on nondecision time, with a higher estimate in the accuracy than in the speed condition. More specifically, 60% (six out of ten groups) of the diffusion model teams observed this effect, in contrast to only one out of five LBA teams.

Again, essentially two main explanations for the unexpected effect on nondecision time can be put forward: The effect might be due to (1) trade-offs in the parameter estimation or (2) the insufficient validity of the experimental manipulation. It is possible that the participants in the speed condition not only reduced the time spent on accumulating information, but also hastened their motoric responses and information encoding. In our simulation study, we aimed to disentangle the two possible explanations.