## Introduction

## How to disentangle task-selection failures and task-execution failures?

### The distinction between cognitive and observed error types

### Methodology I: univalent task-response mapping

### Methodology II: capitalizing on stimulus congruency

### Methodology III: using more response keys than levels of stimulus dimension

## The role of inhibition and associative learning in task switching

### Evidence for a slow error-correction mechanism in task switching

## The present study

## General method

### MPT modelling

_{s}whose value is constrained to fall between 0 and 1. An observable event (e.g. an error that is empirically categorized as task-confusion error) is considered to be the result of the success (or failure) of those unobservable cognitive events. Crucially, the structure of a model depends on the cognitive processes that the experimenter assumes to take place in the paradigm. For this reason, the equations describing an MPT model depend on how a response is thought to arise from the specific cognitive mechanisms involved in the paradigm. Having set the equations for mapping the observable response categories to the cognitive processes, and knowing the frequencies with which each response category is observed, it is possible to estimate the probabilities of the latent cognitive events, which are represented by the model’s parameters (Hu & Batchelder, 1994). After this, the probability of occurrence of an observable event can be computed by first multiplying the model’s parameters along a branch, and second, adding the resulting probabilities for all branches that lead to this event.

### Analysis of N-2 repetition costs

_{z}is reported for paired t-tests, computed by dividing mean difference scores by their standard deviation (Brysbaert, 2019; Lakens, 2013).

### Computation of Bayes factors (BFs)

^{1}(e.g., for a design with the independent variables A and B, the following models were built: Null model without any effects, model with only main effect of A, model with only main effect of B, model with main effects of A and B, but no interaction, full model with main effects of A and B and interaction AxB). After the models were built, and the likelihood for each model given the data were computed, BFs were calculated as the ratio between this likelihood and that of a null model containing only the grand average and participants as a random effect. Afterwards, inference on a particular effect of interest was achieved by comparing the BF of the best fitting model (i.e. that with the highest BF compared to the null) with that of an identical model which differs only in the presence/absence of the effect of interest. For example, if the best fitting model contains the main effect of A and an AxB interaction, evidence in favour of the interaction was computed as the ratio between this model’s BF and the BF of a model only containing the main effect of A (Rouder et al., 2017).

### Data trimming

## Experiment 1

### Methods

#### Pre-registration

#### Participants

#### Stimuli

#### Trial procedure

#### Experimental procedure

#### Design

### Results

#### Data trimming

^{2}

#### Frequency of different error types

#### Proportion of task-selection failures and task-execution failures

#### Analysis of N-2 repetition costs

_{10}= 5.75. Furthermore, there was a main effect of N-1 Speed, F(1,23) = 5.33, p = 0.030, \({\eta }_{p}^{2}\) = 0.19, \({\eta }_{G}^{2}=0.010\), BF

_{10}= 2.90, indicating that trials following a slow response were slower compared to trials following a fast response. N-2 Accuracy significantly interacted with N-1 Speed, F(1,23) = 11.02, p = 0.003, \({\eta }_{p}^{2}\) = 0.32,\({\eta }_{G}^{2}=.012\), BF

_{10}= 6.33. Most importantly, the three-way interaction between Task Sequence, N-2 Accuracy and N-1 Speed was not significant, F < 1, BF

_{10}= 0.31. N-2 repetition costs following an N-2 correct response were 52 ms for the N-1 fast condition and 37 ms for the N-1 slow condition. If there was a response-confusion error in N-2, N-2 repetition costs were identical irrespective of N-1 Speed (13 ms in both N-1 fast condition and N-1 slow condition).

### Discussion

_{10}= 0.31). N-2 repetition costs after N-2 response-confusion errors were numerically identical after fast and slow N-1 trials, suggesting that N-2 repetition costs after N-2 response-confusion errors were not modulated by N-1 speed (for further analysis excluding the N-1 speed factor, see the Online Appendix).

## Experiment 2

### Methods

#### Pre-registration

#### Participants

#### Stimuli

#### Responses

#### Procedure

#### Design

#### Data trimming

### Results

#### Frequency of different error types

#### Proportion of task-selection failures and task-execution failures

#### Analysis of N-2 repetition costs

_{10}= 0.73. Numerically, the data pattern was as follows: in N-1 fast trials, N-2 repetition costs were present following a correct response in N-2 (38 ms), t(20) = 3.42, p = 0.003, d

_{z}= 0.75, BF

_{10}= 15.26, but turned into a numerical facilitation in the N-2 error condition (− 28 ms), t > − 1, BF

_{10}= 0.31. In contrast, in N-1 slow trials, N-2 repetition costs were present both after N-2 correct (44 ms), t(20) = 3.90, p < 0.001, d

_{z}= 0.85, BF

_{10}= 40.10 and after N-2 error trials (55 ms), t(20) = 2.14, p = 0.044, d

_{z}= 0.47, BF

_{10}= 1.49. In the N-2 incongruent dataset, on the other hand, the 3-way interaction was not significant, F < 1, BF

_{10}= 0.29. N-2 repetition costs after N-1 fast trials were 61 ms after N-2 correct and 37 ms after N-2 error; N-2 repetition costs after N-1 slow trials were 37 ms after N-2 correct and 30 ms after N-2 error.

### Discussion

## Experiment 3

### Methods

#### Participants

#### Stimuli

#### Procedure

#### Design

### Results

#### Data trimming

#### Frequency of different error types

#### Proportion of task-selection failures and task-execution failures

#### Analysis of N-2 repetition costs

_{10}= 1.23. The interaction indicates that N-2 repetition costs were present following a correct response in N-2 (39 ms), t(88) = 8.91, p < 0.001, d

_{z}= 0.95, BF

_{10}> 100, but were absent following a task-confusion error (5 ms), t < 1, BF

_{10}= 0.12. In N-1 slow trials, the same 2 × 2 interaction was instead absent, F < 1, BF

_{10}= 0.26. Within this subset, N-2 repetition costs were again found following a correct response in N-2 (26 ms), t(88) = 4.97, p < 0.001, d

_{z}= 0.53, BF

_{10}> 100, as well as in the N-2 error condition (40 ms), t(88) = 3.28, p = 0.001, d

_{z}= 0.35, BF

_{10}= 16.24.

### Discussion

_{10}= 0.12) following task-confusion errors in trial N-2, but as expected, this was the case only when the N-1 post-error trial was a fast one (note that this three-way interaction was significant in the frequentist ANOVA, but was not strongly supported by the more conservative Bayes factor analysis). Instead, when the N-1 post-error trial was slow, there was no evidence for any difference in N-2 repetition costs following N-2 correct or N-2 error trials (F < 1, BF

_{10}= 0.26, for the interaction). N-2 repetition costs were observed in both conditions.