Results
Participants were highly accurate (
M = 83%) in auditory-only trials and almost reached ceiling performance (
M = 96%) in congruent trials (see Fig.
2). To test whether accuracy levels differed statistically in the auditory-only trials and congruent trials, we performed a one-way ANOVA with the within-subjects factor Eyes (Auditory only, Eyes closed (congruent), Eyes open (congruent)). As before, there was a main effect of Eyes (
F(2,138) = 74.86,
p < 0.001,
ηG2 = 0.343), showing that participants had a significantly lower accuracy in the auditory-only trials compared to the other two trial types (both corrected
ps < 0.001; averaged Cohen’s
d = 1.13). These differences in participants’ accuracy levels were also reflected in their response times (
F(2,138) = 3.66,
p = 0.030,
ηG2 = 0.011), with slower responses in the auditory-only trials.
As before, participants chose the fusion response in a high proportion of trials (
M = 55%), indicating that they experienced the McGurk illusion (see Fig.
3). The proportion of fusion responses was descriptively, yet not significantly, smaller for Eyes closed compared to Eyes open, as shown by a paired
t-test (
t(69) = 1.57,
p = 0.122, Cohen’s
d = 0.19). Again, this difference was reflected in a mirror-inverted pattern for the auditory syllables, which were selected significantly more often in the Eyes closed compared to the Eyes open condition, as shown by a paired
t-test (
t(69) = 2.10,
p = 0.039, Cohen’s
d = 0.23).
When integrating the latter result into the overall context, it seems that the effect of Eyes (i.e., a reduced McGurk illusion when the speaker’s eyes are closed) is significant in the Static condition only, yet fails to reach significance in the Static matched condition (see above) and Dynamic condition (see Sect.
Incongruent trials -
Response choices). However, when looking at the data more closely, it turns out that the effect of Eyes depends on the basic size of the McGurk illusion (i.e., the percentage of perceived fused responses). In particular, the reason for the absence of the effect of Eyes in the Static matched and Dynamic conditions might be the generally smaller McGurk illusion in these conditions compared to the Static condition (Static matched: 55%; Dynamic: 62%; Static: 74%; see Fig.
3).
As we noticed that our data for the Static matched and Dynamic conditions seemed to be bimodally distributed, we considered performing a Median split to gain a better understanding of participants' behavior. To this end, we first assessed the degree of bimodality by calculating a bimodality coefficient (Pfister et al.,
2013). In line with Pfister and colleagues, we considered a coefficient larger than 0.55 as an indication for bimodality. For both conditions, the computed coefficients surpassed this reference value (Static matched: 0.58; Dynamic: 0.59), suggesting that a Median split is a reasonable approach.
We first conducted a Median split for the Static matched condition and analyzed the above-Median and below-Median data sets separately. The results showed that for the above-Median data set, the size of the McGurk illusion is 82% when the speaker’s eyes are open and 78% when the speaker’s eyes are closed, resulting in a significant difference (t(34) = 3.16, p = 0.003, Cohen’s d = 0.31). Note that this effect size is comparable to the effect size of the Static condition. In contrast, for the below-Median data set, the size of the McGurk illusion is 29% when the speaker’s eyes are open and 30% when the speaker’s eyes are closed, showing no significant difference (t(34) = − 0.61, p = 0.546, Cohen’s d = − 0.05). When conducting the same Median split for the Dynamic condition, we find the same pattern for eyes open vs. closed (above-Median: 85% vs. 82%; below-Median: 40% vs. 41%). The difference between eyes open vs. closed for the above-Median data set is close to significant (t(34) = 1.86, p = 0.069, Cohen’s d = 0.29), yet it is not significant for the below-Median data set (t(34) = − 0.58, p = 0.564, Cohen’s d = − 0.05).
To sum up, the effect of Eyes can be detected only if participants reliably perceive the McGurk illusion. Thus, the effect of Eyes can only be seen in those participants showing a large McGurk illusion (i.e., in the above-Median data set) but not in those showing a small McGurk illusion (i.e., in the below-Median data set).
Critically, we also compared the proportion of fusion responses in the control condition (Static matched) with the Static and the Dynamic conditions from the main experiment by conducting a 2 × 3 ANOVA with the within-subjects factor Eyes (Open, Closed), the between-subjects factor Motion (Static, Dynamic, Static matched), and with the proportion of fusion responses as a dependent variable. We found a significant main effect of Eyes (
F(1,207) = 7.51,
p = 0.006, η
G2 = 0.001), indicating that participants selected fewer fusion responses when the speaker’s eyes were closed (vs. open). There was also a significant main effect of Motion (
F(2,207) = 9.08,
p < 0.001, η
G2 = 0.079). The interaction effect was not significant (
F(2,207) = 0.54,
p = 0.542, η
G2 < 0.001). We followed up the main effect of Motion with pairwise comparisons. There was a significant difference between Static and Static matched (
t(138) = 4.22,
p < 0.001, Cohen’s
d = 0.71), with a higher proportion of fusion responses in Static. There was no significant difference between Dynamic and Static matched (
t(138) = 1.49,
p = 0.138, Cohen’s
d = 0.25). On a descriptive level, however, there was a higher proportion of fusion responses in Dynamic. This result indicates that the extent to which participants in our control condition—which was identical to the Static condition yet matched in length to the Dynamic condition—experienced the McGurk illusion was more similar to the Dynamic condition than to the Static condition (see Fig.
3).
We performed the same 2 × 3 ANOVA with response times as the dependent variable. The pattern of results mirrored the analysis for fusion responses (see Fig.
4). There was a significant main effect of Eyes (
F(1,207) = 5.35,
p = 0.022, η
G2 = 0.001), indicating that participants were faster to select a response when the speaker’s eyes were closed (vs. open). This suggests that when the speaker’s eyes were closed, participants were more likely to choose the accurate auditory response (instead of the fusion response) and to make this response faster compared to when the speaker’s eyes were open. There was also a significant main effect of Motion (
F(2,207) = 3.90
p = 0.022, η
G2 = 0.034). The interaction effect was not significant (
F(2,207) = 1.30,
p = 0.275, η
G2 < 0.001). We followed up the main effect of Motion with pairwise comparisons. There was a significant difference between Static and Static matched (
t(138) = 2.90,
p = 0.004, Cohen’s
d = 0.49), with slower responses in Static matched. There was no significant difference between Dynamic and Static matched (
t(138) = 1.40,
p = 0.164, Cohen’s
d = 0.24). On a descriptive level, however, responses were slower in Static matched. This result indicates that participants were fastest to respond in the Static condition, distinctively slower in the Dynamic condition, and again slightly slower in the Static matched control condition (see Fig.
4).