Validity of pupil size measurements
Before addressing the hypotheses, we would like to elaborate on the temporal analysis of the pupil size means. Visual inspection of the time course of pupil dilation indicated possible carry-over effects. Over all trials, pupil size decreased after trial start for 500 ms and began to increase thereafter. Therefore, the one second fixation point duration was seemingly not enough for the pupil diameter to fully return to its baseline value. Inspecting the end of trials indicated that the pupils were still dilating before, while, and after task response (mouse-click) was given, leading to higher values during the fixation point. This is a possible concern to the validity of the measurements as the exact extent and duration of the observed effects are unknown. One possible explanation is that shorter trials were systematically affected more, as the recording was cut off earlier and the trailing pupil dilation would lead into the following fixation point and trial. On longer trials, there was more time for the pupil size to plateau, which could be a reason for our results indicating that higher values emerged for higher angular disparity. This explanation however is unlikely, as this should have been compensated by the included effect of reaction time on pupil dilation. Nevertheless, assuming that carry-over effects only depend on the difficulty of the previous trial (and trials are in random order) and are somewhat random in magnitude and duration, they at least introduce additional variance to the measurements, which in turn reduces the power of the design.
Despite these issues, we did not choose to alter the measurements. Regarding the baseline measurement, we cannot isolate a baseline due to the possible overlap in time of the observed decrease in pupil size and the expected increase due to the cognitive effort. However, we also conducted the pupil diameter analysis using baseline values for each stimulus, which were recorded at the beginning of the experiment. This form of analysis has its own problems, i.e. mainly having no control of the random trial by trial pupil fluctuations. Interestingly, this and our described analysis results did not differ in the significance of any effect in question. This could indicate that the starting values were overall only shifted, i.e. baseline measurements at the start of each trial were always 0.5 units larger than the true baseline. Regarding the measurement of maximal pupil dilation, one could include the following fixation point in the analysis of each trial. This however would have other effects influencing the data, e.g. new visual input and pupillary light reflex. In addition, altering the analysis would not change the problem of the carry-over effects. In consequence, regarding the hypotheses for the pupillometric measurements, the results must be considered with caution, independent of a possible change of measurements. Nevertheless, these issues are important for both past and future studies of pupil dilation during mental rotation and also other cognitive tasks where similar problems might arise. In experiments where the trial was cut off directly after task response, the interpretation of the results has to be done cautiously, as carry-over effects might have a similar impact there (e.g. in, Campbell et al.,
2018). In a recent study using another approach, Bochynska et. al. (
2021) showed the stimulus for 4 s, independent of task response. However, since trials with response times longer than 4 s (141 of 1064 trials) were excluded from the analysis, tasks of higher angular disparity might only have been partially included, and the ones included could also have faced the problems of delayed pupil dilation.
Based on the observed time course of pupil dilation, future studies should (1) keep showing the task and measuring pupil dilation even after the response for at least 500 ms, and (2) increase the break between trials to at least 2 s.
General discussion
The results show no sex differences in the behavioral performance. The main effects and interactions for both accuracy and reaction time do not show any influence of sex on the models. This is in line with other chronometric mental rotation studies. For instance, Voyer et. al. (
2006) also report no sex differences in mental rotation performance of 3D cube figures. Jansen-Osman and Heil (
2007) investigated sex differences in mental rotation tasks with five different stimulus types and also reported sex differences only in one (polygons) of these (3D cube figures, letters, stimuli from primary mental abilities, and animal pictures). However, the results are in contrast to the study of Voyer and Jansen (
2016)—using stimuli that were partially the same as in this study—who pointed out that although a performance improvement for both sexes was apparent, males might benefit more from the advantage through embodiment. One reason for this discrepancy may be the variation in the use of the human stimuli between the two studies. Voyer and Jansen (
2016) presented head cubes (cubes with the addition of a head) while we investigated human postures.
In accordance with Amorim et. al. (
2006), our results confirm our second hypothesis that task performance would be better for both embodied figures compared to cube figures for both sexes. Our experiment shows a significant embodiment effect. Both embodied figures in this object-based transformation task were processed more easily on a behavioral level (shorter reaction time and higher accuracy), matching our hypothesis. This is in congruence with other studies (e.g. Amorim et al.,
2006; Campbell et al.,
2018; Voyer & Jansen,
2016).
In line with the paradigm for chronometric mental rotation tasks, changes in angular disparity significantly influenced all dependent variables for all models. Higher angular disparity between the two pictures resulted in higher reaction times and lower accuracy. These effects are larger for the abstract than for the embodied figures. In particular, the interaction of cube figures and angular disparity showed a higher negative impact on performance than for body postures and human figures.
In terms of cognitive load (hypothesis 3), cube figures show the highest values in pupil diameter, followed by body postures and human figures. Here, both embodied figure types differ significantly from the cube figures. Therefore, the highest cognitive load manifests in cube figures, indicating that these tasks are more difficult to solve. This finding is congruent with our results for behavioral performance.
An additional possible explanation for the pupil diameter to be lower for both embodied figures than for cube figures might be a congruency effect. In a pupillometry experiment regarding the Stroop task, Hershman and Henik (
2019) report lower pupil diameter values for neutral (colored letters with no meaning) than for color-congruent (word and word color align) tasks, indicating higher cognitive load in the latter due to a task conflict between reading the word and naming the color of the word. This is based on the effect that stimuli evoke tasks, which are strongly associated with them (Rogers & Monsell,
1995; Waszak et al.,
2003). That means the neutral colored word has more task congruency, because it only elicits naming the color, whereas the colored word elicits reading of the word, creating conflict with the response. These findings concur with the description of motoric embodiment (i.e. imagination and execution of actions addressing the same motor representations) of Amorim et. al. (
2006), and might also apply to this study. Task response was given in interaction with a desktop mouse. With the hand being an important and salient feature of human figures and body postures, a congruency effect with the hand response is possible. That is, higher congruency could also lead to lower cognitive load and might thus be an additional factor considering motoric embodiment effects in this experiment.
Overall, the first part of our hypothesis 3 was confirmed by the modeling results, with the main effects of cube figures and higher angular disparity increasing the pupil dilation the most. Also as predicted, the two embodied figure types did not differ significantly. Interestingly, the interaction of stimulus type and angular disparity showed the highest increases for human figures. With the implementation of reaction time in the model, the results illustrate the additional effect of this interaction on top of the effect of reaction time. Here, the pupil sizes for cube figures are higher due to the higher difficulty to solve the task. Thus, the range to increase was smaller than for the embodied figures, resulting in a less steep slope by angular disparity, which might indicate a ceiling effect in this regard. However, no significant differences emerged between body postures, and cube and human figures, with the values of body postures lying between those two. Consequently, they cannot be placed properly in this regard, which should be further investigated in future experiments. Additionally, changes in pupil dilation did not get fully explained by variations in angular disparity. Reaction time itself still predicted a significant portion of the pupil diameter in the statistic model. That is, both reaction time and angular disparity had an impact on cognitive load, but none of them alone seemed to be sufficient to describe the connection between each other, and to account for task difficulty. As a consequence, it is possible that the relationship between difficulty and cognitive load is not linear.
Our expectations regarding the sex differences in pupil dilation were based on the findings of Campbell et. al. (
2018), but our results did not confirm these predictions, as no differences emerge. As a conclusion, both sexes showed similar performance in the mental rotation tasks and exerted the same levels of cognitive effort for each stimulus type. This also means that all embodiment facilitations provoked similar changes in cognitive load and task performance for both sexes. This gives a hint that both males and females are able to solve mental rotations tasks, which have the same amount of spatial embodiment, with the same effort (Amorim et al.,
2006). Spatial embodiment includes the mapping of a body-relevant coordinate system that facilitates the mental rotation process. In our case, having controlled object-based transformations as instructed, spatial embodiment seems to partially explain the differences in behavioral performance. With the results for pupil dilation matching those for behavioral data, spatial embodiment can explain the reduced pupil size for embodied figures, having a logical alignment in the geometric space.
Our findings did not show any differences between the two embodied stimulus types. This is interesting, as they differ between each other regarding their geometric form. Body postures and cube figures are s-shaped comprising three bends, with body postures imitating the shape of the abstract figures. Human figures instead comprise only one or two bends, and are more i- or t-shaped. Thus, the stimulus types differ in spatial complexity, and human figures could theoretically be assumed to facilitate visual absorption and processing. However, the difference in geometric form had no impact in this experiment. Overall, our results provide—for the first time—evidence that males and females need the same cognitive effort for the solution of this task.