Introduction

Time and space are tightly linked not only in the physical world, but also in the psychological experience. A number of theoretical and empirical studies (Bender & Beller, 2014; Casasanto & Boroditsky, 2008; Bonato et al., 2012; Fuhrman et al., 2011; Núñez & Cooperrider, 2013; Núñez & Sweetser, 2006) indicate that people may rely on space to think about time. For example, accumulating evidence has pointed to the possibility that people represent elapsing time by mapping it onto a linear spatial layout (Santiago et al., 2010; Weger & Pratt, 2008). In the literature, the term mental time line (MTL) has been adopted as a typical and immediate way to account for such space-time interactions in the mind (Bonato et al., 2012). To date, interesting issues concerning MTL include, but are not limited to: cross-linguistic differences in temporal cognition (Boroditsky, 2001; Boroditsky et al., 2011; Fuhrman et al., 2011; Núñez & Sweetser, 2006), factors that may shape the construct of MTL (Bergen & Chan Lau, 2012; Fuhrman et al., 2011; Vallesi et al., 2014), directionalities of MTL (Fuhrman & Boroditsky, 2010; Ding et al., 2015), and the number of MTLs that a person can possess (Miles et al., 2011; Sinha et al., 2011). These questions have attracted much attention and controversy.

The study at issue

In one of the influential studies on the MTL, Miles et al. (2011) noted that temporal relations are cross-linguistically expressed using spatiotemporal metaphors. While spatiotemporal metaphors in English predominately depict time as flowing along a horizontal plane (e.g., the day before yesterday, after graduation), an additional vertical dimension, i.e., shàng (“up”—referring to an earlier event or time-point) and xià (“down”—referring to a later event or time-point), is also employed in Mandarin. Miles et al. (2011) reasoned that sociolinguistic conventions would shape temporal cognition and that the frequent use of L1 Mandarin and L2 English spatiotemporal metaphors might render Mandarin-English (ME) bilinguals to maintain both horizontal (i.e., L2 English) and vertical (i.e., L1 Mandarin) representations of temporal information.

To examine this possibility, Miles et al. (2011) designed a temporal judgment task (Experiment 1) in which ME bilinguals saw pictures that appeared one after another on the computer screen and depicted buildings/cities representing the past (e.g., ancient ruins) or the future (e.g., science fiction scenes). Participants had to judge if the image stood for the past or future time-point by pressing one of two keys on a keyboard. In the horizontal compatible condition, the left key was designated as “past” and the right key “future,” whereas in the horizontal incompatible condition the key assignment was reversed. Likewise, in the vertical compatible condition the top key was designated as “past” and the bottom key “future,” whereas in the vertical incompatible condition this mapping was reversed. Results showed that participants were faster to make a decision in the compatible condition than in the incompatible condition, regardless of whether they did so along a horizontal or a vertical axis. This was taken as evidence demonstrating both a horizontal MTL, consistent with L2 English, and a vertical MTL, congruent with L1 Mandarin, in ME bilinguals’ cognitive domains of time.

Miles et al. (2011) further conducted Experiment 2 in which ME bilinguals were asked to arrange in order two sets of cards depicting temporal sequences of natural events. The results confirmed and extended those of Experiment 1 that ME bilinguals’ horizontal preference for temporal reasoning was very much likely to be prompted by English sociolinguistic conventions and their vertical bias driven by Mandarin linguistic and cultural elements.

Controversies and the present study

Miles et al.’s (2011) findings were very interesting. Nevertheless, their theoretical assumption and the interpretation of data, i.e., Mandarin speakers’ horizontal mental representation of time accords with L2 English, seem somewhat problematic. According to the search results from CCL corpus (a corpus developed by the Center for Chinese Linguistics, the largest Mandarin Chinese corpus in the world), 80.35 % of spatiotemporal metaphors in Mandarin were horizontal and 19.65 % were vertical (Xiao, 2012). The search results from some small Mandarin corpora (i.e., Yahoo News Taiwan and Google News Taiwan) also confirmed that horizontal expressions of time were used far more frequently than vertical terms (Chen, 2007; Chen & O’Seaghdha, 2013). If a person’s native language serves as the primary determinant of habitual thought, a Mandarin speaker should have two MTLs, i.e., one horizontal line and one vertical line, with the horizontal axis being the relatively dominant one. Therefore, it is expected that a Mandarin speaker has two MTLs (the horizontal time line in particular) not because he/she has acquired L2 English, but because there are horizontal and vertical expressions of time in Mandarin per se. In other words, both the horizontal and the vertical MTL that a Mandarin speaker possesses correspond to the two time lines in Mandarin spatiotemporal metaphors.

We performed two experiments which recruited Mandarin monolinguals as participants to test our hypothesis. In Experiment 1, the computer screen presented a horizontal or a vertical array of pictorial stimulus depicting a temporal sequence of natural events, and participants were asked to verify if the temporal sequence described in the pictures was in the correct order. Through this experiment, we sought to establish that Mandarin monolinguals, as a whole group, possess both a horizontal and a vertical MTL, which can be predicted by patterns in Mandarin spatiotemporal metaphors. Experiment 2, in which participants were asked to arrange in order a series of cards depicting temporal sequences of natural events, aimed to confirm and extend the findings from Experiment 1 by more closely examining if there are individual variations among Mandarin monolinguals in their constructs of the two MTLs.

Experiment 1

Participants

Sixty Mandarin monolinguals (26 females, Mage = 45.9, SDage = 1.75) from mainland China took part in this study in exchange for payment or gifts. Prior to the experiment, all participants completed an L2 English proficiency/experience questionnaire to report their L2 proficiency level on a scale from 0-4 [0 = (almost) know nothing about English, 4 = advanced]. They also reported their L2 experience (e.g., the frequency of their exposure to or use of L2 in daily life). The participants all reported that their proficiency in English was 0, and that they had almost no exposure to English in daily life. All of them had normal or corrected-to-normal vision, and had already obtained a degree in tertiary education, thus having reached a high level of literacy.

Materials

Materials comprised 128 triplets of pictures, all describing themes of temporal progression. Each triplet of pictorial stimulus (21.5 × 11.8 cm or 11.8 × 21.5 cm) showed a natural event at three different temporal stages. The 128 triplets of pictures included 64 specific themes of temporal sequences (e.g., a famous film star aging, a watermelon being eaten, an orange tree growing).

Each participant completed four testing blocks, each consisting of 32 trials. The four blocks were: a horizontal canonical block in which each triplet of pictorial stimulus was arranged from left to right as indicated by an arrow alongside the stimulus; a horizontal non-canonical block in which each stimulus was arranged from right to left; a vertical canonical block in which each stimulus was arranged from top to bottom; and a vertical non-canonical block in which each stimulus was arranged from bottom to top (see Figs. 1a, b and 2a, b as examples of the stimuli). The block order was counterbalanced across participants. Each temporal theme appeared twice: once in the canonical block and once in the non-canonical block of the same axis. Stimuli were equally often true and false, and the true/false orders of the trials were randomized. Each block started with four practice trials, and the items used in the practice trials were not used subsequently in the testing blocks.

Fig. 1
figure 1

a Example of a horizontal canonical stimulus. b Example of a horizontal non-canonical stimulus

Fig. 2
figure 2

a Example of a vertical canonical stimulus. b Example of a vertical non-canonical stimulus

Procedures

All participants were tested individually in a quiet room, and all used the same desktop computer. On each trial, a fixation cross was presented in the center of the screen for 600 ms. Then, the stimulus appeared in the middle of the screen for 5,000 ms. Participants were asked to judge if the temporal sequence described in the stimulus was in the correct order according to the direction of the arrow, and they needed to respond as quickly and accurately as possible by pressing one of the two keys (the key “D,” marked with a blue sticker, represented “false”; the key “K,” marked with an orange sticker, represented “true”) on a standard keyboard. Upon entry of a response, a blank screen of 100 ms replaced the stimulus and a new trial began.

Results

Results from six participants whose accuracy rates were lower than 85 % were considered invalid and excluded from the dataset. Moreover, trials that recorded a response latency farther than 3 SD away from their mean (5.92 %) and trials in which participants made errors (8.43 %) were omitted from the RTs’ analyses.

The remaining response data were submitted to 2 (spatial axis: horizontal vs. vertical) × 2 (canonicality of stimuli type: canonical vs. non-canonical) repeated measures ANOVAs. The results revealed main effects of spatial axis [F1 (1, 53) = 5.90, p < 0.05, partial η2 = 0.10; F2 (1, 31) = 23.18, p < 0.001, partial η2 = 0.428] and canonicality of stimuli type [F1 (1, 53) = 121.18, p < 0.001, partial η2 = 0.696; F2 (1, 31) = 34.58, p < 0.001, partial η2 = 0.527]. However, a nonsignificant spatial axis by canonicality of stimuli type interaction [F1 < 1, p = 0.504, partial η2 = 0.008; F2 < 1, p = 0.424, partial η2 = 0.021] was observed. As shown in Fig. 3, participants responded faster to canonical than to non-canonical stimuli, irrespective of the spatial axis. Figure 3 also revealed that the magnitude of MTL effect (i.e., RTs in the non-canonical condition minus RTs in the canonical condition) was larger in the horizontal axis (236 ms) than in the vertical axis (133 ms). These results were not due to speed-accuracy trade-offs, because participants’ response accuracy did not differ significantly [F < 1, p = 0.458, partial η2 = 0.01] across the four testing blocks (91.73 %, 91.12 %, 92.03 %, and 91.38 %, respectively).

Fig. 3
figure 3

Mean RTs for canonical stimuli and non-canonical stimuli along the horizontal and the vertical axis by Mandarin monolinguals. The figure plots by participants’ mean RTs. Error bars indicate standard errors of the mean

Discussion

Participants showed both a horizontal and a vertical canonicality effect, suggesting that they do access both a left-to-right and a top-to-bottom representation of temporal information. Moreover, the faster RTs in the horizontal canonical condition than in the vertical canonical condition, together with the greater magnitude of MTL effect in the horizontal axis than in the vertical axis, reveal that the horizontal axis occupies a relatively dominant role between the two MTLs. Overall, results of Experiment 1 demonstrate that Mandarin monolinguals’ horizontal and vertical space-time mappings were commensurate with patterns in spatiotemporal metaphors in Mandarin.

It should be noted that stimuli and procedures employed in the present research are not entirely the same as Miles et al.’s (2011) Experiment 1. Miles et al. (2011) measured participants’ space-time mappings via the array of response keys, given that the stimuli did not convey any spatiotemporal information but participants were asked to press horizontally or vertically arranged keys (one key representing “past” and the other “future”), which designated the directions of time flow (see also Boroditsky et al., 2011; Fuhrman et al., 2011). On the contrary, our research examined participants’ space-time associations through the setup of stimuli, because the response keys did not reveal any spatiotemporal messages (one key representing “true” and the other “false”), but the stimuli were arranged horizontally or vertically to indicate the linear path of elapsing time (for relevant studies, see Boroditsky, 2001; Chen, 2007; Tse & Altarriba, 2008). Which of the two experimental paradigms could better capture the cognitive mechanisms of temporal processing remains an unexplored issue, as different investigators may favor one paradigm over the other. What is clear thus far, however, is that the stimuli and procedural discrepancy between the present study and Miles et al. (2011) does not constitute a potential factor that affects the MTL effects observed.

Experiment 2

Mandarin monolinguals, as a whole group, encode passage of time into both horizontal and vertical spatial linear paths, which is indicative of a link between language and cognition. However, some foregoing research have noted that there may be individual variations within the Mandarin population, given that some Mandarin speakers favor a horizontal representation of time and others tend to conceive of time as vertically oriented (Bergen & Chan Lau, 2012). Thus, the results of Experiment 1 give rise to an important question: Does a Mandarin monolingual individual simultaneously possess two MTLs, or is one specific space-time mapping routinely preferred over the other across individuals?

To address this question, we adopted a simple card arrangement task similar to the design of several previous studies (Bergen & Chan Lau, 2012; Miles et al., 2011; Tversky et al., 1991). The card-arranging task could explicitly elicit how an individual tend to represent time (Jin & Huang, 2012). Moreover, the experimental setup abided by an open-design principle (Bender & Beller, 2014): there were neither spatial prompts (e.g., participants were not instructed to arrange the cards horizontally or vertically) nor spatial restrictions for participants’ array of the cards (e.g., participants were not confined to arranging the cards horizontally or vertically). Participants could lay out the cards however they wished according to their own cognitive representations of temporal sequences.

Participants

Forty-four Mandarin monolinguals (19 females, Mage = 50.33, SDage = 4.19) from mainland China took part in this study in exchange for payment or gifts. The participants had almost no exposure to English in daily life (mean English proficiency = 0, SD = 0). All of them had normal or corrected-to-normal vision and had already obtained a degree in tertiary education, thus having reached a high level of literacy.

Materials

Materials comprised six sets of printed photograph cards, and each of them described a distinctive temporal progression (e.g., an orange being eaten, Bill Gates aging). Each set consisted of four pictures describing different stages of a temporal sequence (e.g., whole orange, half-peeled orange, half-eaten orange, peel), and each picture was a round piece of paper with a diameter of 3 cm.

Procedures

A round piece of white cupboard with a diameter of 12.50 cm was placed in front of participants. Participants were told that they would be given six sets of cards separately, one set at a time. They were instructed to look through the four pictures of each set and arrange them on the cupboard in the correct temporal order, from the earliest to the latest state.

After receiving instructions, participants were handed the six sets of cards one by one in random order, and each set of shuffled pictures was presented separately in a stack. Upon finishing the arrangement of one set, participants were given the next one. The experimenter recorded participants’ arrangement patterns.

Results and Discussion

We observed two arrangement orientations of cards by participants: horizontally from left to right (abbreviated as HLR), vertically from top to bottom (VTB). As shown in Fig. 4, most of the Mandarin monolingual individuals (39/44, 88.64 %) adopted both the HLR and VTB approach, and the remainder (5/44, 11.36 %) used the same orientation (i.e., either HLR or VTB) for each of the six sets of pictures he/she ordered. Results further indicate that a relative majority of the participants (32/39, 82.05 %) who simultaneously created two arrangement orientations displayed a stronger propensity to lay out cards HLR. A paired t test by participants on the proportions of HLR and VTB arrangements confirmed these participants’ significant horizontal bias [t (31) = 15.87, p < 0.001, d = 2.81]. Meanwhile, a relative minority of the participants who simultaneously used two approaches showed no preference (5/39, 12.82 %) or a VTB preference (2/39, 5.13 %).Footnote 1 Overall, results of Experiment 2 reveal that a horizontal and a vertical time line coexist independently in a Mandarin monolingual’s mind, although some subtle variations were identified across individuals.

Fig. 4
figure 4

Arrangement patterns and the number of participants who created these patterns. X-axis plots the number of participants who created the distinct patterns as indicated by each stacked bar. Y-axis plots the proportion of HLR and VTB arranged sequences within the six sets of cards. The numbers inside each stacked bar indicate the number of HLR and VTB arranged sequences within the six sets of cards

Miles at al. (2011) maintained that the cultural shade of the task materials would trigger the operation of two distinct MTLs, given that the Chinese target printed on the card (i.e., well-known Chinese film star, Jet Li) and the Western target (i.e., famous American film star, Brad Pitt) prompted participants to use a vertical space-time mapping and a horizontal spatial representation of time respectively. It should be noted that we did manipulate such factors as culture elements in our materials, because two sets of cards displayed cultural-specific content (i.e., an American entrepreneur, Bill Gates, and a famous Chinese pop star, Huan Liu). However, the findings of Experiment 2 provide counterevidence about Miles et al.’s (2011) claim, as different cultural identities of the target did not bias participants to arrange temporal sequences HLR or VTB (Fig. 5). To confirm that participants did not arrange the photographs differently as a function of the cultural identity of the target, we performed a Cochran’s Q test in which the proportion of horizontally and vertically arranged sequences was compared between the Huan Liu and Bill Gates trials. The results revealed a nonsignificant effect [Q (1) = 0.758, p = 0.487].

Fig. 5
figure 5

The proportion of HLR and VTB arranged temporal sequences as a function of the cultural identity of the target

General discussion

Across two experiments, Mandarin monolinguals were shown to rely on both a horizontal and a vertical spatial axis to reason about time, with the horizontal axis being the relatively dominant one. The present study further demonstrated that a Mandarin monolingual individual’s mind accommodates two MTLs, irrespective of some minor interindividual differences. These space-time mappings in cognition can be approximately predicted by patterns in Mandarin spatiotemporal metaphors. Given these observations, we conclude that a Mandarin speaker possesses both a horizontal and a vertical MTL, not because he or she has acquired L2 English, but because there are two time lines in Mandarin linguistic structures. Specifically, this study highlights the fact that a horizontal time line does exist in a Mandarin speaker’s cognition, even if he/she is a Mandarin monolingual instead of a ME bilingual.

Through establishing that Mandarin monolinguals simultaneously access horizontal and vertical temporal reasoning, this study may shed light on the ongoing debate over the interesting issues related to Mandarin speakers’ mental representations of time (Boroditsky, 2001; Boroditsky et al. 2011; Chen, 2007; Chen & O’Seaghdha, 2013; January & Kako, 2007; Tse & Altarriba, 2008). In particular, this study provides some counterevidence about the argument that Mandarin speakers tend to think about time vertically (Boroditsky, 2001).

The findings of the present study may bolster (though not absolutely) the Linguistic Relativity Hypothesis, which suggests that one’s native language can influence habitual thought (Whorf, 1956). Nevertheless, it is noteworthy that Linguistic Relativity Hypothesis might not thoroughly accommodate the subtle individual variations among Mandarin speakers in their temporal thinking. In addition, several studies have argued that writing direction plays a potent role affecting how people represent time spatially (Bergen & Chan Lau, 2012; Casasanto & Rottini, 2014; Fuhrman & Boroditsky, 2010; Vallesi et al., 2014). However, we concur with Miles et al.’s (2011) view that orthography could not adequately account for Mandarin speakers’ mental construals of the vertical time line, because modern Mandarin is written HLR in mainland China.

Further interpretive issues of Miles et al.’s (2011) stance towards bilingualism and cognition

Finally, we will briefly dwell on the impact of bilingualism on temporal thinking. As elucidated above, English speakers talk about time almost exclusively via horizontal spatial terms,Footnote 2 whereas Mandarin speakers depend on both horizontal and vertical spatiotemporal metaphors. The cross-linguistic commonalities in horizontal expressions of temporal information unsurprisingly complicate the question concerning the relative strength of L1 Mandarin and L2 English that may contribute to ME bilinguals’ horizontal MTL. We envisage three possibilities about this puzzle. First, L2 acquisition does not affect habitual thought (i.e., thinking patterns exhibited by monolingual speakers of L1) at all. In this case, the influence of L2 English may not be extended to Mandarin speakers’ conceptual system. The horizontal MTL therefore accords with more of L1 Mandarin than L2 English. Second, L2 exerts moderate influence on habitual thought. On this occasion, L1 Mandarin and L2 English would be equally responsible for ME bilinguals’ horizontal MTL, given the significant overlap between the two languages in the horizontal expressions of time. In other words, the horizontal representation of time by ME bilinguals is derived from a symmetric mixture of L1 Mandarin and L2 English. Last but not the least, habitual modes of thought are subject to profound L2 influence or systematically restructured by L2 linguistic forces. On this occasion, L2 would pervade cognition so that the horizontal MTL would be compatible with more of L2 English than L1 Mandarin. Additionally, the vertical MTL of L1 Mandarin may become inactive or even be erased in consequence of the cognition reconstruction.

To sum up, the evidence in hand is far from sufficient to support Miles et al.’s (2011) conclusion that ME bilinguals’ horizontal concept of time is manipulated by English. In reality, a lot of previous studies have noticed the putative discrepancies between Mandarin and English in spatiotemporal metaphors, but the salient cross-linguistic similarities between the two languages have received little focus. This may partially explain why ME bilingualism is a complex linguistic and psychological phenomenon. Therefore, the relationship between bilingualism and cognition in general and the issue of ME bilinguals’ mental representations of time in particular still call for further examinations and clarifications.