Elsevier

Cognitive Development

Volume 46, April–June 2018, Pages 97-111
Cognitive Development

How robust are anticipatory looking measures of Theory of Mind? Replication attempts across the life span

https://doi.org/10.1016/j.cogdev.2017.09.001Get rights and content

Highlights

  • The paper summarizes replications of implicit ToM tasks from two independent labs.

  • Both labs failed to replicate anticipatory looking false belief tasks in children.

  • The replicated effects were only marginally significant in adults, and were not found in children or elderly.

  • There was no evidence for correlation across tasks on performance and thus for their convergent validity.

Abstract

Recent findings from new implicit looking time tasks indicate that children show anticipatory looking patterns suggesting false belief processing from very early on; however, systematic and independent tests of their replicability and their convergent validity are still outstanding. The current paper reports three studies from two independent research labs that attempted to test the replicability and convergent validity (using correlation analyses) of the Southgate et al. (2007) and the Surian and Geraci (2012) paradigms. Results showed that the original findings can neither be replicated in children nor in elderly adults, and can only partially be replicated in adults. Furthermore, the two different paradigms did not correlate, which puts into question the convergent validity of these tasks as tapping the same capacity of an implicit Theory of Mind. In conclusion, the present studies suggest that the results from implicit Theory of Mind tasks should be treated with caution.

Introduction

Theory of Mind (ToM), the ability to attribute subjective mental states such as beliefs, desires and intentions is a capacity of fundamental importance to many aspects of our social lives. Explicit false belief (FB) tasks have been used as litmus tests for such an understanding. In particular, in so-called change-of-location false belief (FB) tasks, participants hear a vignette in which an object changes its location which is either witnessed by the protagonist (true belief [TB] control condition) or not (FB condition), and participants are then asked where the agent will look for the object (Wimmer & Perner, 1983). Decades of studies with such tasks have shown that children come to solve such FB tasks around age 4 (Baron-Cohen, Leslie, & Frith, 1985; Wimmer & Perner, 1983). Furthermore, superficially very different tasks systematically converge and correlate (Perner & Roessler, 2012; Wellman, Cross, & Watson, 2001), suggesting that they tap a common cognitive capacity, namely meta-representation (the capacity to represent others’ representational states). This body of evidence was the basis for the traditional consensus that the development between 3 and 5 years marks a fundamental conceptual transition or even revolution.

However, this consensus has been challenged recently by a growing body of evidence from implicit ToM tasks. Since the pioneering work by Clements and Perner (1994) more and more studies have demonstrated implicit sensitivity to another agent’s beliefs in younger children who do not yet master explicit tasks (e.g., Clements & Perner, 1994; Kovács, Téglás, & Endress, 2010; Low & Watts, 2013; Southgate, Chevallier, & Csibra, 2010; Southgate, Senju, & Csibra, 2007; Surian, Caldi, & Sperber, 2007; Surian & Geraci, 2012). And recent adult work also suggests that these implicit and automatic capacities may remain intact and stable across the life span (Schneider, Bayliss, Becker, & Dux, 2012; Schneider, Lam, Bayliss, & Dux, 2012; Schneider, Slaughter, Bayliss, & Dux, 2013). Different kinds of implicit ToM tasks have been used, including violation of expectation (VoE) paradigms, interaction behavior (e.g., Buttelmann, Carpenter, & Tomasello, 2009; Buttelmann, Suhrke, & Buttelmann, 2015; Fizke, Butterfill, Van de Loo, Reindl, & Rakoczy, 2014; Knudsen and Liszkowski, 2012, Southgate et al., 2010), and anticipatory looking (AL) tasks (e.g., Clements and Perner, 1994, Low and Watts, 2013, Schneider, Bayliss et al., 2012; Senju, Southgate, White, & Frith, 2009; Southgate et al., 2007, Surian and Geraci, 2012). Of these, AL tasks are particularly interesting: In contrast to measures that tap differential retrodictive responses such as VoE, AL requires prediction; and in contrast to VoE and interaction measures, AL tasks can be (and have been) used in exactly the same kinds of ways across the lifespan (see e.g., Senju et al., 2009, Southgate et al., 2007).

Findings from AL measures have been part of the basis for ambitious and far-reaching theoretical conclusions to the effect that standard explicit tasks mask the true and early ToM competence that may even be innate (Baillargeon et al., 2015, Carruthers, 2013, Leslie, 2005). It is a very interesting question whether such positive findings, if they turned out to be robust, license such strong conclusions. It is another and much more fundamental question whether these findings actually are robust and replicable. In light of the general replication crisis (see e.g., Bakker, van Dijk, & Wicherts, 2012; Button et al., 2013; Simmons, Nelson, & Simonsohn, 2011), questions of the reliability of these findings arise and need to be taken seriously. In particular, relatively few AL studies have been published to date, most of which have not been replicated outside of the lab in question or could not be replicated (Grosse Wiesmann, Steinbeis, Friederici, & Singer, 2017) or have used small sample sizes (Senju et al., 2009, Senju et al., 2010, Southgate et al., 2007).

Implicit ToM tasks involving AL measures were first used by Clements and Perner (1994) who studied anticipatory looking in response to a verbal prompt (“I wonder where she will …”). Subsequent studies then used spontaneous AL measures without any verbal prompting (e.g., Low and Watts, 2013, Schneider, Bayliss et al., 2012, Senju et al., 2009, Southgate et al., 2007, Surian and Geraci, 2012). Two AL tasks, in particular, have been used with children from very young ages on. Firstly, the Southgate/Senju paradigm has been used most widely in different populations, including infants (Southgate et al., 2007), children (Senju et al., 2010), adults (Senju et al., 2009) and participants with autism spectrum disorder (Senju et al., 2010, Senju et al., 2009). It is structured like standard change-of-location FB tasks (Clements & Perner, 1994), but the object is removed rather than transferred and there is no TB condition. Secondly, Surian and Geraci (2012) developed a standard change-of-location task with animated figures in which two figures chase each other and the protagonist forms a true or false belief about the other agent’s location.

Both of these implicit ToM tasks are, in a broad sense, change-of-location FB tasks measuring anticipatory looking, but they differ in two crucial aspects. Strictly speaking, only the Surian & Geraci task is a proper change of location task. Firstly, the object is transferred between locations in the Surian & Geraci task and stays in the new location, like in standard change-of-location FB tasks, while it is relocated and then removed from the scene in the Southgate/Senju paradigm. Secondly, the Surian & Geraci task includes a TB control condition, as standard change-of-location tasks do, which is missing in the Southgate/Senju task.

Additionally, very little is known about whether different AL measures actually tap the same underling construct, which should result in high correlations of different tasks (convergent validity) (Heyes, 2014). Decades of studies have shown systematic, strongly converging and correlated performance in various superficially diverse explicit FB tasks and have thus supplied ample evidence for their converging validity. In contrast, so far hardly any tests for convergent validity of implicit tasks have been published. Rather, most studies only tested one local task (e.g., Clements and Perner, 1994, Low and Watts, 2013, Schneider, Bayliss et al., 2012, Senju et al., 2009, Southgate et al., 2007, Surian and Geraci, 2012). And those few studies that have used several tasks have failed to find any evidence for correlations (Yott & Poulin-Dubois, 2016).

Therefore, the current paper reports two sets of studies that systematically investigated the reliability and convergent validity of AL tasks independently in two labs. The first study describes findings from the first lab, attempting replication of one AL ToM task (Southgate et al., 2007) with a large sample of 2- to 6-year-old children. The second set of studies describes two replication experiments that stem from a second, independent lab. Study 2a tests an opportunity sample of children in the Southgate et al. task, including a comparably large age range. Based on previous research using the original paradigm, age should not affect our findings, as belief- congruent AL has previously been demonstrated in different age ranges for infants, (Southgate et al., 2007), children, (Senju et al., 2010), and adults (Senju et al., 2009). However, to ensure that the broad age range does not affect the findings, Study 2b tests narrow age ranges of children, adults and elderly adults to investigate developmental changes across the life span. It furthermore combines two implicit ToM paradigms (Senju et al., 2009, Southgate et al., 2007, Surian and Geraci, 2012) to investigate convergent validity.

Section snippets

Study 1

In this first study, the anticipatory-looking paradigm from Southgate et al. (2007) and Senju et al. (2010) was employed in an attempt to replicate the original findings and to trace the developmental course of (implicit) false-belief understanding in children aged 2–6 years. In close correspondence to Southgate et al. (2007), we presented children with a change-of-location task in which an actor reached through one of two windows to retrieve an object hidden in one of two containers. In a

Study 2

Studies 2a and 2b describe replication attempts from a second, independent lab, that test the Southgate/Senju paradigm (Studies 2a and 2b) and the Surian & Geraci paradigm (Study 2b) in children and adults.

General discussion

The current paper reports two sets of studies that attempted to replicate implicit ToM tasks and correlated the outcomes to test their convergent validity. Study 1 tested 460 children between two and six years of age in the Southgate/Senju paradigm and could only reliably replicate the findings from the FB1 but not the FB2 condition of the paradigm. Study 2a tested children between two and eleven years on the Southgate/Senju task and also showed that earlier findings from the FB1 condition

Conclusion and future directions

The present findings show consistent lack of replicability and convergent validity of AL measures of implicit ToM across studies, methods and labs. Whether this suggests that implicit ToM is not as robust a phenomenon as previously assumed, or robust yet difficult to detect, one cannot tell from the present findings alone. More systematic future research, ideally involving multiple labs, will be needed to settle this question. In the meantime, though, the robustness of implicit ToM should be

Acknowledgements

We would like to thank Luca Surian, Victoria Southgate, and Atsushi Senju for sharing their original stimuli with us. Thanks to the students and research assistants involved in the testing and data processing for this project, particularly Virginie Bihari, Jacqueline Ewert, Josefine Grzesko, Julia Henke, Kathrin Heyn, Josefin Johannsen, Jonas Koch, Annika Nöhring, Julia Schmuggerow, Friederike Schreiber, Lisa Wenzel, and Marieke Wübker, and to Wolfgang Bartels for technical assistance. We also

References (34)

  • D.M. Bernstein et al.

    Theory of mind through the ages: Older and middle-aged adults exhibit more errors than do younger adults on a continuous false belief task

    Experimental Aging Research

    (2011)
  • K.S. Button et al.

    Power failure: Why small sample size undermines the reliability of neuroscience

    Nature Reviews Neuroscience

    (2013)
  • P. Carruthers

    Mindreading in infancy

    Mind & Language

    (2013)
  • S.-C. Chow et al.

    Large sample tests for proportions sample size calculations in clinical research

    (2007)
  • E. Fizke et al.

    Signature limits in early theory of mind: Toddlers spontaneously take into account false beliefs about an object’s location but not about its identity

    (2014)
  • C. Grosse Wiesmann et al.
  • J.D. Henry et al.

    A meta-analytic review of age differences in theory of mind

    Psychology and Aging

    (2013)
  • Cited by (74)

    • Socially evaluative contexts facilitate mentalizing

      2023, Trends in Cognitive Sciences
      Citation Excerpt :

      Socially evaluative contexts, by contrast, may facilitate belief representations by giving observers reason to care about agents’ mental states. A growing number of studies have found evidence for nonverbal belief representations in toddlers, children, and adults when an agent chases another agent [60–63] – a social (typically harmful and/or antisocial) goal that may be relevant to social evaluation [64] – rather than seeking an inanimate object (for difficulty replicating these findings in 5-year-old children and adults, see [65,66]). In one paper [62], toddlers and adults viewed videos of a triangle that chased a disk (Figure 2).

    View all citing articles on Scopus
    View full text