How robust are anticipatory looking measures of Theory of Mind? Replication attempts across the life span

doi:10.1016/j.cogdev.2017.09.001

Cognitive Development

Volume 46, April–June 2018, Pages 97-111

https://doi.org/10.1016/j.cogdev.2017.09.001 Get rights and content

Highlights

•
The paper summarizes replications of implicit ToM tasks from two independent labs.
•
Both labs failed to replicate anticipatory looking false belief tasks in children.
•
The replicated effects were only marginally significant in adults, and were not found in children or elderly.
•
There was no evidence for correlation across tasks on performance and thus for their convergent validity.

Abstract

Recent findings from new implicit looking time tasks indicate that children show anticipatory looking patterns suggesting false belief processing from very early on; however, systematic and independent tests of their replicability and their convergent validity are still outstanding. The current paper reports three studies from two independent research labs that attempted to test the replicability and convergent validity (using correlation analyses) of the Southgate et al. (2007) and the Surian and Geraci (2012) paradigms. Results showed that the original findings can neither be replicated in children nor in elderly adults, and can only partially be replicated in adults. Furthermore, the two different paradigms did not correlate, which puts into question the convergent validity of these tasks as tapping the same capacity of an implicit Theory of Mind. In conclusion, the present studies suggest that the results from implicit Theory of Mind tasks should be treated with caution.

Graphical abstract

Introduction

Theory of Mind (ToM), the ability to attribute subjective mental states such as beliefs, desires and intentions is a capacity of fundamental importance to many aspects of our social lives. Explicit false belief (FB) tasks have been used as litmus tests for such an understanding. In particular, in so-called change-of-location false belief (FB) tasks, participants hear a vignette in which an object changes its location which is either witnessed by the protagonist (true belief [TB] control condition) or not (FB condition), and participants are then asked where the agent will look for the object (Wimmer & Perner, 1983). Decades of studies with such tasks have shown that children come to solve such FB tasks around age 4 (Baron-Cohen, Leslie, & Frith, 1985; Wimmer & Perner, 1983). Furthermore, superficially very different tasks systematically converge and correlate (Perner & Roessler, 2012; Wellman, Cross, & Watson, 2001), suggesting that they tap a common cognitive capacity, namely meta-representation (the capacity to represent others’ representational states). This body of evidence was the basis for the traditional consensus that the development between 3 and 5 years marks a fundamental conceptual transition or even revolution.

However, this consensus has been challenged recently by a growing body of evidence from implicit ToM tasks. Since the pioneering work by Clements and Perner (1994) more and more studies have demonstrated implicit sensitivity to another agent’s beliefs in younger children who do not yet master explicit tasks (e.g., Clements & Perner, 1994; Kovács, Téglás, & Endress, 2010; Low & Watts, 2013; Southgate, Chevallier, & Csibra, 2010; Southgate, Senju, & Csibra, 2007; Surian, Caldi, & Sperber, 2007; Surian & Geraci, 2012). And recent adult work also suggests that these implicit and automatic capacities may remain intact and stable across the life span (Schneider, Bayliss, Becker, & Dux, 2012; Schneider, Lam, Bayliss, & Dux, 2012; Schneider, Slaughter, Bayliss, & Dux, 2013). Different kinds of implicit ToM tasks have been used, including violation of expectation (VoE) paradigms, interaction behavior (e.g., Buttelmann, Carpenter, & Tomasello, 2009; Buttelmann, Suhrke, & Buttelmann, 2015; Fizke, Butterfill, Van de Loo, Reindl, & Rakoczy, 2014; Knudsen and Liszkowski, 2012, Southgate et al., 2010), and anticipatory looking (AL) tasks (e.g., Clements and Perner, 1994, Low and Watts, 2013, Schneider, Bayliss et al., 2012; Senju, Southgate, White, & Frith, 2009; Southgate et al., 2007, Surian and Geraci, 2012). Of these, AL tasks are particularly interesting: In contrast to measures that tap differential retrodictive responses such as VoE, AL requires prediction; and in contrast to VoE and interaction measures, AL tasks can be (and have been) used in exactly the same kinds of ways across the lifespan (see e.g., Senju et al., 2009, Southgate et al., 2007).

Findings from AL measures have been part of the basis for ambitious and far-reaching theoretical conclusions to the effect that standard explicit tasks mask the true and early ToM competence that may even be innate (Baillargeon et al., 2015, Carruthers, 2013, Leslie, 2005). It is a very interesting question whether such positive findings, if they turned out to be robust, license such strong conclusions. It is another and much more fundamental question whether these findings actually are robust and replicable. In light of the general replication crisis (see e.g., Bakker, van Dijk, & Wicherts, 2012; Button et al., 2013; Simmons, Nelson, & Simonsohn, 2011), questions of the reliability of these findings arise and need to be taken seriously. In particular, relatively few AL studies have been published to date, most of which have not been replicated outside of the lab in question or could not be replicated (Grosse Wiesmann, Steinbeis, Friederici, & Singer, 2017) or have used small sample sizes (Senju et al., 2009, Senju et al., 2010, Southgate et al., 2007).

Implicit ToM tasks involving AL measures were first used by Clements and Perner (1994) who studied anticipatory looking in response to a verbal prompt (“I wonder where she will …”). Subsequent studies then used spontaneous AL measures without any verbal prompting (e.g., Low and Watts, 2013, Schneider, Bayliss et al., 2012, Senju et al., 2009, Southgate et al., 2007, Surian and Geraci, 2012). Two AL tasks, in particular, have been used with children from very young ages on. Firstly, the Southgate/Senju paradigm has been used most widely in different populations, including infants (Southgate et al., 2007), children (Senju et al., 2010), adults (Senju et al., 2009) and participants with autism spectrum disorder (Senju et al., 2010, Senju et al., 2009). It is structured like standard change-of-location FB tasks (Clements & Perner, 1994), but the object is removed rather than transferred and there is no TB condition. Secondly, Surian and Geraci (2012) developed a standard change-of-location task with animated figures in which two figures chase each other and the protagonist forms a true or false belief about the other agent’s location.

Both of these implicit ToM tasks are, in a broad sense, change-of-location FB tasks measuring anticipatory looking, but they differ in two crucial aspects. Strictly speaking, only the Surian & Geraci task is a proper change of location task. Firstly, the object is transferred between locations in the Surian & Geraci task and stays in the new location, like in standard change-of-location FB tasks, while it is relocated and then removed from the scene in the Southgate/Senju paradigm. Secondly, the Surian & Geraci task includes a TB control condition, as standard change-of-location tasks do, which is missing in the Southgate/Senju task.

Additionally, very little is known about whether different AL measures actually tap the same underling construct, which should result in high correlations of different tasks (convergent validity) (Heyes, 2014). Decades of studies have shown systematic, strongly converging and correlated performance in various superficially diverse explicit FB tasks and have thus supplied ample evidence for their converging validity. In contrast, so far hardly any tests for convergent validity of implicit tasks have been published. Rather, most studies only tested one local task (e.g., Clements and Perner, 1994, Low and Watts, 2013, Schneider, Bayliss et al., 2012, Senju et al., 2009, Southgate et al., 2007, Surian and Geraci, 2012). And those few studies that have used several tasks have failed to find any evidence for correlations (Yott & Poulin-Dubois, 2016).

Therefore, the current paper reports two sets of studies that systematically investigated the reliability and convergent validity of AL tasks independently in two labs. The first study describes findings from the first lab, attempting replication of one AL ToM task (Southgate et al., 2007) with a large sample of 2- to 6-year-old children. The second set of studies describes two replication experiments that stem from a second, independent lab. Study 2a tests an opportunity sample of children in the Southgate et al. task, including a comparably large age range. Based on previous research using the original paradigm, age should not affect our findings, as belief- congruent AL has previously been demonstrated in different age ranges for infants, (Southgate et al., 2007), children, (Senju et al., 2010), and adults (Senju et al., 2009). However, to ensure that the broad age range does not affect the findings, Study 2b tests narrow age ranges of children, adults and elderly adults to investigate developmental changes across the life span. It furthermore combines two implicit ToM paradigms (Senju et al., 2009, Southgate et al., 2007, Surian and Geraci, 2012) to investigate convergent validity.

Section snippets

Study 1

In this first study, the anticipatory-looking paradigm from Southgate et al. (2007) and Senju et al. (2010) was employed in an attempt to replicate the original findings and to trace the developmental course of (implicit) false-belief understanding in children aged 2–6 years. In close correspondence to Southgate et al. (2007), we presented children with a change-of-location task in which an actor reached through one of two windows to retrieve an object hidden in one of two containers. In a

Study 2

Studies 2a and 2b describe replication attempts from a second, independent lab, that test the Southgate/Senju paradigm (Studies 2a and 2b) and the Surian & Geraci paradigm (Study 2b) in children and adults.

General discussion

The current paper reports two sets of studies that attempted to replicate implicit ToM tasks and correlated the outcomes to test their convergent validity. Study 1 tested 460 children between two and six years of age in the Southgate/Senju paradigm and could only reliably replicate the findings from the FB1 but not the FB2 condition of the paradigm. Study 2a tested children between two and eleven years on the Southgate/Senju task and also showed that earlier findings from the FB1 condition

Conclusion and future directions

The present findings show consistent lack of replicability and convergent validity of AL measures of implicit ToM across studies, methods and labs. Whether this suggests that implicit ToM is not as robust a phenomenon as previously assumed, or robust yet difficult to detect, one cannot tell from the present findings alone. More systematic future research, ideally involving multiple labs, will be needed to settle this question. In the meantime, though, the robustness of implicit ToM should be

Acknowledgements

We would like to thank Luca Surian, Victoria Southgate, and Atsushi Senju for sharing their original stimuli with us. Thanks to the students and research assistants involved in the testing and data processing for this project, particularly Virginie Bihari, Jacqueline Ewert, Josefine Grzesko, Julia Henke, Kathrin Heyn, Josefin Johannsen, Jonas Koch, Annika Nöhring, Julia Schmuggerow, Friederike Schreiber, Lisa Wenzel, and Marieke Wübker, and to Wolfgang Bartels for technical assistance. We also

References (34)

S. Baron-Cohen et al.
Does the autistic child have a theory of mind?
Cognition
(1985)
D. Buttelmann et al.
Eighteen-month-old infants show false belief understanding in an active helping paradigm
Cognition
(2009)
F. Buttelmann et al.
What you get is what you believe: Eighteen-month-olds demonstrate belief understanding in an unexpected-identity task
Journal of Experimental Child Psychology
(2015)
W.A. Clements et al.
Implicit understanding of belief
Cognitive Development
(1994)
A.M. Leslie
Developmental parallels in understanding minds and bodies
Trends in Cognitive Sciences
(2005)
J. Perner et al.
From infants’ to children's appreciation of belief
Trends in Cognitive Sciences
(2012)
D. Schneider et al.
A temporally sustained implicit theory of mind deficit in autism spectrum disorders
Cognition
(2013)
H. Wimmer et al.
Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children's understanding of deception
Cognition
(1983)
R. Baillargeon et al.
Psychological and sociomoral reasoning in infancy
(2015)
M. Bakker et al.
The rules of the game called psychological science
Perspectives on Psychological Science
(2012)

D.M. Bernstein et al.

Theory of mind through the ages: Older and middle-aged adults exhibit more errors than do younger adults on a continuous false belief task

Experimental Aging Research

(2011)

K.S. Button et al.

Power failure: Why small sample size undermines the reliability of neuroscience

Nature Reviews Neuroscience

(2013)

P. Carruthers

Mindreading in infancy

Mind & Language

(2013)

S.-C. Chow et al.

Large sample tests for proportions sample size calculations in clinical research

(2007)

E. Fizke et al.

Signature limits in early theory of mind: Toddlers spontaneously take into account false beliefs about an object’s location but not about its identity

(2014)

C. Grosse Wiesmann et al.

J.D. Henry et al.

A meta-analytic review of age differences in theory of mind

Psychology and Aging

(2013)

Cited by (74)

Investigating belief understanding in children in a nonverbal ambiguous displacement and communication setting
2024, Journal of Experimental Child Psychology
Finding ways to investigate false belief understanding nonverbally is not just important for preverbal children but also is the only way to assess theory of mind (ToM)-like abilities in nonhuman animals. In this preregistered study, we adapted the design from a previous study on pet dogs to investigate false belief understanding in children and to compare it with belief understanding of those previously tested dogs. A total of 32 preschool children (aged 5–6 years) saw the displacement of a reward and obtained nonverbal cueing of the empty container from an adult communicator holding either a true or false belief. In the false belief condition, when the communicator did not know the location of the reward, children picked the baited container, but not the cued container, more often than the empty one. In the true belief condition, when the communicator witnessed the displacement yet still cued the wrong container, children performed randomly. The children’s behavior pattern was at odds with that of the dogs tested in a previous study, which picked the cued container more often when the human communicator held a false belief. In addition to species comparisons, because our task does not require verbal responses or relational sentence understanding, it can also be used in preverbal children. The children in our study behaved in line with the existing ToM literature, whereas most (but not all) dogs from the previously collected sample, although sensitive to differences between the belief conditions, deviated from the children. This difference suggests that using closely matched paradigms and experimental procedures can reveal decisive differences in belief processing between species. It also demonstrates the need for a more comprehensive exploration and direct comparison of the various aspects of false belief processing and ToM in different species to understand the evolution of social cognition.
More than one path to pragmatics? Insights from children's grasp of implicit, figurative and ironical meaning
2023, Cognition
Human communication requires impressive inferential abilities and mind-reading skills. To learn how to speak and become competent communicators children need both. The development of pragmatic abilities presents us with a puzzle. On the one hand, much evidence suggests pragmatics play a grounding role in early communication and language acquisition. On the other, preschoolers find linguistic pragmatic inferences such as implicatures, metaphor and irony difficult to grasp. Apperly and Butterfill (2009) maintain that there are two separate systems for belief reasoning: a simpler one and a more sophisticated one that develops later. Along this line of reasoning we might also expect there to be two separate kinds of pragmatic abilities: an early set using (among other things) the simpler Theory of Mind system, and a more sophisticated one appearing later in childhood and using full-blown Theory of Mind. I will argue there is no need to divide pragmatic abilities in such a way to bridge the gap between the pragmatic inferential skills found in toddlers and the difficulties observed in preschoolers. Evidence from the past two decades indicates that phenomena such as implicatures and metaphor (but not irony) can be understood earlier than previously established. Additionally, children's apparent struggle with specific pragmatic inferences might be better explained by factors independent from pragmatic competence, but which interact with it.
Is false belief understanding stable from infancy to childhood? We don't know yet
2023, Cognitive Development
Socially evaluative contexts facilitate mentalizing
2023, Trends in Cognitive Sciences
Citation Excerpt :
Socially evaluative contexts, by contrast, may facilitate belief representations by giving observers reason to care about agents’ mental states. A growing number of studies have found evidence for nonverbal belief representations in toddlers, children, and adults when an agent chases another agent [60–63] – a social (typically harmful and/or antisocial) goal that may be relevant to social evaluation [64] – rather than seeking an inanimate object (for difficulty replicating these findings in 5-year-old children and adults, see [65,66]). In one paper [62], toddlers and adults viewed videos of a triangle that chased a disk (Figure 2).
Our ability to understand others’ minds stands at the foundation of human learning, communication, cooperation, and social life more broadly. Although humans’ ability to mentalize has been well-studied throughout the cognitive sciences, little attention has been paid to whether and how mentalizing differs across contexts. Classic developmental studies have examined mentalizing within minimally social contexts, in which a single agent seeks a neutral inanimate object. Such object-directed acts may be common, but they are typically consequential only to the object-seeking agent themselves. Here, we review a host of indirect evidence suggesting that contexts providing the opportunity to evaluate prospective social partners may facilitate mentalizing across development. Our article calls on cognitive scientists to study mentalizing in contexts where it counts.
Evaluative contexts facilitate implicit mentalizing: relation to the broader autism phenotype and mental health
2024, Scientific Reports
Uncomfortable staring? Gaze to other people in social situations is inhibited in both infants and adults
2024, Developmental Science

View all citing articles on Scopus

View full text

How robust are anticipatory looking measures of Theory of Mind? Replication attempts across the life span

Highlights

Abstract

Graphical abstract

Introduction

Section snippets

Study 1

Study 2

General discussion

Conclusion and future directions

Acknowledgements

Cognition

Cognition

Journal of Experimental Child Psychology

Cognitive Development

Trends in Cognitive Sciences

Trends in Cognitive Sciences

Cognition

Cognition

Psychological and sociomoral reasoning in infancy

The rules of the game called psychological science

Perspectives on Psychological Science

Theory of mind through the ages: Older and middle-aged adults exhibit more errors than do younger adults on a continuous false belief task

Experimental Aging Research

Power failure: Why small sample size undermines the reliability of neuroscience

Nature Reviews Neuroscience

Mindreading in infancy

Mind & Language

Large sample tests for proportions sample size calculations in clinical research

Signature limits in early theory of mind: Toddlers spontaneously take into account false beliefs about an object’s location but not about its identity

A meta-analytic review of age differences in theory of mind

Psychology and Aging