Elsevier

NeuroImage

Volume 30, Issue 3, 15 April 2006, Pages 1059-1068
NeuroImage

Do visual perspective tasks need theory of mind?

https://doi.org/10.1016/j.neuroimage.2005.10.026Get rights and content

Abstract

Reviews [Frith, U., Frith, C.D., 2003. Development and neurophysiology of mentalising. Philos. Trans. R. Soc., B 358, 685–694.] of several imaging studies report robust involvement of medial prefrontal cortex (MPFC) in “theory of mind” (ToM) tasks. Surprisingly, this activation is notably absent when judging another person's visual perspective [Vogeley, K., May, M., Ritzl, A., Falkai, P., Zilles, K., Fink, G.R., 2004. Neural correlates of first-person perspective as one constituent of human self-consciousness. J. Cogn. Neurosci. 16, 817–827.]. The objective of our study was to see whether this activation can be recovered when the difference between what observers see is clearly one of perspectives (in front vs. behind) and not potentially a difference in what observers are looking at. Despite this change, there was still no apparent activation of MPFC. We did find activation in the temporo-parietal junction (TPJ) recently emphasized as centrally involved in processing false belief stories [Saxe, R., Kanwisher, N., 2003. People thinking about thinking people: The role of the temporo-parietal junction in “theory of mind”. NeuroImage 19, 1835–1842.], which also create a stark contrast of perspectives. By integrating extant neurophysiological evidence on theory of mind processing, we suggest that the dorsal part of the TPJ region is responsible for representing perspective differences and making behavioral predictions, while the more ventral part of TPJ and the MPFC region is responsible for predicting behavioral consequences and the MPFC also emotional consequences of mental states.

Introduction

The concept of perspective has become a frequent object of investigation in neural imaging studies. Its popularity is grounded in the central role it plays in psychology and philosophy. With Nagel's (1974) famous paper, it has become the hallmark of consciousness within philosophy (Block, 1995, Chalmers, 1996) and, more recently, neuroimaging of consciousness (Ruby and Decety, 2003, Ruby and Decety, 2004, Vogeley et al., 2001, Vogeley et al., 2004). These studies use imaging techniques to localize perspective relevant processes in the brain under the title of perspective and another group of studies under the rubric of “theory of mind” contrasting a wrong view of the world (false belief) with reality (Fletcher et al., 1995, Gallagher et al., 2000, Goel et al., 1995, Grèzes et al., 2004, Happé et al., 1996, Saxe and Kanwisher, 2003). The intuitive notion of “perspective” has, however, a very broad and changeable meaning, which makes it difficult to compare the different studies in this regard. In particular, we like to draw attention to a distinction between different levels of perspective taking skills identified in developmental psychology.

Starting with Piaget and Inhelder's (1948) Three-Mountains problem, a focal finding was the discovery by Flavell et al. (1981) and Masangkay et al. (1974) of the distinction between Level 1 and Level 2 Perspective Taking. Level 1 refers to the ability to distinguish between what people can and cannot see, e.g., that people who look at different sides of a piece of paper see different things: a picture of a cat on the one and a picture of a dog on the other side. Level 2 refers to the understanding that, when people look at the same drawing or scene from different angles, they arrive at different and contradictory descriptions. Fig. 1a gives an example (not the one that Flavell and colleagues had used but material from our study). From our vantage point as readers, we see the pole behind the block, while the little doll in the picture sees the pole to the left of the block. There is clearly a difference of how we and the doll should describe what we see (the speech bubble illustrates that, if the doll gives the description that fits our view, “the block is in front of the pole”, his statement is wrong as a description from his point of view), even though we both look at the same scene and describe the spatial relationship between the two objects.

Level 1 emerges well before Level 2, which, interestingly, is not mastered before about 4 years when other perspective problems are understood. In particular, children become able to understand the false belief problem (Wimmer and Perner, 1983), for example, a person, who fails to witness an unexpected transfer of an object to a new location, will mistakenly believe that the object is still in its original location. False belief problems have become the central measure of children's understanding of the mind (“theory of mind”). They also play a central role in neural imaging of theory of mind in adults.

Flavell's distinction between two levels of perspective taking illustrates an important ambiguity in the concept of perspective. The word perspective has its origins in the visual arts (Thompson, 1995) of giving the correct visual impression of spatial relationships among objects. Importantly, one talks of people “giving or having a different perspective” only when a particular object or scene is depicted or perceived differently from different points of view. We do not speak of a difference in perspective if different objects or scenes are depicted or perceived. Ambiguity arises because it is, to some degree, at the beholder's discretion of stipulating whether two pictures depict the same scene (or object) from different points of view or whether they depict different scenes or different parts of a scene.

This ambiguity is very patent in the Level 1 task: one can say that the observers look at the same piece of paper but have different visual impressions (different mental depictions), which justifies calling it a “perspective task”. However, one can with equal justification say that observers look at different sides of the piece of paper (Observer 1 looks at the side with the dog, Observer 2 looks at the side with the cat), hence their differing visual inputs can be explained by a difference in what they are looking at. In contrast, in the Level 2 task, it is difficult to argue that the people are looking at different things. The only difference is the way in which they look at it, which must be responsible for the different interpretations they give: Observer 1 (we as readers) sees the pole behind the block, while Observer 2 (the doll) sees the pole to the left of the block.

An obvious explanation for why children find the Level 2 task more difficult is that this task cannot be understood without an understanding of perspective differences, whereas the Level 1 task can be mastered without this understanding. In other words, some “perspective tasks” involve an analysis of a difference in perspective only optionally (depending on how one conceives of the unity of objects or scenes under consideration), while others (perspective tasks in the strict sense) definitely require an understanding of perspective, in the sense of seeing or depicting, describing one and the same thing differently. Following Perner et al. (2003), we call only the latter perspective tasks in the sense of unambiguously requiring some analysis of perspective, which remains optional in the Level 1 tasks.

Clearly, false belief tasks require an understanding of perspective. For instance, when an object is in a new location but the other person believes that the object is still in its original location, then, evidently, the child herself and the other person are not related to different parts of the world (objects, scenes, facts) but conceive of one and the same state of affairs differently: the child conceives of the object as being in location 2, while the other person conceives of it as being in location 1.

If we ask whether there is a brain region associated with registering perspective differences, we might look at the several imaging studies reporting specific areas being activated during theory of mind tasks, especially those with stories involving false beliefs or ignorance. Meta-analyses of theory of mind imaging studies (Frith and Frith, 2003, Gallagher and Frith, 2003) show consistent involvement of three regions in theory of mind tasks: (1) the anterior cingulate/paracingulate cortex as part of the medial prefrontal cortex, (2) the posterior superior temporal sulcus (pSTS) and (3) the temporal poles. These authors single out the medial prefrontal paracingulate area as particularly responsible for the “decoupling” mechanism (Leslie, 1987) that quarantines representations of imaginary circumstances from straight representations of reality. Such decoupling is, of course, required for representing mental states since the representations of what people believe or want to be the case need to be kept separate from what one knows to be the case. Naturally, this mechanism must be central to representing any perspective that differs from one's own representation of reality, among them, differences in visual perspective. The importance of this prefrontal region for theory of mind is also underlined by studies of patients with lesions in prefrontal areas mostly including medial areas, who had severe problems attributing first- and second-order false beliefs (Rowe et al., 2001), and problems detecting simple deceptions, especially when medial prefrontal areas are affected bilaterally and on the left (Stuss et al., 2001). Furthermore, the amount of ventromedial frontal atrophy in patients with frontotemporal dementia predicts impairment on theory of mind tests (Gregory et al., 2002).

Recent findings on theory of mind competence from patients with lesions pose, however, some problems for this theory of the anterior paracingulate area. Patient G.T. (Bird et al., 2004) has no demonstrable theory of mind deficit despite extensive damage to the medial frontal lobes bilaterally including the paracingulate area. Apperly et al. (2004) and Samson et al. (2004) report that a group of patients with lesions in the left temporo-parietal junction (TPJ) show specific theory of mind deficits, while patients with medial frontal damage show a mixture of theory of mind and other cognitive and executive impairments. Although these theory of mind impairments cannot be explained as consequences of executive problems caused by prefrontal damage (Rowe et al., 2001), these findings, nevertheless, give some support to the claim by Saxe and Kanwisher (2003) that theory of mind tasks specifically activate the TPJ at the border between the superior temporal and angular gyrus, which is in close vicinity but somewhat dorsal from the pSTS. In summary, there seems to be consensus that the temporal pole is involved in theory of mind processing for other reasons (e.g., social script knowledge, Gallagher and Frith, 2003) but controversy as to whether it is the prefrontal paracingulate area or the posterior STS/temporo-parietal junction area that is centrally involved in theory of mind.

Little noticed, studies of visual perspective taking show no systematic activation of the paracingulate area. To our knowledge, there are only two imaging studies of direct relevance1. Vogeley et al. (2004) looked at how many marks in a room another person (avatar) standing in that room can see in contrast to how many the participants can see looking at the room from outside. One would think that, in particular, judgments about how many marks the avatar can see (3rd person judgment) involve a theory of mind and engage representations of a perspective difference. In this study, the paracingulate part of the medial prefrontal area bilaterally only showed specific activations of 1st person over 3rd person judgments, but no activations for 3rd person judgments either in contrast to 1st person judgments or to the baseline activation. In the vicinity of the posterior STS bilaterally—close to the part of the temporo-parietal junction, where Saxe and Kanwisher reported specific ToM activation—there, too, we find only activations of 1st over 3rd person judgments (we will attempt an explanation of this seemingly contradictory finding in Discussion) or deactivations during 3rd as well as 1st person judgments compared to baseline.

These findings are puzzling for both claims of where theory of mind is computed. In particular, the 3rd person judgments seem to clearly require attribution of the mental state of seeing and computation of the content of this state. Yet, neither the medial prefrontal paracingulate region nor pSTS/TPJ region seems to be involved, despite the impressive consistency of their involvement in other theory of mind tasks involving false belief or ignorance.

One explanation for why neither of these regions is activated by 3rd person judgments is that the perspective problem used in this study is a Level 1 problem, which leaves it open as to whether participants did or did not compute a difference of perspective. The task can be solved without computing perspective differences because the problem can be analyzed in terms of what the avatar is looking at–the two marks in front of him–as opposed to what the participant is looking at and need not be analyzed in terms of avatar and participant looking at the same part of the room from different vantage points leading to different visual impressions of what is in that part of the room. In that case, the theory of mind requirements become minimal because, without representation of visual impressions (perspectives), the seeing can be understood as a spatial relationship between eyes and targets, and this may be the reason why these Level 1 perspective tasks do not activate the “theory of mind” areas in the brain. These regions have been found activated in other theory of mind studies because the problems posed in these studies require a deeper understanding of the mind (see Discussion).

The other imaging study potentially involving visual perspectives is a mental rotation experiment that systematically contrasted viewer rotation with object rotation (Zacks et al., 2003). We focus on viewer rotation because object rotation instructions (as used also in other mental rotation imaging studies) do not create a perspective difference since participants are asked to imagine the array in a different position at a different time. In contrast, viewer rotation instructions ask participant viewers to imagine moving themselves to another vantage point and how the array would look from there, in this particular case, judge whether a particular element in an array would be to the right or left of the viewer's imagined position. Provided that the participants do compute a view of the static array from a different vantage point, it clearly requires representation of the same scene from a different perspective and ought to engage theory of mind centers dealing with perspective differences.

Nevertheless, viewer rotation (but also object rotation) resulted in bilateral deactivation (against baseline) of the anterior medial prefrontal cortex. Of particular interest are the activation differences between viewer and object rotation found in two areas (although in one case it was in terms of less deactivation under viewer than object rotation), and both of them were within (or close by) the posterior STS area indicated by Frith and Frith's review of ToM studies.

However, there is no guarantee that participants solved the task by imagining a different view of the array. In fact, this was not even asked for in the instructions. Participants only had to judge whether a target object would be to the left or right of their imagined position. Thus, it is perfectly possible that this task was approached as a pure spatial transformation task without considering any difference in views. The lack of activation in the medial prefrontal paracingulate area, therefore, fails to provide strong evidence against the theory that this area is responsible for theory of mind.

In order to settle such uncertainties about whether alternative visual perspectives have to be computed or not, our study aimed at making representation of contrasting perspectives a necessary feature of the task. In a sentence verification task, participants are shown a visual display of two objects, a squat block and a tall pole and a little doll as an observer figure. In a speech bubble, the observer makes a statement about the spatial relation of the two objects using the perspective relative expressions “in front” and “behind”, e.g., “The block is in front of the pole,” and participants had to judge whether this statement is correct from the viewer's point of view. Since the viewer is positioned at different angles, his perspective differs from that of the participants'. This condition was contrasted with three comparison conditions resulting in four conditions (shown in Figs. 1a to d) according to a two factorial design: Point of View (Self vs. Other) and Perspective Dependence (a perspective relative spatial relation between objects vs. a perspective independent comparison of object properties).

The prime objective is to see whether the tasks in which a difference in perspective has to be represented activate at least one of the areas activated by false belief tasks, in particular, the anterior paracingulate region and the pSTS/TPJ. There are two possible activation patterns that speak for the representation of a perspective difference. (1) Condition (a: spatial relation for other) requires representation of a perspective difference. Hence, if condition (a) activates a particular region more strongly than the other three conditions, then this region is associated with mandatory computation of a perspective difference. (2) Although not necessary, conditions (b) and (c) are likely to evoke thoughts about different perspectives. Condition (b: spatial relation for self) does so because the spatial descriptions “in front” and “behind” immediately make one aware that these are perspective relative descriptions that raise the danger of a clash of perspectives. Condition (c: property comparison for other) makes perspectival considerations likely because participants are instructed to judge the doll's statement from the doll's point of view (even though ignoring the perspective difference would still give the right answer to the question). Only condition (d: property comparison for self) makes any concerns about perspective differences unlikely. Hence, if conditions (a + b + c) activate a particular region more strongly than condition (d), then this region is associated with the computation of likely perspective differences.

In case that one of the ToM regions shows activation pattern (1), i.e., condition (a) activates more strongly than conditions (b + c + d), we added a camera condition (Fig. 1e) in which participants had to judge whether the photo shown at the top of the display could have been taken by the camera from its position in the display. This is to test whether the activation by condition (a) concerns perspective differences in a theory of mind (triggered by the presence of the doll) or whether it is due to perspective differences without involvement of an animate agent and, presumably, not specific to theory of mind. Condition (f: comparison of scenes) served as an additional control for condition (e) because the duplication of scenes (actual scene, photo of scene) in (e) was not part of conditions (a to d).

Section snippets

Participants

Eighteen volunteers (10 women) were recruited at the University of Salzburg by advertising and were paid 10 for their time. Participants were screened for neurological disorders and contradictions for MRI scanning. The average age was 28.5 years ranging from 21 to 55 years.

Task procedure

Before the scanning took place, the different problem types were explained in detail. Volunteers went through the entire experimental procedure to ensure that, during the presentation in the scanner, no unforeseen problems

Reaction times

There were no significant differences in reaction time among the four sentence verification tasks: F(3,45) = 1.238, P > 0.05, (a) mean = 3.58 s, SD = 0.67 s, (b) mean = 3.43 s, SD = 0.61 s, (c) mean = 3.51 s, SD = 0.65 s, (d) mean = 3.44 s, SD = 0.49 s Although we expect these conditions to lead to different brain processes, this lack of differences in reaction time is not surprising since there are no expectations of how long these different processes should take. Furthermore, reaction times

Discussion

The puzzle we started with was why visual perspective tasks, in which participants have to figure out what another person sees, do not seem to activate reliably the cerebral areas deemed necessary for theory of mind. In a review of theory of mind experiments, Frith and Frith, 1999, Frith and Frith, 2003, Gallagher and Frith, 2003 singled out the anterior medial prefrontal paracingulate area as centrally involved in theory of mind tasks. This area failed to be activated when judging what another

Acknowledgments

The authors are grateful to the members of the Department of Radiology for assistance and thank Chris Frith for his encouragement at the early stages of this research.

References (48)

  • R. Saxe et al.

    Making sense of another mind: the role of the right temporo-parietal junction

    Neuropsychologia

    (2005)
  • K. Vogeley et al.

    Mind reading: neural mechanisms of theory of mind and self-perspective

    NeuroImage

    (2001)
  • T.D. Wager et al.

    Optimization of experimental design in fMRI: a general framework using a genetic algorithm

    NeuroImage

    (2003)
  • H. Wimmer et al.

    Beliefs about beliefs: representation and constraining function of wrong beliefs in young children's understanding of deception

    Cognition

    (1983)
  • I.A. Apperly et al.

    Frontal and temporo-parietal lobe contributions to theory of mind: neuropsychological evidence from a false-belief task with reduced language and executive demands

    J. Cogn. Neurosci.

    (2004)
  • S. Berthoz et al.

    An fMRI study of intentional and unintentional (embarrassing) violations of social norms

    Brain

    (2002)
  • C.M. Bird et al.

    The impact of extensive medial frontal lobe damage on ‘Theory of Mind’ and cognition

    Brain

    (2004)
  • N. Block

    On a confusion about a function of consciousness

    Behav. Brain Sci.

    (1995)
  • D.J. Chalmers

    The Conscious Mind: In Search of a Fundamental Theory

    (1996)
  • G. Csibra et al.

    The teleological origins of mentalistic action explanations: a developmental hypothesis

    Dev. Sci.

    (1998)
  • J.H. Flavell et al.

    Young children's knowledge about visual perception: further evidence for the Level 1–Level 2 distinction

    Dev. Psychol.

    (1981)
  • C.D. Frith et al.

    Interacting minds—A biological basis

    Science

    (1999)
  • U. Frith et al.

    Development and neurophysiology of mentalising

    Philos. Trans R. Soc., B

    (2003)
  • T.P. German et al.

    Neural correlates of detecting pretense: automatic engagement of the intentional stance under covert conditions

    J. Cogn. Neurosci.

    (2004)
  • Cited by (0)

    View full text