When behavior is studied in a rich environment full of possibilities for action (affordances; J. J. Gibson, 1979), rather than in a restrictive and confining laboratory, it becomes clear that the currency of perception is action. People experience the world in terms of how they can act—in terms of their effectivities or action capabilities (Shaw & McIntyre, 1974; Shaw, Turvey, & Mace, 1982). For example, perceived size of a softball is a function of the batter’s hitting success (Gray, 2013; Witt & Proffitt, 2005), and perceived speed of a tennis ball is a function of the player’s success at returning the ball (Witt & Sugovic, 2010). These findings suggest that perception is action-specific (e.g., Witt, 2011a). At first glance, the action-specific perception account seems compatible with J. J. Gibson’s ecological approach to perception–action (J. J. Gibson, 1966, 1979) because both emphasize action. However, the fact that the same environment phenomenally looks different depending on the perceiver’s effectivities and intentions seems to challenge the idea that perception is direct rather than mediated—one of the core principles of the ecological approach (Michaels & Carello, 1981; Richardson, Shockley, Fajen, Riley, & Turvey, 2008; Turvey, Shaw, Reed, & Mace, 1981). This has led some ecological psychologists to be skeptical of this work and some action-specific perception proponents to be skeptical of direct perception.

Our goals in this article are twofold: (1) to convey how the action-specific account of perception appears to be incompatible with J. J. Gibson’s theory of direct perception, and (2) to elaborate on some aspects of the theory of direct perception to show that action-specific effects are compatible with direct perception after all. In doing this, we also highlight important areas for future research. We argue that a reconciliation between action-specific and ecological approaches to perception–action will be beneficial for the action-specific account, because it provides a mechanism for these effects, and that this reconciliation will also be beneficial for the ecological approach, because it provides a rich, new source of empirical support for the second major conceptual cornerstone of ecological psychology (direct perception being the first)—J. J. Gibson’s (1979) theory of affordances. Reconciliation thus would benefit both approaches and would foster future theoretical development.

Action-specific perception

According to the action-specific account of perception, people see the environment in terms of their ability to act in it (Witt, 2011a). More specifically, perceived properties of a target object such as its size are a function of both the physical properties of the object and factors related the perceiver’s ability to perform an action on the object. For example, golf holes look bigger to golfers who are playing better than others (Witt, Linkenauger, Bakdash, & Proffitt, 2008). The action-specific account asserts that spatial perception is relational; perception of the spatial layout of the environment is perception of the physical environment in relation to the perceiver’s capability to perform the intended action.

Action-specific effects have been observed in perception of a range of dimensions including distance, slant, height, width, shape, parallelism, weight, and speed. Spatial perception has been assessed using explicit, implicit, and indirect measures and evaluated in a number of ways including verbal reports, visual matching tasks, and action-based measures. That a person’s ability to act is expressed in spatial perception has been shown with a variety of actions including reaching, grasping, walking, throwing, jumping, falling, climbing, hitting, and kicking. Experiments demonstrating these effects have used both within-subjects and between-subjects measures and manipulations.

The common thread throughout all of the studies is that perceived dimensions are a function of the perceiver’s abilities (or anticipated abilities) to act. A perceiver’s action capabilities—so-called effectivities (e.g., Shaw et al., 1982)—depend upon many factors, including anatomical makeup (e.g., having opposable thumbs, or long arms, or a flexible spinal column), learned behavior patterns (e.g., a child must learn to walk), bodily states such as strength, fitness, and fatigue, and the availability of artifacts or tools. The effects can be grouped into three overlapping, and perhaps converging, categories of scaling (Proffitt & Linkenauger, 2013): body morphology (i.e., body-scaled effects), behavior (i.e., action-scaled or ability-scaled effects), and physiology (i.e., energy-scaled effects). Examples of each will be reviewed throughout the various sections below.

A challenge to direct perception?

At first glance, action-specific effects seem compatible with J. J. Gibson’s (1979) ecological approach to perception given that both perspectives emphasize the relation between perception and action and posit that the main function of perception is the control of action. Indeed, ecological psychologists have often considered the role of action, particularly in the context of action-calibration, in spatial perception (e.g., Bingham & Pagano, 1998; Rieser, Pick, Ashmead, & Garing, 1995). As was stated by Bingham and colleagues, “If action is what perception is for, then space perception should be tested in the context of relevant action” (Pan, Coats, & Bingham, 2014, p. 404). This is precisely the framework within which action-specific researchers have examined spatial perception (e.g., Proffitt, 2006; Witt, 2011a). However, some ecological psychologists have not embraced action-specific findings, for a number of reasons.

One potential challenge is that for ecological psychologists the fundamentally relational nature of perception is an issue of ontology—that is, affordances are defined in terms of (real) relations between an animal and the environment. The ontologically real status of affordances is why it is possible for them (and perceiver–environment relations, more generally) to be specified by information. The action-specific approach is not incompatible with that ontological claim but neither does it explicitly embrace it. Ecological psychologists would be less skeptical of the action-specific approach if it did embrace the claim, because accepting that assumption de-motivates the idea that animals perceive the environment and then cognitively factor themselves into the scenario, as might be suggested by findings that suggest perceivers might be interpreting the same information differently in different circumstances.

Perhaps the most important reason action-specific effects have not been embraced is that many of these results seem to challenge J. J. Gibson’s (1979) theory of direct perception because the effects demonstrate that different perceptual experiences arise when the optical information available to the perceiver is the same. This seems to suggest that the same optical information is open to interpretation, and that people with different abilities (i.e., effectivities), or the same person at different points in time, depending on how they are performing, whether they are tired, or some other factor, are interpreting the same information differently, thus yielding different perceptual experiences (qua perceptual beliefs; Turvey et al., 1981). In other words, if perception of spatial properties of the environment appears to be mediated by internal factors related to action then this threatens the idea that perception is direct.

According to the theory of direct perception, the spatial layout of the environment is specified by information in arrays of ambient energy, such as the optic array (J. J. Gibson, 1966, 1979; Michaels & Carello, 1981). Specification refers to the lawful, 1:1 mapping between the structure and dynamics of the environment and events, on the one hand, and, on the other hand, the patterns present in the ambient optical (or acoustic, mechanical, chemical, etc.) array (Stoffregen & Bardy, 2001). Patterns or variables in stimulus arrays that specify some fact or facts of the animal–environment system are considered to serve as information about the animal–environment system. Because the optic array specifies the layout of the environment—that is, the stimulation available to the visual system is not ambiguous, contingent, or arbitrary in its relation to the environment, but rather is unambiguous, noncontingent, and lawful—perception does not require inferences based on prior experience in order to deduce this layout (for an in-depth discussion of this contrast, see Fodor & Pylyshyn, 1981, and the reply by Turvey et al., 1981).Footnote 1 In other words, perception is direct, rather than mediated by constructive or inferential processes, when animals possess a sensitivity to information and engage in the activities necessary to pick up that information from ambient stimulus arrays.

Given that it seems internal factors such as whether one intends to reach or throw or walk can lead to different perceptual experiences based on the same optical information (e.g., Witt, Proffitt, & Epstein, 2004, 2005, 2010), the evidence for the action-specific perception account seems to challenge the idea that perception is unmediated and thus direct. As a result of this potential discrepancy, proponents of the ecological approach can take one of a number of paths. They could reject the claim that action-specific effects are “real” perceptual effects, and instead attribute the responses to effects in post-perceptual processes. Although this approach has been taken by some researchers (e.g., Woods et al., 2009), a growing body of research suggests the effects are perceptual (e.g., Schnall, Zadra, & Proffitt, 2010; Witt, 2011b; Witt & Sugovic, 2012, 2013a, 2013b).

A second approach could be to reject action-specific effects because of the methods used to demonstrate them. These methods typically involve asking perceivers to make estimates of spatial dimensions such as size, distance, or slant. Although people clearly perceive these properties of spatial layout, according to the ecological approach, the primary objects of perception are affordances, or possibilities for action (J. J. Gibson, 1979—e.g., p. 260). Consequently, many proponents of the ecological approach typically study affordances directly by asking perceivers to assess if a step is climb-able or a chair is sit-able (e.g., Mark, 1987; Warren, 1984) or by asking perceivers to perform the action if and when the action seems possible (e.g., Adolph, Eppler, & Gibson, 1993; Ishak, Adolph, & Lin, 2008). Further, when spatial properties are examined, they are typically done so using an action-based measure such as walking (e.g., Rieser et al., 1995) and reaching (e.g., Bingham & Pagano, 1998). Rejecting action-specific effects because they ask the wrong experimental question or ask it using the wrong measures (e.g., “use of verbal judgments to test action specificity of perception is inappropriate”; Pan et al., 2014), even if true, prevents the ecological approach from having any explanatory power for these effects.

Here, we advocate a third approach. Instead of taking the typical either–or path of promoting one approach versus another, we argue that the two approaches can be reconciled in a way that benefits and strengthens both. This reconciliation requires that action-related information is not conceived of as an internal store of knowledge used to supplement ambiguous sensory data for which perception relies on internal inferences, is logical, and results in a propositional belief about the world external to the perceiver (cf. Helmholtz, 1925/2001; Rock, 1983). Instead, information related to action as it factors into spatial perception is not stored in a body of knowledge but rather is detected online and at the time the action is being anticipated (Witt & Proffitt, 2008). With this conceptualization, there may be ways to account for action-specific effects in a direct-perception framework. Indeed, as discussed below, several ecological psychologists have already begun empirical investigations along these lines (e.g., Harrison & Turvey, 2009; Y. Lee, Lee, Carello, & Turvey, 2012; White, Shockley, & Riley, 2013).

Reconciling the approaches

Ecological psychology has traditionally been resistant to implicating a role of internal (particularly cognitive) states in perception. The approach we propose here does not jettison the skepticism held by ecological psychologists about unprincipled appeals to arbitrary internal mechanisms that are supposed to play a causal, constructive role in perception.Footnote 2 Action abilities, including those dependent on internal bodily states, could modulate perception in ways entirely consistent with the ecological approach. To preview our proposal, we accept, first of all, that perception is based on a global array of information (Stoffregen & Bardy, 2001) and that this global array includes information about the perceiver’s ability to act. Consequently, “visual” perception is not simply detecting information from the optic array but detecting information in the global array that is defined in terms of optical variables in relation to (i.e., scaled by) variables that specify physiological states and action abilities (Heft, 1989). Perception is about detecting information from the global array—patterns defined across different energetic media—that specifies the environment in relation to the perceiver’s ability to act, as emphasized by J. J. Gibson (1966). As a result, in the absence of a change in the optic array, a perceiver can nevertheless have perceptual experiences that vary in conjunction with information about the perceiver’s effectivities. We additionally propose that some of the results from the action-specific perception account can be explained by considering the effects of intention, attention, and information selection, as they are considered in ecological psychology (e.g., J. J. Gibson, 1966; Michaels & Carello, 1981; Turvey, Carello, & Kim, 1990).

The global array

The first step in considering how action-specific perception effects are compatible with the theory of direct perception is to assume that perception is based on the global array (Stoffregen & Bardy, 2001), rather than on just the optical (or acoustical, etc.) array. The global array, which was originally proposed in order to account for multimodal effects such as the McGurk effect (McGurk & MacDonald, 1976) and the swinging room effect (D. N. Lee & Aronson, 1974; D. N. Lee & Lishman, 1975), is a higher-order array that spans single-energy arrays such as the optic array or the acoustic array. The global array contains patterns defined across the various single-energy arrays, and those patterns, rather than patterns contained within any given single-energy array, are what specify facts about the animal–environment system. Although the global array concept was motivated and strongly influenced by J. J. Gibson’s (1966) emphases on specificity and multimodal perception, the latter feature is what distinguishes the global array from previous ecological theories of multimodal perception. For example, information available in the global array is not ambiguous with regard to whether you are moving forward or the rest of the environment is moving toward you, in contrast to information in the optic array alone; rather, the initiation of forward movement through the environment is specified in terms of a pattern consisting of optical outflow, vestibular stimulation associated with accelerations of the head, proprioceptive information about changing limb positions, and other tactile and acoustic variables associated with bodily movement.

Ecological psychology has long recognized that stimulus arrays may contain information that is exteroceptive (about the external environment) and proprioceptive (about the position and motion of body segments), to use Sherrington’s (1906) terminology, as well as information that is exproprioceptive (about the body relative to the environment; D. N. Lee, 1978). However, as J. J. Gibson (1966, 1979) noted, information that is exteroceptive simultaneously implicates the animal and thus is always actually exproprioceptive. Consider that the structure of the optic array is dependent on the location of the point of observation. Although it is possible to define a potential point of observation that is unoccupied by an observer, any actual point of observation is a space occupied by a real agent, and that agent’s position and movement are specified by the optical structure defined with respect to the point of observation.

Bingham and Stassen (1994) demonstrated how a higher-order, exproprioceptive variable in the global array could specify distance for an active observer who is producing head movements (which always naturally occur because of postural sway). This multimodal variable that specifies distance can be obtained by considering the relation between the optical variable τ, which specifies time-to-contact (e.g., D. N. Lee, 1974, 1980) between the head and a target location or object, and proprioceptive feedback about head movements. Bingham and Stassen showed that τ at the peak velocity of head oscillation divided by the oscillation period is proportional to distance divided by the amplitude of head oscillation. This multimodal variable in the global array thus provides an action-scaled distance metric.

We propose that action-specific effects that relate to body morphology can be readily explained by direct perception and the global array. We review those findings in the following subsection. Action-specific effects that relate to physiological potential (i.e., energetics) can also be explained in this framework, but doing so requires a slightly broader conception of the global array so that it also includes stimulus arrays for interoception, which may entail structured chemical energy arrays inside the body. Recent research on interoception suggests it plays an important role in a variety of emotional, cognitive, and perception–action processes (e.g., Herbert & Pollatos, 2012). We review findings that motivate this extension of the global array concept to better account for how an animal’s effectivities influence perception without recourse to inference or other forms of cognitive processing, and then describe interoception and the extended global array. We also discuss some recent evidence and a newly proposed theoretical model of an informational variable defined in the extended global array that can account for certain physiological-scaled effects on distance perception.

Morphology: Body-scaled action-specific effects

One category of action-specific effects is related to the body’s morphology (Proffitt & Linkenauger, 2013). For instance, arm length determines the extent to which one can reach. Wielding a rod increases the functional length of the arm, thus allowing the perceiver to reach farther. In one experiment, target objects were presented beyond arm’s reach, and participants estimated the distance to the targets by completing a visual matching task: They positioned two external objects presented in the fronto-horizontal plane to be the same distance from each other as the distance from the perceiver to the target. Participants estimated the distance and reached to objects while using a reach-extending tool and while reaching without the tool. The targets appeared closer when participants used the tool, and thus could reach the targets, than when participants reached without the tool (Witt et al., 2005).

This action-specific expression of reachability in perceived distance could be based on multimodal variables in the global array. The length of tools can be perceived haptically (for a review, see Carello & Turvey, 2000). Consequently, a multimodal variable that specifies distance could be obtained by considering the relation between an optical variable (that specifies the distance to the target) and a haptic variable (that specifies the length of the tool plus arm). In this case, perception of distance in the context of the intention to reach could still be considered direct.

The idea of a multimodal variable specifying distance to a to-be-reached object is consistent with several findings. Target objects appear closer when participants hold a long tool that would effectively extend reach to the object, but not when they hold a short tool that would be ineffective for reaching the object (Osiurak, Morgado, & Palluel-Germain, 2012). In this case, the haptic information specifying tool length would differ across the tool conditions, so a multimodal variable that takes into account tool length could specify different distances. When perceivers hold the tool but do not use it to reach, target objects do not look closer than when the tool is not held (Witt et al., 2005). In this case, when the perceiver’s motivation or intention is different, visual perception may rely on an optical-only variable than on a multimodal variable (below we discuss how intention might affect a perceiver’s choice of informational variables).

However, a haptic–optical multimodal variable cannot account for other findings. For instance, when perceivers do not hold the tool but intend to pick it up and use it to reach, target objects look closer than for perceivers do not use or anticipate using the tool (Witt & Proffitt, 2008). Similarly, when perceivers do not use the tool but imagine using it to reach, targets look closer than for perceivers who do not use or imagine using the tool (Davoli, Brockmole, & Witt, 2012; Witt & Proffitt, 2008). In these cases, however, optical information is available that specifies the length of the tool. Thus, a higher-order optical variable (i.e., a relation between two optically specified extents) could still specify distance.

This proposal is similar to that of eye-height scaling. Sedgwick (1973, 1980, 1986) demonstrated that eye height can be used to scale distances to and the heights of objects. The combination of eye height and angle of gaze to the target specifies distance, and in fact several smartphone applications use this exact method as a distance estimator. Similarly, the combination of eye height and the ratio of the proportion of the object above versus below the horizon specifies object height. Eye-height scaling has been proposed as a natural metric for perception of affordances of stand-on-ability and sit-on-ability (Mark, 1987) as well as pass-through-ability of apertures (Warren & Whang, 1987).

Just as eye height can be used to recover the physical distance to and height of an object, studies in action-specific perception reveal that many aspects of the body provide their own metrics with which to measure the environment (Proffitt & Linkenauger, 2013), as was suggested by D. N. Lee (1980). As a consequence of having different bodies, these metrics differ from person to person. Thus, the exact same environment looks different across individuals and looks different to the same individual depending on which part of the body is being used as a metric.

Consistent with this proposal, visual manipulations of body size influence perceptual judgments of size and distance. Visual manipulations have been accomplished using optical magnifying lenses and using virtual reality. Optical magnification of the right hand leads people to perceive nearby objects as smaller than when the hand is not magnified (Haggard & Jundi, 2009; Linkenauger, Ramenzoni, & Proffitt, 2010). Similarly, when the hand is visually presented as bigger in a virtual reality environment, nearby objects look smaller than when the hand is visually presented as its actual size or as smaller (Linkenauger, Leyrer, Bülthoff, & Mohler, 2013).

Body-based effects on distance and size estimates extend beyond those related to the arm and hand. Physical differences across individuals in shoulder width influence perceptual judgments of aperture width (Stefanucci & Geuss, 2009; Sugovic, 2013). Visual presentation of the entire body as bigger or smaller (as presented in a virtual environment) influences perceptual judgments of the size of a nearby box (van der Hoort, Guterstam, & Ehrsson, 2011).

Importantly, these effects of visual magnification indicate the “online,” embodied–embedded, and fundamentally relational nature of body-scaled effects. Perceptual estimates are not based on a comparison between absolute object size and an off-line, cognitive representation of body size. They are “online” because what is relevant is not remembered hand size or body size, or some other manifestation of a stored body schema, but rather the perceptually apparent size. As changes to the body occur, neural bodily representations are updated (see Maravita & Iriki, 2004). Furthermore, if online detection of, for example, tool length is blocked (e.g., by haptically perceiving a different object), tool-related information is not expressed in perceptual distance judgments (Witt & Proffitt, 2008). Body scaling appears to be rooted in concurrent information that relates the environment to the perceiver.

Higher-order variables within and, more generally, across single-energy arrays might be able to explain many of the body-scaled effects on perception (e.g., Mantel, Bardy, & Stoffregen, 2010). Fath and Fajen (2011) identified two body-scaled (head-sway scaled and stride-length scaled) variables that specify the width of an aperture with respect to width of the shoulder (Warren & Whang, 1987). These dynamic (i.e., they are defined only for locomoting observers) variables, along with static eye-height-scaled information about aperture widths, might serve as informational bases for perceiving affordances for pass-through-ability and might also be utilized when perceivers are asked to report aperture widths in extrinsic units as well (Stefanucci & Geuss, 2009).

Physiology: Energy-scaled action-specific effects

The initial discovery of action-specific effects related to energetic expenditure. Hills appeared steeper to perceivers who were fatigued from having just completed a long run compared with perceivers who had not yet started their run (Proffitt, Bhalla, Gossweiler, & Midgett, 1995). Similarly, hills appeared steeper and distances appeared farther to perceivers who wore a heavy backpack, and thus would have to exert more energy to traverse the terrain, than to perceivers who did not wear the backpack (Bhalla & Proffitt, 1999; Proffitt, Stefanucci, Banton, & Epstein, 2003). Perceivers also report a distance traveled while wearing a heavy load as farther than one traveled when not wearing the load (Harrison & Turvey, 2009).

Long-term changes to energetic potential also reveal effects in spatial perception. Physically fit perceivers see hills as less steep than perceivers who are out of shape (Bhalla & Proffitt, 1999). Older adults see hills as steeper and distances as farther than younger adults (Bhalla & Proffitt, 1999; Bian & Andersen, 2013; Sugovic & Witt, 2013). Additionally, distances appeared farther and hills appeared steeper to observers who weighed more than others (Sugovic, 2013; Sugovic & Witt, 2013). Interestingly, it was physical weight—and not beliefs about weight—that was the relevant factor for perception.

Energy is also dependent on food consumption and its effects on the perceiver’s energetic potential. In one experiment, participants did not eat any food for 3 hours prior to the experiment. They were given juice that was sweetened with sugar, and thus contained energy, or artificial sweetener, which contained no energy. Participants could not differentiate the drinks on the basis of taste, yet those who consumed the sugar perceived hills to be less steep than did those who drank juice containing the artificial sweetener (Schnall et al., 2010).

Energetic potential can be limited to specific actions. For example, a heavy ball requires more effort to throw than a light ball. As a result, targets look farther away to people who intend to throw a heavy ball to them than to people who intend to throw a light ball (Witt, Proffitt, & Epstein, 2004). Similarly, visuomotor adaptations can be used to induce an increase in anticipated effort to walk to a target. For example, walking on a treadmill causes recalibration such that more effort is anticipated to walk a prescribed distance. After experiencing this adaptation, targets on the ground look farther away (Proffitt et al., 2003; Witt et al., 2004, 2010). Wearing ankle weights increases the energetic cost of leg-based actions such as jumping. People who wore ankle weights estimated lower maximum jumping-reach heights for other people than did perceivers who were not wearing ankle weights (Ramenzoni, Riley, Shockley, & Davis, 2008). Gaps that require jumping across appear farther when a perceiver wears ankle weights (Lessard, Linkenauger, & Proffitt, 2009). Importantly, wearing ankle weights only influenced perceived distance across a gap for gaps that afforded jumping. Perception of gaps that were too big to jump across was not influenced by wearing ankle weights (Lessard et al., 2009). Thus, energy-based scaling may only be apparent in perception of objects that afford the intended action.

Energetic demands also influence perception of the weight of objects. In a series of experiments, participants estimated the weight of a bag of golf balls (Doerrfeld, Sebanz, & Shiffrar, 2012). For some participants, their task was to lift the bag by themselves whereas for others, the task was to lift the bag with the help of another person. When anticipating lifting the bag by themselves, participants perceived the bag to be heavier than when they anticipated the help of another person. Interestingly, if the other person was clearly injured and thus likely to be of little help in carrying the bag, the bag was perceived to be heavier.

The effects of effort for a specific action only influence perceived distance when perceivers anticipate performing that action. In one study, participants threw a heavy ball and then verbally estimated the distance to targets. Then one group of participants threw the heavy ball again, and another group closed their eyes and walked to the target. Those who intended to throw again perceived the targets to be farther away than did those who intended to walk, even though both groups had just thrown the heavy ball (Witt et al., 2004). Similarly, after recalibration from walking on a treadmill, a target appeared farther away to participants who intended to walk to the target than to perceivers who intended to throw a beanbag to the target even though both groups had just walked on the treadmill. This result was found when participants verbally estimated the distance to the targets and when they were asked to blindwalk to the target, which reveals effects in both verbal and action-based measures (Witt et al., 2004, 2010).

As with the body-based action-specific effects, these energy-based action-specific effects can be interpreted as challenging the notion of direct perception as being based solely on information in an optic array. In addition, some of these energetic effects may not be explainable on the basis of previously discussed energetic media such as optic, haptic, and mechanical (proprioceptive) arrays. In the next section, we propose that it may be possible to extend the concept of a global array to account for other action-specific perception effects by including within the global array stimulus arrays that specify an animal’s internal states (i.e., adding an interoceptive dimension to the global array). If underlying physiological states are specified by patterns in arrays of stimulation (chemical or mechanical, for example), and if animals possess sensitivity to those patterns or to higher-order patterns that span the optic and interoceptive arrays, then perhaps those internal states can influence perception in a manner compatible with direct perception. We propose that the extended global array, in principle, may be able to specify the environment in relation to a perceiver’s effectivities. As such, there may be an informational basis for action-specific perceptual effects (cf. Firestone, 2013).

The global array, extended

We hypothesize that interoception—perception of internal bodily or physiological states such as hunger, effort, or fatigue (Cameron, 2002; Craig, 2002, 2003; Dworkin, 2007; Sherrington, 1906)—may be supported by patterns of stimulation that specify underlying physiological states. Direct perception (which does not necessarily entail conscious awareness) of internal states may be possible if information about those states is available in arrays of stimulus energy, including, potentially, chemical energy arrays internal to the body. Presently, the state of knowledge about interoception does not allow us to conclude that structured, informative interoceptive arrays exist. However, research is progressing at a rapid pace, and several findings described below are broadly consistent with our proposal.

Interoception

Interoception is a broad and extensive category of experience that can be achieved using multiple mechanisms and pathways. For example, Khalsa, Rudrauf, Feinstein, and Tranel (2009) identified a role of mechanoreceptors and the somatosensory cortex in detecting interoceptive information about the heartbeat (see also Dworkin, 2007). Interoceptive information may also be available in the acoustic array (e.g., sounds associated with increased respiration; Pennebaker & Lightner, 1980).

Although it may be the case that the traditional stimulus energy arrays of perceptual psychology (e.g., mechanical/proprioceptive, optical, or acoustical arrays) often contain important sources of information about internal states, it is possible that only a limited range of internal states could be specified in that way. However, almost all bodily tissues, including the muscles and viscera, are innervated with small-diameter Aδ and C primary afferent fibers that terminate in the lamina I spinothalamocortical pathway, which in turn projects to brainstem mechanisms involved in basic physiological processes (including respiration and cardiac functioning), and which are connected to the dorsal insular cortex (Craig, 2002, 2003). These receptors play a role in detecting the internal, physiological states of the body and are involved in experiencing visceral sensations. Interoception may also be achieved by direct neural registration of chemical compounds in the bloodstream (Damasio, 1999), or by chemoreceptors elsewhere in the body. For example, the carotid body, a sensory receptor found in the carotid artery, is sensitive to O2 and CO2 levels in the blood (Gonzalez, Almaraz, Obeso, & Rigual, 1994; Nurse, 2010; Prabhakar, 2006, 2013), and along with the aortic body, is sensitive to changes in blood oxygen resulting from exercise (Prabhakar & Peng, 2004). Baroreceptors, which register changes in blood pressure, are another type of interoceptor (Chernigovskiy, 1967). Muscle metaboreceptors or ergoreceptors (and their associated neural pathways) are sensitive to muscular contractions and concentrations of metabolic compounds associated with muscle activity such as lactate, and are hypothesized to play a role in perception of effort, exertion, or fatigue (Adriani & Kaufman, 1998; Kaufman & Forster, 1996; Kaufman, Longhurst, Rybicki, Wallach, & Mitchell, 1983; Kniffeki, Mense, & Schmidt, 1981; McCloskey & Mitchell, 1972; Mense & Stahnke, 1983; Mitchell, Kaufman, & Iwamoto, 1983; Wilson, Andrew, & Craig, 2002). Sensitivity to interoceptive patterns (rather than, e.g., sensitivity to absolute levels of some metabolic by-product) is suggested by recent findings that cardiovascular reflexes in response to exercise are triggered by synergistic patterns of different metabolic compounds that differentially activate several types of receptors (McCord, Tsuchimochi, & Kaufman, 2009).

Although relatively little is known at the present time about interoceptors and the information to which they might be sensitive, considerably more data are available about the brain structures involved in interoception. The ventromedial frontal cortex, dorsal insular cortex, opercular cortex, and other nearby brain regions form a distributed interoceptive center in the brain (Critchley, 2005; Critchley, Wiens, Rothshtein, Öhman, & Dolan, 2004). Several of these brain regions play important roles in other mental functions, linking interoception to cognitive capacities such as decision making, for example (Bechara, 2004; Bechara, Damasio, Damasio, & Anderson, 1994; Werner, Jung, Duschek, & Schandry, 2009). The insula, in particular, are related to a broad range of behaviors. The insula are known to play important and potentially overlapping roles in emotion (Critchley, 2005), agency (Farrer & Frith, 2002), sense of body ownership (Tsakiris, Hesse, Boy, Haggard, & Fink, 2007), attention (Eckert et al., 2009), and motor behavior (Fink, Frackowiak, Pietrzyk, & Passingham, 1997). Activation level of the right insular cortex is sensitive to effort and the intensity of physical exercise (Williams, McColl, Mathews, Ginsburg, & Mitchell, 1999). Critchley et al. found that activation levels in the right anterior insula and opercular cortex correlated with interoceptive awareness measured by heart rate sensitivity and by questionnaires about subjective visceral sensations. They also found evidence that the right anterior insula functions to integrate interoceptive and exteroceptive information. The insular cortex has also been implicated in cognitive control, anticipating future outcomes, and planning and optimizing future behaviors (Critchley, 2005).

Several behavioral and psychophysical studies have focused on how interoceptive sensitivity relates to other cognitive and perception–action processes. As was noted by Sparrow and Newell (1998), it is known that interoception can be conditioned to environmental stimuli (Razran, 1961) and that external feedback can be used to control metabolic processes during exercise (Goldstein, Ross, & Brady, 1977; Lo & Johnston, 1984a, 1984b; Perski & Engel, 1980). Herbert, Ulbrich, and Schandry (2007) found that good heartbeat perceivers are better at modulating physical effort to match their physical abilities than are bad heartbeat perceivers. Matthias, Schandry, Duschek, and Pollatos (2009) found that people with greater interoceptive awareness (operationalized in terms of heartbeat detection ability) performed better in a set of visual attention tasks than people with poor interoceptive awareness. Moreover, interoceptive awareness predicts body representation malleability as determined by susceptibility to the rubber-hand illusion (Tsakiris, Tajadura-Jiménez, & Costantini, 2011), a multimodal integration phenomenon that involves showing the participant a fake rubber hand that is stroked synchronously with the participant’s real, unseen hand with the consequence of causing the participant to identify the rubber hand as her own (Botvinick & Cohen, 1998). Individuals with low interoceptive awareness were more susceptible to the illusion, possibly because they tend to weight exteroceptive information much more than interoceptive information. Interoceptive awareness thus appears to interact with vision and proprioception in determining a perceiver’s sense of bodily self.

To summarize, several findings about interoception are consistent with our proposal of the extended global array and a role for informational variables in this array in perception–action. These findings include: the identification of links between interoceptive brain centers and exteroceptive perceptual processes and attention; findings that interoceptive brain center activation is dependent on real or imagined (Williamson et al., 2001) physical effort; evidence for a role for interoceptive brain centers in decision making, planning future actions, and motor control; and data relating interoception to body image. It is nonetheless true, however, that the current state of understanding of interoception is limited, and fleshing out many of the details of our proposal requires a much more developed empirical and conceptual understanding of interoception than is currently available. A comprehensive theory of perception–action may benefit from a more developed account of internal receptors and, especially, the information to which perceivers are sensitive (cf. Sparrow & Newell, 1998).

What could be specified in the extended global array?

Typically interoception is thought of as playing a background role in helping regulate a variety of physiological processes. Often interoceptor activity does not lead directly to a conscious or phenomenal experience. Our proposal is that interoceptor activity can modulate an agent’s perceptual experience of the environment by virtue of factoring into multimodal variables defined in the extended global array. These hypothesized informational variables relate the world to the body and its effectivities. Information contained in the global array thus considered does not absolutely specify environmental properties such as “distance” or “size” independently of the perceiver but rather specifies the environment as it relates to the perceiver’s ability to act in it (Y. Lee et al., 2012). This is, we hypothesize, why it is possible for perceptual reports of distance to change when the objectively measured distance has not changed, but a factor such as fatigue has been experimentally manipulated. In that case, the variation in the perceptual reports of distance may be anchored in corresponding changes in the value of a variable in the global array—a higher-order, multimodal variable formed from lower-order optical and interoceptive variables. If this information is available to the perceiver, the perceiver can see the world through the lens of her effectivities, and this can be explained in terms of direct perception rather than requiring recourse to cognitive mediation and mental representations.

Suppose that the form of this higher-order variable was a ratio of an optical variable related to distance and an interoceptive variable related to fatigue, such as the concentration of byproducts of muscular activity. This ratio is specific to the traversability of that distance by the animal, given the animal’s momentary physiological state—the variable specifies an affordance, not an environmental property considered in isolation of the animal. The manipulation of fatigue leaves the optical variable unchanged, but the concentration of metabolites increases. If the perceptual system is sensitive to that ratio (i.e., the variable that specifies the affordance) rather than to the optical variable alone, then a perceptual report of distance could change in the absence of a change in the optical variable. Perception of the underlying affordance—the intrinsically meaningful and more fundamental object of perception—manifests in variable perceptual reports of extrinsic metrics such as distance, size, or slant—variables that have little meaning beyond the animal’s capacity to act (cf. Chrastil & Warren, 2014).

If the global array is extended to include interoceptive information, then variables in the extended global array defined across interoceptive and exteroceptive media might specify the relation between the world and the animal’s effectivities. Relations between interoceptive and exteroceptive variables were anticipated by J. J. Gibson in an unpublished manuscript (1975):

What about interoception? . . . Does the awareness of breathing and the pumping of the heart have no reference to anything external? Is the awareness of the activity of the stomach purely internal? Does the rule that environment perception goes along with self-awareness fail in these cases? (p. 1)

White, Shockley, and Riley (2013; see also White, 2012) hypothesized that perception of distance (at least when reported by walking to reproduce a target distance) is based on such a multimodal informational variable, termed multimodally specified energy expenditure (MSEE). MSEE captures the relation between the metabolic cost of locomotion (i.e., energy expenditure) and the coincident optical information that accompanies locomotion (i.e., optically specified distance), and is defined as

$$ \mathrm{MSEE}=\frac{\mathrm{Energy}\kern0.5em \mathrm{Expenditure}}{\mathrm{Optically}\kern0.5em \mathrm{Specified}\kern0.5em \mathrm{Distance}} $$
(1)

The energy expended to walk a given distance can be quantified by the amount of O2 consumed to walk the distance (McArdle, Katch, & Katch, 2008). MSEE is therefore the O2 consumed to walk an optically specified distance; it relates the energetics of locomotion to the optical consequences of the energy expenditure. Several experiments, summarized below, determined that distance reports were constrained in precise ways by MSEE. These findings are important in the present context because MSEE is precisely the kind of multimodal (interoceptive and/or proprioceptive scaled by optic) informational variable that we hypothesize can account for energy-scaled, action-specific effects on perception.

Effects of MSEE on distance perception have been tested using a range of experimental manipulations. White et al. (2013) manipulated MSEE by raising or lowering the grade of a treadmill on which participants walked, by manipulating walking speed, or by manipulating optic flow rate. Increasing treadmill grade or speed increases the metabolic cost of walking and thus increases MSEE. MSEE increases when optic flow rate is reduced relative to walking speed (i.e., more effort is required to walk an apparent or optically specified distance) and MSEE decreases when optic flow rate is higher than walking speed (the same amount of energy expenditure produces a greater optical distance change). For each of two levels of MSEE, White et al. achieved the MSEE value by changing grade, speed, or optic flow rate in such a way as to produce equivalent MSEE levels (they created MSEE metamers). Perceived distance was directly related to MSEE, and, importantly, was transparent to the mode of manipulating MSEE. White (2012) additionally found that distance perception was affected in the predicted direction by manipulations of MSEE achieved by adding mass to the participants to increase the energy required to walk (cf. Bhalla & Proffitt, 1999; Harrison & Turvey, 2009; Lessard et al., 2009; Proffitt et al., 2003), by having participants walk in an energetically inefficient gait (gallop walk), by simulating the energetically costly effects of walking on sand or other loose terrain, and again by manipulating treadmill grade and optic flow rate, among other manipulations. In each case, perceived distance was significantly affected by MSEE in the predicted manner.

These findings, in particular the weight manipulation effects in White (2012), suggest that MSEE or similar variables might be able to explain some of the findings that have emerged from the action-specific perception approach in which extents look farther when the perceiver is weighed down (e.g., Bhalla & Proffitt, 1999; Lessard et al., 2009; Proffitt et al., 2003; Ramenzoni et al., 2008; Witt et al., 2004). As we have noted, however, the current state of knowledge of how interoception works is limited, and thus certain details of the MSEE hypothesis—most prominently, how energy expenditure relative to optically specified distance can be specified and perceptually detected—remain incomplete at this time.

Another possible way to test hypotheses about the extended global array and the role of interoception, in the absence of a detailed understanding of the information available to specify internal states, is to exploit documented individual differences in interoceptive sensitivity (Herbert & Pollatos, 2012). For example, people vary in the extent to which they are good at detecting their own heartbeat (Katkin, 1985; Pollatos & Schandry, 2004). Good heartbeat perceivers are better able than bad heartbeat perceivers to appropriately modulate their physical effort in a way that matches their capabilities (Herbert et al., 2007). In addition, individuals with anorexia nervosa are known to exhibit interoceptive deficits (Pollatos et al., 2008) and hyperactivity of the right insular cortex (Friederich et al., 2010), so they may be differentially susceptible to the kinds of action-specific effects that we hypothesize depend on sensitivity to higher-order relations of optical to interoceptive variables. Likewise, the perceptual data from the action-specific perception accounts also show a considerable range of individual differences (see, e.g., Fig. 1 from Witt & Proffitt, 2005). Following this logic, it might be expected that people with high interoceptive sensitivity are more strongly affected by manipulations of action-relevant variables, such as fatigue, that have been shown to induce changes in perception of environmental properties such as distance. That is, people with high interoceptive sensitivity may be more likely to perceive the environment in terms of their action capabilities. This hypothesis is indirectly supported by the finding of Tsakiris et al. (2011) that individuals with low interoceptive awareness were more susceptible to the rubber-hand illusion—these individuals may rely more on purely exteroceptive information than on exteroceptive information in relation to interoceptive (or proprioceptive) information. It may therefore be useful to include measures of heartbeat detection accuracy or questionnaire-based measures of interoceptive sensitivity (e.g., Mehling et al., 2012) in studies on action-specific perception. Another way to test this hypothesis would be to train participants to improve interoception (cf. Herbert et al., 2007; Schandry & Weitkunat, 1990) and determine if this results in larger effects of action-relevant variables on perception of environmental properties.

Behavior: Action-scaled effects

In the previous sections we presented our hypothesis of the extended global array and showed that this framework can account for body-scaled and physiological-scaled effects demonstrated in the action-specific perception literature. A third category of effects—termed action-scaled effects—can also be explained in that framework by considering the role of perceiver activity in detecting and selecting informational variables from the extended global array.

In addition to differences in body size, perceivers’ abilities to use, control, and coordinate their bodies will have consequences for spatial perception. A nice illustration of this is apparent in people skilled at parkour. Parkour, also known as urban climbing or free-running, is a sport for which people can jump on top of seemingly impossibly high obstacles, or zip through small openings, or bounce from narrow handrails to walls and back. People who do parkour do not have noticeably different bodies, yet what they can do with their bodies clearly differs from what the average person can do. Consequently, those skilled in parkour see the world differently. One study compared perception of wall height in people trained at parkour (called traceurs) to height- and sex-matched novices. The participants were approximately matched for structural aspects of the body, yet the traceurs were more skilled at controlling their bodies to, in this example, execute a move known as the wall jump. A wall jump is a move for which a person jumps toward a wall and kicks his or her foot along the wall to propel the body up even higher. Those trained in parkour perceived the walls to be shorter compared with novices (Taylor, Witt, & Sugovic, 2011). Thus, even with the same size body, the ability to use the body (i.e., effectivities) influences spatial perception. Similarly, training to learn a skill such as swimming or golf also influences perception. More skilled swimmers perceived underwater targets as closer than did less skilled swimmers (Witt, Schuck, & Taylor, 2011), and more skilled golfers perceived holes to be bigger than did less skilled golfers (Kwon & Kim, 2012).

The ability to control and coordinate the body is developed with skill, but this ability also varies from day to day. For example, some days a batter might be hitting well, but on other days he or she cannot make contact with the ball. On a given day, batters who are hitting better than others perceive the ball as bigger (Gray, 2013; Witt & Proffitt, 2005). Similarly, golfers who are playing better than others see the hole as bigger (Kwon & Kim, 2012; Witt et al., 2008). In addition, archers who are shooting better than others see the target as bigger (Y. Lee et al., 2012). In these experiments, the athletes were shown a poster with different sized circles and asked to select the circle that best matched the size of the ball, hole, or target circle. Their selections correlated with their performance. A similar design was used to demonstrate that kids who had more success throwing a ball to the target perceived the target as bigger (Cañal-Bruland & van der Kamp, 2009).

The effect of performance on perception is also apparent in perceived speed. Tennis players perceived the ball to be moving slower when they successfully returned the ball, as compared to when they hit the ball out of bounds (Witt & Sugovic, 2010). In this experiment, participants attempted to return a ball, then estimated the time it took for the ball to travel from the feeder machine to when they hit the ball. They estimated shorter travel times, which correspond with perceiving the ball as moving faster, on trials for which they were unsuccessful at returning the ball compared with trials for which they returned the ball successfully. To further examine this effect, Witt and Sugovic (2010) created a virtual tennis game in which participants played a modified version of the classic computer game Pong. On each trial, participants played with a paddle that was small, medium, or big. They attempted to stop a ball that traveled across the screen, then verbally estimated ball speed. Participants judged the speed as faster when they played with the smaller paddle, and thus had less success at blocking the ball, than when they played with the big paddle. Follow-up experiments revealed the same pattern of effects when participants judged the speed of the ball while the ball was still visibly moving, which suggests these effects occur in perception and not in their memory of the ball speed (Witt & Sugovic, 2012). In addition, the same pattern of effects was observed in an implicit action-based measure of perceived speed. In a modified version of the task, participants had to press a single button to send the paddle on a path to block the ball (in the actual experiment, it was called a net and was released in an attempt to catch a fish). When the net was bigger, participants waited longer to release the net, indicative that the fish appeared slower than when the net was small (Witt & Sugovic, 2013a). These studies show that across a variety of contexts, the ability to successfully perform an action influences perception of the target.

Some of these performance-based action-specific effects could be explained by appealing to multimodal, higher-order variables such as those discussed above. For instance, athletic form and body coordination could be perceived haptically, and a multimodal variable relating athletic form to optically specified size could account for differences in perceived size across performance (cf. Y. Lee et al., 2012). Athletic form is relevant because better form often leads to better performance, so athletes can anticipate better performance as a result of having and perceiving better form.

Another way in which performance-based action-specific effects can be compatible with the theory of direct perception is that differences in skill or performance levels can change the control of information detection. Thus, even if the information in the global array is held constant, differences in the control of the detection of this information could account for differences in perception. For instance, those who are capable of climbing onto tall walls or those who are better at hitting softballs might rely on different higher-order variables than those who are less skilled, possibly as a result of perceptual learning.

The control of information detection

The concept of a global array that includes information about the perceiver’s ability to act can explain most, if not all, action-specific effects (particularly body- and energy-scaled effects). However, we must also consider cases in which optical information and underlying abilities are constant across perceivers, yet they perceive the environment differently depending on their intention to act or on how well they are performing. For example, after throwing a heavy ball, targets look farther away to participants who intend to throw again compared with participants who are then going to walk to the target (Witt et al., 2004). In this case, the information available in the global array is similar for both groups of participants, so why did one group perceive the target as farther?

Intention determines what information in the global array is relevant for the perceiver. When the perceiver intends to throw, the relevant information is that which relates to throwing. When the perceiver intends to walk, the relevant information is that which relates to walking and not that which relates to throwing. The perceiver needs to select the information relevant to the intended action, and to ignore information that is not relevant. In some cases, this may require that the perceiver learns which variable or variables to attend to—a perceptual learning process termed the education of attention by ecological psychologists including both James Gibson (1966) and Eleanor Gibson (1963, 1969; E. J. Gibson & Pick, 2000), who conducted pioneering research in perceptual learning and development (see also J. J. Gibson & Gibson, 1955; Michaels & Carello, 1981).

From the ecological approach, an intention to act in a particular way serves to guide attention, which is defined as the control of information detection (Arzamarski, Isenhower, Kay, Turvey, & Michaels, 2010; Michaels & Carello, 1981; Turvey et al., 1990). Under ordinary circumstances, an organism may be presented with a wide range of informational variables that suffice to guide behavior within some tolerated level of success or accuracy (i.e., if a behavior must merely be sufficient rather than optimal). The organism is faced with a choice of which information to detect. The choice is constrained somewhat because the intention to act in a certain way makes some of those variables more relevant and useful, and others less relevant and useful (cf. Jacobs & Michaels, 2007), and moreover, the organism’s prior experience shapes the extent to which its attention has been properly educated to the relevant variables. The intention to act in a certain way should therefore be accompanied by a change in information detection so that the relevant information is detected and the irrelevant information is ignored. Intention results in a bias toward some informational variables at the expense of others.

If perception is a function of the information that is detected, as claimed by the ecological approach, then the changes in perceptual reports that result from the intention to perform a particular action in the future could reflect a change in the use of different informational variables. Methods for identifying variable usage and its change over time are available (e.g., Jacobs & Michaels, 2007), as are methods for quantifying the exploratory, information-detecting behaviors in which perceivers differentially engage under changing intentions (e.g., Riley, Wagman, Santana, Carello, & Turvey, 2002; Stoffregen, Yang, & Bardy, 2005; Yu, Bardy, & Stoffregen, 2011). Many studies have provided evidence that the intention to act in a certain way or perceive a certain variable leads to reliance of some informational variables over other, competing variables, and that this intention-guided variable selection process is accompanied by changes in exploratory, information-seeking behaviors (e.g., Arzamarski et al., 2010; Harrison, Hajnal, Lopresti-Goodman, Isenhower, & Kinsella-Shaw, 2011; Jacobs & Michaels, 2006; Michaels & Isenhower, 2011; Riley et al., 2002). Perceptual learning and skill learning also lead to changes in variable use that are consistent with the idea of the “education of attention” toward relevant informational variables (J. J. Gibson, 1966; Michaels & Carello, 1981).

Intention and attention and the resulting information-detection behaviors may also be influenced by the perceiver’s internal states relative to the goal behavior. Internal states such as hunger or fatigue may bias the animal’s selection of information so that when the animal is in one physiological state, it attends to one particular variable, whereas when in another physiological state, it attends to a different variable. This hypothesis suggests an empirical strategy of quantifying the relevant internal states and associating these with the use of different informational variables—a challenging research agenda, but one that is possible, in principle.

Similarly, control of the detection of information could also explain motivation-based effects in perception. Motivation-based effects put an emphasis on emotional states rather than on action potential. For instance, desirable objects look closer than less-desirable objects (Balcetis & Dunning, 2010). Fearful objects such as a spider look closer, bigger, and faster (Cole, Balcetis, & Dunning, 2013; Harber, Yeung, & Iacovelli, 2011; Vasey et al., 2012; Witt & Sugovic, 2013c). Effects such as these can be understood within the direct-perception framework by appealing to the idea that various emotional states could lead to differences in the control of information detection. Moreover, interoceptive brain centers are known to play a role in emotional processing (e.g., Critchley, 2005), so these findings could also reflect a reliance on multimodal, higher-order variables.

The effects of recent performance on perceived size (Cañal-Bruland & van der Kamp, 2009; Gray, 2013; Kwon & Kim, 2012; Y. Lee et al., 2012; Witt & Dorsch, 2009; Witt et al., 2008; Witt & Proffitt, 2005; Witt & Sugovic, 2010) might likewise relate to attentional control of the selection of higher-order variables spanning interoceptive and exteroceptive energetic media. Interoception is improved during and following physical exercise (Bestler, Schandry, Weitkunat, & Alt, 1990; Montgomery & Jones, 1984; Schandry & Bestler, 1995; Schandry, Bestler, & Montoya, 1993). This suggests engaging in physical activity might make the relevant interoceptive variables more salient or otherwise bias attention and the selection of information. A change in reliance on one optical–interoceptive higher-order variable to a different one could, in principle, result in a change in perceived object size when the optical variables related to size remain unchanged.

Direct action-specific perception: Summary

We propose that if the relation of the animal’s action abilities to the environment is considered to be specified in the global array, and if the animal possesses sensitivity to this information, then this information may drive perception of the spatial layout of the environment. Higher-order patterns of stimulus energy (e.g., patterns that span the optic array and internal chemical or mechanical arrays) may specify relations between the environment and the bodily states of the animal, such as fatigue, energetic capacity, or, in an area yet to be investigated, perhaps even the states of enhanced experience, intense focus, and elevated performance termed flow (e.g., Jackson & Csíkszentmihályi, 1999). These relations can, in principle, be directly perceived, and it may be perception of these relations that is responsible for variations in perceptual reports of environmental properties. Our proposal is speculative, and requires a more detailed understanding of interoception (particularly the lawful specification of internal states; i.e., what constitutes information about factors such as energetic capacity or fatigue) than is currently available, but nevertheless yields testable, falsifiable predictions about manipulations of interoceptive variables, and is supported by the work of White et al. (2013) and White (2012) with regard to energy-scaled influences on perceived distance. We have also proposed testable predictions relating to intention, attention, and the control of information detection. A more fully developed account of the extended global array and the processes by which perceivers extract information from the global array will be useful outcomes of this research regardless of whether or not the specific hypotheses advanced here are supported.

Strengthening action-specific and ecological claims

In this article, rather than provide support for one account over another, we have attempted to reconcile the action-specific account of perception with the ecological approach to perception. Through reconciliation, we believe that the two accounts strengthen each other. An advantage for the action-specific account is that the notion of the extended global array provides a mechanism for these effects. Whereas there have been a plethora of demonstrations of action-specific effects, few mechanisms have been proposed. Here, we propose that action-specific effects are the result of detection of multimodal, higher-order variables within the global array. This proposal is important for the action-specific account by providing a direction for future research beyond mere demonstrations. In particular, research should focus on finding specific higher-order variables, and determining if perceivers are sensitive to this information. This kind of investigation will also help make the action-specific account more precise, especially with respect to the kinds of action-related information that is relevant for perception.

An advantage of reconciliation for the ecological approach to perception–action is that findings that appear to potentially challenge direct perception are instead consistent with the idea. Once the potential challenge to claims of direct perception is resolved, it can then be seen that the action-specific perception account further strengthens one of the main claims of ecological psychologists regarding affordances. Affordances are the possibilities for action in the environment (J. J. Gibson, 1979). For example, a Frisbee affords throwing. The Frisbee also affords holding food for use as a plate, or holding water for a dog. It also affords protecting one’s bum from the wet grass by serving as a seat. Affordances capture the mutual relationship between the environment and a particular perceiver. A Frisbee affords throwing and catching to a human but only catching to a dog.

According to J. J. Gibson (1979), affordances are the primary objects of perception, and measures such as metric distance are “a far extreme from direct perception of the affordance dimensions of the environment. Nevertheless, they are both cut from the same cloth” (p. 260). Ecological researchers typically measure prospective judgments of whether an action is possible (e.g., Warren, 1984) or code behaviors to deduce perceived affordances such as whether a perceiver made an attempt to perform an action or refused to try (e.g., Adolph et al., 1993). In studies on affordance perception, ecological researchers rarely ask participants to report their perceptual experience in terms of seemingly action-neutral spatial dimensions such as distance or size, per se. And when such questions are asked, they are typically assessed using behaviors such as walking and reaching as opposed to verbal reports or visual matching tasks because action-based measures are thought to be in tune with action-based calibrations (e.g., Bingham & Pagano, 1998; Pan et al., 2014; Rieser et al., 1995).

However, the action-specific perception account reveals that affordances are apparent even in reports about these seemingly action-neutral dimensions of the environment. Distance to reachable objects is perceived in terms of whether the object affords reaching, or how efficiently the object could be reached (Davoli et al., 2012; Kirsch, Herbort, Butz, & Kunde, 2012; Kirsch & Kunde, 2013; Linkenauger, Witt, Stefanucci, Bakdash, & Proffitt, 2009a, b; Moragado, Gentaz, Guinet, Osiurak, & Palluel-Germain, 2013; Osiurak et al., 2012; Witt, 2011b; Witt & Proffitt, 2008; Witt et al., 2005). Size of graspable objects is perceived in terms of their affordance for grasping (Linkenauger et al., 2013; Linkenauger et al., 2010; Linkenauger, Witt, & Proffitt, 2011). Size of hittable targets is perceived in terms of their affordance for hitting (Cañal-Bruland & van der Kamp, 2009; Gray, 2013; Y. Lee et al., 2012; Witt & Proffitt, 2005) Distance to walkable objects is perceived in terms of their affordance for walking (Bian & Andersen, 2013; Harrison & Turvey, 2009; Proffitt et al., 2003; Stefanucci, Proffitt, Banton, & Epstein, 2005; Sugovic & Witt, 2013; Witt et al., 2004, 2010) and distance across gaps is perceived in terms of their affordance for jumping (Lessard et al., 2009) or falling (Jiang & Mark, 1994; but see also Mark, Jiang, King, & Paasche, 1999). Speed of blockable objects is perceived in terms of their affordance for blocking (Witt & Sugovic, 2010, 2012, 2013a, 2013b, 2013c). Height of jumpable walls is perceived in terms of their affordance for jumping (Taylor et al., 2011). Thus the perceived dimensions of the environment are a function of its affordances—or, stated differently, of the perceiver’s effectivities.

Y. Lee and colleagues (2012) hypothesized that reports of perceived size of target circles in an archery task might be conceptualized as “elliptical reports of affordances” (p. 1130). That is, in studies on action-related perception that involve reports of perceived environmental dimensions, participants’ reports reflect their capacity to act with respect to those environmental dimensions. In Y. Lee et al.’s study, participants might have tacitly reported their sense of the targets’ hittableness rather than absolute size of the target circles, per se. Hittableness takes into account the size of the target relative to the skill of the archer and the stability of gaze and aim during the activity. Y. Lee et al. found that when the archer’s arm was externally stabilized (thus enhancing target hittability) participants reported the target sizes as larger. This finding invites additional research that measures and quantifies perceiver activity and relates the degree of coordination and control of activity to perceptual reports of action-relevant environmental properties. Given that what the environment affords a perceiver is partly determined by the coordinative states of the perceiver’s body, it is important to elucidate how changes in these states, whether on a moment-to-moment basis or over the course of long-term training in a specific activity (cf. Fajen, Riley, & Turvey, 2009), influence a perceiver’s sensitivity to action-related environmental properties.

Interest in boundaries separating afforded from nonafforded actions has been a primary emphasis in research on affordances by ecological psychologists. Less research has focused on the efficiency or optimality of afforded actions in terms of energy expenditure or other action-relevant variables (however, see Warren, 1984, Exps. 2 and 3; see also Bingham, Schmidt, & Rosenblum, 1989; Gardner, Mark, Ward, & Edkins, 2001; Jayne & Riley, 2007; Konczak, Meeuwsen, & Cress, 1992; Wagman & Malek, 2009; Zhu & Bingham, 2009). This relatively underdeveloped aspect of the theory of affordances is another way in which action-specific findings can help advance ecological research and theory.

Summary

In summary, we began with the dilemma that although the action-specific account of perception seems to emphasize the role of action in perception in a similar way as J. J. Gibson’s (1979) ecological approach to perception–action, the two approaches appeared to differ critically regarding whether perception is direct. According to this theory of direct perception, the same information should always give rise to the same perceptual experience, whereas the action-specific account demonstrated that the world can look different depending on the perceiver’s ability to act. However, a law-based, direct-perception theory can hold, in principle, if “visual” perception is based on the active selection of information from the extended global array, which we propose spans not only the familiar energy arrays associated with exteroception and proprioception (e.g., optical, acoustic, and mechanical), but potentially also other patterned arrays that specify internal states such as fatigue (i.e., an “inner” Gibsonian array). Higher-order variables within the global array could be a mechanism underlying action-specific effects. Additionally, the action-specific perception account further extends the claims of the ecological account by revealing the penetration of affordances in seemingly abstract dimensions of the environment and by emphasizing the energetic aspects of perception–action.