Elsevier

Journal of Memory and Language

Volume 92, February 2017, Pages 1-13
Journal of Memory and Language

Is the Levels of Processing effect language-limited?

https://doi.org/10.1016/j.jml.2016.05.001Get rights and content

Highlights

  • The levels of processing effect is largely based on words.

  • We demonstrate consistent but modest effects using visual stimuli.

  • In contrast verbal effects vary widely in magnitude.

  • We apply Gibson’s concept of affordance to both visual and verbal stimuli.

  • This results in a levels of processing framework that is applicable to both.

Abstract

The concept of Levels of Processing (LOP), proposing that deep coding enhances retention, has played a central role in the study of episodic memory. Evidence has however been based almost entirely on retention of individual words. Across five experiments, we compare LOP effects between visual and verbal stimuli, using judgments of pleasantness as a method of inducing deep encoding and a range of shallow encoding judgments selected so as to be applicable to both verbal and visual stimuli. LOP effects were consistent but modest across the visual stimuli (mean effect size 0.5). In contrast, LOP effects for verbal stimuli varied widely, from modest for people’s names and unfamiliar animals (mean effect size 0.6) to large for familiar animals and household items (mean effect size 1.4), typical of the dramatic LOP effects that characterize the existing verbal literature. We interpret our data through the Gibsonian concept of “affordance”, proposing that visual and verbal stimuli vary in the number and richness of features they afford, and that access to such features will in turn depend on encoding strategy. Our hypothesis links readily with Nairne’s feature model of long-term memory.

Introduction

Craik and Lockhart (1972) proposed that memory is a by-product of processing, the deeper the processing the better the retention. Their paper is one of the most highly cited in the history of cognitive psychology (Roediger & Gallo, 2001), and “one of the most influential systematic conceptual frameworks within which problems of memory can be raised and investigated” (Tulving, 2001, p. 24). While the assumption of a series of levels leading from perceptual to semantic was subsequently abandoned (Craik & Tulving, 1975), Levels of Processing (LOP) has continued to serve as a broad theoretical framework, accounting for a wide range of data within the field of human memory and potentially providing a fruitful basis for further investigation (Conway, 2002). Furthermore, the principle underlying the levels approach is of considerable practical relevance, providing an important and valuable means of improving learning, in contrast to the common tendency for learners to rely on rote rehearsal.

On the other hand, despite many replications and the magnitude of the effects shown (a series of studies by Hyde and Jenkins (1969) and Walsh and Jenkins (1973) yielded an average effect size based on Cohen’s d of 2.27), the use of the framework to broaden our knowledge of human memory has been somewhat limited. One exception to this comparative lack of development comes from the demonstration by Tulving and Thomson (1973) of the importance of the match between encoding and retrieval in determining memory performance. This point was further developed with the introduction of the concept of Transfer Appropriate Processing (TAP), as proposed by Morris, Bransford, and Franks (1977). They showed that shallow phonological coding led to better performance than deeper semantic coding when rhyming words were used as retrieval cues for the items to be recalled, again demonstrating that memory performance depends crucially on conditions at retrieval, as well as at encoding. The concept of TAP is an important reminder that retrieval needs to be considered, but leaves open the question of how to determine transfer appropriateness.

In an attempt to develop the concept of TAP, Roediger (Roediger and Blaxton, 1987, Roediger et al., 1989) proposed to link it to the distinction between explicit episodic memory and more automatic implicit memory. Most explicit memory tasks involve processing in terms of meaning, hence benefiting from deeper encoding while implicit tasks tend to be perceptually based, depending more on the exact replication of shallower encoding cues. However, although there were many examples in the literature that fitted this pattern, it is not always possible to make a clear distinction between perceptual data-driven levels of analysis and analysis at a more conceptual or semantic level. Roediger, Srinivas, and Weldon (1989) proposed that any given situation could have components involving both levels of analysis which might or might not trade off against each other. While plausible, this compounds the problem of measuring transfer appropriateness. Furthermore, data began to appear suggesting that dissociations occurred within the proposed perceptual and conceptual paradigms (Hunt & Toth, 1990) presenting further difficulties in using TAP as a way of developing the original LOP approach, and leading Roediger (2002, p. 321) to conclude “we suggest that the field in general has not yet been able to develop an adequate characterization of procedures that account for memory phenomena despite efforts in this direction”.

One important question to be asked of any theoretical framework concerns its breadth of application. As Roediger and Gallo (2001, p. 42) observe, LOP can be regarded as “a special case of transfer-appropriate processing that applies to memory for words in meaning-based tests”. However, although language is clearly important, it is only part of our capacity to experience and remember the world, suggesting a need for LOP studies of non-verbal memory. We describe a series of experiments that began with the question of whether reliable LOP effects could be demonstrated using visual material. As relatively little is known we adopted an exploratory approach of comparing LOP effects for a range of visual and verbal materials. Our results show that different types of visual materials all yield modest LOP effects whereas verbal materials give a wider range such that the dramatic advantage to deep encoding typically found depends crucially on the nature of the material. These findings led us to propose a modified explanation of LOP effects that takes into account the “affordances” of a stimulus (Gibson, 1977) and applies to both verbal and non-verbal material.

An early critique of the LOP concept (Baddeley, 1978) noted the lack of evidence for LOP effects using visual stimuli. Although subsequent research on LOP has also been dominated by use of verbal stimuli, a number of studies have been performed across a range of other modalities, though largely using implicit memory measures for which LOP effects were, unsurprisingly found not to apply (Graf and Mandler, 1984, Jacoby and Dallas, 1981). There appears to be very little investigation of the LOP effect in studies of explicit episodic memory using nonverbal stimuli. Some exceptions to this generalization do however occur.

In the case of music, Halpern and Bartlett (2010) comment on a paucity of LOP studies in the literature, reporting only one positive result. Peretz, Gaudreau, and Bonnel (1998), found that judgments of the familiarity of a tune led to better subsequent recognition than judging the instrument playing the tune, commenting however that “the current authors failed to find LOP effects for unfamiliar music on numerous occasions (some published, some languishing in bottom drawers)” (Halpern & Bartlett, 2010, p. 234).

Attempts have also been made to study LOP effects in olfactory memory. Lyman and McDaniel (1986) varied encoding instructions in a study involving recognition of 30 odors after a 1 week delay. No difference in hit rate was found, but an advantage on a d′ measure suggested that attempting to name and define each odor or linking it to a life episode led to better performance than forming a visual image or simply trying to memorize each stimulus. A subsequent replication by Zucco (2003) again found a significant effect for d′ but not hit rate, with only the life episode condition showing a significant advantage. These results suggest a modest overall effect of deeper processing, operating mainly through reducing false alarm rate, far from the robust effects typical of verbal material.

There have been rather more attempts to detect LOP effects in visual memory, reflected largely in studies of memory for faces. Warrington and Ackroyd (1975) report better face recognition following pleasantness judgments than from estimation of the person’s height, a somewhat challenging task from a portrait photograph. A much easier “shallow” task was used by Bower and Karlin (1974), judging the sex of the person portrayed. This proved less effective in facilitating subsequent recognition than did judgments of likeableness or honesty. This could however simply reflect the need to scan the face more intently in order to make these “deeper” judgments, as proposed by Winograd (1981) who found that an instruction to identify the most distinctive facial feature of a given face was more effective than the apparently deeper task of making a personality judgment. On the other hand, a study by Patterson and Baddeley (1977) which compared categorization on physical dimensions such as nose size and thickness of lips found these to be slightly less effective than judgments of pleasantness or intelligence. An attempt to increase depth of processing by providing a semantic context for each face by adding a description of the unfamiliar person’s occupation, background and habits however, proved ineffectual (Baddeley, 1982, Baddeley and Woodhead, 1982). An attempt to maximize TAP by presenting the contextual information at both encoding and recognition did increase rate of detection, but this proved to be entirely attributable to inducing a positive response bias (Baddeley & Woodhead, 1982), with participants also more likely to erroneously say yes to a novel face, if accompanied by a previously presented description. Once again therefore, although it would be unwise to rule out the possibility of an LOP effect for faces, any such effects are clearly far weaker than those routinely found for verbal materials.

It could be argued of course, that despite the obvious importance of faces, they are a rather special form of visual stimulus, with their own specific anatomical processing area (Kanwisher, McDermott, & Chun, 1997), possibly also associated with a relatively automatic link to emotional coding (Öhman, 2009). For that reason, it is important to extend the study of LOP effects in visual memory to other stimuli. Unfortunately, the small number of studies that have attempted this previously have used different methods and given contradictory results with D’Agostino, O’Neill, and Paivio (1977) finding a positive effect using readily nameable line drawings while Intraub and Nicklos (1985), found a negative effect for some of their cued recall conditions, suggesting the need for a more systematic approach.

The starting point for our investigation was the observation that any study of the role of LOP must deal with three variables, the nature of the initial encoding, deep versus shallow, the nature of the retrieval test, bearing in the mind the importance of TAP, and the characteristics of the material to be remembered. Neither the method of encoding nor the range of materials involves a simple binary choice, hence the range of possible experiments becomes very large indeed. For that reason we fixed our deep encoding method, basing it on judgments of pleasantness, and always used a four-alternative forced-choice recognition retrieval measure. Holding constant the method of ensuring deep encoding and the testing procedure then allowed us to manipulate the variable central to our enquiry, the nature of the material, allowing comparison between visual and verbal memory, and importantly, of variations in material within each modality. This approach raises a number of further issues which will be discussed next.

The first concerns our selection of judgments of pleasantness as our deep encoding procedure. We did this because we needed a semantic judgment that is readily applicable to a wide range of materials. In his attempt to develop a measure of meaning that extended beyond verbal material Osgood developed a complex rating scale, the semantic differential which factor analysis suggested yielded three factors of which the strongest was consistently the hedonically evaluative good–bad dimension (Osgood, May, & Miron, 1975). In the case of words, encoding on this dimension has been shown to produce a particularly powerful LOP effect (Hyde & Jenkins, 1969) and indeed Packman and Battig (1978) found that judgments of pleasantness were substantially more effective than other “deep” judgments such as concreteness or meaningfulness. Furthermore, the widespread use of pleasantness judgments in clinical assessments such as the Warrington (1984) recognition test involving words and faces, reflects the fact that it is a task that participants find natural and relatively easy to use for both verbal and visual stimuli.

Choosing shallow encoding tasks is less straightforward given that they need to be applicable to both visual and verbal material and to ensure that participants process the stimuli at the required level. Finally, to avoid the risk of basing our conclusions on a single atypical task, we use a range of different “shallow” processing instructions. Our earlier research concerned with developing a clinical test of visual memory opted to use door scenes as they are familiar, allowing a range of degrees of similarity and resulting difficulty. Two lists of 12 doors tested using four-alternative forced choice proved both sensitive to memory deficit and patient friendly (Baddeley, Emslie, & Nimmo-Smith, 1994).

Photographing doors subsequently proved addictive to A.B., resulting in a data base of over 2000 visual stimuli. In order to increase their experimental usability we classified each item along a range of dimensions, thus making it relatively easy to select sets of differing levels of inter-item similarity (Baddeley, Hitch, Quinlan, Bowes, & Stone, in press). In addition to our having a very large readily available set, doors have the advantage that, unlike faces they almost certainly do not have a specific brain area devoted to their processing and are unlikely to have atypically strong links to emotional and social processing (Öhman, 2009).

Having established that a LOP effect can be obtained using door scenes in pilot work, we continued to include door stimuli as a baseline against which other types of visual and verbal stimulus material could be compared. This led to the question of what other type of material. In this essentially exploratory study, rather than setting up and testing precise hypotheses, we used pragmatic constraints to select our material. We opted for lists of 24–30 items per condition, choosing four-choice recognition rather than two-alternative or yes/no recognition reduced baseline guessing to obviate the need for longer lists. We wanted to maintain certain characteristics of our doors test, namely that the items should come from a single broad semantic category, and that there should be sufficient similarity between items to allow a level of recognition approximately equal across materials. It is worth noting at this point that simply selecting visual recognition items from a wide range of categories, with distractors chosen at random tends to lead to levels of performance of 90% or more, even with very long lists (Brady et al., 2008, Konkle et al., 2010a, Nickerson, 1965, Standing et al., 1970). The experiments that follow reflect these constraints.

Experiment 1 therefore compares recognition memory for door scenes or concrete words processed either “deeply” in terms of pleasantness, or more shallowly in terms of stimulus color. Experiment 2 attempts to replicate this with different sets of stimuli and a different shallow processing task, while Experiments 3a, b and c explore the generality of our initial findings by extending them to a broader range of visual and verbal materials, using the method of converging operations to determine which aspects of the material are crucial

Section snippets

Design and procedure

A 2 × 2 within participants design combined two types of material, doors and words and two types of encoding instruction involving judgments of pleasantness and color. All participants were tested on each of the four conditions in counterbalanced order. A total of 20 student volunteers were tested.1 They and all participants in the remaining studies were

Design

A total of 24 participants, were each tested on three types of stimuli, doors, names and occupations, in each case processed at two levels, shallow and deep. Each encoded list was followed by an immediate four alternative forced-choice test. All participants completed all six conditions, half beginning with the shallow condition and half with deep. The order of stimulus presentation within each encoding condition was counterbalanced using a 3 × 3 Latin square. Half began with the three deep

Experiment 3a

The materials selected were as follows:

  • (1)

    The 240 doors used in Experiment 2.

  • (2)

    A total of 240 clocks. These were selected from the internet using Google Search under five subcategories: circle clocks, square clocks, pendulum clocks, alarm clocks, and street clocks. All words on the pictures were removed using Adobe Photoshop CS 2.256.

  • (3)

    A total of 240 verbal items came from food menus, again selected from the internet using search terms: Chinese food menu, English food menu, Japanese food menu, Dessert

Experiment 3b

The overall design is identical to 3a with the exception that different materials were used, and different shallow processing judgments required. In this case the shallow judgment was to report the dominant color of each stimulus.

Experiment 3c

This used the same overall design as 3a and 3b, using a pleasantness judgment for deep processing, this time compared with a shallow judgment of whether the stimulus had one dominant color or was multi-colored. Three types of material were used, one comprising the same 30 doors, a second stimulus set comprising scenes as used by Konkle, Brady, Alvarez, and Oliva (2010b) at http://cvcl.mit.edu/MM/sceneCategories.html (Appendix). We used 240 scene images from 10 categories (streams, libraries,

General discussion

We will begin by summarizing our results before going on to suggest an interpretation. This will then be applied to the studies of the effect of LOP on verbal and nonverbal material more generally, as summarized in the introduction before concluding with a discussion of the potential significance of our results for other recent studies of visual LTM.

We set out with a broad question; is the positive effect of deep processing limited to language-based materials? We compared the effect of

Author note

We are grateful to Philip Quinlan for his help and advice, and to Stephanie Chung, Po Fu, Stephanie Motley, Susan Oei, Stephen Rhodes, Natalie Whitehead and Suet Wong for their contribution to the development of material and to testing and to Fergus Craik and James Nairne for their constructive comments on an earlier draft.

References (54)

  • G.H. Bower et al.

    Depth of processing pictures of faces and recognition memory

    Journal of Experimental Psychology

    (1974)
  • T.F. Brady et al.

    Visual long-term memory has a massive storage capacity for object details

    Proceedings of the National Academy of Sciences of the United States of America

    (2008)
  • J.D. Bransford et al.

    Some general constraints on learning in memory research

  • M.A. Conway

    Sensory-perceptual episodic memory and its context: Autobiographical memory

  • F.I.M. Craik

    Levels of processing: Past, present … and future?

    Memory

    (2002)
  • F.I.M. Craik et al.

    Depth of processing and the retention of words in episodic memory

    Journal of Experimental Psychology: General

    (1975)
  • G.S. Cree et al.

    Analyzing the factors underlying the structure and computation of the meaning of chipmunk, cherry, chisel, cheese, and cello (and many other such concrete nouns)

    Journal of Experimental Psychology: General

    (2003)
  • P.R. D’Agostino et al.

    Memory for pictures and words as a function of level of processing: Depth or dual coding?

    Memory & Cognition

    (1977)
  • A.D. De Groot

    Thought and choice in chess

    (1965)
  • J.J. Gibson

    The theory of affordances

  • A.R. Halpern et al.

    Memory for melodies

  • R.R. Hunt et al.

    Perceptual identification, fragment completion, and free recall: Concepts and data

    Journal of Experimental Psychology: Learning, Memory, and Cognition

    (1990)
  • T.S. Hyde et al.

    Differential effects of incidental tasks on the organization of recall of a list of highly associated words

    Journal of Experimental Psychology

    (1969)
  • H. Intraub et al.

    Levels of processing and picture memory: The physical superiority effect

    Journal of Experimental Psychology: Learning, Memory, and Cognition

    (1985)
  • L.L. Jacoby et al.

    On the relationship between autobiographical memory and perceptual learning

    Journal of Experimental Psychology: General

    (1981)
  • N. Kanwisher et al.

    The fusiform face area: A module in human extrastriate cortex specialized for face perception

    Journal of Neuroscience

    (1997)
  • T. Konkle et al.

    Conceptual distinctiveness supports detailed visual long-term memory for real-world objects

    Journal of Experimental Psychology: General

    (2010)
  • Cited by (0)

    View full text