Although there is little doubt that literacy has profoundly transformed human societies, we are only beginning to understand its consequences on cognition. As the cognitive models are usually developed from performance of schooled Western children and highly educated literate adults such as undergraduate students, our knowledge of cognition actually reflects what occurs in brains profoundly shaped by formal education and literacy, and hence cannot be generalized to all human beings (e.g., Henrich, Heine, & Norenzayan, 2010). However, as pointed out by Maryanne Wolf (p. 3, 2007), “we were never born to read. Human beings invented reading only a few thousand years ago. And with this invention, we rearranged the very organization of our brain, which in turn expanded the ways we were able to think, which altered the intellectual evolution of our species.” For a better understanding of both cognitive development and reading disabilities, it is thus crucial to know more precisely to what extent and how exactly learning to read shapes cognition.

In this paper, we want to advocate a significant role of literacy in the development of working memory (WM), the set of mental prosesses holding limited information in a temporarily accessible state in service of cognition (Cowan et al., 2005). We will focus on literacy as the ability to read and write, namely as “the ensemble of representations and processes that an individual acquires as an obligatory and direct consequence of learning to read and write” (Morais & Kolinsky, 2005, p. 188),Footnote 1 and will argue that its acquisition modifies the functioning of and representations used in verbal WM.

The idea that literacy shapes immediate memory is not new. In fact, 25 years ago, Nick Ellis (1990) proposed that, as reading makes extensive use of WM and involves naming visually presented items, articulatory rehearsal, and ordered retention of information, it may in turn enhance memory skills. Under this view, the first decoding activities, due to their sequential nature, provide opportunities for training memory skills and strategies used to refresh the decaying phonological representations, such as cumulative rehearsal, the process of repeating to oneself (either overtly or covertly) the material to be remembered in serial order. However, to the best of our knowledge, no recent work has reviewed the relationship between reading acquisition and WM, looking at both sides of their association. Indeed, the vast majority of studies on this issue (e.g., Alloway & Alloway, 2010; Gathercole, Alloway, Willis, & Adams, 2006; Nevo & Bar-Kochva, 2015; Nevo & Breznitz, 2011, 2013) were unidirectional, predicting reading abilities from earlier measures of memory without considering the converse relationship, that is, the potential (direct or indirect) contribution of progress in reading on the development of processes and representations used in verbal WM.

A feedback effect of literacy acquisition on verbal WM would be consistent with the fact that learning to read and write leads to deep cognitive and brain changes in the processing of spoken language (for reviews, see Dehaene, Cohen, Morais, & Kolinsky, 2015; Huettig & Mishra, 2014; Kolinsky, 2015): It boosts the acquisition of a system of explicit representations of speech, establishes interconnections between phonological and orthographic representations, and may even change the nature of phonological representations. As such, one might expect literacy acquisition to also shape the encoding of verbal stimuli in memory and/or their retrieval.

In the present paper, after a brief review of some theoretical models of WM and a description of the reading acquisition process, we gather the relevant data suggesting that learning to read influences verbal WM and discuss the various, non-mutually exclusive, mechanisms underlying these literacy effects.

The working memory system

Brief theoretical overview

Although cognitive psychologists agree that the processes attributed to WM are fundamental in human cognition, there are still debates regarding how WM is limited and how it operates. For most theorists in the field, the WM system includes multiple interacting components. The most influential multi-component model comes from Baddeley and Hitch (1974) who proposed that WM comprises an attentionally limited control system, the central executive, and two slave systems, the phonological loop, holding verbal and acoustic information, and the visuospatial sketchpad, holding visuospatial information. The phonological loop is assumed to include a temporary phonological store in which auditory memory traces decay over a period of a few seconds, unless restored by articulatory rehearsal. In 2000, Baddeley added a fourth component controlled by the central executive: the episodic buffer, a limited and temporary interface between the two slave systems and episodic long-term memory, capable of integrating information from a variety of sources into coherent episodes.

The distinction between the phonological loop and the central executive system parallels the one often used between verbal short-term memory (STM) and WM: The former refers to the simple temporary storage of information, whereas the latter implies a combination of storage and manipulation (Baddeley, 2012; Swanson, Zheng, & Jerman, 2009). Verbal STM is frequently assessed by immediate serial recall tasks, in which participants listen to (or see) a sequence of items and have to recall them in their order of presentation (as in the forward digit span and word span), free recall, so named because participants are free to recall the presented string of items in any order, and nonword repetition, in which participants listen to and have to repeat pronounceable nonsense words of increasing length. Verbal WM capacities are assessed through a wide range of tasks, from the backward digit span, that involves recalling lists of digits in the reverse order of presentation, to more complex span tasks, such as the counting span (Case, Kurland, & Goldberg, 1982), involving counting shapes while remembering the count totals for later recall, the operation span (Turner & Engle, 1989), requiring solving mathematical operations while trying to remember words, the reading span (Daneman & Carpenter, 1980), requiring reading series of unconnected sentences and verifying their logical accuracy while trying to remember words (one for each sentence presented) for later recall, or its auditory variant, the listening span. For the sake of clarity, we will refer to these tasks as verbal STM and WM tasks.

It is, however, worth noting that these measurements of the immediate memory constructs are imprecise and multidetermined, as they might reflect the contributions of speech perception, articulatory rehearsal, speech production, coding strategies, quality and type of storage, processing skills, attentional processes, etc. (see discussions in e.g., Coady & Evans, 2008; Conway et al., 2005; Savage, Lavers, & Pillay, 2007). Furthermore, several authors reject the idea that STM and WM are separate constructs and argue instead that all tasks largely tap the same processes, albeit to varying degrees (e.g., Cowan, 1999; Majerus, Heiligenstein, Gautherot, Poncelet, & Van der Linden, 2009; Unsworth & Engle, 2007). Sharing this view, in this review we use the term verbal WM in a broad sense, which, following the definition of Cowan et al. (2005), is not resticted to WM tasks, but also concerns performance in STM tasks.

The temporary activation of long-term language representations

In addition to integrating verbal WM and STM, several models of immediate memory extend this system to include long-term memory (LTM) representations as well. For instance, Cowan (1999) proposed that immediate memory comes from a hierarchy of embedded processes and representations comprising LTM, the subset of LTM representations currently activated, and the subset of activated memory that is the focus of attention and awareness, which is itself controlled jointly by voluntary processes (a central executive system) and involuntary processes (automatic recruitment of attention by e.g., physically changed stimuli).

In fact, Baddeley (2012) himself acknowledged that the phonological loop is likely to depend on phonological and lexical representations within LTM. Indeed, several types of evidence in the language domain support the contribution of long-term knowledge to performance in immediate memory tasks. Research on aphasic patients (e.g., Martin & Saffran, 1997) has shown that qualitative aspects of verbal (auditory) span performance (effects of item characteristics, such as frequency and imageability, and serial position) directly relate to the extent to which semantic and/or phonological processing is compromised in those patients. In healthy children and adults, performance in immediate serial recall is also closely related to LTM representations, as reflected by better recall of words over nonwords (or lexicality effect, e.g., Hulme, Maughan, & Brown, 1991; Saint-Aubin & Poirier, 2000), of words of high lexical frequency over words of lower lexical frequency (e.g., Roodenrys, Hulme, Alban, Ellis, & Brown, 1994), and of concrete over abstract words (e.g., Allen & Hulme, 2006; Caza & Belleville, 1999). Sublexical representations also influence performance: Nonwords of high phonotactic frequencyFootnote 2 are better recalled than nonwords of low phonotactic frequency (e.g., Gathercole, Frankish, Pickering, & Peaker, 1999; Majerus, Van der Linden, Mulder, Meulemans, & Peters, 2004). It has been proposed that these LTM contributions to immediate memory arise from redintegration processes wherein the partially decayed memory traces are reconstructed by sampling appropriate candidates from LTM (e.g., Hulme et al., 1997; Hulme, Newton, Cowan, Stuart, & Brown, 1999; Nairne, 2002; Schweickert, 1993). Under this view, the short-term phonological record itself is independent of lexical and semantic knowledge: LTM representations support recall only at the moment of retrieval by “cleaning up” degraded phonological traces in memory through a process that matches partial traces and performs pattern completion (Miller & Roodenrys, 2009). Lexical effects would occur because nonwords, contrary to words, have no matching lexical representations in the language networks. Redintegration processes would also be easier for high frequency words, as their lexical representations can be retrieved more quickly (e.g., Saint-Aubin & Poirier, 2005) and for concrete words with high imageability, as they have more unique and stable semantic representations (e.g., Walker & Hulme, 1999). An alternative account (e.g., Majerus, 2009; Martin & Saffran, 1997) considers that the activation of language representations occurs, not only at retrieval, but also at encoding: Verbal immediate memory would be directly supported by the temporary activation of language representations. However, as emphasized by Majerus (2009), verbal immediate memory cannot be reduced to the temporary activation of language representations. Indeed, most verbal memory tasks involve not only the recall of several verbal items, but also the recall of their sequential order of presentation.

The distinction between memory for order and memory for item

Several experimental, neuropsychological, and neuroimaging data indicate that the maintenance of the identity of the items and of their serial order relies on distinct capacities and neuro-anatomic substrates. More specifically, the to-be-remembered items would be stored via temporary activation of the language network in the temporal lobes, whereas a more specialized system in the right fronto-parieto-cerebellar areas would deal with their order (Majerus et al., 2010). Additionally, some domain-general attentional capacities associated with the left intraparietal sulcus seem to be involved in both item and order memory (Majerus et al., 2010). Under this view, verbal immediate memory would be an emergent function resulting from synchronized and flexible recruitment of the dorsal and ventral language networks, as well as of the fronto-parietal attentional and serial order processing systems.

Several computational models of verbal immediate memory also distinguish memory for the identity versus serial order of the items (e.g., Brown, Preece, & Hulme, 2000; Burgess & Hitch, 1999; Gupta, 2003). For instance, Burgess and Hitch (1999) have presented a simplified neural network containing representations of three types of information: lexical items, phonemes, and a context/timing signal. According to this model, item information is directly coded within the language network via associations (nodes) between lexical and phonemic representations. These nodes are connected to a context/timing system whose state changes over time and which is responsible for coding serial order information.

The development of memory processes: a brief sketch

Verbal WM performance increases steeply from the preschool years up to 8 years of age, and more gradually afterward (e.g., Gathercole, 1999). This is reflected by the age-related growth in performance on free recall (e.g., Jarrold et al., 2015; Ornstein & Naus, 1978) as well as on simple and complex span tasks (e.g., Case et al., 1982; Gathercole, Pickering, Ambridge, & Wearing, 2004). This substantial improvement in verbal STM and WM capacities over the childhood years seems to be driven by multiple factors (for reviews see e.g., Camos & Barrouillet, 2011; Gathercole, 1999; Jarrold & Bayliss, 2007).

One of the potential sources of immediate memory improvement appears to be closely linked to the use of verbal rehearsal (e.g., Jarrold & Tam, 2011, but see Jarrold & Hall, 2013). Its use is revealed for instance by overt rehearsal (Ornstein & Naus, 1978) or lip movements indicative of subvocal rehearsal during maintenance intervals (e.g., Flavell, Beach, & Chinsky, 1966) and by a positive correlation between speech rate and memory span (as faster articulation rate allows more items to be rehearsed; e.g., Cowan et al., 1998; Standing & Curtis, 1989). Although it had been reported that this strategy emerges around 7 years of age, its onset seems less discrete than previously supposed: Rehearsal would develop gradually and continually between 5 and 9 years of age (e.g., Jarrold & Tam, 2011; Tam, Jarrold, Baddeley, & Sabatos-DeVito, 2010). Younger children present less overt or covert rehearsal (e.g., Flavell et al., 1966; Ornstein & Naus, 1978; Palmer, 2000a) and, instead, use more rudimentary strategies, such as repeating the presented word that they have just heard (Yuzawa, 2001). Several training studies in children (e.g., Cox, Ornstein, Naus, Maxfield, & Zimler, 1989) and young individuals with Down syndrome (e.g., Comblain, 1994) have provided evidence that active rehearsal benefits memory performance. In addition, when active rehearsal is prevented by articulatory suppressionFootnote 3 serial recall performance drops (e.g., Baddeley, Lewis, & Vallar, 1984; Cowan, Cartwright, Winterowd, & Sherk, 1987; Murray, 1967). In fact, rehearsal seems particularly crucial for the encoding and maintaining of serial order information, as reflected, for instance, by the greater detrimental effect of irrelevant speech and articulatory suppression on a list probe task, which is primarily a test of STM for order information, than on an item probe task, which is primarily a test of STM for item informationFootnote 4 (Henson, Hartley, Burgess, Hitch, & Flude, 2003).

Another critical strategy, which emerges around 5–6 years of age, is the ability to phonologically recode visual materials with verbal labels. According to Baddeley (1986, 1990), whereas auditorily presented material has automatic access to the phonological store in the form of an abstract phonological code, visually presented stimuli (written words, drawings, or pictures) access the visuo-spatial scratch pad. However, visual stimuli such as pictures of familiar objects can also be recoded into verbal form, and hence stored in the phonological loop. Phonological recoding of visual material is classically reflected by the phonological similarity effect (i.e., the observation that phonologically confusable items –usually rhyming ones – are harder to recall than phonologically nonconfusable items; Conrad & Hull, 1964). Contrary to most older children and adults, children younger than 5–6 years of age often fail to show a significant phonological similarity effect with visually presented items that have similar names (e.g., Halliday, Hitch, Lennon, & Pettipher, 1990; Henry, Messer, Luger-Klein, & Crane, 2012; Hitch, Woodin, & Baker, 1989). Some data suggest that the use of a predominantly visual strategy is followed by a period of dual coding (both visual and phonological) before the adult-like strategy of verbal coding finally emerges around 8 years of age (Palmer, 2000a). However, according to a recent study in children aged 6–12 years, the child’s age is not necessarily the best indicator of phonological recoding development and the developmental increase in memory performance does not systematically depend on the use of verbal or visual processing per se (Koppenol-Gonzalez, Bouwmeester, & Vermunt, 2014).

In summary, the underlying processes of verbal WM development are still debated and we are still far from a unified theory of WM development. In addition, many studies that have investigated this issue did not consider the children’s reading skills (e.g., Koppenol-Gonzalez et al., 2014). However, as we will discuss further, the fact that the most important developmental qualitative and quantitative changes in verbal WM occur at the time during which children typically learn to read and write might suggest that this cultural acquisition somehow influences memory development.

Learning to read: A task engaging working memory

The alphabetic writing system is based on the representation of phonemes by individual letters or groups of letters, the graphemes (e.g., “sh” in “she”; “oo” in “blood”). This is called the alphabetic principle. Thus, the first step in becoming literate is to develop a system allowing mapping of graphemes and phonemes, i.e., phonological recoding, which allows retrieval of the pronunciation of a printed string (decoding) or spelling a word (encoding). To this aim, it is necessary to understand that graphemes stand for phonemes. This is a difficult task because phonemes are not sounds or fragments of sounds, they are abstract phonological units (Liberman, 1957). It is now largely demonstrated that preliterate children (e.g., Liberman, Shankweiler, Fischer, & Carter, 1974) and analphabet adults (either fully illiterate, see Morais, Cary, Alegria, & Bertelson, 1979, or literates in a non-alphabetic system, see Read, Zhang, Nie, & Ding, 1986) are not aware of phonemes unless they have begun to acquire the phonological correspondences of some letters and to use them in decoding attempts (Lukatela, Carello, Shankweiler, & Liberman, 1995). This is attested to by their very poor performance on phonemic awarenessFootnote 5 tasks such as deleting the initial consonant of a verbal item (e.g., “mosa” – “osa”) or adding one at the onset. Although the home literacy environment may favor early literacy skills (for a review, see Sénéchal, 2015), most of the time, the acquisition of phonological correspondences and decoding attempts occurs when people begin reading lessons, either in childhood at school, or later on, in adulthood. This is the case for ex-illiterate people, who had no opportunity to attend school but learned to read in special literacy classes (for a review, see Kolinsky, 2015).

A further source of difficulty in learning to read is the necessity to acquire the orthographic code of the language. Whereas the alphabetic principle is “universal,” as it applies to all languages written alphabetically, the orthographic code is relative to each particular language. This code is formed by the set of grapheme-phoneme correspondences (relevant for reading) and by the set of phoneme-grapheme correspondences (relevant for writing), which constitute a system of rules, either simple or complex.Footnote 6 Some of these rules may be explicitly taught at school, others learned in an implicit way through reading practice. All these rules are language-dependent and, depending on language too, there may be more or fewer exceptions to the rules. In transparent orthographic codes such as those of Italian, Finnish, or German, the correspondences between graphemes and phonemes are consistent. In opaque orthographic codes such as those of English or (although to a lesser extent) French, the same grapheme (e.g., <i> in English) may represent different phonemes in different words (as in e.g., “pint” and “mint”) and the same phoneme (e.g., /s/ in English) may be represented by different graphemes in different words (as in e.g., <soft> and <city>), for contextual, morphological, or historical reasons (see discussion in Kessler & Treiman, 2015). These spelling-to-sound and sound-to-spelling inconsistencies affect both the rate of learning (e.g., Seymour, Aro, & Erskine, 2003) and the size of the sub-lexical units used to identify written words, with smaller units (graphemes) used in transparent codes and larger units (e.g., rimes) used in opaque ones (Ziegler, Perry, Jacobs, & Braun, 2001).

Learning to read is thus a complex task engaging verbal WM: The beginner reader has to sequentially translate the graphemes into their spoken forms, temporarily store them in their exact order until the last one has been decoded, and finally combine the whole ordered sequence into a word. In addition, in opaque orthographic codes, decoders have to deal with numerous irregular written words. Hence, not surprisingly, children’s verbal WM capacities are closely associated with word decoding (e.g., Arrington, Kulesz, Francis, Fletcher, & Barnes, 2014; Christopher et al., 2012). Verbal WM is also frequently correlated with reading comprehension (e.g., Arrington et al., 2014; Christopher et al., 2012; Oakhill, Cain, & Bryant, 2003; Yuill, Oakhill, & Parkin, 1989, but see Kush, Johns, & Van Dyke, 2015; Van Dyke, Johns, & Kukona, 2014). However, in this paper, we will focus on the links between verbal immediate memory and the acquisition of phonological recoding and of orthographic representations.

The causal nature of the relationship between decoding skills and verbal memory, as well as its specificity, are still matter of a debate (for critical reviews, see e.g., de Jong, 2006; Savage et al., 2007). It was first proposed that phonological STM, the verbal subcomponent of WM, plays an important role for literacy acquisition (e.g., Baddeley, Gathercole, & Papagno, 1998; de Jong, 1998; Gathercole & Baddeley, 1993; Jorm, 1983; Perfetti, 1985; Siegel & Ryan, 1989). However, recent data indicates that when various measures of verbal STM are taken into account, the relation between phonological STM and decoding skills is better explained in terms of shared variance with phonemic awareness, the latter being the only independent predictor of decoding (see Melby-Lervåg, Lyster, & Hulme, 2012, for a recent meta-analysis). Consistently, verbal STM is not always impaired in children with reading disabilities and an isolated impairment in verbal STM does not necessarily lead to persistent learning difficulties (e.g., Gathercole, Alloway, et al. 2006; Gathercole, Tiffany, Briscoe, & Thorn, 2005).

Currently, an influential view considers that more general deficits in WMFootnote 7 might be at the core of reading acquisition difficulties (e.g., Alloway & Alloway, 2010; Beneventi, Tønnessen, & Ersland, 2009; Beneventi, Tønnessen, Ersland, & Hugdahl, 2010; Gathercole, Alloway, et al., 2006; Gathercole, Lamont, & Alloway, 2006; Swanson, 2006). However, there is no evidence that memory training (which often relies on computer-based verbal and visuospatial STM and WM tasks) produces reliable immediate or delayed improvements in word decoding (e.g., Banales, Kohnen, & McArthur, 2015; Melby-Lervåg & Hulme, 2015; for a recent meta-analysis, see Melby-Lervåg & Hulme, 2013). In addition, when other well-established predictors of word reading are controlled for, there is little evidence that WM or phonological STM measures have unique power in predicting later word reading acquisition (see Savage et al., 2007).

More recently, some researchers have highlighted the contribution to decoding skills of either serial order memory capacity (e.g., Majerus, Poncelet, Greffe, & Van der Linden, 2006; Martinez Perez, Majerus, & Poncelet, 2012a) or the consolidation (or transfer) of serial-order information into a stable LTM trace (Szmalec, Loncke, Page, & Duyck, 2011; but see Staels & Van den Broeck, 2014a). For instance, in a 1-year longitudinal study starting in kindergarten, phonemic awareness (assessed by a phoneme identification task) and serial order STM (measured by a serial order reconstruction taskFootnote 8), but not item STM (measured by monosyllabic nonword repetition under articulatory suppression), predicted independent variance in decoding abilities in first grade, even after controlling for nonverbal reasoning, vocabulary, and initial letter knowledge (Martinez Perez, Majerus, Mahot, & Poncelet, 2012b; see also Nithart et al., 2011).

The impact of serial order memory capacity on decoding abilities might be, at least partly, mediated by the use of rehearsal, which is particularly crucial for encoding serial order information (Henson et al., 2003). However this possibility remains to be tested, as the use of this strategy was not controlled for in the studies that examined the relation between serial order memory and decoding (e.g., Martinez Perez et al., 2012b; Nithart et al., 2011), probably because young children are often assumed to not rehearse, an assumption that may not hold true for all kindergarten children (e.g., Tam et al., 2010).

As (serial) phonological recoding constitutes a central means by which orthographic representations are acquired (e.g., Share, 1995), Martinez Perez et al. (2012b) further suggested that, besides its specific contribution to decoding processes, memory for serial order information is also important for the acquisition of new long-term orthographic representations. However, a recent study on primary-school readers (Staels & Van den Broeck, 2014b) showed that serial-order memory (measured by a serial order reconstruction task similar to that of Martinez Perez et al., 2012b) did not share any specific relationship with the entire range of reading abilities (word reading was more strongly related to item STM than to serial order STM) or with orthographic learning, as assessed using Share’s (1999) self-teaching paradigm. In fact, in the latter situation, only item STM was significantly related to orthographic learning. Nevertheless, as acknowledged by Staels and Van Den Broeck, the role of serial order STM in orthographic learning might be less pronounced in literate children than in beginner readers, because with increasing reading ability orthographic learning may depend more on already existing orthographic structures than on serial order STM.

There is still much to be understood about the relationship between reading and memory. A common limitation of most studies that have investigated this association is that they have neglected to consider the potential (direct or indirect) contributions of progress in reading skills on the development of verbal memory processes and representations. Even in a longitudinal design, the finding that STM and/or WM predict literacy outcomes does not preclude a reciprocal relation. A similar criticism was already addressed by Wagner, Torgesen, and Rashotte (1994) in a longitudinal study from kindergarten through second grade on the relations between phonological processing skills (including phonological coding in WM) and reading-related knowledge (word decoding, letter knowledge), which they showed to be reciprocal.

As we will discuss in the next section, in typical beginner readers, we might expect that the intensive practice of decoding boosts some strategies used in verbal memory, such as cumulative rehearsal, which in turn would lead to better serial order memory performance. In addition, the emergence of phonemic awareness and of orthographic representations might enhance the quality and precision of the language representations, which, in turn, would improve encoding and retrieval of item information.

The role of literacy on WM and its verbal short-term component

Several authors have already suggested that the developmental changes in memory may be a direct or indirect result of formal education (i.e., schooling) and related experiences, such as learning to read. Comparing children of similar age but different school levels,Footnote 9 Morrison, Smith, and Dow-Ehrensberger (1995) showed that the growth of immediate memory skills and the emergence of strategies depend more on schooling than on maturation: Whereas kindergarten children presented almost no improvement in verbal STM (measured in a free recall task in which children had to verbally recall the names of pictures) over the year-long period, age-matched first graders displayed memory enhancement. On the contrary, within the same grade, children’s age (older vs. younger first grade) did not affect memory performance. Similarly, compared to their literate counterparts, illiterate unschooled Mexican children aged 6–13 years showed weaker performance in all cognitive domains, including digit forward and backward tasks (Matute et al., 2012). Weak memory performance in forward and backward digit spans as well as backward spatial span has also been observed in illiterate unschooled adults, whereas the forward spatial span appeared less affected (Silva, Faísca, Ingvar, Petersson, & Reis, 2012). Illiterate adults are also poorer than literate adults in wordlist learning and recall (e.g., Ardila, Rosselli, & Rosas, 1989), in word pair association learning (e.g., Reis & Castro-Caldas, 1997), and in repeating nonwords or low-frequency words (Castro-Caldas, Petersson, Reis, Stone-Elander, & Ingvar, 1998; Kosmidis, Tsapkini, & Folia, 2006; Petersson, Reis, Askelöf, Castro-Caldas, & Ingvar, 2000; Reis & Castro-Caldas, 1997).

In all these studies, however, it is difficult to conclude that growth of immediate memory skills is primarily a function of reading acquisition rather than of other experiences linked to formal schooling. As commented on by Morrison et al. (1995), memory skills may be directly enhanced by the teacher’s behavior, classroom activities (e.g., Rogoff, 1981), or direct instruction. School-related activities require reliance on memorization and abstract, verbal modes of communication dependent on memory processes, and hence might increase the tendency to activate processes aimed at enhancing storage and retrieval of information.

Other child data support more directly the idea that reading acquisition modulates verbal memory performance. In a longitudinal study, Ellis (1990) showed that, from ages 5–6 and from ages 6–7, reading skills (word and pseudoword reading and sentence comprehension) contributed more to later proficiency in auditory verbal STM (assessed by auditory digit, word, and sentence spans) than the reverse, as only the former relation was significant. Another longitudinal study showed that single word reading proficiency at 6 years of age predicts growth in nonword repetition between 6 and 7 years of age, even after controlling for the effects of phonemic awareness and of general aural language skills. In contrast, nonword repetition was not a longitudinal predictor of progress in reading (Nation & Hulme, 2011).

One way to isolate the impact of literacy is to compare adults who remained illiterate for socioeconomic reasons not only to literate adults, but also to ex-illiterate adults. Contrary to most literate adults (who learned to read as children and attended school for several years), illiterates and ex-illiterates never attended school in childhood. Ex-illiterates are from the same socioeconomic background as illiterates and first learned to read as adults in special literacy classes organized by the government, the army, or an industry, many having been encouraged to do so by their employer or supervisor. The comparison of these two groups thus allows an estimation of the specific effects of reading acquisition, uncontaminated by maturation, formal education, and sociocultural differences.Footnote 10 In addition, a comparison of ex-illiterates with schooled literates allows an estimation of the effect of formal education. Thus, if literacy per se boosted short-term retention, adults who remained illiterate for socioeconomic reasons would present poorer capacities in some memory components compared not only to schooled literates but also to ex-illiterates.

The very few studies that have used this comparison in the memory domain suggest a slight but specific impact of literacy on performance in some memory tasks, in particular with verbal material. In immediate serial recall of pictures, illiterate adults display a lower span compared to ex-illiterates who learned to read as adults in special literacy classes (Morais, Bertelson, Cary, & Alegria, 1986). Note that in that situation, the literacy effect was not due to verbal output constraints (e.g., the interference that might result, on each trial, from the production of an item on the production of the other items), as participants answered by placing the presented pictures on a strip. Nevertheless, schooling seems to also facilitate verbal memory, beyond the specific effect of literacy, as the results for both illiterate and ex-illiterate adults in Morais et al. (1986) were quite poor in comparison to the ones usually observed in schooled literates. In a more recent study, Kosmidis, Zafiri, and Politimou (2011) showed that “self-educated” ex-illiterates (who learned to read at home with their schooled children) perform better in forward digit-span and listening span than illiterates and similarly to schooled literates. In contrast, no effect of schooling or of literacy was observed in forward spatial span. However, it was schooling rather than literacy per se that gave an advantage on the digit span backward condition (Kosmidis et al., 2011).Footnote 11

It is worth noting that the impact of literacy and/or schooling may depend on qualitative aspects of the instruction. In the famous study run in Liberia by Scribner and Cole (1981), adults who were literate in Arabic (a consonantal alphabetic system) because they had attended Quranic schools in their youth, were found to have better memory for ordered recall of word lists in comparison to Vai (tribal) literates who acquired their indigenous (syllabic) script at home through individual tuition. As there were no group differences for free recall of words and no general memory advantage for the Quranic group, this effect probably reflects the influence of the incremental method of learning the Qur’an.Footnote 12 A study on children attending Quranic preschool in MoroccoFootnote 13 supported the idea that this type of schooling has a significant, positive effect on serial memory (with positive effects on digit span, name span, and incremental versions of these tests) that does not generalize to other kinds of memory skills, such as discourse memory and ordered memory of pictures (Wagner & Spratt, 1987; see also Wagner, 1993).

In any case, the fact that illiterate adults display quite poor verbal memory is by itself remarkable, as, intuitively, one might expect that the lack of written external memory aids (e.g., shopping lists, phone books, computers) would force them to rely more on aural memory in everyday life, which may, in turn, boost their memory capacities or strategies. This naïve idea corresponds in fact to a common view of human cultural evolution: Auditory memory would have been traded off against literacy (e.g., Cole, Gay, Glick, & Sharp, 1971), an opinion first articulated by the character of Socrates in Plato’s Phaedrus, who expressed concern about the “inhuman” nature of writing, stating that written words have a destructive effect on human memory (cf. Ong, 1982; Wolf, 2007). In fact, much to the contrary, literacy seems to strengthen verbal memory performance. This conclusion is in line with the suggestion that the use of external symbolic storage systems induces the need to manage multiple memory stores (both internal and external) and multiple knowledge codes (phonemic, orthographic, metalinguistic), which may modify memory as well as executive functions (e.g., Donald, 1993). In the next sections, we will examine in more details which potential mechanisms may be responsible for the effects of literacy on memory.

Verbal recoding and rehearsal strategies

As we have seen, the use of a verbal code to memorize images emerges gradually in typical children from about 6 years of age (e.g., Halliday et al., 1990; Henry et al., 2012; Tam et al., 2010), the age at which reading instruction usually starts. As reading acquisition boosts the emergence of explicit speech representations, this could in turn make speech-mediated retention possible, or at least facilitate the strategic use of a phonological code in memory. However, there is little evidence that the use of phonological codes for short-term retention of pictures requires explicit phonemic analysis capacities. Indeed, in Morais et al. (1986), illiterate adults displayed much poorer phonemic awareness performance than ex-illiterates, but showed a phonological similarity effect of similar size when remembering the names of series of pictures, with poorer performance on rhyming lists than on non-rhyming ones. Consistently, no correlation was observed in that study between explicit speech segmentation abilities (either for phonemes or for rhymes) and the ability to phonologically recode visually presented material in an STM task. The same holds true in young beginning readers and in dyslexic children (see the analyzes presented by Content, Morais, Kolinsky, Bertelson, & Alegria, 1986). These results are thus not consistent with the notion that phonemic awareness is necessary for or facilitates the use of a verbal code, at least to memorize images.

However, in that situation, possible visual strategies or dual coding had not been evaluated. Using an STM task in which either the visual or the phonological similarity of the visual stimuli was manipulated, Palmer (2000b) reported that 7-year-old children who are still relying on visual strategies (i.e., showing a visual similarity effect for lists including items such as brush, comb, pen, fork, etc.) have significantly lower literacy attainment levels than their same-age counterparts who use a verbal code (showing a phonological similarity effect). Dyslexic teenagers also present signs of dual coding for pictures, whereas both their chronologically and reading age-matched controls have abandoned a visual strategy in favor of a pure phonological strategy (Palmer, 2000c). Consistently, Bourdin (2007) showed that, whereas preliterate kindergarteners use a visual code all along the year, age-matched beginning readers attending the first grade progressively begin using a verbal code. Hence, although learning to read is not strictly necessary for verbal recoding of visual material, it seems to bolster the inhibition of a visual strategy in favor of a phonological one.

As already noted, in addition to the use of a verbal code to memorize visually presented items, 6- to 7-year-old children start to more actively use one of the most critical verbal memory strategies: cumulative rehearsal. The first reading activities (in particular, decoding activities, by their sequential nature) might promote changes in the quality of rehearsal as well as in the intensity of its use. Indeed, when reading, the beginner reader has to internally produce the letter (and word) « sounds » and serially rehearse these to form a word (or a sentence). In line with this idea, Wagner (1974) observed that unschooled people from rural Yucatan, Mexico, did not show the primacy effect in ordered memory of pictures (i.e., better recall of initial list compared to middle list items, usually observed in literate adults; Atkinson & Shiffrin, 1968). Although its interpretation has been highly debated (e.g., Brown et al., 2000; Nairne, Neath, Serra, & Byun, 1997; Oberauer, 2003), the primacy effect in serial position curves is sometimes assumed to reflect greater or more elaborative rehearsal of earlier list items, facilitating transfer of items to LTM and subsequent retrieval (e.g., Flavell, 1970; Hagen, 1971; Tulving & Craik, 2000; Watkins & Watkins, 1977). Hence, Wagner concluded that rural, unschooled people are not using verbal rehearsal strategies, which may explain the lack of increase in developmental performance (with participants aged from 7 to 35 years) he also observed in that population. These data, however, do not allow disentangling the effect of literacy and of formal education on verbal rehearsal.

According to the phonological loop model, another indirect marker of rehearsal is the word length effect, the finding that lists of short words are better recalled than lists of longer words (e.g., Baddeley, Thomson, & Buchanan, 1975). This effect would occur because it takes more time to refresh by articulatory rehearsal a list of long words than a list of short words. Consistently, when articulatory rehearsal is disrupted by articulatory suppression (Baddeley et al., 1984) or by specific neuropsychological damages (Jacquemot, Dupoux, & Bachoud-Lévi, 2011), the word length effect disappears regardless of how stimuli are presented. In addition, with spoken material, the word length effect emerges when children begin to use rehearsal, around 7 years of age (see data and discussion in Henry, Turner, Smith, & Leather, 2000). However, several pieces of data also argue against the notion that rehearsal alone can explain length effects (e.g., Campoy, 2008; Jalbert, Neath, Bireta, & Surprenant, 2011; Lewandowsky & Oberauer, 2008; Nairne et al., 1997; Neath & Nairne, 1995; Neath, 2000; Romani, McAlpine, Olson, Tsouknida, & Martin, 2005). For instance, under articulatory suppression, the length effect is reversed for words but remains robust for nonwords (e.g., Romani et al., 2005). A reverse word length effect would reflect stronger reliance on semantic codes, which are length insensitive, and on lexical codes, when phonological coding is disrupted by the acoustic input produced under articulatory suppression (Romani et al., 2005). Indeed, under the redintegration hypothesis, two competing influences of length are expected in recall: Although longer words are at a disadvantage because more pre-lexical phonological units have to be retained, they benefit more from the reconstructive mechanisms because they offer more residual phonological information from which to attempt reconstruction (e.g., Brown & Hulme, 1995). In accordance with this interpretation, a strong length effect remains under articulatory suppression with nonwords (which by definition do not have lexical-semantic representations) presumably because the adverse effect of having more units to recall is not counterbalanced by having a better chance at lexical reconstruction (Romani et al., 2005).

Interestingly, recent data by our group suggest that rehearsal and redintegration processes are influenced by literacy (Kolinsky, Demoulin, Mengarda, & Morais, submitted). We examined adults with varying degrees of rudimentary literacy (as most were attending the first class of a literacy course) and carefully estimated their reading and writing levels (assessed through single word and nonword reading and writing, plus reading comprehension tests). Through four immediate serial recall tasks, we estimated both the lexicality effect and the length effect. Illiterates and quasi-illiterates presented a lexicality effect and a trend toward a length effect for nonwords. However, for words, they displayed a reverse length effect (with a significant advantage of trisyllables over monosyllables), which has also been reported in young, preliterate children (Henry et al., 2000). Noticeably, in our study, Grade 4 literate children presented with the same material as the adults did not display a reverse word length effect. These findings suggest that learning to read reinforces sequential processes in verbal immediate memory, with (quasi-)illiterates not efficiently refreshing the decaying information via cumulative rehearsal.

As already commented on, rehearsal is related to the encoding and maintaining of serial order information (Henson et al., 2003). One study has suggested that sequential encoding strategies are also modulated by literacy: Presented with a picture array, Arabic literates adopt a right-to-left strategy not only of recalling but also of naming the pictures. This effect, which corresponds to the directionality of their script, is absent in illiterates (Padakannaya, Devi, Zaveria, Chengappa, & Vaid, 2002). However, whether it is related to rehearsal remains to be investigated.

Another aspect of our data (Kolinsky et al., submitted) supports the idea that illiterates strongly rely on lexical representations in trying to maintain verbal information in memory: In agreement with former data on nonword repetition (e.g., Castro-Caldas et al., 1998; Kosmidis et al., 2006; Petersson et al., 2000; Reis & Castro-Caldas, 1997), the illiterate and quasi-illiterate adults presented many lexicalizations of nonwords, i.e., transformed the nonwords into real words. As we will discuss in the next section, the reason for this might be related to the fact that the illiterates’ phonological sublexical representations are not as explicit and/or as detailed as those of literates, and hence would be of less support to reconstruct the traces in memory (see Turner, Henry, Smith, & Brown, 2004, for a similar suggestion as regards young vs. older children).

The support of detailed phonological representations and phonemic awareness

It is commonly assumed that memory for spoken material involves phonological coding, that is, coding information in a sound-based representation system (Baddeley, 1986; Wagner et al., 1994) without additional precision on the exact nature of these phonological representations or phonological “codes.” However, as suggested by the neuropsychological literature on selective impairments, there exist multiple potential phonological codes susceptible to involvement in immediate memory that can vary by degree of abstractness (acoustic or echoic, articulatory, auditory imagery, abstract, see Friedrich, 1990) and structural level (from phonetic features or phonemes to syllables and larger units). In addition, whatever their degree of abstractness and structural level, phonological representations are not necessarily explicit representations that can be segregated and manipulated. Although there is evidence that phonological coding in verbal immediate memory highly depends on the long-term phonological representations (see Majerus, 2009, for a review), to our knowledge no direct evidence is available as regards the specific characteristics of the phonological codes involved in WM for spoken items.

Presumably, verbal memory tasks with spoken material involve an implicit phonological processing: Phonological coding is automatically engaged without the requirement of explicit reflection on, or awareness of, the sound structure of the to-be-remembered words (e.g., Melby-Lervåg et al., 2012). It is possible, however, that the emergence of explicit phonological representations somehow facilitates performance on these tasks, either directly or indirectly, e.g., by facilitating access to the phonological representations.

Whereas phonological representations are being established from the earliest stages of infancy, phoneme awareness is known to develop when children or adults learn to read in an alphabetic code, as already commented on (e.g., Morais et al., 1979; Read et al., 1986). Literacy acquisition seems to affect the implicit phonological representations and processes used during perception much less than the explicit representations and processes used in phonological awareness tasks. For instance, although they lack of phoneme awareness, illiterates and preliterate children use implicit phonemic codes in both word recognition (for a discussion, see Morais & Kolinsky, 1994) and production (Ventura, Kolinsky, Querido, Fernandes, & Morais, 2007). Nevertheless, literacy seems to help in finely tuning phonemic boundaries and hence in increasing the precision of phoneme identification in literates compared to illiterates (for a review, see Kolinsky, 2015). This could, in turn, increase the quality of phonological coding in memory tasks with spoken items, particularly when there is no lexical support, as is the case for nonwords.

Consistent with this idea, nonword repetition is clearly a difficult task for illiterate adults. In comparison to (schooled) literates, they show more difficulties when repeating nonwords and low-frequency words, but not when repeating frequent words (e.g., Kosmidis et al., 2006; Petersson et al., 2000; Reis & Castro-Caldas, 1997; Rosselli, Ardila, & Rosas, 1990). Brain data have revealed that network interactions are different during word and nonword repetition only in illiterates (Petersson et al., 2000), making not only more phonological errors than literates, but also more lexicalization errors (Castro-Caldas et al., 1998; Kosmidis et al., 2006; Reis & Castro-Caldas, 1997). These specific difficulties may reflect less precise phonemic boundaries in illiterates (Serniclaes, Ventura, Morais, & Kolinsky, 2005).

However, it is difficult to determine whether some literacy-related memory effects reflect differences in the quality (or nature/granularity) of perceptual representations, or the lack of support from explicit phonological representations (or both). Indeed, as suggested by Morais and Kolinsky (2002), in a nonword repetition task, literate people can use an attentional strategy based on the explicit awareness of phonemes that illiterate people cannot develop because they lack awareness of phonemic segments. Furthermore, there is also a literacy effect on word recognition when listening conditions are made difficult by the use of dichotic presentation, i.e., simultaneous presentation of two words, one to each ear (Morais, Castro, Scliar-Cabral, Kolinsky, & Content, 1987). This effect probably reflects the availability in literates of an attentional mechanism focusing on the phonemic structure of speech, as it is modulated by instructions to pay attention to phonemes (Morais, Castro, & Kolinsky, 1991). Although no direct evidence is available, similar strategies may intervene when coding spoken information for later memory retrieval.

Some findings in schooled children also support the idea that phonological awareness exerts qualitative changes on the underlying phonological representations which, in turn, make verbal memory processes more efficient. Melby-Lervåg and Hulme (2010) showed that training 7-year-old typically developing children to manipulate phonemes in phonologically complex and unfamiliar words improved not only phonemic manipulation of those words, but also their serial recall and, to a lesser extent, their free recall. These memory effects on recall were highly specific and did not generalize to untrained words. In contrast, vocabulary training on those words improved their free recall and, to a lesser extent, their serial recall. Training on rhyme only improved rhyming skills, without any significant effect on either serial or free recall. As proposed by the authors, the phoneme-awareness training, but not the rhyme training, has led to new phonemically structured representations for these complex unfamiliar words, which then have facilitated their immediate recall. In another training study on second- and third-grade children with specific language impairment and concurrent difficulties with reading, explicit phonemic awareness training improved not only word decoding, the targeted skill, but also (untrained) verbal memory (Park, Ritter, Lombardino, Wiseheart, & Sherman, 2014). More precisely, after 4 weeks, the experimental group (who received an additional intervention designed to foster phonemic awareness) significantly outperformed the controls (who received individual traditional language intervention) in word reading skills, as well as on all verbal memory measures. Interestingly, phonemic awareness training differentially impacted the three subcomponents of Baddeley’s (2000) verbal WM model, with the strongest gains found on tasks most dependent on phonological representations, namely the digit and word list recall tests. Significant effects were also observed in verbal WM tasks (backwards digit recall and listening recall). The weakest gains were found on the recalling sentences subtest, a task depending more on attention, integration, and LTM than on phonological representations (Park et al., 2014).

The emergence of phonemically structured representations could also improve the reconstruction of the to-be-remembered items at retrieval (Thomson, Richardson, & Goswami, 2005; Turner et al., 2004). Indeed, under the redintegration view the quality of the phonological representations is assumed to constrain the efficiency of the redintegration processes, such that poorly specified or impoverished phonological representations would be less efficient in reconstructing the decaying memory traces in immediate memory (e.g., Schweickert, 1993). Some findings support this idea. Turner et al. (2004) have examined developmental changes in the use of redintegration in children between 5 and 10 years of age. In several memory tasks requiring varying degrees of phonological specification for accurate performance, they showed that redintegrative processes are used by children as young as 5 years, but change in nature with age. More specifically, in a probed recognition task,Footnote 14 5- and 7-year-olds were less accurate in rejecting rhyming probes (e.g., sound, wheat, law, trout . . . POUND) than non-rhyming probes (e.g., cliff, wheat, throw, bolt . . . BLUSH), whereas 10-year-olds used the rhyming probes to help them to remember list items and, hence, were able to reject the rhyming probe more accurately. Thus, rhyme impaired 5-year-olds and helped 10-year-olds. As the efficient use of rhyming memory probes requires more detailed phonological knowledge of the items to produce a correct response, the authors suggested that older children rely on smaller sublexical chunks than younger children to reconstruct the decaying trace in memory. This change could be related to their enhanced phonological awareness capacities.

Because the development of phonemic awareness and of metaphonological representations mainly depends on reading acquisition, not on age (e.g., Morais et al., 1979), literacy might be the critical factor for this developmental change in the reconstruction processess. Consistently, some studies comparing verbal STM performance of reading-disabled children relative to typical readers suggest that the LTM contribution to immediate memory performance increases with reading age rather than chronological age (Roodenrys & Stokes, 2001; Thomson et al., 2005).

In sumary, by allowing the development of phonemic awareness and more finely tuned phonemic boundaries, learning to read in an alphabetic system might improve WM for spoken materials at encoding and/or at retrieval. However, all the effects of literacy on verbal memory cannot be accounted for by the availability of explicit speech representations or by more precise phonemic boundaries. Learning to read allows interconnecting phonological and orthographic representations (Ehri, 2014; Kolinsky, 2015), and this mapping could also support verbal memory.

Support of orthographic representations

Many researchers conceive the mental representations of words as networks comprising at least three interconnected components: phonology, meaning, and orthography (e.g., Seidenberg & McClelland, 1989). However, word spellings are rarely considered as strengthening memory for their phonological representations. Several lines of evidence do however support the hypothesis that orthographic representations help children and adults to acquire new vocabulary words. Rosenthal and Ehri (2008), for instance, showed that both second- and fifth-grade children learned and remembered new vocabulary words better when they were exposed to written forms of the words during study periods than when they only heard and repeated the words. Spelling presentation facilitated both word pronunciation (prompted by a drawing) and word meaning (prompted by pronunciation), improving learning rate and memory performance beyond the end of training (1 day after). Similarly, presenting children with the spellings of nonwords during the study period favors remembering their pronunciation (Ehri & Wilce, 1979) and even their “meaning,” namely their association with novel objects (Ricketts, Bishop, & Nation, 2009). Importantly, as spellings were not present when participants recalled the items, the boost observed in these studies must have come from the presence of spellings linked to pronunciations in memory. More subtle effects of spelling knowledge have also been reported in adults, with spelling regularity of novel spoken names (in which the initial phoneme could be spelled in a regular or irregular manner based on existing English spelling–sound relationships, e.g., /kIsp/ spelled <kisp> or <chisp>) influencing picture-naming latency in the learning of associations between these novel words and novel pictures (Rastle, McCormick, Bayliss, & Davis, 2011).

In adults, spelling may also help second language (L2) learning by facilitating non-native word recognition. Indeed, the availability of orthographic contrasts has been shown to improve learners’ abilities to distinguish lexical items containing difficult-to-perceive L2 contrasts. For instance, Dutch learners of English have trouble distinguishing the vowels /æ/ and /ɛ/, a contrast that does not exist in Dutch, that only has /ɛ/. In a learning task with English nonwords including these vowels, Escudero, Hayes-Harb, and Mitterer (2008) showed that Dutch-native learners’ ability to encode the /æ/-/ɛ/ contrast lexically was determined by the availability of a difference in spelling during the learning phase. Similarly, for native English speakers with no knowledge of Mandarin, even unfamiliar orthographic symbols, namely orthographic tone marks, help to associate auditory lexical tones with new L2 words (Showalter & Hayes-Harb, 2013).Footnote 15

Thus, for literate children and adults, orthography can support learning for more variable and transient phonological forms. As commented on by Ehri and Rosenthal (2007) and Ehri (2014), this should encourage revising current theories about the storage of words in memory and about the relations between phonological STM and reading abilities. Indeed, an influential view considers that good readers have superior phonological STM in comparison with poor readers, and that this memory difference partly explains why good readers have larger vocabularies (e.g., Gathercole, 2006; Gathercole, Service, Hitch, Adams, & Martin, 1999). However, the data on children cited above advocate a more direct link between new vocabulary acquisition and reading level: Children with better reading skills are more likely to benefit from the presence of orthography in learning new words than less skilled readers presumably because of their superior ability to connect spellings to pronunciations in memory (Ehri & Wilce, 1979; Ricketts et al., 2009; Rosenthal & Ehri, 2008). Orthographic knowledge seems even more important than phonological STM for learning new words. Indeed, in the study by Rosenthal and Ehri (2008), better readers outperformed the poorer readers only slightly in learning the pronunciations of new vocabulary words when items had only been presented orally in the study phase, which indicates only a small difference in phonological STM, but were far superior to lower readers when spellings were seen during study. This suggests that, once people become literate, it is not pure phonological memory but rather the superior ability to connect spellings to pronunciations in memory that explains why good readers build larger vocabularies more easily than poor readers.

In fact, there is now considerable evidence that spelling knowledge shapes a quite large variety of speech processes (see Kolinsky, 2015, for a review). The first evidence showed that literates use orthographic representations in purely auditory phonological awareness situations. For instance, using a rhyming judgment task in which adult participants had to rapidly decide whether two spoken words rhymed or not, Seidenberg and Tanenhaus (1979) observed that it took less time to say “yes” to two rhyming spoken words when their (never presented) spellings were similar (e.g., toast–roast) than when their spellings differed (e.g., toast–ghost). On the contrary, it took longer to say “no” to two non-rhyming words when their spellings were similar (e.g., leaf–deaf) than when their spellings were different (e.g., leaf–ref). Although such effects are hardly surprising, as explicit speech representations are closely linked to reading acquisition (e.g., Morais et al., 1979), orthographic effects have also been reported in on-line word recognition tasks. For instance, in an auditory lexical decision, responses to words such as deep that include rimes that can be spelled differently in other words (e.g., heap) are slower and less accurate than responses to words with rimes that are spelled only one way, a finding first reported by Ziegler and Ferrand (1998). Orthographic knowledge even influences inattentive speech processing, as reflected by the orthographic modulation of the Mismatch Negativity event-related brain potential (ERP), an automatic index of experience-dependent auditory traces recorded in passive listening to spoken words (Pattamadilok, Morais, Colin, & Kolinsky, 2014).

Only a few studies have examined whether the influence of spelling knowledge can go beyond single word processing and may influence memory performance. This seems to be the case in visual immediate serial recall. Tree, Longmore, Majerus, and Evans (2011) explored the pseudo-homophony effect (i.e., visually presented pseudohomophone nonwords such as <skool> are better orally recalled than other nonwords such as <zool>, even under articulatory suppression; Besner & Davelaar, 1982). Tree and colleagues independently manipulated phonological familiarity (i.e., pseudohomophones vs. other nonwords) and orthographic familiarity, with pseudohomophones and nonwords visually similar to their parent word (e.g., <kamp>/“camp”; <speeze>/“cheese”) or not (e.g., <kloo>/“clue”; <gredd>/“red”). Regardless of the presence of a concurrent task, accuracy was higher for pseudohomophone items visually similar to the parent word (e.g., <kamp>) than for pseudohomophones that were dissimilar (e.g., <kloo>). The reverse was observed for the other nonwords: Nonwords visually similar to a real word were recalled more poorly (e.g., <speeze>) than nonwords visually dissimilar (e.g., <gredd>). According to the authors, when visual similarity to the parent word is high, the retrieval of a pseudohomophone item (<kamp>) is facilitated because both orthographic and phonological information act as strong cues (i.e., both activate the parent word “camp”) for the appropriate target, whereas for other nonwords (<speeze>), orthographic information cues a competitor (“cheese”) that creates a greater degree of interference with a consequential decrease in accuracy. Note that this presumed effect of orthographic representations was observed for the retrieval of item-based information in the immediate serial recall task, but not of order-based information.

Whether spelling knowledge also helps to maintain the representation of spoken items in purely aural immediate memory tasks has remained a neglected question for a long time, with the exception of one study by Baddeley (1966) that yielded negative results. With the purpose of separating the deleterious effects of phonological similarity from the potential effects of letter-structure similarity, Baddeley manipulated both the phonological and orthographic inter-item similarity in five-word sequences orally presented for immediate serial written recall. He observed the expected phonological similarity effect, with lower performance when the words had similar pronunciation (e.g., “bought, sort, taut, caught, wart”) than when they were phonologically dissimilar (e.g., “plea, friend, sleigh, row, board”). Performance with words similar in letter structure but dissimilar in pronunciation (e.g., “rough, cough, through, dough, bough”) was almost similar to the control condition, in which words were dissimilar in both pronunciation and letter structure (“bought, sort, taut, caught, wart”). Baddeley concluded that auditory serial recall performance is insensitive to the orthographic similarity of the list, as no deleterious effect of orthographic similarity had been observed. However, in a rapid auditory immediate serial recall task, when the words of a list do not share the same pronunciation, there is no reason to expect a significant effect of spelling knowledge when phonological dissimilarity alone is sufficient to perform the task. This study, together with a lack of theoretical concern for the role of orthography in speech processing at that time, has probably discouraged any further investigation of possible effects of orthography in auditory serial recall.

More recently, however, given the numerous indications of a profound influence of orthographic knowledge on word recognition, Pattamadilok, Lafontaine, Morais, and Kolinsky (2010) examined the question of whether inter-item orthographic dissimilarity may help to reduce the deleterious effect of phonological similarity. Thus, contrary to Baddeley (1966) who examined the potential deleterious effect of spelling similarity in the case of phonological dissimilarity, Pattamadilok and colleagues investigated the beneficial effect of spelling dissimilarity in cases of phonological similarity. When phonology coding is likely to lead to confusion, we might expect a coactivation of other types of representations, such as the spellings of the words, to support recall. In a seven-word auditory immediate serial recall task, they observed that, compared to words that shared neither the phonological nor the orthographic rhyme, performance of adult participants was less affected when words rhymed but had different spellings (as in the French sport, bord, corps, flore, etc., all ending with /oR/) than when they both rhymed and had the same spelling (as in the French classe, brasse, chasse). However, the recall benefit due to orthographic dissimilarity was observed only at positions four to six of the word list, that is, in three out of the four positions at which performance was the lowest. The authors suggested that orthographic representations could be activated and help to disambiguate between the rhyming words when the phonological codes are stored as abstract but unconsolidated representations.

The exact mechanisms by which spelling knowledge provides support to decaying memory traces remains, however, to be investigated. This is an open issue as regards the way spelling knowledge modulates spoken word recognition as well. According to the online account (e.g., Grainger & Ferrand, 1996; Ziegler & Ferrand, 1998; Ziegler, Muneaux, & Grainger, 2003), spoken words activate the corresponding orthographic code via online bidirectional connections between phonology and orthography. In other words, orthography is activated during speech processing and feeds back information to the phonological system. Alternatively, the offline, or restructuring account, proposes that learning about orthography permanently restructures the mentally stored phonological representations during the course of reading acquisition (e.g., Muneaux & Ziegler, 2004; Ziegler et al., 2003). Under this view, orthographic effects take place within the phonological system itself. In fact, online and offline effects are not mutually exclusive as both might occur simultaneously: Orthography might be activated online, in addition to having changed the nature of the phonological representations (see further discussion in Kolinsky, 2015; Taft, 2011).

Both accounts share with the lexical quality hypothesis (Perfetti, 2007; Perfetti & Hart, 2001, 2002) the view that code quality has changed in some way under the influence of orthographic knowledge. According to the latter hypothesis, word comprehension depends on efficient word identification processes, namely on efficiency in the rapid retrieval in LTM of a high-quality representation. A lexical representation would have high quality to the extent that it has a fully specified orthographic representation (a spelling) and redundant phonological representation (one from spoken language and one recoverable from orthographic-to-phonological mapping). Due to their coherence and stability, such high-quality representations would be more likely to be fluently retrieved. Although the lexical quality hypothesis was proposed in the context of reading comprehension, a similar process may occur in verbal memory tasks, at encoding and/or retrieval, with items having high-quality lexical representations better recalled than items with poorly specified representations.

Conclusions and perspectives for future research

Whereas there is considerable literature on the role of WM in reading acquisition, very little attention has been paid to the converse causal relationship, namely, the effects of learning to read on verbal WM development. However, there are strong empirical and theoretical reasons to expect a positive and unique effect of literacy in this cognitive domain.

In schooled children, performance in several verbal STM and WM tasks considerably increases with age, especially between 5 and 8 years of age (e.g., Gathercole, 1999). Several findings presented in this review suggest that these developmental changes in memory performance may be partly due to literacy acquisition, typically occurring during the same period. One source of memory development is potentially linked to qualitative changes, around 6–7 years of age, in the strategies used to maintain a sequence of items in memory. We have proposed that the sequential decoding activities might bolster the use of phonological recoding (Bourdin, 2007; Palmer, 2000c) and cumulative rehearsal (Ellis, 1990; Wagner, 1974; Kolinsky et al., submitted); the latter would, in turn, improve the encoding and maintaining of serial order information (e.g., Henson et al., 2003). In line with this idea, a close relationship has been observed between decoding skills and serial order memory performance (Martinez Perez et al., 2012b; Nithart et al., 2011). More studies are needed to examine the causal nature of this relation, as well as whether greater efficiency in rehearsal underlie the developmental increase in memory capacity for serial order information. Although in most children, this boost on memory strategies may also be driven by other classroom activities, such as numeracy, mental calculation, or poetry recitation, part of the developmental effect might be traced back to the acquisition of literacy (Kolinsky et al., submitted).

There are at least two other mechanisms by which literacy acquisition may shape the development of verbal WM. First, learning to read in an alphabetic writing system allows the emergence of explicit phonemic representations that could influence spoken word recognition (e.g., Morais, Alegria, & Content, 1987) and verbal memory performance (e.g., Ellis, 1990; Melby-Lervåg & Hulme, 2010; Park et al., 2014). Secondly, learning to read allows interconnecting of phonological and orthographic representations for words, along with semantic information (Ehri, 2014). This multiple mapping in LTM could improve the quality, strength, and precision of lexical representations, and thereby the encoding of to-be-remembered items or the efficiency and accuracy of the reconstruction process at retrieval. There is indeed compelling evidence that the immediate recall of verbal items relies on the temporary activation of language representations and is enhanced by rich and easily accessible language representations (Majerus, 2009, 2013).

More research is needed to show in what way the quality and specificity of lexical representations change with the effects of becoming literate and to what extent literate individuals benefit from orthographic knowledge in auditory immediate memory tasks. It would be also relevant to examine whether the effects of literacy depend or not on the levels of reading and orthographic abilities, as well as on the degree of exposure to print. Stanovich, Cunningham, and West (1998) have indeed pointed out that, even in a literate population, individuals differ in their degree of print exposure (i.e., the amount of reading) and that these individual differences are associated with differences in vocabulary and general knowledge across the life span.

In contrast to the relative disregard of the consequences of reading acquisition by developmental psychologists, many other scholars (e.g., anthropologists, historians, linguists) have considered the cognitive consequences of literacy and emphasized its powerful impact (e.g., Goody, 1977; Olson, 1977, 1994; Opland, 1975). For instance, Ong (1982, p. 77) claimed that “without writing, the literate mind would not and could not think as it does, not only when engaged in writing but normally even when it is composing its thoughts in oral form.” In contrast to fully natural, oral speech, Ong saw writing as an interiorized technology laboriously learned (“ There is no way to write ‘naturally’ ”, p. 81) that enhances human consciousness. Claims of this kind have been criticized for being too speculative about the cognitive effects of literacy, but, as we have discussed, there is now compelling behavioral and neuroimaging evidence that learning to read impacts the brain, especially the language network, with some very plausible consequences on the short-term processing and maintenance of verbal information. These findings have important implications for children with reading disabilities who commonly show deficits in verbal WM: These deficits could be a consequence, rather than a cause, of unsuccessful reading achievement.

Undoubtedly, many factors can drive the developmental changes in verbal WM. We share the view of Wagner (1974, p. 395), according to whom the development of memory depends on more than simple maturation: “it is conceivable that recognition memory, as with ‘echoic’ memory, develops maturationally and without formal schooling, but this is apparently not true for all that is considered to be memory. Higher level mnemonic strategies in memory may do more than ‘lag’ by several years – without formal schooling, such skills may never develop at all.” It is obviously far from simple to disentangle the respective roles of maturation, schooling experience, and literacy in schooled individuals, but it is crucial to do so in order to be able to generalize our knowledge of human cognition to all human beings. A way to unravel the role of formal schooling and of literacy seems to be the comparison of unschooled literates and illiterates. Additionally, longitudinal studies comparing the development of WM in groups of same-language children who differ in whether they start formal literacy lessons at age 5 versus 7 years could be instructive to disentangle the role of maturation from reading acquisition.