Imagine you are cleaning the living room and you have come to the part where you are ready to dust off the wooden furniture. You take out your rag and spray some pine-scented dust remover on it, mistakenly believing that this time the dust will stay away for a prolonged period. However, in the moment that you spray the contents from the can onto the rag, something odd happens. Instead of continuing to think about your task, you are spontaneously brought back to an earlier time. Specifically, you instantaneously relive an experience from your childhood where you were running around a camp site, waiting for your parents to get out of their tent so you could begin your day’s hike through the woods.

What seems to have happened is that you have experienced what has come to be known as the Proust phenomenon, whereby a currently perceived odor causes the spontaneous recollection of a past event. The Proust phenomenon has been described using the LOVER acronym Larsson, Willander, Karlsson, & Arshamian, 2014), whereby odor-evoked autobiographical memories are Limbic, Old, Vivid, Emotional, and Rare. Furthermore, the Proust phenomenon refers to the notion that odors are more effective cues for eliciting autobiographical memories than other types of cues (Chu & Downes, 2000b; de Bruijn & Bender, 2017).

The name for this collection of effects, the Proust phenomenon, refers to a passage from Marcel Proust’s book Swann’s Way. In this passage, the author describes in great detail how the smell, among other cues, of a cookie eventually reminds him of a long forgotten memory from his childhood. As Jellinek (2004) pointed out, the name Proust effect is somewhat misleading, or at least oversimplified, as the experience that Proust described involved multiple sensory cues, including not only olfactory but also gustatory, textual, and temperature cues. Although the term Proust phenomenon is used too narrowly, the name has proven to be catchy and has become the commonly used term by researchers investigating olfactory cued autobiographical memory.

The Proust phenomenon has received some attention in the scientific literature. In fact, in the new millennium, we have identified four separate reviews discussing the topic of odor-evoked autobiographical memory (Chu & Downes, 2000b; Herz, 2012; Larsson et al., 2014; Saive, Royet, & Plailly, 2014); a large number considering the limited amount of empirical research that has been published on the topic. Although each of these reviews has made its own contribution to the collective understanding of the phenomenon, we still feel that there is room for one more.

According to Baumeister and Leary (1997), there are five distinguishable goals for a narrative review: (1) theory development, (2) theory evaluation, (3) surveying the state of knowledge on a particular topic, (4) problem identification, and (5) historical accounting. The previous reviews have mainly focused on surveying the state of knowledge about the Proust phenomenon, with the Saive et al. (2014) review focusing specifically on the neuropsychological correlates of the phenomenon (we refer readers to this review for more information about the limbic aspect of the LOVER model, as the current review focuses on behavioral results). The current review will also involve a survey on the state of knowledge on the Proust phenomenon but will additionally include (a) problem identification, especially in relation to an in-depth look at the methods used to study the Proust phenomenon over nearly 35 years of sporadic research, (b) theory evaluation, with a focus on, when appropriate, mini-meta-analyses regarding various aspects of the Proust phenomenon, and (c) theory, or actually hypothesis, development. Specifically, we will provide a summary of hypotheses that attempt to account for the variety of findings that have come to define the Proust phenomenon.

This type of review is, in our estimation, especially important given the fact that research in this area is still in its infancy. At this early stage, the research on odor-evoked autobiographical memory has been, as will become clear below, somewhat disorganized. There is no consensus on the appropriate methods that should be used, the dependent variables that should be tested, or even the specific research questions that can be answered. It is our hope that our review can serve as a guide for researchers in the field, as well as an introduction for researchers unfamiliar with the topic of investigation.

Theoretical interest in the Proust phenomenon

In two early papers on this topic, Chu and Downes (2000a, 2000b) defined current understanding of the Proust phenomenon. Specifically, they claimed that autobiographical memories evoked by odors are older and more emotional than autobiographical memories evoked by other cues. Furthermore, but more controversially, they also claimed that odor-evoked autobiographical memories are more vivid (see Herz, 2012, for a discussion of this issue) and that odors are the most effective cues of autobiographical memory (a claim that is contested, depending on the meaning of “effective”).

Interest in the Proust phenomenon seems to have developed from personal experiences, such as the one described at the beginning of this paper. Given this origin, the research to this point has focused on determining the existence of certain effects related to odor-evoked autobiographical memory. This focus, in turn, means that there is a lack of research into the underlying mechanisms that may account for the phenomenon.

Autobiographical memory

The term autobiographical memory (AM) can possibly be used as a synonym for episodic memory (Rubin, 2006) but also includes semantic knowledge about oneself (e.g., one’s birthdate). For the purposes of this paper, we will consider AMs to be either voluntarily or involuntarily retrieved (Berntsen, 1996, 1998) memories relating to a personal experience, which can be defined as a single or specific event. Although much research on AM involves voluntary memories, where the researchers instruct the subjects to specifically think back and retrieve an important event from their personal history or retrieve an event that is associated with a cue (e.g., Rubin & Schulkind, 1997), the event described by Proust as well as the example given at the beginning of the paper would be involuntary instances of AM, where something from the environment spontaneously cues a memory experience (Berntsen, 1996, 1998).

The assessment of voluntary memories in the laboratory usually involves the aid of specific cues. AM researchers have used different types of cues, such as words, pictures, and odors, to obtain a deeper understanding of how memory works. Some interesting issues that can be addressed by using odor cues include (a) whether the reminiscence bump, which is the disproportional recall of personal events stemming from adolescence and early adulthood (Rubin, Wetzler, & Nebes, 1986), is stable, or whether it is dependent on the mnemonic cue; (b) whether memory functions similarly for all modalities or whether there are fundamental differences, possibly indicating separate memory stores; and (c) whether the experience of reliving personal memories is influenced by, or dependent on, the mnemonic cue. Based on these issues, we feel that the amount of attention that the Proust phenomenon has garnered is justified, as it provides us with information about not just odor-evoked AM but also mnemonic processes in general.

Olfactory cognition

In addition to providing information about AM, research on odor-evoked AM also relates to research on olfactory cognition. Olfactory cognition refers to the relationship of olfactory perception with basic cognitive functions, such as attention and memory. The relationships of interest in this field of research are bidirectional, with researchers being interested in not only the influence of olfactory perception on cognition (e.g., Colzato, Sellaro, Paccani, & Hommel, 2014) but also how olfactory perception is influenced by various means, such as semantic processing (Herz, 2005). This research, which is still in its infancy, has found interesting differences between olfactory cognition and basic cognitive processes that have been traditionally investigated with visual or auditory materials.

One of the most active subareas of olfactory cognition concerns research into olfaction and memory, a subfield we term olfactory memory. To this point, there have been four main lines of investigation into olfactory memory. The most basic line concerns memory for odors themselves. Research on memory for odors themselves has shown that odors may be forgotten at a slower rate than other types of stimuli (Engen & Ross, 1973), although this finding has not always been replicated (e.g., Kärnekull, Jönsson, Willander, Sikström, & Larsson, 2015). The second line of research concerns odors as cues in paired-associates memory tests. Again here, the research has shown that memory seems to function differently with odor cues than other cues, whereby odors were shown to be less effective cues of paired associates than visual stimuli (Davis, 1975, 1977). The third line of research concerns odors as contextual mnemonic cues. Odors have been shown to be effective contextual cues of memory (Hackländer & Bermeitinger, 2017), although it is not clear that they are more effective than other types of cues (Herz, 1998).

The fourth line of research in olfactory memory concerns odor-evoked AMs. Research into odor-evoked AM fits into, as well as helps to advance, both the subfield of olfactory memory and the larger area of olfactory cognition research. Specifically, odor-evoked AM research provides information about how memory functions differently when personal memories are associated with odors than when personal memories are associated with other types of cues, such as visual images or sounds. Also, yet more tangentially, this research can provide us with more basic information about how odors are subjectively experienced, and how these experiences differ from experiences of other modalities. Finally, given humans’ reliance on language for reporting AMs, research into odor-evoked AMs provides us with another comparison of the differences in how we use language to describe odor-associated memories and memories associated with other modalities.

Summary

Research into odor-evoked AM has garnered much interest, especially in the past decade. This attention is justified, given the theoretical interest of the topic to a broad range of researchers. To date, research has shown that AMs evoked by odors are different from AMs evoked by other modalities. Specifically, researchers have made the claim, and attempted to find support for the claim, that odor-evoked AMs are older, rarer, more emotionally intense, associated with stronger feelings of mental time travel, and more vivid. Researchers have used a variety of methods to investigate odor-evoked AM, yet there are problems with these various methods that still need to be addressed.

Analysis of methods

Odor-evoked AM research, as we currently understand it, has been around for nearly 35 years, with the first studies being performed by Rubin, Groth, and Goldsmith (1984). In that time, there have been a variety of methods used to study olfactory-evoked AMs across a developmentally diverse set of samples.

In this section, we will review the methods of all the articles examining odor-evoked AM that conformed to the following criteria: (1) AM was investigated with cue modality (odor and at least one other) as an independent variable; (2) the retrieved memory conforms to our above definition of AM, in that the subject recalls a personal event (this is the reason that studies testing olfactory context-dependent memory, such as that of Aggleton & Waskett, 1999, are not included); (3) the study includes dependent variables measuring either (a) the age and rarity of AMs, (b) the experiential aspects of AMs, or (c) both (this excludes, in particular, a study by Karlsson, Sikström, & Willander, 2013, in which the only dependent variable was a measure of semantic representation, and a study by Knez, Ljunglöf, Arshamian, & Willander, 2017, in which the main dependent variables measured the importance of retrieved AMs to the sense of self); and (4) the target sample was not derived from a clinical population (which excludes a study by El Haj, Gandolphe, Gallouj, Kapogiannis, & Antoine, 2017). These criteria led to the inclusion of 13 studies, containing a total of 14 experiments (see Appendix A for a full list of included studies and Appendix B for a discussion of excluded studies and their potential to contribute to the understanding of the topics reviewed in this paper).

Cuing method

All of the studies reviewed here have used one of two methods for cuing AMs, either the single-cue or the double-cue method. In the single-cue method, a subject is first presented with a single cue (e.g., odorant, verbal label) and then asked to provide a description of an AM that is associated with that cue. From our analysis, we feel the prototypical methodFootnote 1 of studying odor-evoked AM is a single-cue procedure with a comparison between label and odorant cues. In such an experiment, subjects are assigned to one of two groups, either the label or olfactory group (i.e., a between-subjects design). In the label group, subjects are presented with a word (e.g., the word orange) and asked to write down a brief description of a memory that is associated with the word. After providing the description, the subjects are asked to rate their current state and their state when the event first occurred. This procedure is repeated for k number of trials (i.e., labels). After all trials are completed, the subjects are presented their descriptions again and asked to date their memories. The procedure is exactly the same for subjects in the olfactory group, with the exception that instead of a word they are presented with an odorant (e.g., orange extract) and asked to retrieve an AM associated with the odor.

The single-cue method corresponds most closely to how much past AM research has taken place (Crovitz & Schiffman, 1974; Galton, 1879; Robinson, 1976). The main benefit of the single-cue method is that it allows for the investigation of selection differences (i.e., which memories are selected) between modalities, such as the age of the memories and cue efficiency. This benefit is particularly important to odor-evoked AM research.

An alternative to the single-cue method is the double-cue method, wherein each subject is first presented with a verbal label (e.g., the word orange) and asked to provide a description of a memory that is associated with that memory. After providing the memory description and experiential ratings, the subjects are then presented with either the same label (the word orange), a congruent odor (orange extract as an odorant), or a congruent picture (a picture of an orange). Following the second cue, subjects are again asked to describe and rate the AM.

Whereas the single-cue method allows for investigations of selection differences, the benefit of the double-cue method is that it actually controls for selection differences (i.e., the same memory is being retrieved for the different types of cues). By controlling for selection differences, it is assumed that any differences between modalities are due to the ways that the different cues influence the recollective experience of the memory rather than due to different cues leading to the recollection of qualitatively different memories (de Bruijn & Bender, 2017).

The use of both cuing methods is important for determining what makes AM evoked by odor different from AM evoked by other modalities. The single-cue method allows for investigation of selection differences, whereas the double-cue method allows for a more precise investigation of how the cues influence the recollective experience of the memory. However, both methods suffer from the potential confounds of experiential differences between the cues rather than between the cued memories. Although previous research has attempted to control for this issue (Chu & Downes, 2002; de Bruijn & Bender, 2017; Willander & Larsson, 2007), we feel that it has yet to be resolved.

Issues with double-cue method

Chu and Downes (2002) used the double-cue methodology when investigating odor-evoked AM. In an attempt to control for experiential differences being caused by perception of the cue rather than how the cue influences the recollective experience, they added an incongruent odor condition. They found that memories in the congruent odor condition (e.g., where the second cue was orange odor, after the first cue had been the word orange) were rated as more emotional than memories in the incongruent odor condition (e.g., where the second cue was coffee odor, after the first cue had been the word orange) and the congruent label condition (e.g., where the second cue was the word orange, after the first cue had been the word orange). Furthermore, they found no differences between the incongruent odor condition and the congruent label condition. Based on these findings, they argued that the differences were due to the way that the congruent odors influenced the memory rather than subjects mistakenly conflating their ratings of the memory with their experience of the odor.

Although the incongruent odor condition does seem to be an appropriate control condition, it does not allow one to rule out the conflation possibility. When one smells the incongruent odor as the second cue, it is easy to separate any feelings about the perception of the odor from the perception of the memory, as the incongruent odor should not act as a further cue of the memory. However, the congruent odor may act as a further cue, which would make it easier for a subject to conflate their experience of the odor with their experience of the memory. Another point that should be taken into consideration is that congruency itself may influence the memory or the recollective experience of the memory. For example, affective congruency between a contextual odor cue and a target item leads to better memory specifically for emotional details of the original event (Herz & Cupchik, 1995).

One final point that should be considered here relates to the likelihood of a congruent odor actually acting as a further mnemonic cue. Given the finding that odors are less likely to trigger memories than other types of cues (discussed below), it may be fair to assume that odors are actually associated with fewer AMs. If they are indeed associated with fewer AMs, it is possible that congruent odors are not actually associated with the memories that were originally cued by the verbal label in the double-cue procedure. If an odor was not encoded with the memory originally, the encoding specificity principle (Tulving & Thomson, 1973) would hold that the odor cannot act as a cue to retrieve further details of the active memory. Therefore, any differences found between the congruent odor condition and other conditions would be, in fact, due to conflation between the experience of the odor and the experience of the memory. It is important to note that this final point is, as of now, only speculative, so it should be taken with caution. Future tests, of the sort proposed by Chu and Downes (2000b) that combine laboratory setups with personal memories, could test whether there is any merit to this supposition.

Issues with single-cue method

Willander and Larsson (2007) were likewise aware of the potential problem of conflation between the experience of the odor and the recollective experience of the memory. In their study, which used a single-cue procedure, they manipulated whether the ratings of the retrieved memory occurred in the presence or absence of the cue. Across a number of dependent variables, they found that rating the memory in the presence of a cue (including odors) did not increase the ratings of the recollective experience of the memory.

Again, although this control condition seems appropriate, we feel it does not necessarily rule out the possibility of conflation. For one, the presence of the cue during rating was manipulated within subjects. Such an experimental design leaves open two possibilities: (1) subjects were able to recognize the manipulation and this awareness influenced their ratings, and (2) by the time subjects switched conditions (presence or absence), they had a personal mean rating, to which they stuck after the manipulation. More fundamental is a problem stemming from the fact that all of the memories were, by necessity, initially retrieved in the presence of the cue. The presence of the cue during retrieval allows for the possibility that any conflation of experience occurred during the initial retrieval to remain. Simply delaying the rating procedure may not have alerted subjects to the potential conflation of experience between the odor and the memory, which thusly persisted until the memories were finally rated.

Potential solutions to conflation problem

Future research is required to control for the conflation problem, and we would like to propose two potential methods here. The first method we propose is that cues of the different modalities are chosen to be equally positive/negative and arousing. Using such an experimental design probably means that the cues will lose conceptual similarity (e.g., picture of an orange and orange odor would not be considered equally positive or arousing), but we argue the trade-off would be warranted for a limited number of studies. This type of control could be used in either the single-cue or double-cue method of AM investigation.

The second control we propose is based on the method used by Willander and Larsson (2007). In a single-cue method, we suggest to have a 2 × 2 design, where the first factor is the type of cue present during memory retrieval (label vs. odor) and the second factor is the type of cue present during rating (label vs. odor). If odor affects only the experience of the memory (i.e., no conflation), the odor-odor and the odor-label groups should report higher experiential ratings than the label-odor and the label-label groups. However, if the label-odor group is also found to report high experiential ratings, this finding would support the conflation hypothesis.

Further issues of cuing

As already described, all of the studies included in this review used either a single-cue or a double-cue method. However, this style of cuing only partially reflects the broader literature on AM. In addition to asking subjects to provide an AM in response to a cue, some AM research has simply asked subjects to think back and retrieve a personally important memory, with no specific cues being presented (e.g., Demiray & Janssen, 2015; Rubin & Schulkind, 1997). This type of investigation, here referred to as the important memories method, typically leads to a shift in the location of the reminiscence bump, whereby the peak occurs later than with the single-cue method, using verbal labels (Koppel & Berntsen, 2015).

Given the differences in findings with different cuing methods, we feel odor-evoked AM research would benefit from a study implementing a modified version of the important memories method, whereby subjects are asked to retrieve important memories and provide information about the various aspects of the memory (e.g., age of memory, experiential aspects) and how the memory is associated to the different modalities (e.g., Rubin, Schrauf, & Greenberg, 2003). An alternative method would be a variation of a diary study. Subjects would, over a period of time (e.g., 1 week), be asked to record any instance of an involuntary memory in a notebook or with an application on their mobile telephone. In addition to being asked to record the details of both the memory and the original event, they would also provide information about what modality cued the memory (e.g., Berntsen, 1996, 1998).

Such methodological adaptations would help in answering several questions, including the following: (1) Do Proust effects occur when a different style of assessing AM is used? (2) Is the presence of a physical odorant during retrieval necessary for Proust effects to occur? (3) Is olfactory information typically encoded in personally relevant memories? (4) Are important memories as likely to be triggered by olfactory stimuli as by cues from other modalities? (5) Are there differences in the typical contents of AMs triggered by different cues?

Storage and presentation of olfactory cues and related issues

Unlike verbal, visual, and auditory stimuli that can be saved digitally, odorants are physical objects whose attributes are, at least partially, determined by environmental factors. The influence of environmental factors on the qualities of the stimuli means that extra care must be taken with odorants to ensure that their attributes are constant for all subjects across an experiment. Furthermore, the way in which subjects are presented the odorants can also have an effect on how they perceive the odor.

Storage

In all studies, where information on storage is reported, the experimenters have stored the olfactory material in plastic, metal, or glass jars. Only two studies mentioned the temperature at which the odorants were stored: Rubin et al. (1984) stored odorants at room temperature, whereas Goddard, Pring, and Felmingham (2005) refrigerated the odorants. Furthermore, most studies reported replacing the olfactory material either periodically or regularly to maintain freshness for all subjects across the experiment. In only one case (Herz, Eliassen, Beland, & Souza, 2004) did the experimenters report using new odorants for each subject. Although we have no reason to believe that subjects were ever presented with “stale” smells that were diminished or otherwise altered in any of the experiments, we do suggest that future researchers report exactly how often the olfactory materials were replaced and how they were stored to ensure standardization.

Presentation

The method of releasing the olfactory materials to the subjects did vary between experiments, with methods ranging from unscrewing a lid, having puncture holes in a lid, and placing an odor pellet in a small pipe, to squeezing a plastic bottle to release odorants through a nozzle. Once the odorant was presented to the subject, the instructions seemed to vary from study to study (e.g., gently sniff, breathe normally, no instructions reported). Furthermore, the amount of time that a subject was exposed to an odorant also varied, with examples including 30 seconds, 3 minutes, until a memory was retrieved or they were sure no memory was forthcoming, or no information was provided. Finally, the length of the intertrial interval was mentioned in only a few studies.

Issues related to sniffing and intertrial intervals are important in olfaction research. Olfaction is an active sense, and sniffing behavior has been shown to have an influence on perception (Mainland & Sobel, 2006). Also, olfaction is different from other senses, in that the offset of presentation of a stimulus does not necessarily cause the end of perception as odorants remain in the environment. Given this lingering of odors, we implore future researchers to report at least a minimum intertrial interval.

As a further suggestion, we propose that future research, if possible, control for other environmental extraneous odorants. For example, one study asked subjects not to smoke for at least 1 hour before the experiment (Rubin et al., 1984). We feel this requirement is a sensible measure to eliminate any possible, long-lasting environmental odorants. Additionally, one could ask subjects on the particular day to not wear any (new) perfumes or colognes and not to eat any strong-smelling foods for a certain period of time before the experiment, ideally at least 24 hours before the experiment is scheduled.

Cues

A variety of cues have been used in previous investigations of the Proust phenomenon. All experiments included in the analyses in this review considered odors as one of the levels of the independent variable (see Appendix C for a list of all odorants). Other levels of the independent variable included: verbal labels (n = 9), pictures (n = 9), sounds (n = 2), verbal labels plus odorants (n = 1), and multimodal cues (n = 1).

The variety of types of cues compared with olfactory stimuli has been a true strength of the odor-evoked AM research. This broad comparison group allows for hypotheses about sensory and semantic memory search to be tested, as well as allowing for investigations into the differences between different sensory modalities (e.g., olfactory vs. auditory). In fact, only two of the traditional senses (i.e., haptic and gustation) have not yet been used as comparisons for cuing AM. Future studies may consider using these comparison methods for their theoretical interest as well. For instance, tactile information is similar to olfactory information, in that people typically do not attend to it, although it is constantly present. As for gustatory information, taste has the obvious property of being the only chemosense aside from olfaction. Given these similarities to olfaction, one perceptual and one sensory, any findings that are similar to or different from odor-evoked AM would provide us with greater understanding of exactly what it is that accounts for the Proust effects.

In addition to the importance of the variety of comparison cues, it is also important to discuss how researchers select the particular cues they use in an experiment. For many studies (e.g., Willander & Larsson, 2007), the researchers attempt to find a broad sample of cues that are likely to be known to the population being sampled. However, at least one study (Miles & Berntsen, 2011) specifically selected cues that would lead to a high proportion of AMs. The results of that study do not conform well (as will be discussed in various sections below) to the broader literature, showing the importance of cue selection. Except in cases where cue selection is specifically manipulated to test various hypotheses (e.g., youth related or not youth related; de Bruijn & Bender, 2017), we argue that cue selection should generally attempt to provide a broad range of moderately familiar cues.

Subjects

It is also important to consider the properties of the samples tested. Whereas much of psychological research involves using undergraduates as subjects, there is an obvious problem with such sampling in AM research, aside from issues of external validity—namely, determining differences in the age of the memories requires individuals who have lived for many years and can have very old memories.Footnote 2 This issue may, in fact, explain the one study that did not find that AMs cued by odors were older than AMs cued by other modalities (Rubin et al., 1984), as that study used a young sample (of undergraduates). The problem with using a young sample means there is little room for differences in the ages of AMs to show themselves. For this reason, it is important to use older samples in odor-evoked AM research (see Appendix D for information on samples from each study).

In general, the sum of odor-evoked AM research has sampled a developmentally diverse group of subjects. However, a study that specifically compares different cohorts would be particularly useful for two reasons: (1) This study would allow us to get a more exact picture of how the age of memories differs when cued by different modalities, and (2) this study would allow us to investigate cohort effects.

It is also reasonable to think that the importance of odors in daily life has changed as societies have changed over the past century. Given that modern environments are more and more sterilized and removed from odors, it could be that Proust effects become increasingly less salient with later cohorts. Related to the impact of societal changes on odor perception, testing subjects from a wide variety of cultures may also prove beneficial. If attention to, or importance of, odors is critical for associations to be formed and for the odors to later cue AM, it is reasonable to assume that Proust effects would be larger in societies where odor plays more of a role in daily life. Furthermore, testing for Proust effects in other cultures may also allow researchers to further investigate the role of semantic processing of odors on odor-evoked memories, as some cultures seem better to be able to express olfactory perception linguistically than Western cultures do (Majid & Burenhult, 2014), from which all samples included in this review have come (specifically, samples have come from Germany, the Netherlands, Sweden, the UK, and the USA; also note that one excluded study included a sample from Japan; Yamamoto, 2008). It may be that some societies have stronger preexisting notions of the association between odor and memory, which may lead subjects to respond in line with these notions. If cultures can be found in which these associations are weaker or absent but Proust effects are still found, this finding would not only secure knowledge in the universality of such effects but also inform us about their underlying mechanisms.

Dependent variables

For the most part, previous studies have agreed what the important dependent variables should be. For example, all single-cue behavioral studies have measured the age of the memories (or the age of the subject at the occurrence of the personal event), several studies have measured rehearsal frequency (i.e., how often the memory had been thought of since the event occurred), and most studies have measured whether subjects could successfully retrieve a personal event related to each cue (i.e., cue efficiency). Furthermore, although research has shown general agreement that experiential differences are important, there has been a great deal of variability in which experiential measurements have been used (see Appendix E for a list of dependent variables included in the current review).

Two of the most common experiential ratings included are how vivid the memory was and how strong the feeling of being brought back (i.e., recollective experience) was. Other common measures concern how emotionally intense and how positive or negative the memory was. However, there is a great deal of variability associated with the latter two questions, with some studies interested in the emotions at encoding, other studies interested in the emotions at retrieval, and yet other studies interested in both aspects. Future research should specifically set out to clear up whether emotionality differences are related more to encoding or retrieval processes.

One further dependent variable that is worthy of consideration is mnemonic accuracy. Chu and Downes (2002, Exp. 2) made an indirect attempt at measuring accuracy by counting the number of details reported in memories. As mentioned above, they used a double-cue method, whereby subjects first provided a memory in response to a verbal label. Afterwards, subjects were presented with a second cue (e.g., the same label, a congruent visual image, a congruent odor, or an incongruent odor) and were asked to provide as many additional details as possible. The researchers found that subjects were able to provide more details, measured as the number of sentences spoken, following a congruent odor cue than for any other cue.

We argue that the data collected by Chu and Downes (2002, Exp. 2) actually tests memory specificity rather than mnemonic accuracy. It is difficult to determine whether details from an AM are accurate, and Chu and Downes did not attempt to verify the veracity of the memories or find corroborative evidence (e.g., from parents or siblings). Aside from the study by Chu and Downes, no experiment has truly made an attempt at determining whether olfactory cues lead to more accurate AMs than other cues. The fact that the issue of accuracy has received such little experimental attention is, on the one hand, surprising, given that Chu and Downes (2000b) included the notion that odors are better cues of AM as part of their definition of the Proust phenomenon. On the other hand, it is understandable why this dependent variable has been omitted, as it is notoriously difficult to determine whether an AM is accurate: There are minor and major inaccuracies and also errors of omission (whereby details are left out) and of commission (whereby new details are added that never actually occurred). However, difficult does not mean impossible. We suggest that future research should use a variety of methods to measure or approximate accuracy, including corroboration of accounts by relatives of the subject, memory for public events (which can be attested to by some sort of record) associated with cues being tested, memories for experiences in a controlled experimental setting being tested, and finally by using a false memory paradigm. This line of research would allow us to investigate the full claims of the Proust phenomenon and would complement olfactory context-dependent memory research (e.g., Hackländer & Bermeitinger, 2017; Isarida et al., 2014).

Summary

Researchers have used a variety of methods to measure AMs cued by odors and their differences to AMs cued by other stimuli. Although the variety of methods does allow for more generalizability, it also leads to potential problems of standardization, especially as they relate to the dependent variables assessed, which future research should aim to correct. Despite the variety of methods used, there is still need for future research to implement certain controls that would allow us to be more confident in the findings, especially as they relate to cuing methods. Furthermore, we see room for future research to expand the samples studied, to manipulate the properties of cues to test further hypotheses, and to set up more controlled studies to allow for tests of mnemonic accuracy.

One more area in which we see room for improvement is the level of reporting. We have created a simple list that includes the most relevant information for potential future reviews of the literature (see Appendix F). We advise researchers to include this list, answered with the appropriate information, as an appendix, or at a minimum ensure that all the relevant material is covered in the text, in all future articles related to odor-evoked AM. This checklist will ensure a precise level of reporting of important information and allow readers quick access to key methodological details.

Despite seeing room for improvement, we would like to clarify that we think the methods used to this point have been wholly appropriate for establishing the Proust phenomenon. Although the amount of research is limited, there is enough to allow for some conclusions to be made. Furthermore, there has been enough similarity in past research for the data to be combined, allowing for more powerful tests of the various effects associated with the Proust phenomenon that will aid us in drawing conclusions about the phenomenon.

Important findings

Research on odor-evoked AM has revealed several important findings. Most of these findings are related to the Proust effects described by Chu and Downes (2000b) and fit into the LOVER acronym (Larsson et al., 2014). However, one other finding does not fit so well—namely, that olfactory cues are generally less likely to evoke AMs than other cues are. The various findings, and the strength of their evidence, are considered in detail below (see Appendix D for a list of experiments included in the analyses). One set of findings that will not be discussed here, as they have already been discussed in detail elsewhere (Saive et al., 2014), are those related to neuropsychological processes.

When appropriate (i.e., when the reported data allow for it), we will provide mini-meta-analyses of the dependent variables considered here. We use the term mini-meta-analyses, as we are analyzing the individual dependent variables rather than the larger Proust phenomenon as a whole. The reason for doing this is because of the limited amount of research up to this point and the fact that dependent variables have been treated as independent of one another in the literature.

Unless otherwise noted, the analyses will follow the same general format: The data reported about a certain dependent variable will be converted to provide an estimation of the effect size (i.e., Cohen’s d). This estimation will be done by comparing the means from odor-cued memories to the means from all other-cued memories. For example, if a study provided means for emotional intensity for odor-evoked, label-evoked, and visual image-evoked memories, the ratings from the odor-evoked memories would be compared with the mean of the ratings from the verbal-evoked and visual-image-evoked memories. Positive effect sizes show (unless otherwise noted) higher ratings for the odor group, negative effect sizes show higher ratings for other groups, and effect sizes of zero indicate no differences based on cue type.

One potential issue in the analyses is that we were unable to consider cuing method (i.e. single cue or double cue) as an independent variable, due to the limited number of relevant data points. Combining these two methods potentially obfuscates any differences that may arise with their implementation. As discussed above, these differences could be due to differences in memory selection versus the way that the same memories are experienced based on how they are cued. Despite this potential issue, we feel it is warranted to include both types of cuing methods in our analyses, because, as was mentioned above, both methods are needed to fully investigate the Proust phenomenon, and each method is plagued by its own drawbacks, making the actual differences between the two somewhat unclear.

When possible, we provide analyses of data across experiments. The experiential data discussed below were assessed with Likert scales. The range of these scales, however, differed depending on the study. Therefore, when we report analyses and figures, we are reporting data with a transformed Likert score using the following formula, where x refers to the mean for a certain cue type and x′ refers to the transformed mean for that cue type: x′ = (x − minimum of scale) / (range of scale).

When conducting analyses such as those reported here, it is important that researchers take into account unreported data as well. For example, in many of the papers that we review, the authors only reported the data for significant results, even though they may have collected data on other dependent variables. To reduce bias toward significance, we code unreported data in our analyses as having an effect size of d = 0.Footnote 3 When unreported data are included in the analysis, it will be noted in the sections referring to the particular dependent variable.

After calculating effect sizes for each individual experiment, the combined effect sizes (see Fig. 1 for an overview) will be compared with zero by means of a one-tailed one-sample t test. Significant results indicate that odor-evoked memories lead to differences from other-evoked memories, in support of the Proust phenomenon. Given that this study is a review of a range of various studies, null effects will be viewed as strong support for the null hypothesis (Baumeister & Leary, 1997).

Fig. 1
figure 1

Effect sizes calculated from relevant experiments for a range of dependent measures. Error bars refer to standard errors of the mean

Age (O of the LOVER acronym)

The most well-known of the Proust effects is that AMs evoked by odors are significantly older (or have occurred at an earlier age) than those evoked by cues of other modalities. The evidence for some effect of age seems to be strong, with odor-cued AMs being older (or more likely to come from childhood) than other cued AMs in 5 out of 7 experiments that included information on this variable. Indeed, the two experiments that failed to find differences in age (Rubin et al., 1984, Exps. 1 & 2) used young adults, which may have reduced the ability for differences to emerge (although differences were found in other studies using young adults; Miles & Berntsen, 2011; Willander, Sikström, & Karlsson, 2015). To provide a more detailed assessment of the differences in the age of the memories, we combined the data from several different experiments in a mini-meta-analysis (see Fig. 2), which differs from the general method, described above, used for analyzing the other dependent variables.

Fig. 2
figure 2

Transformed proportion of memories across first 3 decades as a function of cue modality across experiments. Data are estimated from visual analysis of published figures

Given the wide variation in the range of the age of the subjects, leading to differences in which time-periods recency effects occur, we decided to focus our analysis on the first 3 decades of life. We first transformed all data, estimated from visual inspection of published figures, to fit a model wherein all responses from a single group for each experiment came from the first 3 decades of life. Footnote 4 Transforming the data to focus on the first 3 decades of life allows for a more precise comparison between studies that included older adults and studies that only included young adults. We then submitted this data, from the five studies in which relevant data were reported (i.e., all relevant studies except for the two experiments from Rubin et al., 1984), to a 3 (decade: 0–10, 11–20, 21–30) × 2 (cue modality: odor, other) repeated-measures ANOVA. This analysis, as predicted, revealed a significant interaction between decade and cue modality, F(2, 8) = 16.20, p = .002, ηp2 = .80. To investigate this interaction, the differences between cue modality were submitted to three separate paired-samples t tests, one for each of the first 3 decades of life. These tests showed that, for AMs from the first decade of life, a greater proportion was elicited by odor (M = .51) than by other (M = .22) cues, t(4) = 5.63, p = .005; the proportion of AMs from the second decade of life did not differ between odor (M = .29) and other (M = .44) cues, t(4) = 2.32, p = .081; and, for AMs for the third decade of life, a lower proportion was elicited by odor (M = .20) than by other (M = .34) cues, t(4) = 4.54, p = .010.

Cue efficiency

One finding that has received less discussion than the age differences is that odors are less likely to evoke AMs than are other cues. The limited discussion about this finding is somewhat odd, given its consistency dating back to Rubin et al. (1984). However, although the pattern has been consistent, with all studies that reported data relevant to the question showing lower numbers of AMs evoked by odorants than other cues, the effect has not always been significant.

To obtain a better understanding of this effect, we analyzed the data from all eight experiments that reported data relevant to this question (see Appendix E for a list of experiments included; see Fig. 3 for results). We calculated the proportion of trials on which an AM was reported in response to each type of cue. These calculations revealed that there was a great deal of variance across experiments, even when using the same cue type (e.g., Miles & Berntsen’s, 2011, subjects produced AMs in response to odor cues on 95% of the trials, whereas Chu & Downes’s, 2000b, subjects produced AMs on only 41% of the trials). To address this variance, we decided to compare odor to other cues using a nonparametric test (Wilcoxon signed-ranks test). This analysis revealed a significant difference, Z = 2.52, p = .012, whereby odors (M = .63) elicited AMs on a lower proportion of trials than did other cues (M = .76).

Fig. 3
figure 3

Proportion of trials on which an autobiographical memory (AM) was evoked as a function of cue modality for each of the experiments reporting relevant data. CD = Chu & Downes; GPF = Goddard, Pring, & Felmingham; MB = Miles & Berntsen; RGG = Rubin, Groth, & Goldsmith; WL = Willander & Larsson; WSK = Willander, Sikström, & Karlsson

We would like to note here that we believe this difference is actually larger than is indicated by the data here. The reason we believe the difference is underestimated here is due to the Miles and Berntsen (2011) study, which reported the smallest difference between odors and other cues in the proportion of trials on which an AM was retrieved. In their experiment, the researchers specifically used cues that had a high chance of leading to an AM being retrieved (based on pilot studies they had performed). The use of cues specifically chosen for their ability to elicit AMs, thusly, seems to have led to an overestimation of the efficiency of odor cues in triggering AMs. Based on our analysis, and the fact that this difference may be even larger than here indicated, we feel that this difference should be considered to be a medium-sized effect (d = .50) and one that must be considered when theorizing about what accounts for the Proust phenomenon.

Rehearsal (related to Rare of the LOVER acronym)

The frequency with which evoked memories have been thought of since the event is an interesting dependent variable, as it possibly gets at the heart of the literary version of the Proust effect (Jellinek, 2004). Namely it is not only that odors have the potential to evoke particularly old memories but that they have the potential to evoke memories that have been long forgotten, or, in other words, memories that have not been thought of frequently. Although this variable was explored in the original research on odor-evoked AM (Rubin et al., 1984), it seems to have been relegated to a less important finding in more recent work, as we could find only two other studies in which the frequency of thinking about a memory was considered as a dependent variable (Miles & Berntsen, 2011; Willander & Larsson, 2006).

To assess the findings across the four experiments in which data were reported, we followed the method outlined above and calculated effect sizes for each of the experiments (see Table 1), with positive effect sizes in this case representing odor-cued memories having been relived less frequently than other-cued memories. These effect sizes (M = .52) were then submitted to a one-sample t test against a value representing no effect (d = 0). This analysis revealed clearly that odor-evoked AMs were thought of less frequently since the time of the event than were other-evoked AMs, t(3) = 3.54, p = .019.

Table 1 Means, standard deviations (in parentheses), and effect sizes related to frequency of rehearsal from individual experiments

In addition to the data reported above, Herz and Cupchik (1992) asked how often a memory had been thought about in a descriptive study, where modality was not manipulated as an independent variable, and all subjects reported information about their AMs in response to odor cues. They found that, when an AM was evoked in response to the odor cue, subjects reported having “hardly ever thought of” the memory in a majority of cases (55.1%). By comparison, the subjects reported having “frequently thought of” the memory in a small minority of cases (12.6%).

Although there is only a limited amount of research into this finding, the current evidence clearly supports the notion that AMs evoked by odors are thought of less frequently than are AMs evoked by other cues, and low rehearsal frequency must be considered when considering the causes of the Proust phenomenon. It should be noted that the findings reported here could be interpreted in two different ways. We have framed the discussion as a difference between AMs evoked by odors and AMs evoked by other cues in frequency of rehearsal, as we believe this explanation is more likely (see Hypotheses section). However, it could also be that there is differential memory for instances of remembering odor-associated and other associated events. Humans are generally poor at remembering their instances of remembering (Parks, 1999), especially when the memory for the original event takes place in a different context than that in which memory for remembering is happening (Arnold & Lindsay, 2002, 2005). As it relates to research on the Proust effect, it could be that AMs evoked by odors are not actually rehearsed less frequently, but that they are not typically cued by odors (but rather by other cues). This difference in cuing could lead to an underestimation of the frequency of rehearsal.

Emotional intensity (E of the LOVER acronym)

The dependent variable we here call emotional intensity refers to the overall strength of emotions that a subject experiences in association with the retrieved AM. Emotional intensity is considered to be integral to the Proust phenomenon, with odor-evoked AMs typically being considered to be especially emotionally intense.

In total, nine experiments provided 10Footnote 5 data points relevant to the question of whether AMs evoked by odors are more emotionally intense than AMs evoked by other cues (see Fig. 4). Effect sizes were calculated for each of these experiments (M = .47) and compared in a one-tailed one-sample t test to a value representing no effect (d = 0). This analysis revealed that odor-evoked AMs were, across the studies, more emotionally intense than other-evoked AMs, t(9) = 2.33, p = .022. This analysis, however, considers emotional intensity as if the formulation of the question is irrelevant (see also Analysis of Methods section).

Fig. 4
figure 4

Transformed mean scores (x′) of emotionality as a function of cue modality for each of the experiments reporting relevant data. AIG = Arshamian et al.; deB&B = de Bruijn & Bender; H = Herz; HS = Herz & Schooler; HEBS = Herz, Eliassen, Beland, & Souza; MB = Miles & Berntsen; WL = Willander & Larsson; WSK = Willander, Sikström, & Karlsson

Broadly classified, there are two questions used—one asks for the emotional intensity at retrieval, the other asks for emotional intensity at encoding. To get a picture of how the effect has been found across studies, using different querying methods, we submitted the transformed means to a 2 × 2 mixed-design ANOVA, with cue modality (odor vs. other) as the within-subjects variable and time of emotion (retrieval vs. encoding) as the between-subjects variable (see Fig. 5). This analysis revealed only a significant main effect of cue modality, F(1, 7) = 11.05, p = .013, ηp2 = .61. The main effect of time of emotion and the interaction between cue modality and time of emotion were not significant, Fs < 3.0, ps > .10.

Fig. 5
figure 5

Transformed mean scores (x′) of emotional intensity as a function of cue modality and time when emotion was experienced for each of the experiments reporting relevant data. Error bars refer to standard errors of the mean

Although the results of this analysis would seem to indicate that odor-evoked AMs are more emotionally intense than other-evoked AMs, regardless of the framing of the question, we urge caution here, as the interaction analysis was based on a limited number of data points, and one of the data points included in the effect size analysis had to be left out of the interaction analysis due to missing data. Future researchers should include separate questions about emotion experienced during the encoding of the event as well as during the retrieval of the memory. This information could help determine whether experiential differences are due to odors influencing the recollection of the emotion or whether odors are more likely to be associated with emotional memories.

It should be noted here that there are potential interpretation issues, even if both framings (encoding/retrieval) are used. The formulation for assessing emotionality at encoding is measured at retrieval, making it difficult to distinguish emotions felt at encoding from emotions felt at retrieval. Indeed, research has shown that there are frequent errors in memory for emotional experiences (Levine & Safer, 2002). There are several reasons why these errors occur, including a dissociation between the context of encoding and retrieval, a change in the goals/appraisals of individuals between encoding and retrieval, and a reliance during the rating on the peak emotional intensity during encoding (Levine, Lench, & Safer, 2009). These sources of errors make it difficult for a true dissociation between emotional intensity during encoding and retrieval based only on self-report to be accurate. Indeed, we believe this dissociation can only truly be made with experiments performed in a controlled setting and/or in a longitudinal study.

Pleasantness (associated with the E of the LOVER acronym)

Related to emotional intensity experienced when recalling an odor-evoked AM is how that emotionality is experienced in terms of valence. Given that most AMs are associated with positive emotions (Walker, Skowronski, & Thompson, 2003), we refer to this dependent variable here as pleasantness. Although Chu and Downes (2000a, 2000b) did not consider pleasantness to be part of the Proust phenomenon when they originally formulated their hypotheses, several studies have shown that AMs evoked by odors are experienced as more pleasant than AMs evoked by other cues (see Fig. 6Footnote 6). In total, eight experiments provided 10 data points relevant to the question of whether AMs evoked by odors are more positive than AMs evoked by other cues. Effect sizes were calculated for each of these experiments (M = .35) and compared in a one-tailed one-sample t test to a value representing no effect (d = 0). This analysis revealed that odor-evoked AMs were, across the studies, more positive than other-evoked AMs, t(9) = 3.43, p = .004.

Fig. 6
figure 6

Transformed mean scores (x′) of pleasantness as a function of cue modality for each of the experiments reporting relevant data. AIG = Arshamian et al.; CD = Chu & Downes; MB = Miles & Berntsen; RGG = Rubin, Groth, & Goldsmith; WL = Willander & Larsson; WSK = Willander, Sikström, & Karlsson

Unfortunately, the pleasantness variable, or emotional valence, suffers from the same framing (encoding/retrieval) problems as the emotional intensity variable. Even more unfortunate, some experiments that asked for information about both time periods only report the data for one of the time periods, in which significant differences were found (e.g., Rubin et al., 1984). We handle these problems in the same way as above by submitting the transformed meansFootnote 7 to a 2 × 2 mixed-design ANOVA, with cue modality (odor vs. other) as the within-subjects variable and time of positivity (retrieval vs. encoding) as the between-subjects variable (see Fig. 7). This analysis revealed only a significant main effect of cue modality, F(1, 4) = 15.83, p = .016, ηp2 = .80. No other effects were significant, Fs < 2, ps > .20, indicating that the formulation of the question seems irrelevant, although, as with emotional intensity, we urge caution in interpreting this result.

Fig. 7
figure 7

Transformed mean scores (x′) of positivity as a function of cue modality and time when positivity was experienced for each of the experiments reporting relevant data. Error bars refer to standard errors of the mean

Being brought back

One of the key findings of the odor-evoked AM literature is that AMs evoked by odors are associated with exceptionally strong feelings of “being brought back” to the moment when the event occurred. Although researchers frequently fail to provide a clear definition for “being brought back,” “mental time travel,” or “autonoetic consciousness,” “being brought back” is understood as a true reliving of the event, with the feeling of being in the place and time of the event and reexperiencing the emotions and feelings that were felt when the event occurred (Tulving, 1985).

In total, seven studies have reported comparisons between AMs evoked by odors and AMs evoked by other cues in terms of the feelings of “being brought back” to the event (see Fig. 8). Effect sizes were calculated for each of these experiments (M = .47) and compared in a one-tailed one-sample t test to a value representing no effect (d = 0). This analysis revealed that odor-evoked AMs were, across the studies, associated with stronger feelings of being brought back than were other-evoked AMs, t(6) = 2.36, p = .028.

Fig. 8
figure 8

Transformed mean scores (x′) of feeling of “being brought back” to the moment when experiencing an autobiographical memory evoked by either an odor or a different cue for each of the experiments reporting relevant data. AIG = Arshamian et al.; H = Herz; HS = Herz & Schooler; MB = Miles & Berntsen; WL = Willander & Larsson; WSK = Willander, Sikström, & Karlsson

Vividness (V of the LOVER acronym)

One of the most common claims about odor-evoked AMs is that they are especially vivid and clear. This notion was supported in a descriptive study by Herz and Cupchik (1992), where subjects were asked to rate their AMs elicited only by odors. Subjects reported their memories to be “very clear” in a majority of cases (50.9%), and “very vague” in only a small minority of cases (14.0%). However, despite this finding and the claims made about vividness, the scientific evidence has been equivocal. In fact, across 10 experiments that investigated this question, only three (Chu & Downes, 2002, Exp. 1; de Bruijn & Bender, 2017; Herz & Schooler, 2002) found that AMs evoked by odors were significantly more vivid than AMs evoked by other cues.Footnote 8

Of the 10 experiments that investigated this question, only five reported the data (see Table 2). Four experiments reporting no significant differences were added as d = 0 to our analyses. Finally, one experiment (Goddard et al., 2005) indicated that vividness was significantly lower for odor-evoked than other-evoked memories but did not provide data to allow us to calculate an effect size. To remain conservative (in terms of rejecting the notion that vividness is greater for odor-evoked AMs), we entered a value of d = −.05 (i.e., the absolute value is equal to the lowest positive effect size in the analysis; Willander et al., 2015).

Table 2 Effect sizes related to vividness from individual experiments

Effect sizes (M = .10) were compared in a one-tailed one-sample t test to a value representing no effect (d = 0). This analysis revealed that odor-evoked AMs were, across the studies, more vivid than other-evoked AMs, t(9) = 2.02, p = .037. Thus, in line with the original hypothesis by Chu and Downes (2000a, 2000b), we found evidence that odor-evoked AMs are more vivid than AMs evoked by other cues.

However, we urge caution in this (strong) interpretation. For one, the effects of vividness are likely overestimated in the analysis, due to the way we made our estimations. This overestimation is especially likely, as the negative effect reported in Goddard et al. (2005) was likely larger than our conservative estimation. Indeed, if we included a negative effect size equal to the lowest significant positive effect size included in the analysis (d = .11), rather than the lowest nonsignificant positive effect size (d = .05), the one-tailed t test would no longer be significant (p > .05). A second point to be made is that one could question how we calculated effects sizes for the experiment by de Bruijn and Bender (2017). In this case, we only used the data we felt was most relevant to the question (that corresponding to childhood odors). This strict data selection actually means that the estimate is based on a single odor and only half of the possible data. Such a strict method of data selection likely also leads to an overestimation of the effect.

Although the vividness variable is worthy of being included in future studies, we feel it is most interesting for the revelation that the differences found by cue modality are neither stable nor convincing. This finding is interesting for two reasons in particular. First, the claim that odor-evoked AMs are especially vivid seemed to be integral to the formulation of what the Proust phenomenon is (Chu & Downes, 2000a, 2000b) and has remained an important part of more recent formulations, such as the LOVER acronym (Larsson et al., 2014). Second and more interestingly, although vividness differences either do not exist or are only small, differences in reported emotional intensity and the feeling of “being brought back” are large and stable.

This contrast seems confusing, given that it could reasonably be assumed that part of the feeling of being brought back is that the memory seems particularly vivid (Rubin et al., 2003). For example, Demiray and Janssen (2015) asked subjects to recall the most important events of their lives. After providing a short description for each event, subjects rated the memories on emotional intensity, sense of reliving, and vividness. Emotional intensity correlated positively with sense of reliving (r = .44) and vividness (r = .41), and sense of reliving had a strong and positive correlation with vividness (r = .69). It seems likely that the correlations between vividness and emotional intensity and between vividness and sense of reliving will be weaker for memories that are elicited with the help of odors. The inclusion of correlational analyses between the experiential variables in future studies would help researchers understand this relationship more fully.

Summary of findings

Based on our review of the literature, we conclude that there is ample evidence supporting the Proust phenomenon, as proposed by Chu and Downes (2000a, 2000b), in so far as AMs evoked by odors are different than AMs evoked by other cues. The LOVER model (Larsson et al., 2014) states that odor-evoked AMs are especially limbic, old, vivid, emotional, and rare. In agreement with the model, we found support for the notion that odor-evoked AMs are old, emotional, and rare (low frequency of rehearsal). Somewhat at odds with the LOVER model is the fact that we found only weak support for the notion that odor-evoked AMs are especially vivid. Finally, although the LOVER model is catchy and a useful shorthand for the findings associated with the Proust phenomenon, our review shows that the Proust phenomenon extends beyond the LOVER model. Additional findings that are not covered by the LOVER model may include those that suggest that odor-evoked AMs are not particularly relevant to the sense of self (e.g., Knez et al., 2017), findings related to semantic processing (e.g., Herz & Cupchik, 1992; Yamamoto, 2008, Exp. 2), and findings related to direct versus indirect search and the speed to retrieve memories (Goddard et al. 2005). These additional variables, which were beyond the scope of the current review, warrant more investigation in the future.

In extending our focus beyond the LOVER model, we believe our analyses will make it easier for researchers to consider all the various effects of the Proust phenomenon together. This overview of the effects should encourage future researchers to focus more on developing a theory as to what accounts for the Proust phenomenon, and whether the various effects are independent of each other or stem from a common, underlying cause. Future research should shift its focus from establishing the existence of Proust effects to explaining their causes.

Limitations

There are two limitations to our review of the important findings that we wish to highlight. These issues are specifically related to our analyses of the various Proust effects. The first issue concerns the limited number of variables that we analyzed. The second limitation concerns the potential for bias in our review.

Limited number of variables

In our analyses, we focused on the variables that have received the most attention by previous researchers, such as the age and emotional intensity of memories. However, our analyses do not fully cover either (a) the entirety of dependent variables that have been of interest to previous researchers or (b) the entirety of potential dependent variables related to the Proust phenomenon.

There are many variables that previous researchers have investigated but which were not included in our analyses. Some of these variables may even provide important information that could help narrow our theoretical understanding of the cause or causes for the Proust effects. Although we feel future reviews should certainly aim to include a wider range of dependent variables, this inclusion was not possible in our analyses, as there is not yet enough data to allow for any meta-analyses to be performed. This issue can only be addressed by continuing research on odor-evoked AM.

Future research may also aim to include new variables that may get at the heart of the literary version of the Proust phenomenon (Jellinek, 2004). Some variables that have not yet been, or have only rarely been, investigated, but which are integral to the idea of the Proust phenomenon are the feelings of surprise accompanied by odor-evoked AMs, the ability for odors to evoke “long forgotten” memories, and the sense that memories evoked by odors are less likely to contain any information about the source of the evocation (i.e., the presence of the odor itself).

Bias

Although we have done our best to avoid bias in our analyses, such as by estimating effect sizes for nonreported, nonsignificant findings, there is still one potential source of bias that may affect our interpretations: the file-drawer problem (Rosenthal, 1979). One of the current authors (S.J.) attempted to address this issue by sending out a public request for relevant data via social media and by personally contacting researchers in the field of autobiographical memory research at an international conference. While these efforts did reach a broad audience, it is impossible to know how many researchers did not receive the notice. This limitation could mean that the studies that we included are more likely to report significant effects than is the actual state of affairs. Although we do consider this issue to be a potential problem, there are three reasons to believe the problem may not be too great for this topic.

The first study investigating Proust effects (Rubin et al., 1984) actually reported a nonsignificant effect on the age of memory dependent variable. The fact that nonsignificant results were published directly from the beginning of investigation into the Proust phenomenon makes it less likely that later findings were refused for publication based on their nonsignificance. A second fact that makes it unlikely that this area of research suffers substantially from the file-drawer problem involves the large number of dependent variables that are included in each study. A study that includes a large number of nonsignificant findings touching on a single topic is generally persuasive, in the same way that null effects from a meta-analysis are persuasive—the more data collected, the more likely the results are to be accurate. Finally, even though the public request sent out by S.J. was widely read, and he personally spoke with many individuals, only one researcher responded saying they had potentially relevant unpublished data.Footnote 9 The low response rate makes us more confident that there is not a large pool of relevant data hidden from the public view.

Hypotheses

Although research on odor-evoked AMs to this point has focused mainly on finding effects rather than investigating the causes of said effects, some hypothetical accounts have been proposed that may explain some of the described effects (see Table 3 for an overview of previously stated and new hypotheses). In this section, we will discuss these, not necessarily mutually exclusive, accounts. Note that when not otherwise explicitly stated, all hypotheses are novel in their application to the Proust phenomenon.

Table 3 List of hypotheses that attempt to account for the variety of effects associated with odor-evoked autobiographical memory (AM)

For structural and readability purposes, the hypotheses will be organized by the stage of memory with which they are best associated: encoding, storage, or retrieval. Where appropriate, we will provide evidence in support of the various hypotheses, but, given that there is only a limited amount of research that touches on the causes of the Proust effects, the main function of this section is to list the hypotheses with the aim of inspiring future research.

Crucial to understanding our hypotheses is an understanding of the notion of dependency. What we mean by dependency is that it is more parsimonious to consider the various effects of the Proust phenomenon as dependent on and/or related to each other, with either a single or a limited number of causes, rather than independent of each other. When we list the hypotheses below, we will not go through the chain of causes each time, so we feel an example of what we mean is appropriate here. Two of the effects that need to be accounted for are that odor-evoked memories tend to be older and recalled less frequently. As opposed to thinking of these as separate effects, we argue it is more logical and parsimonious to hypothesize that a person is less likely to report the rehearsal of older memories, as remote instances of rehearsal will also be more likely to be forgotten than recent instances of rehearsal. Given the relationship between the age of memories and their importance for current/future behavior (Anderson & Milson, 1989), it makes sense that older memories will be less likely to be retrieved at any given time than newer memories, even if they have been recalled more often over the course of the lifetime. This logic can be applied to many of the hypotheses below, so that accounting for one aspect of the Proust phenomenon may actually account for several aspects.

Finally, there is a potential confound in the variable we have termed age of memories. Research has shown that AMs cued by odors are older than AMs cued by other stimuli. This effect could be due to a slower rate of forgetting of odor-associated AMs, or it could be due to a greater tendency to associate AMs with odors during a specific period of life, namely. childhood. When appropriate, we will highlight the theoretical differences when discussing specific hypotheses. Except for a few exceptions, hypotheses emphasizing that odor-associated AMs are less resistant to forgetting will focus on storage, whereas hypotheses emphasizing a greater tendency to associate odors with memories during childhood will focus on encoding.

Encoding

We use the following criteria to consider a hypothesis as belonging to the encoding stage: (a) it considers how a situation/target or an ambient odor is attended to, or (b) it considers how odors are originally bound to situations/targets.

Differential encoding bias

Chu and Downes (2002) formally stated the differential encoding bias hypothesis. The basic tenet of this hypothesis is that AMs differ in terms of their complexity. Those AMs that are more complex are also more likely to contain peripheral details. One of the peripheral details that is more likely to be included in complex AMs is olfactory information. Because, according to this hypothesis, the likelihood of an odor being associated with an AM is positively correlated to the complexity of the AM, odors are more likely to cue the retrieval of complex AMs, which will also be associated with more details. The cuing and retrieval of more complex memories could lead to a stronger feeling of being brought back to the event and a more vivid sense of the memory. Also, if odors were only associated with a portion of all AMs (i.e., those that are especially complex), then this association would explain why they are less efficient cues of AM than other types of cues.

Although Chu and Downes (2002) formally stated the differential encoding bias hypothesis, they ultimately rejected it in favor of their preferred hypothesis, the differential cue affordance hypothesis (see Retrieval hypotheses section). The authors’ rejection of the differential encoding bias hypothesis was, however, based purely on their use of the double-cue method that they had implemented. As we discussed above, there are potential issues with this method, which complicate interpretations of results. Given these difficulties, we would like to suggest that future researchers consider the differential encoding bias hypothesis as a potential explanation for certain aspects of the Proust phenomenon and a hypothesis that requires further testing.

Odors important during childhood

It could be that, as young children, people attend to odors to a greater degree, and, as they age, people come to rely more heavily on other senses. A shift in the reliance on different senses could account for why a large proportion of odor-related memories come from early childhood. It is important to note that this hypothesis explains age of memory differences (i.e., that odor-evoked AMs are older than other-evoked AMs) due to the fact that odors are more likely to be associated with memories during childhood than at other periods of life.

Based on the odors important during childhood hypothesis, it would not necessarily be the case that odors are “better” cues of older memories, but rather that a larger proportion of odor-associated memories come from childhood than do, for example, sound-associated memories. Furthermore, if odors are more important during childhood than during other periods of life, the odors important during childhood hypothesis could also account for why odors are less efficient cues. Once odors are less attended to, they may be less likely to be associated with new experiences.

Pleasant childhood

For children, the world is a new, exciting, and fascinating place. If we assume that novelty and excitement lead to greater emotional processing, it stands to reason that children experience the world as more emotional than older persons do. In turn, on average, memories from childhood should be more emotionally intense and pleasant than memories from other lifetime periods. If odors are more likely to evoke childhood memories than other cues, the pleasant childhood hypothesis would explain why odor-evoked AMs are more positive and more emotionally intense than memories evoked by other cues.Footnote 10

There is some indirect support for the notion that children process situations generally as more emotional, or at least that children process emotion differently than older individuals (Batty & Taylor, 2006), but we are cautious about overinterpreting this data without a test aimed at specifically determining whether similar situations are processed as more emotionally intense by children than by older individuals. In opposition to the notion that the world is more emotional for children, Janssen and Murre (2008) asked subjects to rate memories that were cued with the help of words on emotional valence, but they did not find any evidence showing that childhood memories were more positive or emotionally intense than memories from other periods. Given the ambiguity in the limited amount of extant research relevant to this hypothesis, future research may find it a fruitful topic for investigation.

Proactive interference

The proactive interference hypothesis, as it relates to the Proust phenomenon, was first laid out by Herz (2012). This hypothesis states that odor-associated memories are more susceptible to proactive interference (whereby earlier memories interfere with later memories) than memories associated with other stimuli, possibly due to the importance of first associations with odors for informing individuals about the edibility of food stuffs (e.g., Domjan, 2015; Garcia & Koelling, 1966). If the proactive interference hypothesis is correct, it would account for the finding that AMs evoked by odors are older than AMs evoked by other cues. It is important to note here that, unlike the odors important during childhood hypothesis, this hypothesis is not limited to the notion that odor-associated memories are more likely to come from childhood. Additionally the proactive interference hypothesis would also account for why odors are less effective cues of memory. If proactive interference occurs at a strong rate, the total number of memories associated with any given odor would be limited (because the memory for the first association would interfere with later associations), leading to a smaller pool of memories to retrieve during a test of AM.

Although there is, to the best of our knowledge, no research specifically focused on the role of proactive interference as it is related to AM, there is research from laboratory studies on odors and episodic memory that is relevant to the proactive interference hypothesis. Research using a paired-associates paradigm has shown that odors bind particularly well to the first item with which they are associated. Although the first item is well associated with odors, it is difficult to associate the odor with a new item (Lawless & Engen, 1977). The study by Lawless and Engen supports the notion that proactive interference is indeed stronger for odor-associated memories than for memories associated with other stimuli, but it is still unclear how generalizable this finding is. Future research would ideally attempt to directly test this hypothesis as it relates to AM. However, indirect support for the proactive interference hypothesis would come from evidence showing that each individual odor cues fewer AMs than individual cues from other modalities.

Binding difficulty

The binding difficulty hypothesis states that it is, in general, difficult to bind odors to other items/experiences. Given this difficulty, it could be that one must be acutely aware of the presence of an odor for the odor to become associated with an event (for a similar notion, see Köster, Moller, & Mojet, 2014). Given the inherent properties of odors, which are rapidly perceived in terms of their emotional qualities (e.g., Smeets & Dijksterhuis, 2014), the presence of an odor would make it likely that the event is experienced as emotional. The difficulty in binding odors to events, and the inherent properties of odors, could then account for the cue efficiency and emotional intensity effects of the Proust phenomenon. Evidence supporting this hypothesis comes from studies showing that odors are less effective cues in paired-associates tests than are other stimuli (Davis, 1975, 1977).

The binding difficulty hypothesis is difficult to reconcile with the abundance of evidence showing that odors are effective contextual mnemonic cues (Isarida & Isarida, 2014). It could be that the difference in findings lies in the explicit instruction to associate odors with other stimuli in paired-associate tests, whereas in context dependent memory studies the odors are generally presented ambiently and without further instruction (but see Hackländer & Bermeitinger, 2017). Future research is needed to explain the difference in findings with different paradigms and to assess the relevance of the binding difficulty hypothesis to the Proust phenomenon.

Semantic processing differences

Odors are difficult to name (Cain, 1979). This difficulty potentially shows that odors undergo less semantic processing and possibly more emotional processing (Yeshurun & Sobel, 2010). If odor perception is indeed more reliant on emotional than semantic processing, the semantic processing differences hypothesis would explain not only the emotionality effects but also the cue efficiency and the frequency of rehearsal. Semantic processing makes it more likely for a memory to be encoded (related to cue efficiency), and semantic processing leads to greater lexical connections, increasing the chance that a different but related cue will also trigger the AM (related to frequency of rehearsal). Support for the semantic processing differences hypothesis comes from a study by Herz and Cupchick (1992), in which the authors showed that odors need not be nameable to evoke an AM (this finding is also supported in Yamamoto, 2008). Future studies could investigate whether the ability to identify the cue leads to differences in the emotional experience of the AM, as would be predicted by this hypothesis.

Shift in naming abilities

Related to the semantic processing differences hypothesis is the shift in naming abilities hypothesis. This hypothesis rests on the fact that odor identification ability changes over time. Namely, people become increasingly better at naming odors up until their peak ability in their early 20s (Doty et al., 1984). This age difference could mean that, as people age, they encode odor-associated AMs differently, as the presence of semantic information seems to change the experience of an odor-evoked AM (Willander & Larsson, 2007). The shift in odor naming abilities, and the following processing of semantic information related to the odor, could be why earlier odor-associated memories are perceived as more emotional.

Both the difficulty naming and the shift in naming abilities hypotheses highlight the interesting connection between olfactory and language processing (a topic of much current research interest). These two hypotheses underscore the importance of broadening odor-associated AM research to more diverse samples. Cultures that place a stronger emphasis on odor identification may provide important information about the plausibility of these two hypotheses and how well they account for various Proust effects.

Shift in odor perception

People’s perception of odors seems to change with age, as does their ability to identify them (Doty et al., 1984). Several researchers (Chu & Downes, 2000a; Goddard et al., 2005) have speculated for the shift in odor perception hypothesis. These authors have made the claim that this shift actually leads to greater attention to odors, especially in the lifetime period (i.e., childhood) when people learn to combine perceptions from different modalities. If there is indeed a shift in people’s perceptual abilities with age, the shift in odor perception hypothesis would account for the large proportion of odor-evoked memories coming from childhood. Depending on the nature of the shift in perception during childhood, this hypothesis could potentially also account for the emotion-related Proust effects. For example, if the shift led to greater emotional processing of odors as well, the memories associated with the odors may be perceived as particularly emotional. Crucial evidence for the shift in odor perception hypothesis would necessarily come from longitudinal studies investigating the development of odor perception across childhood.

Sterilization

The world has experienced over the last century, and continues to experience, great changes. One of those changes is that surroundings have become more sterilized. Given these changes, it could be that odors have become less present within the lifetime of the individuals who have participated in the various AM studies reviewed here. If there has in fact been a decrease in the presence of odors within the lifetimes of those who have participated in the reviewed studies, it would follow that the largest proportion of their odor-associated memories would be from earlier ages (i.e., childhood).

Although limited, there is evidence related to the sterilization hypothesis. Two experiments have found that age differences based on cue modality are not present when using a younger sample (Rubin et al., 1984), as would be predicted by this hypothesis. However, further research has indeed found that, even in younger samples, odors evoked older memories than did other cues (Miles & Berntsen, 2011; Willander et al., 2015). To obtain a clearer sense of the plausibility of the sterilization hypothesis, future research should test different cohorts, as well as individuals from societies which have not undergone such drastic changes and in which odors are still as present as they have been throughout their entire lifetimes.

Storage

For the purposes of our review, storage refers to the time between the encoding and retrieval of the event in question. Two hypotheses that focus on this stage of memory are highlighted here. One hypothesis focuses on forgetting and the other hypothesis focuses on consolidation.

Forgetting

The forgetting hypothesis states that odors are forgotten at a slower rate than are other stimuli. Given this slower rate of forgetting for odors, odors can act as cues of associated memories for longer periods of time. The forgetting hypothesis would therefore account for the finding that AMs evoked by odors are older than AMs evoked by other cues. The forgetting hypothesis is not specific to the notion that odor-associated memories are more likely to come from childhood. This hypothesis would hold that, regardless of when encoding occurred, odor-associated memories should be forgotten slower than memories associated with other cues.

There is evidence that memories for odors themselves are forgotten at a slower rate than are memories for other types of cues. Lawless and Cain (1975) found that, although recognition of odors was relatively poor after a short retention interval, there was little additional forgetting up to 1 month later (see also Engen & Ross, 1973). Although the finding from Lawless and Cain has been widely cited, recent evidence failed to support the notion that memories for odors are especially resistant to forgetting (Kärnekull et al., 2015). Future research is needed both to (1) determine whether memories for odors are truly forgotten at a slower rate than are memories for other stimuli and (2) determine how relevant the forgetting hypothesis is to the Proust phenomenon, as memory for odors is not the same as AMs evoked by odors.

Consolidation

Given the strong and close connections between odor processing areas and the amygdala and hippocampus, it could be that odor-evoked memories undergo more rapid and stronger consolidation processes than AMs evoked by other cues. Research suggests that emotional arousal does lead to enhanced consolidation of memories (e.g., Phelps & LeDoux, 2005). If events associated with odors are experienced as more emotionally intense, then it would follow that they are more strongly consolidated. Enhanced consolidation could potentially explain the age of memory effect, as older memories would be stored in a secure way and would be less labile and open to interference or reconsolidation. There is, however, no evidence that we know of to support the consolidation hypothesis, so it should for the moment be considered as speculation.

Retrieval

In this section, we highlight hypotheses that focus on how odors presented while recalling AMs affect retrieval processes. These hypotheses could involve either the memory search or the way that recollection is subjectively experienced.

Differential cue affordance

Chu and Downes (2002) formally laid out what they referred to as the differential cue affordance hypothesis. This hypothesis states that AMs are associated with a variety of stimuli, each of which could potentially cue the AM. Each of the stimuli related to the AM provides a differential cue affordance (or efficiency in cuing the AM). Although cue affordances may differ for idiosyncratic memories, across all memories odors provide the highest cue affordance. In other words, odors are the most efficient cues of particular memories. This hypothesis argues that odors are specifically good retrieval cues, but it does not provide any more details about the potential mechanisms or any more explanations about why odors are good retrieval cues.Footnote 11

In opposition to the predictions of the differential cue affordance hypothesis, research using multimodal cues (including odors) has found that odors contribute less to the retrieved memories than visual and auditory cues (Karlsson et al., 2013; Willander et al., 2015). If odors were truly more efficient than other stimuli as cues of AMs, then they would contribute more, rather than less, to the experience of memories cued by multimodal cues.

We would like to emphasize that the evidence from the studies with multimodal cues (Karlsson et al., 2013; Willander et al., 2015) do not necessarily rule out the differential cue affordance hypothesis. Indeed, it could be that odors are more efficient cues of AMs, but people still ignore them when visual and auditory cues are present (possibly due to greater familiarity with relying on such cues). Evidence in favor of this view comes from Willander and Larsson (2007), who found that the presence of a label in combination with an odor cue decreased the experiential ratings, as compared with when only an odor was present. Given the variety of findings related to this hypothesis, it is necessary for future research aimed at specifically testing this hypothesis to be precise when defining what they mean by cue affordance or cue efficiency.

Misattribution

The misattribution hypothesis was outlined in the Analysis of Methods section above when discussing cuing methods. This hypothesis states that when subjects retrieve a memory in response to an odor cue, they conflate their emotional responses. Specifically, subjects misattribute their emotional responses elicited by odors themselves to their emotional responses to the AMs.

Odors cue emotional memories

Odors themselves are emotional and are possibly processed in terms of their emotionality (Yeshurun & Sobel, 2010). Given their emotionality, when an odor cue is presented, it probably leads to emotional processing. The emotional processing could, in turn, lead to the retrieval of affectively congruent AMs (although past research on congruency effects of AM is equivocal, see Eich, Macaulay, & Ryan, 1994). The retrieval of affectively congruent AMs would, obviously, account for the finding that odor-evoked AMs are more emotionally intense, because other cues are generally emotionally neutral and would tend to cue less emotional AMs. Because mostly positive odors are used in experiments, affective congruency would also account for the positivity bias effect.

The odors cue emotional memories hypothesis may also account for the fact that odors are less effective cues of memory than other stimuli. If (a) odors are experienced as highly emotional, (b) odors only lead to the retrieval of affectively congruent memories, and (c) the majority of AMs are either unemotional or only moderately emotional (for evidence supporting this claim, see Janssen & Murre, 2008, p. 1853), then it would make sense that odors have only a limited pool of memories that they would effectively cue. This limited pool of memories would account for why odors are not particularly effective cues of memories, as compared with other types of cues.

Nostalgia

The nostalgia (or “everything was better back then”) hypothesis states that, as we age, we tend to evaluate our childhood as more emotionally intense and more positive than it actually had been. In line with this reappraisal of our childhood, any particular memory from that period that is retrieved is more likely to be experienced in correspondence with this shift in appraisal (Levine & Safer, 2002). Therefore, memories from childhood are, during retrieval, experienced as more emotionally intense and positive than they actually had been during encoding. This shift in appraisal is, for the sake of this hypothesis, assumed to be specific to childhood.

If we assume that odors are more likely to evoke childhood memories than other cues, then it follows from the nostalgia hypothesis that these memories should be experienced during retrieval as being, on average, particularly emotionally intense and positive. The nostalgia hypothesis, therefore, would account for the fact that odor-evoked AMs are more positive and emotionally intense than AMs evoked by other cues.

It should be noted that the nostalgia hypothesis is different than the pleasant childhood hypothesis outlined above, in that the nostalgia hypothesis focuses on the experience during retrieval, whereas the pleasant childhood hypothesis focuses on the experience during encoding. Although different, both hypotheses share a set of data that speaks against them. Janssen and Murre (2008) had people rate the valence and emotional intensity of their AMs from different periods of life. They found no evidence that AMs from childhood were rated as more pleasant or more emotionally intense than AMs from other lifetime periods.

Feel older

If one would accept that odor-associated AMs are rehearsed less frequently (perhaps because the odor cues that would lead to retrieval do not enter awareness as frequently as other cues), then this limited amount of rehearsal could make the AMs feel older than they actually are. In other words, it could be that the process of coming up with a date for when the original event occurred is influenced by how often the experience is recalled. If experiences are relived infrequently, then it would make them seem more unfamiliar and older, leading to odor-evoked AMs mistakenly being considered to be older than other-evoked AMs.

There is some support for this explanation from research that has examined how people date personal and public events. The accessibility principle states that the amount of knowledge about an event determines whether the event is perceived to be recent or old (Brown, Rips, & Shevell, 1985). People may infer that, if they recall few details about an event, then the event must have happened a long time ago, and if they recall many details, then the event must have happened recently. In accordance with this principle, Brown et al. (1985) have found that people judge events they knew much about as occurring more recently than the events in fact had, and events they knew little about as occurring more remotely than actually was the case. If odor-evoked memories are rehearsed less often, then they are less likely to be detailed, which could make them feel older.

Direct retrieval

It has been proposed (Conway & Pleydell-Pearce, 2000) that sensory stimulation leads to a direct search of memory rather than a cyclical search. If odors are considered as sensory percepts (Yeshurun & Sobel, 2010), then it would hold that they lead to a direct search of memory. The direct retrieval hypothesis is similar to the argument that odors are highly specific cues of particular AMs (Larsson et al., 2014). This hypothesis could account for the emotionality and positivity effects, the feelings of being brought back, and possibly also for the age of memory effects.

There is little evidence directly related to the direct retrieval hypothesis. One indirect piece of evidence may come from response latencies. If odors lead to a more direct search of memory than other cues, then it stands to reason that the latency between perceiving the cue and retrieving the memory should be shorter than for other cues (e.g., Uzer, Lee, & Brown, 2012). Research relevant to this question has, however, found that odors lead to longer response latencies than other cues (Goddard et al., 2005), which incidentally fits with Proust’s madeleine-episode description.

Although the results from Goddard et al. (2005) speak against the direct retrieval hypothesis, the evidence cannot be seen as conclusive, as there was no control for perception time. Indeed, it takes longer to sense and perceive odors than other stimuli (Herz & Engen, 1996; for a discussion of the time course of olfactory perception, see Olofsson, 2014). Furthermore, a trial would only end once a memory was retrieved (or presumably until a certain amount of time had passed). Because odors are less efficient cues, it is not surprising that the response latency for odors was longer than the response latency for other cues. Finally, the memories in Goddard and colleagues were gathered in the laboratory, which means that they were voluntary memories and therefore could have been retrieved directly or generatively. However, prototypically, odors evoke memories involuntarily. It has been shown that involuntary memories, which are always retrieved directly, are more often specific (Berntsen, 1996), which causes their retrieval to be accompanied with more physical reactions and emotional impact (Berntsen & Hall, 2004).

Language bypass

Related to the direct search hypothesis, although based upon different assumptions, is the language bypass hypothesis. Koppel and Rubin (2016) argued that the weak neuroanatomical connections between olfactory processing and language centers in the brain could lead to odors bypassing normal, language-dependent, memory search functions. This aspect of neuroanatomy could mean that odors themselves are less elaborated on cognitively and lead to less semantic associations. The limited cognitive elaboration could lead to the direct access to older memories, as there would be less language-based interference from semantically related memories.

Conclusions

In this review, we have shown that there are several effects associated with the Proust phenomenon that have received strong support throughout the admittedly limited literature on odor-evoked AM. Whereas it remains important for future research to use various different methods to ensure the validity of the various effects (as we highlighted in the Analysis of Methods section), it is also important for future research to focus on hypothesis testing and theory building. Specifically, the questions of (a) whether the various effects associated with the Proust phenomenon are dependent or independent of one another, and (b) what the underlying causes of said effects are, are of particular interest.

Further investigations into the Proust phenomenon can provide us with information relevant to a wide range of topics related to olfactory cognition. The most obvious connections are between AM and other areas of memory research. This includes the other three lines of olfactory memory research highlighted above, including (a) memory for odors themselves, (b) paired-associates tests with odor cues, and (c) odors as contextual mnemonic cues. However, the relevance of the Proust effects goes beyond this for olfactory cognition. Researching the Proust phenomenon can also provide information about attention to odors, the relationship between olfactory and language processing, cultural differences in olfactory processing, developmental differences in olfactory processing, and how odors influence our emotions and our memories for emotions.

Thinking even further, research into the Proust phenomenon is of interest to researchers of AM more generally. Focusing on differences between how odors and other stimuli influence the encoding, storage, and retrieval of memories can provide important information about modality differences in memory structure. These differences relate to several important aspects of AM research, such as childhood amnesia and the reminiscence bump, the differences between voluntary and involuntary retrieval, and direct and generative memory search.

The findings related to the Proust effect are not just of interest to researchers interested in the theoretical aspects of how olfaction and memory function. There is an increasing push to use the research related to the Proust phenomenon to find practical benefits. One recent study attempted to use odors to improve narrative storytelling (Camara Leret & Visch, 2017). Specifically, the authors argue that storytelling could help patients in clinical settings (in this case, patients suffering from addiction) discuss their problems with others in their peer groups. Similarly, a study by El Haj et al. (2017) found that odors can help patients with Alzheimer’s disease when retrieving AMs. A third area of practical interest relates to individuals suffering from PTSD. The connection between traumatic experiences and olfactory cues has been of interest to clinicians for some time (e.g., Ferguson & Cassaday, 1999, 2002). A recent study (Woodward, Kahn, Ball, & Sizemore, 2017) found that traumatic memories associated with odors are more likely to intrude into everyday life and cause more severe symptoms of hyperarousal than traumatic memories not associated with odors. The interest in the practical application of Proust-related research makes it even more necessary for researchers to be confident in their findings and interested in the causes for the various Proust effects.

Research on the Proust phenomenon is of theoretical interest to a wide range of researchers. Furthermore, this research is becoming increasingly interesting to clinicians and others interested in the practical applications of using odors as cues of AM. Based on our analyses, we are confident that the basic tenet of the Proust phenomenon, that odor-evoked AMs are in some way different than AMs evoked by other cues, is correct. The next exciting step for this field of research is to determine exactly what these differences are, what causes them, and in what contexts the various hypotheses fit or do not.

Author note

Ryan P. M. Hackländer, Department of Psychology, University of Hildesheim; Steve M. J. Janssen, School of Psychology, University of Nottingham–Malaysia Campus; Christina Bermeitinger, Department of Psychology, University of Hildesheim.

The work is supported by the German Research Foundation (Deutsche Forschungsgemeinschaft), BE 4851/3-1 (to C. Bermeitinger).

We would like to thank Haruka Janssen for translating the study by Yamamoto (2008) from Japanese into English.