This study examined whether presenting visual information in an object-based manner improves memory maintenance in VWM. Together the four experiments demonstrated that representations of two visual stimuli were indeed more effectively remembered when they were part of the same object. This benefit was obtained with simple contour-based objects, and whether a pair of memory features were from the same or from different dimensions. Overall memory performance and the object benefit were specifically enhanced for features from orthogonal dimensions; however, this came at the cost of lower memory precision. The object benefit furthermore still emerged when the relative importance of the objects themselves was reduced by presenting them in fixed spatial locations. Finally, it was also confirmed that the object benefit arose automatically, or at least did not depend on strategic use of object information.
Object benefits in WM
Our findings were consistent with the studies of Xu (
2002a,
2002b,
2006), who observed memory benefits for features from different parts of an object in a change detection paradigm. Similarly, in the current study, we tested participants’ VWM for features that were organized as multiple parts of an object, but instead of the same/different probes in the change detection paradigm, we used continuous reproduction of memory features, which allows model-based parameter estimation that provides further insight into the nature of memory representation in VWM. We found that features that were part of the same object had a higher chance of being maintained in memory (i.e., lower guess rate), while the precision of these representations was not improved. As a matter of fact, memory precision decreased when the objects contained non-interfering features (i.e., from different dimensions; Experiment 2). A possible explanation for this reversed effect for memory precision is that non-interfering features might be processed more in parallel (e.g., attentionally; (Krummenacher et al.,
2001; Müller et al.,
1995; Wheeler & Treisman,
2002; Wolfe et al.,
1990), which might facilitate their entry to the memory, such that more information is retained overall. However, this might in turn reduce memory precision, as there is comparatively more information being held in memory.
Furthermore, we found that object-based representation did produce a benefit for the same-dimensional feature, as found originally by Luck and Vogel (
1997), but unlike several subsequent change detection studies that failed to find object benefits for same-dimensional feature conjunctions (typically two-color combinations; Delvenne & Bruyer,
2004; Wheeler & Treisman,
2002; Xu,
2002b). It has been proposed that this failure to replicate the object benefit for same-dimensional features could be specific to the change detection task. Awh et al. (
2007) used a change detection task where they tested cross-category versus within-category changes between sample and test array. It was assumed that sample-test similarity was higher in within-category change compared to cross-category change, and that this would consequently result in decreasing change detection performance. Indeed, a strong correlation was found between the reduction of memory capacity and sample-test similarity, suggesting that comparing the sample and test may be more difficult for within-category changes, which may cause more confusion in identifying change.
A study by Luria and Vogel (
2011) further tested this assumption using the Contralateral Delay Activity (CDA), which is a marker of the number of objects during WM maintenance (e.g., Akyürek et al.,
2017; Balaban & Luria,
2015a,
2015b; Luria & Vogel,
2011,
2014; Peterson et al.,
2015; Wilson et al.,
2012; Woodman & Vogel,
2008). Indeed, Luria and Vogel (
2011) found that a small cost was visible in CDA amplitude for a bicolor object, compared to a single-color object during the retention interval, even though no accuracy advantage was found for the former in the behavioral results. This outcome supported the hypothesis that two-color features could be maintained within a single, bound object in VWM.
Since we used a continuous reproduction paradigm, in contrast to a change detection paradigm, our task did not require the comparison/decision process needed to make a comparison between the memory and test arrays. Therefore, the object benefit that occurs before the test phase of the task might also be obtained in behavioral performance, as we indeed observed. Additionally, another important aspect of our study that might have facilitated the object benefit for the same-dimensional features was that the total items given to memorize by the participants was under the typical working memory capacity limit (presumably at least four items; Cowan,
2001; Irwin & Andrews,
1996; Luck & Vogel,
1997; Vogel & Machizawa,
2004). This might have limited overall interference amongst same-dimensional features, and consequently revealed memory advantages for objects. Anecdotal evidence was found that memory precision also decreased in Experiment 4B. Precision also decreased in all other experiments (Table 1 and Table 2 in the Supplementary material show mean parameter estimates for the best fitting model in all experiments), although the effect was not significant enough to be seen as evidence in the individual Bayesian analyses. With that caveat, this potential broader effect might be explained as follows: Possibly, we are seeing two different types of recall: one being recall of the second feature from a discrete memory of the second target, which has a relatively high precision, and the second being recall of the second feature from a memory of the object including the second target, which has a somewhat lower precision. The increased probability of recall of the second feature under in-object conditions then goes hand in hand with decreased precision of second target response as the relative frequency of the second type of recall increases. Future research into this effect and its background could potentially be of benefit to the field.
This account can also explain the larger increase in memory probability for object features that were a combination of color and orientation in Experiment 2, a finding that was consistent with Treisman’s feature integration theory (Treisman & Gelade,
1980), which suggests that less interference should occur when different-dimensional features are maintained in memory. In line with this idea, memory advantages for objects containing a conjunction of different-dimensional features have been found by several studies (Delvenne & Bruyer,
2004; Olson & Jiang,
2002; Riggs et al.,
2011; Wheeler & Treisman,
2002).
Other neurophysiological evidence supporting object-based representation in memory comes from functional Magnetic Resonance Imaging (fMRI) data. It has been shown that brain activity in the parietal cortex correlates with object-based representation and grouping of visual elements (Xu & Chun,
2007). These authors described two stages in visual object processing. The first stage was called object individuation and is characterized by attention-related processing. In this stage, a fixed number of objects can be selected, regardless of object complexity. This stage was characterized by a linear increase of neural activity in the inferior intraparietal sulcus (IPS) up to 4 items, after which the neural activity reached a plateau. In the second stage, which was called object identification, objects that were selected in the previous stage are encoded and stored with more detail in VWM. The brain response during this stage was strongest in the superior IPS.
The present results may be related to this two-stage model of object perception, as follows. Since the total number of items (i.e., three) was likely below the maximum capacity of the first object processing stage, they might all be processed in this stage, regardless of whether they were perceived as part of an object, or individually. However, being able to reproduce object-related (featural) information not only requires detecting and attending to the objects, but also successful storage of that information in VWM. Therefore, the increased probability of the second target feature being present in memory when it was part of the object might originate in the second stage of object processing, which is sensitive to object complexity.
It is also worth mentioning that some of our results appear to be compatible with the continuous resource model rather than the discrete slots model of WM. These are the two main models that have been introduced to explain the nature of WM capacity limits. The discrete slots model assumes that working memory storage is limited to a number of discrete slots, typically three or four (Irwin,
1992; Luck & Vogel,
1997; Vogel et al.,
2001). In this model, once the amount of items in WM reaches the limit of these slots, no more items can enter memory. Consequently, the discrete slots model suggests that the precision of the memory representation remains constant when the presented memory items exceed the maximum capacity of the slots. On the other hand, the continuous resource model assumes that there is no upper limit of items that can be maintained in working memory, and that memory resources can be flexibly allocated to each item (van den Berg et al.,
2012). In this regard, the discrete slots model predicts fixed precision no matter how complex the item is, whereas the continuous resource model predicts more variability in the precision of items. In this study we found object effects on the probability of having the second target in memory. However, we also found some evidence that precision differed between object conditions in Experiment 2 and Experiment 4a, which may suggest that there is some variability in the precision of the encoded items in memory.
Attentional effects
Although both the number of features and objects were presumably well under the maximum capacity of VWM, recall accuracy was clearly different between the two targets regardless of object conditions: Accuracy on the second target was always (much) lower than that of the first. This was expected, firstly because the first target feature was also the one to be tested first after the memory display. Secondly, in our experimental design, the first target feature was always flagged as such, because it appeared in red or within a colored circle, to ensure this part of the object was always encoded. This made the first target feature salient, and likely to draw the focus of attention. The first target should therefore have perceptual priority in the encoding stage of VWM. Indeed, we hypothesized that such prioritization may facilitate the rest of the object as well and result in better memory for the second target, when presented within the same object with the first target.
The study of Egly et al. (
1994) suggested that when attention is drawn towards one part of the object, it can spread within boundary of object, therefore the rest of the attended object can be selected automatically. Furthermore, it has been argued that perceiving individual parts as an integrated object depends on where attention is focused exactly, and whether this includes the object structure (Driver & Baylis,
1998; Marr,
1982). Our finding is consistent with such an object-based attention account. It must be noted that in our study, participants were only required to memorize the individual features of orientation or color, so the simple background shapes (i.e., the objects) that encompassed these memory stimuli were not task relevant. The object was also never predictive of the second feature to be tested. It could be expected that participants might, thus, only attend to the task-relevant parts of the object rather than on the object as whole. Moreover, by presenting the task-relevant parts of the object in fixed locations in Experiment 3, participants might even be able to attend more to the locations of the features themselves, rather than to the object itself. Nevertheless, even under these conditions that rendered the object itself completely task-irrelevant, the features in all of our experiments were still perceived as part of a bound object, as indicated by the presence of the object effect in each experiment. This finding suggests that the objects were processed at an early stage of visual processing, possibly reliant on automatic perceptual grouping (Driver et al.,
2001; Duncan,
1984; Duncan & Humphreys,
1989; Kahneman & Treisman,
1984).
Conversely, a recent study by Chen et al. (
2021) investigated perceptual grouping benefits for features that were either grouping-relevant, or not. While grouping-relevant features produced clear benefits, grouping-irrelevant features did not, unless both feature and grouping were task relevant. The authors concluded that features may be encoded independently in VWM, and integrated object representations are not automatically generated, but instead depend on the task demands. This would seem to be at odds with the object benefits that were presently observed, given that our objects were always task irrelevant. However, the highlighted first feature used in our study may actually have put the object as a whole into focus, as argued above, thereby ‘activating’ its benefits.
Finally, the current results seems inconsistent with the view that attentional prioritization of WM items can occur before or after the appearance of object features, even though the first target feature clearly benefitted from prioritization due to its unique color. Previous studies that showed pre- or retro-cue benefits on memory (Bays & Husain,
2008; Schmidt et al.,
2002) used those cues to draw attention to certain memory items indicating that they are more likely to be tested. Therefore, the cued item in these studies was directly task-relevant information essential to be recalled. In Experiment 4 of our study, the features that needed to be memorized were always presented together in the same display, and the object shapes that made them part of the same or of a different object either preceded or followed the memory features. Therefore, the part of the object that was presented before or after the memory features was not directly task-relevant information that participants needed to retain in memory, and this object information did not necessarily need to be used. Thereby, any encoding of this shape and any benefit it might bring would be purely strategic in nature. We found that participants were unable to use the object shape strategically, to help structure VWM contents either during encoding or maintenance, depending on whether the object preceded or followed the features. The object effects we presently observed, thus, seemed to be driven by processing stages that precede the strategic level. However, this finding does not imply that memory benefits for objects can emerge only with simultaneous perception of visual information, or that it is limited to the perceptual/encoding stage in all cases. For instance, one previous study found that the presentation of objects based on various Gestalt principles (collinearity, closure, and similarity) across two sequential stimulus displays improved VWM performance (Gao et al.,
2016). In the current study, the complete task-irrelevance of the object might have led to it not being selected attentionally, thereby precluding any positive effect.
To conclude, in four experiments we presented consistent evidence demonstrating that object-based presentation of visual information helped our participants to retain more information in VWM, even when the total number of items was well below the VWM capacity limit. Recall advantages were obtained when combining features from either the same or different dimensions into a single object. The object benefit seemed to happen automatically at a relatively early stage of visual processing, which was indicated by the persistent effect of the object even when participants' attention was made to focus more on the location of the features rather than on the object surrounding the stimuli, and by a lack of evidence for strategic encoding or maintenance.