The process of visual word recognition in skilled readers involves extremely efficient mechanisms that, in a few hundreds of milliseconds, convert the visual signal into the appropriate long-term lexical representation despite the similarity between letters (e.g., prescribe, but not the visually similar word proscribe) (see Grainger, 2018; Grainger, Dufau, and Ziegler, 2016, for recent reviews). However, visual similarity among stimuli seems to influence this process. Previous research has shown that sentences in which some of the letters are replaced by similar symbols or digits (e.g., MΔT3R1ΔL 7H1NGS C0M3 ΔND G0) can be read without much cost (Duñabeitia, Perea, and Carreiras, 2009). Indeed, a number of experiments using Forster and Davis’ (1984) masked priming technique (i.e., a procedure that taps the initial stages of word processing; Grainger, 2008) have consistently reported facilitative effects of visual similarity with letter-like digits and letter-like symbols during word recognition. In the initial demonstration of the effect, Perea, Duñabeitia, and Carreiras (2008) found faster lexical decision times to a target word like MATERIAL when preceded by a prime with similar letter-like digits (M473RI4L) or symbols (MΔT€R!ΔL) than when preceded by a dissimilar prime (M568RI2L or M□T%R?□L). Furthermore, Perea et al. (2008) found that visually similar primes were nearly as effective as the identity primes (see also Kinoshita, Robidoux, Mills, & Norris, 2013; Perea, Duñabeitia, Pollatsek, & Carreiras, 2009, for converging behavioral evidence).

Molinaro, Duñabeitia, Marìn-Gutiérrez, and Carreiras (2010) used event-related potentials (ERPs) to examine the time course of visual similarity effects with words containing similar letter-like digits. In their masked priming experiment, targets that were preceded by an identity prime (e.g., PRIMAVERA-PRIMAVERA [spring]) or a similar prime containing letter-like digits (PR1M4V3R4-PRIMAVERA) elicited similar ERP waves in the 140- to 170- and 250- to 300-ms time windows. However, both ERPs differed from those elicited by a visually dissimilar control condition (e.g., PR2M8V6R8-PRIMAVERA). Likewise, in an unprimed semantic categorization experiment, Lien, Allen, and Martin (2014) found an N400 effect (i.e., a component that reflects lexical-semantic processing) that was similar in magnitude for the regular word APPLE and its counterpart 4PPL3.

The above-cited findings strongly suggest that the perceptual input produced by words composed of letter-like digits or symbols (4PPL3, M473RI4L, MΔT€R!ΔL) is comparable to that of regular words. Models of visual word recognition can easily accommodate this phenomenon by assuming that letter detectors tolerate “some shape distortion” (Dehaene & Cohen, 2008, p. 458). That is, in a reading context, the letter A is the best-matching letter for the nonletter form 4 in M4TERI4L (Kinoshita et al., 2013). Therefore, upon presentation of the embedded letter-like digit 4 in M4TERI4L, the letter A would be activated resulting in a processing advantage of M4TERI4L-MATERIAL over the control M5TERI2L-MATERIAL.

A research question with important implications to models of visual word recognition is whether these visual similarity effects are restricted to nonletter forms during word recognition or whether they also occur with visually similar letters (e.g., HA: MHTERIHL is more visually similar to MATERIAL than MDTERIDL). Prior studies using isolated letters have consistently shown that visually similar letters (e.g., A and H) are more confusable than visually dissimilar letters (e.g., A and D) (Mueller & Weidemann, 2012). For instance, using a two-alternative perceptual identification task with masked isolated letters, Kinoshita et al. (2013) found that participants made more errors on a target letter (e.g., A) when the distractor was visually similar (e.g., H) than when the distractor was visually different (e.g., D)—they found exactly the same pattern with letter-like digits (e.g., 4 [but not 6] was confusable with the letter A). Thus, one might expect a parallel letter visual-similarity effect with words. However, as argued by Kinoshita et al. (2013), the letter H itself is the best-matching letter for the letter form H in MHTERIHL, and hence the letter A would not be activated during the processing of the MHTERIHL—note that A would be the best-matching letter for 4 (i.e., a nonletter form) in M4TERI4L. Therefore, the visually similar prime MHTERIHL would not activate MATERIAL to a larger degree than the visually dissimilar prime MDTERIDL. Indeed, the interactive activation model (McClelland & Rumelhart, 1981) predicts a null effect of letter visual-similarity using the default parameters. For instance, the number of cycles to identify the word CODE is virtually the same when briefly preceded by the visually similar one-letter different prime CQDEO and Q share all letter features except one in the orthographic scheme of the interaction activation model—and the visually different one-letter different prime CXDE (115 vs. 115 processing cycles using Davis, 2010, simulator). A similar null effect occurs when running simulations on other leading models of word recognition (e.g., Davis, 2010, spatial coding model).

Alternatively, one could argue that, in the initial stages of word processing, the identity of the letters that constitute the visual input comes with some degree of uncertainty, which also happens with letter order (e.g., JUGDE may be initially processed as JUDGE; see Perea & Lupker, 2004). In this scenario, the groups of neurons responsible to encode the visual features of the letter H may initially produce evidence compatible not only with the letter H but also with other visually similar letters (e.g., A). This perceptual uncertainty would be resolved with further processing at later processing stages (Bicknell & Levy, 2010; Norris & Kinoshita, 2012). Consequently, the letter H in MHTERIHL could activate to some degree the letter representation of A at the early stages of word processing, thus producing a processing advantage of MHTERIHL-MATERIAL over a visually different control condition like MDTERIDL-MATERIAL.

The empirical evidence of letter visual-similarity effects at the initial moments of word processing is scarce and restricted to behavioral experiments. Kinoshita et al. (2013) conducted a masked priming lexical decision experiment that included priming conditions with letter-like digits ([visually similar] 484NDON-abandon vs. [visually dissimilar] 676NDON-abandon) and priming conditions with replaced letters ([visually similar] HRHNDON-abandon vs. [visually dissimilar] DWDNDON-abandon). For letter-like digits, Kinoshita et al. (2013) found faster word identification times for 484NDON-abandon than for 676NDON-abandon, hence replicating the findings reported by Perea et al. (2008). For letter-replacement primes, word identification times were faster for HRHNDONabandon than for DWDNDON-abandon, but the difference only approached significance (p = 0.09). More recently, Marcet and Perea (2017) conducted two masked priming lexical decision experiments with a larger number of data points per condition than in the Kinoshita et al. (2013) experiment (2,160 vs. 740, respectively). To create the replaced-letter primes from the target words, Marcet and Perea (2017) substituted a single letter that was visually very similar (e.g., ij, as in dentjst-DENTIST) or not (e.g., ig, dentgstDENTIST) using the Simpson et al. (2012) ratings of visual letter similarity. To assess how effective the visually similar primes were, they also included an identity condition. Marcet and Perea (2017) found faster word identification times in the visually similar substituted-letter condition than in the visually dissimilar substituted-letter condition (e.g., dentjstDENTIST was responded to faster than dentgstDENTIST). Furthermore, the visually similar substituted-letter condition produced word identification times that were only slightly slower than those in the identity condition (see also Marcet & Perea, 2018b, for evidence of this pattern during sentence reading using the boundary technique). Likewise, Marcet and Perea (2018a) found faster lexical decision times on a target word when preceded by a visually similar prime containing a multi-letter homoglyph (docurnent-DOCUMENT, where rn is visually similar to m) than when preceded by an orthographically control prime (e.g., docusnent-DOCUMENT)—again, the visually similar condition yielded word identification times only slightly slower than those in the identity condition. Thus, the behavioral evidence suggests that letter visual-similarity affects the early moments of word recognition—note that Gomez et al. (2013) provided empirical and modelling evidence that masked priming effects reflect early encoding processes. The limitation of the Marcet and Perea (2017, 2018a) experiments is that the obtained letter visual-similarity effects cannot be unambiguously attributed to orthographic overlap between the stimuli (i.e., more orthographic overlap between dentjst and DENTIST than between dentgst and DENTIST) or to lexical-semantic activation (i.e., more lexical activation from dentjst to DENTIST than from dentgst to DENTIST), thus highlighting the need for additional evidence with a technique with better temporal resolution

The main purpose of the current masked priming lexical decision experiments was to track the time course of the effects of letter visual-similarity as they unfold in time. Unlike response times—which only provide a response at the end of processing—the ERPs provide online, continuous measures during the course of word processing. We focused on two key components that have been respectively associated to orthographic overlap between prime-target pairs (the N250) and to lexical-semantic interactions (the N400) in masked priming experimentsFootnote 1. The N250 is a negative-going component that peaks around 250 ms post target onset, usually ranges from 150 to 350 ms, and has a widespread scalp distribution centered over midline and central-anterior electrode sites. The N250 component is thought to reflect the sublexical orthographic overlap between prime and target, because it shows a gradient modulated by orthographic similarity (being more negative to porchTABLE [unrelated] < tebleTABLE [one-letter replaced] < tableTABLE [identity]; HoIcomb & Grainger, 2006; Kiyonaga, Grainger, Midgley, & Holcomb, 2007). The N400 is a negative-going component that peaks around 400 ms, ranges approximately from 350 to 500 ms, and has a widespread central scalp distribution. In masked priming experiments, the N400 is larger (i.e., more negative) for word targets when preceded by an unrelated prime (porchTABLE) than when preceded by an orthographically related prime (tebleTABLE). In turn, negative-going amplitudes in the N400 are larger in the orthographically related condition (tebleTABLE) than in the identity condition (tableTABLE) (Holcomb & Grainger, 2006). The N250 is a domain-specific component thought to reflect the mapping of orthographic information onto whole-word representations, either directly or using phonological codes. In the context of visual word recognition, the N400 it is thought to reflect an interaction of lexical level (i.e., whole-word units) and the semantic level, matching the orthographic word representations to the concepts stored in memory (see Grainger & Holcomb, 2009, for discussion).

In the present experiments, we recorded both behavioral (word identification times, accuracy) and event-related potential (ERP) measures to target stimuli preceded by a masked prime. The priming conditions were the same as in the behavioral experiments conducted by Marcet and Perea (2017): (a) a visually similar one-letter replacement prime [SIM] (e.g., dentjst-DENTIST); (b) a visually dissimilar one-letter replacement prime [DIS] (e.g., dentgst-DENTIST); and (c) an identity prime [ID] (e.g., dentist-DENTIST). As in the Marcet and Perea (2017) experiments, the visually similar letters were i/j (Experiment 1) and u/v (Experiment 2); these pairs of letters are visually very similar in letter confusability ratings (5.17 and 4.93 out of 7, respectively, in ratings of Simpsons et al., 2012). Behaviorally, we expect faster word identification times in the visually similar (SIM) letter condition than in the visually dissimilar (DIS) letter condition (e.g., dentjstDENTIST faster than dentgstDENTIST) (i.e., the same pattern as in the Marcet & Perea, 2017, experiments). More importantly, the examination of the ERPs allowed us to track the time course of these differences. If early in orthographic processing, there were initial uncertainty about the letter identities that constitute the words that is resolved later in processing, the perceptual input initially produced by the SIM condition would be comparable to that of the ID condition (dentjst-DENTIST would be processed similarly to dentist-DENTIST) but not to the DIS condition (dentgst-DENTIST). In this scenario, we expect that the N250 (i.e., the sublexical orthographic component) would be more negative for the DIS than for the SIM condition—note that in the extreme case the SIM condition would elicit ERPs close to that of the ID condition, whereas at a later time window (N400; the lexical-semantic component), the SIM and DIS conditions would behave similarly—and with larger negative-going amplitude than the ID condition. Alternatively, if there were some degree of uncertainty concerning letter identity at both the orthographic and lexical-semantic stages, one would expect larger negative-going amplitudes in the DIS than in the SIM condition not only in the N250 component but also in the N400 component. Finally, if the abstract representations of the letters were accessed early in processing regardless of visual letter-similarity, one would expect similarly larger negative-going amplitudes in the SIM and DIS condition than the ID condition in both N250 and N400 components.

Experiment 1

Method

Participants

A group of 27 undergraduate students from the University of Valencia (Spain) were recruited for this study. The data of four participants were discarded because of incomplete data sets (1 participant) and noisy electroencephalogram (EEG; 3 participants: mostly due to blinks and alpha activity) recording. The remaining 23 participants’ ages ranged from 18 to 32 years (15 females; mean age = 22 years, standard deviation [SD] = 3.7). All participants were native speakers of Spanish with no history of neurological or psychiatric impairment and with normal or corrected-to-normal vision. All participants were right handed, as assessed with a Spanish abridged version of the Edinburgh handedness inventory (Oldfield, 1971). Written, informed consent was obtained from all participants.

Materials

We employed a set of 228 word targets extracted from the stimuli used by Marcet and Perea (2017). The average Zipf frequency in the EsPal database (Duchon, Perea, Sebastián-Gallés, Martí, & Carreiras, 2013) was 3.65 (range: 1.72–5.91), the average number of letters was 7.6 (range: 5–11), and the average Levenshtein distance was 2.2 (range: 1.3–4.3). All words had the letters i or j in an internal position (e.g., DENTISTA [dentist]; PASAJERO [passenger]). For each target word, we created three primes: (1) an identity prime (dentistaDENTISTA; pasajeroPASAJERO); (2) a pseudoword prime created by replacing a single letter with a visually similar letter (e.g., ij, as in dentjstaDENTISTA; ji, as in pasaieroPASAJERO); (3) a pseudoword prime created by replacing a single letter with a visually dissimilar letter (e.g., dentgstaDENTISTA; pasaueroPASAJERO)—note that the outline letter shape was the same in the visually similar and visually dissimilar primes. To act as foils in the lexical decision task, we selected 228 pseudoword targets from the Marcet and Perea (2017) stimuli. All the pseudoword targets had the letters i or j in a middle position (e.g., BESTINDA; MOMAJERA) and were preceded by a prime with the same characteristics as in the word trials. To rotate the priming conditions across the word/pseudoword targets, we created three counterbalanced lists in a Latin square manner. The complete set of prime-target stimuli is presented in Appendix A.

Procedure

Participants sat comfortably in a dimly lit and sound attenuated room. All stimuli were presented on a high-resolution monitor that was positioned slightly below eye level, 85–90 cm in front of the participant. The size of the stimuli and distance from the screen allowed for a visual angle of less than 5 degrees horizontally. Stimuli were presented in white 24-pt Consolas font against a dark-gray background. Stimulus display was controlled by Presentation software (Neurobehavioral Systems). All of the stimuli were displayed at the center of the screen.

The sequence of events in each trial was as follows: the participant was presented with a pattern mask (a series of “#” signs that matched the length of the target item) for 500 ms. A lowercase prime replaced the mask in the same spatial location for 50 ms and was replaced by an uppercase target (either a word or a pseudoword), which remained on the screen until the participant responded or 2,000 ms had elapsed. After participants’ response, a blank screen of a random duration between 700 and 1,000 ms was shown. To minimize participant-generated artifacts in the EEG signal during the presentation of the experimental stimuli, participants were asked to refrain from blinking and moving from the onset of each trial to the set up period after response. Brief 10-second breaks occurred every 60 trials. Every 270 trials, there was a brief pause for resting and impedance checking. Participants were asked to decide as fast and accurately as possible if the target stimulus was a real Spanish word or not. They pressed one of two response buttons (either the YES button or the NO button). The hand used for each response was balanced across participants. Lexical decision times were measured from target onset until the participant’s response. Each participant was randomly assigned to one of the three counterbalanced lists. The order of the trials was randomized for each participant. Before the experiment began, participants were given a brief 16-trial practice session to acquaint them with the task. The stimuli used in the practice session were different from those used in the actual experiment. The whole session, including set up, lasted approximately 1.5 hours.

EEG recording and analysis

The electroencephalogram (EEG) was recorded from 29 Ag/AgCl electrodes mounted in an elastic cap (EASYCAP GmbH, Herrsching, Germany) according to the 10/20 system (Figure 1). Eye movements and blinks were monitored with four electrodes providing bipolar recordings of the horizontal and vertical (over the left eye) electrooculogram (EOG). Signals were sampled continuously throughout the experiment with a sampling rate of 250 Hz and filtered offline with a bandpass filter of 0.01–40 Hz. Data from scalp and eye electrodes were referenced offline to the average of left and right mastoids. Initial analysis of the EEG data was performed using the ERPLAB plugin (Lopez-Calderon & Luck, 2014) for EEGLAB (Delorme & Makeig, 2004). Epochs of the EEG corresponding to 100 ms pre- to 550 ms post-target onset were analyzed. Baseline correction was performed using the average EEG activity in the 100 ms preceding the onset of the target stimuli. Following baseline correction, trials with muscle activity or other artifacts, including blinks to ensure that participants saw the briefly presented prime, were rejected (9.4%). All participants had a minimum of 46 acceptable correct trials per condition (ID: M = 64, SD = 7; SIM: M = 65, SD = 5; DIS: M = 64, SD = 7). There were no significant differences in the number of trials accepted per condition (ID vs. SIM: t(22) = −1.06, p = 0.30; SIM vs. DIS: t(22) = 1.38, p = 0.18; ID vs. DIS: t(22) = 0.19, p = 0.85). As in previous similar studies (Kinoshita et al., 2013; Lupker, Perea, & Davis, 2008; Marcet & Perea, 2017, 2018a), we focused on the word trials, because masked priming effects for nonword trials are absent or minimal.

Fig. 1
figure 1

Schematic representation of the electrode montage. Electrodes are grouped in four different areas (anterior-left, anterior-right, posterior-left, and posterior-right) for statistical analyses.

To characterize the time course and scalp distribution of letter visual-similarity effects (similar vs. dissimilar condition), we performed statistical analyses on the mean voltage values for two different time windows: 230-350 ms, and 400-500 ms. These epochs allowed for detailed assessment of the N250 and N400 components, respectively. The selection of these epochs was based on the visual inspection of the waveforms and prior literature (see Laszlo & Federmeier, 2014, for a data-driven approach to investigate the time course of orthographic and semantic effects that validate the typically used a priori time windows). To further depict the data in a more data-driven manner, we also conducted repeated-measures t-tests at every 4-ms intervals between 1 and 550 ms at all scalp sites for the effects of visual letter similarity (SIM vs. DIS and SIM vs. ID; also for ID vs. DIS; Figure 2c). The differences shown by this approach are consistent with the selected time windows and electrode groups (Figure 2, left panel).

Fig. 2
figure 2

(a) Grand average ERPs to targets preceded ID (black line), SIM (blue line), and DIS (red line) priming conditions. The differences between the SIM and DIS conditions in the first time window (230-350 ms) are highlighted in grey. The differences between the SIM and ID conditions in the second time window (400-500 ms) are highlighted in blue. (b) Topographic distribution of the effects of visual letter similarity (calculated as the difference in voltage amplitude between the ERP responses to the SIM vs. DIS and ID vs. SIM priming conditions) in the two time windows of the analysis. (c) Results of the (uncorrected) univariate statistical analyses of the time course of the effects of letter visual-similarity for each of the three comparisons (SIM vs. DIS; SIM vs. ID; DIS vs. ID). The plots convey the results of the comparisons between 80 and 550 ms at all 27 electrodes (listed in an anterior-posterior progression within the left hemisphere at the top, midline, and right hemisphere at the bottom). Significant p values < 0.001 are shown in blue (p values between <0.05 and <0.001 are indicated in grey).

The contrast between the DIS and ID conditions is included in Figure 2 for comparison purposes (i.e., it shows the typical N250 and N400 effects reported in previous masked form priming experiments; Holcomb & Grainger, 2006). We analyzed the topographical distribution of the ERP results by including the averaged amplitude values across three electrodes of four representative scalp areas (Figure 1) that resulted from the factorial combination of the factors hemisphere (left vs. right) and anterior-posterior (A-P) distribution (anterior vs. posterior): anterior left (F3, FC1, FC5), anterior right (F4, FC2, FC6), posterior left (CP1, CP5, P3), and posterior right (CP2, CP6, P4). Of note, we employed the same grouping of electrodes as in recent masked priming experiments examining the N250 and N400 conducted in our lab (Gutierrez-Sigut, Vergara-Martínez & Perea, 2017; Vergara-Martínez, Gómez, Jiménez & Perea, 2015). For each time window, we performed two separate repeated measures analyses of variance (ANOVA) that included the factors hemisphere, A-P distribution, and type of prime (SIM, DIS, and ID). As in the Marcet and Perea (2017, 2018a, 2018b) experiments, we mainly focused on the two novel, theoretically motivated a priori contrasts (i.e., SIM vs. DIS and ID vs. SIM). Nonetheless, for the sake of completeness, Figure 2a displays the ID versus DIS differences in the ERP waves. As expected, we obtained larger negative-going amplitudes in the one-letter visually dissimilar priming condition than in the identity condition in the N250 and N400 components, thus replicating prior research (Grainger & Holcomb, 2006). In all analyses, List (1–4) was included as a dummy between-subjects factor to remove the variance due to the lists (Pollatsek & Well, 1995). Effects of hemisphere or A-P distribution factors are only reported when they interact with the experimental manipulation. Interactions between factors were followed up with simple effects tests. When the sphericity assumption did not hold, we applied the Greenhouse-Geisser correction to adjust the degrees of freedom. For the pairwise comparisons across the factor Type of prime, p values were corrected using the Šidák correction (1967).

Results and Discussion

Behavioral results

Error responses (5.6%) and lexical decision times shorter than 250 ms or longer than 2,000 ms (1 observation) were omitted from the latency analyses. To examine the effect of type of prime, we performed ANOVAs that paralleled those conducted with the ERP data (Type of prime: SIM, DIS, and ID; the dummy factor List also was included) separately for the latency and accuracy data. These analyses were conducted over subjects (F1) and over items (F2).

The statistical analyses of the word identification times showed a main effect of type of prime, F1(2,40) = 20.92, MSE = 366.3, p < 0.001, ƞ2 = 0.51; F2(2, 450) = 21.86, MSE = 4085.5, p < 0.001, ƞ2 = 0.09. This reflected an 18-ms advantage of the SIM condition over the DIS condition (687 vs. 705 ms; p = 0.004 and p < 0.001 in the by-subjects and by-items analyses, respectively). We also found a 20-ms advantage of the ID condition over the SIM condition (667 vs. 687 ms, respectively; p = 0.016 and p = 0.007 in the by-subjects and by-items analyses, respectively). Finally, the 38-ms advantage of the ID condition over the DIS condition also was significant (both ps < 0.001).

The analyses of the error rates showed a main effect of type of prime, F1(1.36, 27.22) = 9.18, MSE = 4.454, p = 0.003, ƞ2 = 0.32; F2(1.88, 423.49) = 7.31, MSE = 41.2, p = 0.001, ƞ2 = 0.03. This reflected that participants committed fewer errors in the SIM than in the DIS condition (2.2 vs. 4.0%; p = 0.008 and p = 0.003 in the by-subjects and by-items analyses, respectively), whereas there were no differences between the SIM and ID conditions (2.2% and 2.0%, respectively), both ps > 0.50. Finally, participants committed fewer errors in the ID than in the DIS condition (p = 0.002, and p = 0.003 in the by-subjects and by-items analyses, respectively)Footnote 2

ERP results

Figure 2b shows the ERP waves of the Identity (ID), similar (SIM), and dissimilar (DIS) conditions in six representative electrodes from the four areas of interest. The ERPs showed a small negative going potential peaking around 50 ms, followed by a positive potential peaking around 190 ms (range: 100-240 ms). These early components are followed by negative going waves from 240 ms that remained positive until the end of the epoch (550 ms). Within this negativity, two negative peaks can be observed approximately at 320 and 390 ms, respectively. The first ERP component to show differences in the amplitudes was the N250, a negative-going component that peaked around 320 ms after target onset. For this component, both the DIS condition showed a larger negativity than both the ID and SIM conditions. Further differences were found in the N400 time window, where both DIS and SIM showed larger negative-going amplitudes than the ID condition.

230- to 350-ms window

The main effect of type of prime was significant, F(2,40) = 5.14, MSE = 10.96, p = 0.010, ƞ2 = 0.21, whereas none of the interactions approached significance (all ps > 0.23). Unsurprisingly, the effect of type of prime reflected larger negative-going amplitudes for the DIS condition than for the ID condition, p = 0.007. More important, the DIS condition also showed larger negative-going amplitudes than the SIM condition (p = 0.007), whereas there were no signs of a difference between the SIM and ID conditions (p = 0.624).

400- to 500-ms window

The main effect of type of prime also was significant, F(2,40) = 9.47, MSE = 19.71, p < 0.001, ƞ2 = 0.32, and again, none of the interactions was significant (all ps > 0.19). The DIS condition showed larger negative-going amplitudes than the ID condition (p < 0.001). More importantly, the DIS condition also showed larger negative-going amplitudes than the SIM condition, although the difference only approached significance (p = 0.074). Finally, the SIM condition showed larger negative-going amplitudes than the ID conditions, p = 0.036.

In summary, the behavioral data showed that the visually similar condition (SIM condition; e.g., dentjst-DENTIST) produced faster word identification times and fewer errors than the visually dissimilar condition (DIS condition e.g., dentgst-DENTIST), thus replicating the behavioral findings reported by Marcet and Perea (2017). More important, the ERP data revealed differences between the SIM and DIS conditions at the 230- to 350-ms time window, with larger negative-going amplitudes for the DIS than for the SIM condition. Furthermore, the SIM condition produced ERP waves comparable to those of the identity (ID) condition (e.g., dentist-DENTIST) (Figure 2). When inspecting the N400 component, we found more negative-going amplitudes in the SIM than in the ID condition, whereas the difference between the SIM and DIS conditions only approached significance. These findings suggest that, at an early orthographic stage, there is some degree confusability when encoding letter identities (N250: ID = SIM < DIS), which tends to vanish at later processing stages.

Experiment 2 was designed to replicate Experiment 1 using another set of items in which the visually similar pairs were u/v instead of i/j (see Marcet & Perea, 2017, for a similar strategy). This new experiment will allow us to conduct a combined, more powerful analysis of the time course of visual letter-similarity effects.

Experiment 2

Method

A group of 24 students from the University of Valencia (Spain) were recruited for this study. The data of four participants were discarded because of noisy electroencephalogram (EEG; mostly due to alpha activity and blinks) recording. The remaining 20 participants’ ages ranged from 18 to 30 years (12 females; mean age = 22.6 years, SD = 4.5). All participants were right-handed, native speakers of Spanish with no history of neurological or psychiatric impairment and with normal or corrected-to-normal vision. Written, informed consent was obtained from all participants.

Materials

We employed a set of 228 word targets extracted from the stimuli used by Marcet and Perea (2017). We chose the same number of items as in Experiment 1. The average Zipf frequency in the EsPal database (Duchon et al., 2013) was 4.08 (range: 3.33–5.50), the average number of letters was 7.5 (range: 5–11), and the average Levenshtein distance was 2.1 (range: 1.2–4.3). All words had the letters u or v in a middle position (e.g., NEUTRAL; CAVERNA [cavern]). The prime-target conditions were parallel to those in Experiment 1 (i.e., identity condition [neutral–NEUTRAL; cavernaCAVERNA]; visually similar condition [nevtralNEUTRAL; cauernaCAVERNA]; visually dissimilar condition [neztralNEUTRAL; caoernaCAVERNA]). We also selected 228 pseudoword targets from the Marcet and Perea (2017) stimuli; these stimuli had the letters u or v in a middle position (e.g., CARCURA; OLCLIVO) and were preceded by a prime with the same characteristics as in the word trials. We created three counterbalanced lists to rotate the priming conditions across the word/pseudoword targets. The complete set of prime-target stimuli is presented in Appendix B.

Procedure

The procedure was the same as in Experiment 1.

EEG recording and analysis

The EEG recording and analysis were the same as in Experiment 1. Trials with artifacts (i.e., eye movements, blinks, muscle activity, etc.) were rejected (14.4%). All participants had a minimum of 33 acceptable correct trials per condition (ID: M = 63, SD = 10; SIM: M = 63, SD = 11; DIS: M = 62, SD = 10). There were no significant differences in the number of trials accepted per condition [ID vs. SIM: t(19) = −0.042, p = 0.97; SIM vs. DIS: t(19) = 1.11, p = 0.28; ID vs. DIS: t(19) = 1.28, p = 0.22). Importantly, because this experiment was parallel to Experiment 1—except for the set of items and the visually similar letters (u/v instead of i/j)—we performed statistical analyses on the mean voltage values for the same two different time windows: 230-350 ms, and 400-500 ms and electrode groups. Visual inspection of the morphology of the ERP waves (see below and Figure 3) confirmed that the selected time windows and electrode groups allow for the examination of the N250 and N400 components respectively.

Fig. 3
figure 3

(a) Grand average ERPs to targets preceded ID (black line), SIM (blue line), and DIS (red line) priming conditions. The differences between the SIM and DIS conditions in the first time window (230-350 ms) are highlighted in grey. The differences between the SIM and ID conditions in the second time window (400-500 ms) are highlighted in blue. (b) Topographic distribution of the effects of visual letter similarity (calculated as the difference in voltage amplitude between the ERP responses to the SIM vs. DIS and ID vs. SIM conditions) in the two time windows of the analysis.

Results and discussion

The inferential analyses, both behavioral and ERPs, were parallel to those conducted in Experiment 1.

Behavioral results

Error responses (3.2%) and lexical decision times shorter than 250 ms or longer than 2,000 ms (0 observations) were omitted from the latency analyses. The statistical analyses of the word response times showed a main effect of type of prime, F1(2,34) = 14.17, MSE = 424.9, p < 0.001, ƞ2 = 0.46; F2(2, 450) = 13.33, MSE = 5168.1, p < 0.001, ƞ2 = 0.06. This reflected a 22-ms advantage of the SIM condition over the DIS condition (671 vs. 693 ms, p = 0.003 and p < 0.001 in the by-subjects and by-items analyses, respectively). In addition, the ID condition showed a 16-ms advantage over the SIM condition (655 vs. 671 ms, p = 0.040 and p = 0.054 in the by-subjects and by-items analyses, respectively). Finally, the ID vs. DIS comparison also was significant, both ps < 0.001.

The statistical analyses of the error rates showed a main effect of type of prime, F1(2,34) = 3,32, MSE = 4.05, p = 0.048, ƞ2 = 0.16; F2(1.90, 426.3) = 3.77, MSE = 47.8, p = 0.026, ƞ2 = 0.02. On average, participants committed a similar percentage of errors in the SIM and DIS conditions (3.5% vs. 3.9%, both ps > 0.50). In addition, participants committed more errors in the SIM than in the ID condition (3.5% vs. 2.2%, respectively), p = 0.028 and p = 0.021, in the by-subjects and by-items analyses, respectively). Finally, participants made more errors in the DIS than in the ID condition (p = 0.05 and p = 0.007, in the by-subjects and by-items analyses, respectively)Footnote 3

ERP results

Figure 3a shows the ERP waves of the Identity (ID), similar (SIM), and dissimilar (DIS) conditions in six representative electrodes from the four areas of interest. The ERPs showed the same morphology as Experiment 1. There was a small negative-going potential peaking around 50 ms, followed by a positive potential peaking around 200 ms (range: 100-240 ms). These early components are followed by negative-going waves from 240 ms that remain positive until the end of the epoch (550 ms). Within this negativity, two negative peaks can be observed approximately at 310 and 380 ms, respectively. The first ERP component to show differences in the amplitudes was the N250, a negative-going component that peaked around 320 ms after target onset. For this component, the DIS condition showed a larger negative-going amplitude than both the ID and SIM conditions. Further differences were found in the N400 time window, where both DIS and SIM showed larger negative-going amplitudes than the ID condition, which appears to be slightly stronger at posterior electrodes.

230- to 350-ms window

The main effect of type of prime was significant, F(2,34) = 3.36, MSE = 16.3, p = 0.046, ƞ2 = 0.17, whereas none of the interactions approached significance (all ps > 0.16).

The effect of type of prime reflected larger negative-going amplitudes for the DIS condition than for the ID and SIM conditions (p = 0.047 and p = 0.032, respectively). There were no differences between the SIM and ID conditions, p = 0.415.

400 to 500-ms window

We found a main effect of type of prime, F(2,34) = 4.23, MSE = 19.96, p = 0.023, ƞ2 = 0.20, whereas none of the interactions was significant (all ps > 0.20). This reflected that the DIS condition showed larger negative-going amplitudes than the ID condition, p = 0.010. The DIS also showed larger negative-going amplitudes than the SIM condition, p = 0.041, whereas we did not find a significant difference between SIM and ID conditions, p = 0.436.

To sum up, the analyses of the latency data mimicked those in Experiment 1. With respect the ERP data, we found a very similar pattern as in Experiment 1 in the N250 component (i.e., ID = SIM < DIS). However, results for the N400 time window showed that the SIM condition was different from the DIS condition, whereas there were no significant differences between the SIM and ID conditions. To offer a more powerful test of the effects of letter visual-similarity during word processing, we carried out combined analyses of Experiments 1 and 2 with Experiment as a between-subjects factor.

Combined analyses of Experiments 1 and 2

Behavioral analyses

The statistical analyses of the word response times showed a main effect of type of prime, F1(2,74) = 34.32, MSE = 393.2, p < 0.001, ƞ2 = 0.48; F2(1.97, 886.3) = 34.22, MSE = 4698.2, p < 0.001, ƞ2 = 0.07. This reflected faster response times on the target words when preceded by a visually similar prime than when preceded by a visually dissimilar prime (p < 0.001 and p < 0.001 in the by-subjects and by-items analyses, respectively). In turn, response times on the target words were faster when preceded by an identity prime than when preceded by a visually similar prime (p < 0.001 and p < 0.001 in the by-subjects and by-items analyses, respectively). The interaction between type of prime and experiment was not significant (both ps > 0.1).

The ANOVAs on the error data showed a main effect of type of prime, F1(1.67, 61.87) = 9.20, MSE = 4.18, p = 0.001, ƞ2 = 0.20; F2(1.92, 866.56) = 8.76, MSE = 43.65, p < 0.001, ƞ2 = 0.01. This reflected that participants committed more errors to target words when preceded by a visually dissimilar prime than when preceded by a visually similar prime (p = 0.034 and p = 0.030 in the by-subjects and by-items analyses, respectively). In turn, participants committed more errors to target words when preceded by a visually similar prime than when preceded by an identity prime (p = 0.017 and p = 0.048 in the by-subjects and by-items analyses, respectively). The interaction between type of prime and experiment was not significant (both ps > 0.1).

ERP analyses

230- to 350-ms window

We found a main effect of type of prime F(2,82) = 6.83, MSE = 12.97, p = 0.002, ƞ2 = 0.14. None of the interactions, including those with the factor Experiment, was significant (all ps > 0.16). The effect of prime reflected that the DIS condition showed larger negative-going amplitudes than the ID condition (p = 0.001) and the DIS condition (p = 0.015), whereas there were no significant differences between the SIM and ID conditions, p = 0.182.

400- to 500-ms window

The main effect of type of prime was significant, F(2,82) = 10.43, MSE = 20.75, p < 0.001, ƞ2 = 0.20. The interaction betwee Type of prime and AP distribution approached significance, F(2,82) = 2.56, MSE = 2.65, p = 0.084, ƞ2 = 0.06, whereas the other interactions, including those with the factor Experiment, did not approach significance, all ps > 0.17. The effect of prime reflected that the DIS condition showed larger negative-going amplitudes than ID condition (p < 0.001) and the SIM condition (p = 0.046), and that the SIM condition showed larger negative-going amplitudes than the ID condition, p = 0.023.

Thus, this combined analysis corroborated the behavioral and N250 findings from the Experiments 1 and 2 (i.e., ID = SIM < DIS). More importantly, these analyses—using a larger sample size (N = 43)—provided a more complete picture of the effects of letter visual-similarity at a later stage of processing (i.e., N400: ID < SIM < DIS) than the data from the individual experiments.

General discussion

The main purpose of the present ERP masked priming experiments was to track the time course of letter visual-similarity effects during word processing. The behavioral data in both experiments replicated previous findings (Marcet & Perea, 2017) of faster word identification times for target words when preceded by a visually similar nonword (SIM condition; e.g., dentjist-DENTIST) than when preceded by a visually dissimilar nonword (DIS condition; e.g., dentgist-DENTIST) (see also Marcet & Perea, 2018b, for parallel evidence in eye fixation times when using parafoveal previews during sentence reading). The ERP data showed that, in both experiments, the DIS condition elicited larger negative-going amplitudes in the N250 time window for the SIM condition (i.e., DIS > SIM). Importantly, at this time window, the SIM condition did not differ from the ID condition (i.e., SIM = ID). Assuming that the N250 reflects a gradient of orthographic overlap between prime and target (Grainger & Holcomb, 2009), the present data strongly suggest the visually similar one-letter different prime dentjst, but not the visually dissimilar one-letter different prime dentgst, initially produced a similar perceptual input as the identity prime dentist. This outcome favors the view that, early during orthographic processing, there is some uncertainty when attaining the abstract letter identities for visually similar letters (e.g., Bayesian Reader model; Norris & Kinoshita, 2012).

Later on, when inspecting lexical-semantic activation via the N400 component, the combined analyses showed not only larger negative-going amplitudes for the DIS than for the SIM conditions—as in the N250 component, but also that the SIM condition elicited larger negative-going amplitudes than the ID condition. This latter difference suggests that, at this time window, the identity condition activated the lexical-semantic representations of the target words to a larger degree than the SIM condition. This can be taken as an indication of resolution of the visual ambiguity. At this same time window, the DIS condition elicited larger N400 amplitudes than the SIM condition, suggesting that visual letter similarity might play a role at late processing stages (see Carreiras, Perea, Gil-López, Abu Mallouh, & Salillas, 2013; Madec, Rey, Dufau, Klein, & Grainger, 2012, for late ERP effects of visual similarity in single letter identification experiments). Further research is needed to examine the exact contribution of letter visual-similarity to the N400 component (see Carreiras, Armstrong, Perea & Frost, 2014, for a discussion of recent advances that can aid the study of the contributions of feed-forward and top-down activations to the amplitude of the N400 component).

Therefore, the present experiments confirmed that visual similarity effects during word recognition are not limited to non-letter forms (numbers and symbols; e.g., M4TERI4L or M4TERI4L), but they also occur with visually similar letters (see also Marcet & Perea, 2017, 2018a, for behavioral evidence). Critically, the ERP results in the 230- to 350-ms time window showed that the identity condition and the visually similar condition (e.g., dentist–DENTIST and dentjst–DENTIST) behaved similarly, whereas the visually different primes produced larger negative-going amplitudes. This pattern of data qualifies those hierarchical accounts of orthographic processing that propose an access to the abstract orthographic representations by this time window, regardless of the physical similarities among letters. For instance, Grainger and Holcomb (2009) postulated a bimodal interactive activation model in which the orthographic information would be attained in the N250 component. Indeed, Holcomb and Grainger (2006) found more negative-going amplitudes in one-letter different priming condition than in the identity condition in this component, which suggests that readers had access to the abstract letter representations of the stimuli. However, the one-letter replaced condition in the Holcomb and Grainger (2006) was not visually similar to the identity condition (e.g., teble-TABLE vs. table-LETTER). Indeed, in the two experiments, we found more negative-going amplitudes in the visually dissimilar condition than in the identity condition in the N250 (Figures 2 and 3), hence replicating the Holcomb and Grainger (2006) experiment. Thus, the more parsimonious account of the current masked priming data is that in the 230- to 350-ms time window there is still some degree of uncertainty concerning letter identity during word recognition when the perceptual input involves visually similar letters (e.g., dentjst-DENTIST).

The present results can be accommodated by those models that assume that there is uncertainty concerning letter identities at the early stages of word recognition (e.g., Bayesian Reader model; Norris & Kinoshita, 2012; see also Bicknell & Levy, 2010, for a Bayesian model of eye movement control in reading). As Norris and Kinoshita (2012) indicated “letter-identity and letter-order information accumulate gradually over time by a stimulus-sampling process” (p. 540), and this uncertainty is eventually resolved, as shown by the larger negative-going amplitudes for the SIM condition than for the ID condition in the N400 component. Therefore, the Bayesian Reader model can readily capture that at an earlier stage in processing (230- to 350-ms time window), the ERPs of the visually similar condition—but not the visually dissimilar condition—behave similarly to those of the identity conditions, whereas later in processing, the ERPs of the identity condition behave differently from the visually similar condition. (We acknowledge that the Bayesian Reader model does not make any claims on the specific time windows of these effects, because this is a formal model that focuses on word identification times and accuracy rates.) The present data are also consistent with the idea of uncertainty with respect to the order of the letters during word processing (Davis, 2010; Gomez, Ratcliff, & Perea, 2008; Vergara-Martínez, Perea, Gomez, & Swaab, 2013, for ERP evidence of the time course of transposed-letter effects). An avenue for future research would be to determine whether letter visual-similarity effects in a reading context occur in a purely bottom-up manner or whether they are modulated by higher level elements, such as the transitional probabilities of letters or top-down feedback from the lexical level (Dehaene & Cohen, 2007). That is, the letter j in dentjst may be interpretable as a letter i during word processing not only because of its visual similarity but also by orthotactic/phonotactic or lexical constraints. Furthermore, because the critical features that determine letter perception are not entirely understood yet (Rosa, Enneson, & Perea, 2016, for discussion), additional ERP experimentation also should investigate whether low-level spatial information interacts with the effects of letter visual-similarity during word recognition (e.g., manipulating the outline letter shape; e.g., dentgst-DENTIST vs. dentcst-DENTIST).

In summary, the present experiments sought to shed some light on the mechanisms to access abstract orthographic codes from the visual input. To that end, we recorded the participants’ ERPs in two masked priming experiments in which each word could be preceded by a visually similar or visually dissimilar one-letter replaced prime (dentjst-DENTIST vs. dentgst-DENTIST). In the 230- to 350-ms time window, the identity and visually similar condition behaved similarly, whereas there were larger negative-going amplitudes in the visual dissimilar condition. Thus, there is some degree of uncertainty at attaining letter identities during the first moments of word processing that is modulated by letter visual-similarity. Additional work is necessary to comprehend the intricacies underlying the processes that mediate between the printed stimulus and the long-term orthographic abstract representations.

Authors’ notes

The order of authors is alphabetical (i.e., contributions were equal). This research has been partly supported by Grants PSI2014-60611-JIN (EG), BES-2015-07414 (AM), and PSI2017-86210-P (MP) from the Spanish Ministry of Economy and Competitiveness. The authors thank Daniel Díaz for his help programming the experiments and Marta Vergara-Martínez for her helpful comments. They also thank the three anonymous reviewers, whose comments and suggestions helped improved the manuscript.