Ga naar de hoofdinhoud
Top

Bridging Languages, Broadening Access: Examining an Observation-Based Autism Assessment with a Latinx Sample

  • Open Access
  • 09-02-2026
  • Original Article

Abstract

Purpose

Standardized observational tools are part of the gold standard for autism assessment, leading to the most reliable diagnoses. Widely used tools are often costly, require extensive training, and lack validation for use with multilingual and low-income populations, factors that contribute to prolonged diagnostic wait times.

Method

This study examined psychometric properties of the Brief Observation of Symptoms of Autism (BOSA), a 12- to 14-min semi-structured observation designed as an autism assessment tool for both virtual and in-person administration. We evaluated the BOSA’s sensitivity and specificity in both English and Spanish within a Latinx, predominantly low-income sample (N = 98), among other psychometric properties.

Results

Findings indicate that the BOSA is a promising tool that can be used as a screener or as a part of a comprehensive evaluation administered across languages, settings (home, clinic, community), and interactants (caregivers, clinicians) for individuals with limited verbal abilities, though further research is needed to optimize its use with more verbally fluent populations.

Conclusion

These results add to the literature, positioning the BOSA as a promising, affordable, and adaptable tool for improving timely access to high-quality autism assessments in culturally and linguistically diverse, underserved communities. Additional research is needed to assess its usefulness in different circumstances while aiming to increase ease and efficiency of coding. The BOSA’s suitability for use by non-specialists in intervention and school-based settings could help reduce diagnostic delays that disproportionately affect families of color from non-English-speaking households, making its optimization an important future goal.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Autism spectrum disorder (ASD), hereinafter referred to as autism, is diagnosed based on the presence of social communication challenges and restricted or repetitive behaviors that typically manifest early in life (American Psychiatric Association, 2013). On average, individuals in the U.S. are diagnosed with autism at 47 months old but age of diagnosis varies by geographical location (36 months [California]—69.5 months [Laredo, Texas]; Shaw et al., 2025). There are benefits to being identified as having autism earlier in life, such as accessing intervention services that support the development of social communication skills (Wallis & Guthrie, 2024). The effects of early intervention have led to efforts by specialists and researchers to reliably screen and identify children as early as possible.
Various tools have been implemented to screen for autism in early childhood. Level 1 screeners (e.g., Modified Checklist for Autism in Toddlers, Revised, Ages & Stages Questionnaires, Parents’ Evaluation of Developmental Status, etc.) are designed to detect individuals who may be at risk for developmental delays in the general population (Sanchez-Garcia et al., 2019). These screeners may indicate that an individual needs close monitoring and further testing. While attempts to implement universal screening are warranted, the use of level 1 screeners often leads to high false positive rates because they err on the side of over-identification, so as to not miss children who need support (Wetherby et al., 2021). To address this issue, level 2 screeners were developed to differentiate between autism and other conditions (such as global developmental delays or language disorders) among those who have already been identified as being developmentally at risk. Level 2 screeners help triage individuals at higher risk for autism and reduce overall wait times by minimizing false positives before referral for a comprehensive evaluation (Khowaja et al., 2018). These screeners aim to provide greater specificity than broader caregiver-report level 1 screeners and require fewer resources than full, comprehensive assessments (Khowaja et al., 2018). These tools are available in various formats, including caregiver rating scales, brief caregiver interviews, and observation-based measures.
Observation-based level 2 screeners are particularly useful because direct observations can help offset the limitations of informant-only reports, demonstrating strong clinical utility for differentiating among neurodevelopmental disorders (Miller et al., 2017; Nordahl-Hansen et al., 2014; Pellegrini, 2001). In fact, medical autism diagnoses rely primarily on direct behavioral observation by trained clinicians, along with a developmental history, typically obtained through caregiver report or medical chart review. Observation-based tools have the potential to speed up access to services. Widely used and standardized tools like the Autism Diagnostic Observation Schedule, Second Edition (ADOS-2; Lord et al., 2012) remain the gold standard of observational assessments, but they can be time intensive to administer and score, and families must travel to the clinic. For clinicians working in organizations with long waitlists, level 2 observational tools can offer valuable structure and guidance, helping standardize what behaviors to look for that are indicative of autism and score rather than relying solely on subjective impressions. The ability to confidently diagnose individuals and quickly assess their strengths and weaknesses in combination with other assessment tools that include medical chart review and parent interviews and questionnaires may help them access services sooner and allow clinicians to serve a greater number of individuals. Overall, the use of brief, low-cost standardized observational tools is one practical strategy that can address the ongoing waitlist crisis in the autism field (Kanne & Bishop, 2021) without compromising assessment quality.
The dual function of these brief standardized observation-based assessments became particularly evident during the COVID-19 pandemic. During the pandemic, the Autism Diagnostic Observation Schedule, Second Edition (ADOS-2; Lord et al., 2012) could not be administered in a standardized manner due to masking requirements. This limitation underscored the need for clinicians to be able to observe core autism features (i.e., social communication differences and restricted, repetitive behaviors) within a relatively structured, yet ecologically valid context and to code these behaviors using a standardized framework modeled after the ADOS-2. As a result, the Brief Observation of Symptoms of Autism (BOSA; Dow et al., 2021) was made widely accessible during the COVID-19 pandemic, with providers receiving free training in its administration and coding to ensure continued access to standardized behavioral assessments when in-person visits were not feasible. The BOSA is a 12–14-min semi-structured video-recorded interaction between the individual being assessed and a familiar social partner (e.g., caregiver, sibling or another familiar adult). The BOSA can be administered remotely or in-person. There are four different versions of the BOSA that were designed for different ages and language levels (more detailed information in the Method). Using the BOSA during the COVID-19 pandemic allowed clinicians to observe social communication and restricted, repetitive behaviors within a relatively structured, yet ecologically valid context and to code these behaviors using a standardized framework modeled after the ADOS-2. Individuals being assessed were observed (through video recordings or from an observation room) interacting with a familiar individual. This gave clinicians the opportunity to observe and score behaviors systematically. Not only did the use of the BOSA as part of the comprehensive assessment lead to a diagnostic conclusion, but having a standardized observation between the individual being assessed and a familiar adult allowed clinicians to better support goal setting and treatment recommendations. Beyond its use during the COVID-19 pandemic, the BOSA remains a valuable tool for clinicians, allowing them to observe interactions between the individual being assessed for autism and familiar individuals (e.g., caregivers, siblings) and incorporate these observations into comprehensive assessments.
Some widely used observational assessments that are available include the Screening Tool for Autism in Toddlers (STAT; Stone et al., 2004), Naturalistic Observation Diagnostic Assessment (NODA; Smith et al., 2017), TELE-ASD-PEDS (Wagner et al., 2021), the Autism Detection in Early Childhood (ADEC; Young & Nah, 2016), the Systematic Observation of Red Flags (SORF; Dow et al., 2019), and the Brief Observation of Symptoms of Autism (BOSA; Dow et al., 2021). All of these tools were created with the goal of expediting the diagnostic process and most of them in reducing the need for in-person visits. Because these level 2 screeners are also available remotely, they have the potential to promote greater inclusion for individuals with limited access to trained providers (Corona et al., 2021). A unique strength of the BOSA in particular is its utility across a broad age range. As the field of autism assessment has expanded to include and diagnose older individuals with autism, the availability of tools validated across a wide age range has become increasingly important (Wigham et al., 2019).
Despite overall advances in autism screening practices and the proliferation of observation-based tools, including level 2 screeners, families of color from non-English-speaking households remain more susceptible to having longer wait times for assessment, receiving later diagnoses, and facing delays in starting early intervention services, compared to primarily English-speaking households (Chavez et al., 2021; Imanpour, 2024; Lim et al., 2020; Zuckerman et al., 2017). Implementation of observation-based assessments in real-world settings remains limited by cost, time, and a shortage of qualified multilingual providers. There is limited research on the psychometric properties of observation-based level 2 screeners with underserved populations across a broader age range. As such, additional research is needed to assess how observational tools such as the BOSA perform within community-based, multilingual individuals from low-income households, in order to address existing disparities and promote equitable, timely access to autism diagnostic services.

The Current Study

The present study evaluates the reliability of the BOSA among Latinx youth and adults with autism and related neurodevelopmental conditions. Currently, the BOSA is available to researchers and experienced ADOS-2 users for free upon request. Preliminary validation of the BOSA in English-speaking samples has shown strong sensitivity, specificity, and convergent validity with the ADOS-2 across assessment settings (Dow et al., 2021). More recent work in Latin America also supports its feasibility in Spanish-speaking contexts, though some modules exhibited reduced sensitivity or specificity, highlighting the need for continued investigation across varying linguistic and cultural settings (Granana et al., 2025).
The goal of our study was twofold. We aimed to (1) assess the utility of the BOSA in Spanish with a bilingual sample and (2) expand the literature on its use in English with a culturally diverse Latinx sample fluent in English. We calculated the psychometric properties, including sensitivity and specificity of the BOSA in Spanish and English. Then, using the subset for whom we had both Spanish and English BOSAs, we calculated whether its performance is comparable across languages. Furthermore, we examined the role of individual language proficiency on BOSA performance in each of the languages. By centering a standardized, time- and cost-efficient observational tool, this study aims to inform more equitable yet reliable approaches to autism diagnosis in under-resourced, multilingual communities.

Method

Participants

A total of 98 Latinx participants (ranging in age from 15 months to 42 years) who completed at least one BOSA in English were included in this study (see Table 1 for demographic information). All participants were living in the United States and were fluent in English. Of the 98 participants, a subset (N = 42) was recruited as part of a separate and subsequent study focused on English–Spanish bilingual individuals and families from Southern California (see Tafolla et al., 2025) and thus were administered BOSAs in both English and Spanish. The 42 bilingual families were predominantly from low-income (69% of participants reported a household income below $65,000), primarily Spanish-speaking households (74% of caregivers reported Spanish was their primary language). The bilingual group was a community-based sample recruited from schools, Regional Centers, community non-profit organizations, and local autism organizations.
Table 1
Demographics for participants with english (N = 98) and Spanish (N = 42) Data by BOSA version
 
MV-T English
MV-T Spanish
MV-1 English
MV-1 Spanish
PSYF English
F1 English
F1 Spanish
F2 English
F2 Spanish
n
19
9
19
8
14
34
13
12
12
Age in months [M (SD)]
25.95 (5.14)
23.2 (4.7)
51.95 (17.7)
53.6 (17.5)
65.7 (26.3)
96.7 (43.8)
131.0 (28.2)
298.8 (92.6)
298.8 (92.6)
Sex male
63%
67%
89%
75%
78%
68%
54%
50%
50%
Autism diagnosis
84%
63%
84%
67%
79%
88%
77%
75%
75%
Verbal IQ [M (SD)]
54 (23)
51 (28)
47 (29)
41 (28)
68 (9)
93 (15)
85 (18)
103 (25)
103 (25)
Nonverbal IQ [M (SD)]
86 (19)
87 (26)
72 (29)
71 (31)
86 (18)
102 (14)
96 (15)
99 (14)
99 (14)
Interactant
 Caregiver
47%
89%
21%
88%
14%
35%
92%
75%
75%
 Clinician
53%
11%
79%
12%
79%
65%
8%
17%
17%
 Other
0%
0%
0%
0%
7%
0%
0%
8%
8%
Test location
 Home
16%
33%
21%
50%
21%
65%
46%
33%
33%
 Clinic
84%
67%
63%
12.5%
79%
29%
38.5%
50%
50%
 Community
0%
0%
16%
37.5%
0%
6%
15.5%
17%
17%
Dominant language
 Spanish
78%
88%
38%
25%
PSYF Spanish is not included here due to the limited number of participants with Spanish videos. Autism diagnosis refers to the percentage of those in the sample that received a best estimate diagnosis of autism. All participants in this study were Latinx. “Other” interactants included romantic partners and siblings. Dominant language information is only available for a subset (N = 42) of the sample

Diagnostic Procedures

All 98 participants had diagnoses of autism or other neurodevelopmental (e.g., global developmental delays, ADHD) or mental health conditions (e.g., depression, anxiety). Most participants (84%) received a best estimate diagnosis of autism. Because participants were recruited as part of different research studies, diagnostic procedures varied across participants (Dow et al., 2021; Tafolla et al., 2025).
Comprehensive evaluations were conducted for the subset of 42 English–Spanish bilingual participants (Tafolla et al., 2025). Their battery of assessments included the ADOS-2 in Spanish and in English, a measure of cognitive functioning, and caregiver reports of autism-related symptoms, adaptive skills, and externalizing/internalizing behaviors gathered via questionnaires. For some participants, an additional semi-structured interview was also administered (when a more detailed history was necessary to determine the best estimate diagnosis).
For the remaining English-speaking participants, various strategies were used to determine the best estimate diagnoses (Dow et al., 2021). Some of these participants received comprehensive evaluations which also included a battery of assessments such as the ADOS-2, measures of cognitive functioning, and caregiver reports of autism-related symptoms gathered via questionnaires or a semi-structured interview. Other participants presented with historical diagnoses given by community providers, which were confirmed by the clinical team using an ADOS-2 and/or a semi-structured questionnaire (the Autism Diagnostic Interview, Revised [ADI-R; Rutter et al., 2003]), and reviewing all available information. Assessments were conducted in clinic or community-based settings (e.g., homes and libraries).

Measures

A demographic form was collected to gather background information about participants and caregivers, including child characteristics (e.g., sex and age).

Bilingual Participants

For the 42 bilingual participants in the second study, we collected additional data including sociodemographic information (e.g., income and caregiver primary language), language dominance, and BOSAs both in English and Spanish. To quantify the 42 participants’ language dominance, we used a parent questionnaire called the Bilingual Input–Output Survey (BIOS; Peña et al., 2018). The BIOS allows for an estimate of the amount of language individuals hear and speak during each waking hour of weekdays and weekends. We used the percentage of exposure to each language to determine language dominance for less verbally fluent participants and output of each language to determine language dominance for verbal participants. For less verbally fluent participants, being exposed to Spanish over 50% of the time or above were considered Spanish dominant. For verbally fluent participants, speaking Spanish over 50% of the time or above were considered Spanish dominant. See Tafolla et al., 2025 for more information on calculating language dominance. BOSAs were administered in English and Spanish approximately four weeks apart, with the order of administrations randomized. Parents were asked to participate as their child’s social partner in both the English only and Spanish only BOSAs if they were able; however, many parents in this sample had limited English proficiency. If a parent was unable to administer the BOSA in either language, a bilingual examiner conducted the administration instead. All adult participants provided written informed consent, and parents provided consent for minors. All participants were compensated following each in-person visit. The study was approved by and conducted in compliance with the Institutional Review Boards at all participating institutions.

The Brief Observation of Symptoms of Autism (BOSA)

The BOSA was adapted from the Autism Diagnostic Observation Schedule, Second Edition (ADOS-2; Lord et al., 2012), one of the most widely used standardized observation-based assessments used to inform diagnoses of autism. The ADOS-2 is a 40- to 60-min semi-structured assessment administered and scored by trained clinicians, consisting of five modules that are commercially available (Toddler, Module 1, Module 2, Module 3, Module 4) and two modules available for research (Adapted Modules 1 and 2 for adolescents and adults with limited language; Bal et al., 2020) tailored to different ages and language levels.
Similar to the ADOS-2, the BOSA is an observation-based assessment that provides a naturalistic social context for interaction between a familiar adult (e.g., caregiver, teacher, or clinician–referred to as the interactant or social partner) and the individual, using standardized materials and activities (Dow et al., 2021). The measure entails a brief 12–14-min semi-structured interaction modeled on the ADOS-2 and adapted from the Brief Observation of Social Communication Change (BOSCC; Byrne & Lord, 2023, 2024; Grzadzinski et al., 2016). The BOSA can be administered remotely or in-person, as all sessions are video recorded. Administrations may take place in a variety of settings, including the participant’s home, a clinic, or community-based locations (e.g., libraries, community centers). Interactants are given instructions on each of the tasks approximately five minutes before they begin the interaction, during which they can ask clarifying questions. The observing clinician helps guide the interaction by signaling when it is time to move on to the next activity, either in-person or virtually.
The BOSA consists of four versions based on the individual’s age and language level. The BOSA-MV is designed for individuals of any age who are nonspeaking or who use only single words or rote phrases. It corresponds to the ADOS-2 Toddler Module, Module 1 or the Adapted Module 1. Each administration uses two sets of developmentally appropriate toys and bubbles, which can be drawn from existing ADOS-2 kits or purchased through the University of California, Los Angeles (Byrne et al., 2023, 2024). The BOSA-PSYF is for individuals of any age with phrase speech or for verbally fluent children under age eight and aligns with the ADOS-2 Module 2 or the Adapted Module 2. It includes developmentally appropriate toys or materials (some overlapping with the BOSA-MV), a dollhouse or mailbox, interactive items (e.g., a rocket launcher), and bubbles. The BOSA-F1, intended for verbally fluent children ages 6 to 10 who generally meet criteria for the ADOS-2 Module 3, incorporates turn-taking games and structured conversational prompts. The BOSA-F2 is designed for verbally fluent individuals ages 11 through adulthood and corresponds to the ADOS-2 Module 4. It includes more complex games (e.g., Jenga, Slap Jack) and social-emotional questions adapted from the ADOS-2.
Each BOSA version uses two sets of similar materials (one set per half-session) to support standardized elicitation and social presses by providing novel materials to the participants throughout. After administration, a clinician experienced in using the ADOS-2 codes the interaction using the corresponding ADOS-2 protocol. Research reliability on the ADOS-2 is not a requirement for community-based clinicians to use the BOSA. These codes are then transferred to a BOSA scoring sheet using a binary system (0/1). Select items, chosen based on empirically supported psychometric properties that prioritize sensitivity (Dow et al., 2021), comprise the BOSA algorithm. Each version has a distinct cutoff score and categorizes individuals into one of three autism concern ranges: little-to-no concern, mild-to-moderate concern, and moderate-to-severe concern. Cutoffs for concern are tied to the ADOS Module that would have been most appropriate. Thus, when the BOSA-MV is given to a child who would have received the Toddler Module (BOSA-MV-T) on the ADOS-2, it has a different cutoff score than when the BOSA-MV is given to a child who would have received Module 1 (BOSAMV-1) on the ADOS-2. Meeting the BOSA cutoff indicates a level of concern equivalent to moderate-to-severe symptoms consistent with DSM-5 criteria for autism.

Coding Procedure

The BOSA uses the ADOS-2 scoring protocol, requiring experience using and coding the ADOS-2. While research reliability on the ADOS-2 is not required for community-based clinicians who use the BOSA, all coders in this study were ADOS-2 research reliable because data was for study purposes. Coders were instructed to watch the 12–14-min semi-structured interaction, take notes using ADOS-2 protocols, and assign item scores based on their observations. The number of items coded by the clinicians ranged from 29 to 41, depending on the ADOS-2 module. These items reflect behaviors related to communication, reciprocal social interaction, and restricted and repetitive behaviors in alignment with DSM-5 and ICD-11 diagnostic criteria.
Each item was scored on a scale from 0 (no abnormality in the behavior) to 2 or 3 (abnormality of the behavior clearly present). Because the BOSA interaction is significantly shorter than a full ADOS-2 administration, coders sometimes lacked sufficient information to score certain items. In these cases, items were marked as ‘8’ (unable to code), which were later converted to 0 to avoid over-penalizing the participant. Select ADOS-2 items that comprise the BOSA algorithm were then converted to binary scores: 0 (no abnormality of behavior observed) or a 1 (abnormality of behavior clearly present). Algorithm items were summed to determine whether the individual met the established cutoff indicating moderate-to-severe concern for autism (Dow et al., 2021).
Spanish-language BOSAs were randomized to four coders who fluently spoke and/or understood Spanish, while English BOSAs were randomized to more than nine coders (including those who scored videos in Spanish). We controlled the assignment of BOSAs so that the same coder did not code the same participant’s BOSA in both languages. All coders were blind to participants’ diagnostic status. In this project, all participants were asked to stick to the language they were assigned during the administration. However, coders noted that participants often codeswitched from one language to the other, meaning if they were assigned to speak in Spanish they sometimes used English. This did not affect the scoring, as the administrator of the BOSA would respond in the language that was assigned for the BOSA and it did not occur frequently. Furthermore, coders were asked not to count those instances of codeswitching against the participant and give credit.

Analytic Plan

Using best-estimate clinical diagnoses as the reference standard, sensitivity and specificity with exact (Clopper–Pearson) 95% confidence intervals were calculated for each BOSA version (i.e., MV-T, MV-1, F1 and F2) in Spanish (N = 42). The PSYF version was excluded from the Spanish analyses due to the limited number of available Spanish PSYF administrations. Sensitivity and specificity with exact (Clopper–Pearson) 95% confidence intervals were also calculated for all English BOSA versions (N = 98) (i.e., MV-T, MV-1, PSYF, F1 and F2, N = 98).
Multilevel logistic regression models were then fit separately by language to assess the association between a positive BOSA screen and an autism diagnosis across BOSA versions. Models allowing both random intercepts and random slopes for module were initially evaluated but due to the small number of modules and limited sample sizes within BOSA versions, these models did not provide stable estimates of between module heterogeneity. Therefore, results from hierarchical models including random intercepts only were retained.
To assess systematic differences in positive classifications between English and Spanish BOSA administrations, exact McNemar’s tests were conducted for the 42 participants who completed BOSAs in both languages. Subsequently, agreement between English and Spanish screener classifications within each BOSA version was evaluated using Cohen’s kappa. Finally, for participants with both Spanish and English BOSAs, we calculated the percentage of participants with discrepant cutoff classifications across languages and descriptively examined demographic characteristics including language proficiency to identify whether that may have influenced the discrepancy. All analyses were done using R (R Core Team, 2023).

Results

Sensitivity and Specificity

For the Spanish BOSA, the MV-T, MV-1, and F1 showed good sensitivity and adequate specificity, whereas F2 showed great sensitivity but poor specificity. For the English sample, the cutoff scores resulted in good discrimination between autism and non-autism groups across most versions, except for F1, which had the lowest sensitivity (see Table 2). Formal statistical comparisons were not computed due to the small sample sizes. Sensitivity and specificity for the English BOSAs from the subset of bilingual participants were also computed and were comparable to the full English sample, thus only the results for the full sample were reported.
Table 2
Sensitivity and Specificity of the BOSA in English and Spanish with 95% Confidence Intervals
BOSA
English sample N = 98
English sensitivity
English specificity
Spanish sample n = 42
Spanish sensitivity
Spanish specificity
 
n
% (CIs)
% (CIs)
n
% (CIs)
% (CIs)
MV-T
19
0.94 (0.70–1.00)
1.00 (0.29–1.00)
9
0.83 (0.36–1.00)
0.67 (0.09–1.00)
MV-1
19
0.94 (0.70–1.00)
1.00 (0.29–1.00)
8
0.80 (0.28–1.00)
0.67 (0.09–1.00)
PSYF
14
0.91 (0.59–1.00)
0.67 (0.09–0.99)
N/A
0.50 (0.01–0.99)
Not estimable
F1
34
0.63 (0.44–0.80)
0.75 (0.19–0.99)
13
0.80 (0.44–0.98)
1.00 (0.29–1.00)
F2
12
0.89 (0.52–1.00)
0.67 (0.09–0.99)
12
1.00 (0.66–1.00)
0.33 (0.01–0.91)
PSYF Spanish is not included here due to the limited number of participants with Spanish videos

Multilevel Logistic Regression Models

Results from the multilevel logistic regression models showed that across both English and Spanish administrations, screening positive on the BOSA was strongly associated with an autism diagnosis. After accounting for clustering by BOSA version, children who screened positive had higher odds of receiving an autism diagnosis in both English (odds ratio ≈ 23) and Spanish (odds ratio ≈ 11). In both languages, the predictive value of a positive screen was consistent across BOSA versions. Results are presented in Table 3.
Table 3
Multilevel Logistic Regression Models
Language
Effect
Estimate (log-odds)
SE
Odds ratio (OR)
95% CI (OR)
English
Intercept
−0.003
0.50
1.00
0.37–2.66
English
Screen positive
3.14
0.78
23.1
5.00–107.8
Spanish
Intercept
−0.47
0.57
0.63
0.20–1.91
Spanish
Screen positive
2.38
0.78
10.8
2.3–50.04
Estimates are from multilevel logistic regression models with random intercepts for BOSA version, fit separately by language. Odds ratios and 95% confidence intervals were derived from Wald standard errors

Agreement and Positive Classifications Between English and Spanish

The BOSA demonstrated moderate agreement between the Spanish and English administrations across the different versions (κ = 0.43–0.53), indicating reasonable consistency between languages (see Table 4). The PSYF had a very small number of paired subjects and produced unstable estimates. Only F1 approached statistical significance (p = 0.053). Overall, these findings support moderate cross-language agreement of the screener, while highlighting variability due to limited data in some BOSA versions. McNemar's tests showed no evidence of asymmetric classification between English and Spanish versions across modules (all p > .05). Within the PSYF version, there were no discordant English/Spanish classifications and therefore the test was not applicable. Cutoff classifications between the Spanish and English BOSAs were consistent for 79% of bilingual participants. Among the nine participants with discrepant scores, five were within one point of the cutoff on one version, suggesting minimal differences in those cases. Whether participants met the cutoff in one language versus the other did not appear to be influenced by language dominance. The observed variability may instead be due to other unexplored factors, including differences in interactants or day-to-day fluctuations in behavior.
Table 4
Cohen’s Kappa across BOSA versions
BOSA version
N (paired subjects)
Cohen’s κ
Interpretation
z
p Value
MV-T
9
0.50
Moderate
1.5
0.134
MV-1
8
0.47
Moderate
1.32
0.187
PSYF
2
0
NA
NA
NA
F1
13
0.53
Moderate
1.94
0.05
F2
12
0.43
Moderate
1.81
0.07

Discussion

This study provided initial evidence regarding the psychometric performance of the Brief Observation of Symptoms of Autism (BOSA) with a culturally and linguistically diverse Latinx sample. Our sample was unique in that the entire sample was Latinx and fluent in English and approximately half of the participants were English–Spanish bilingual and completed BOSAs in both languages. The BOSA and other similar remote assessments have been proposed as alternatives to gold-standard, in-person observational tools, or as a screener to triage individuals and determine whether a comprehensive evaluation is warranted. The BOSA is not intended as a replacement for comprehensive diagnostic evaluations, as supported by the psychometric evidence from the current study, but rather as a complementary tool to support and inform clinical decision-making. The BOSA may be particularly valuable in settings where traditional autism assessment tools are less feasible due to language-related or geographical barriers for instance, though additional research and potential refinements are still needed.
Overall, the BOSA demonstrated adequate psychometric properties with our Latinx sample for some age and language level groups, but not for all. Sensitivity and specificity findings were generally consistent with those reported in the original validation study, particularly for minimally verbal individuals and those speaking in simple phrases (Dow et al., 2021). Consistent with these findings, multilevel logistic regression analyses showed that screen-positive status was strongly associated with autism diagnosis across languages, even after accounting for clustering by BOSA version. However, sensitivity was notably lower for individuals with fluent speech, with similar patterns observed in the original study (Dow et al., 2021). This suggests that the BOSA may be less robust in detecting autism symptoms in specific sub-populations, and that particular care should be taken in screening older and more verbal individuals.
Furthermore, the English and Spanish versions across the different modules of the BOSA performed similarly, as evidenced by their moderate agreement (Cohen’s K [kappa]) and positive classifications (McNemar’s test), with some notable nuances. Specifically, the Spanish version of the BOSA-F1 (younger individuals with fluent language) showed better psychometrics than the English version, showing both higher sensitivity and specificity (Table 2). One possible explanation for this finding is the difference in interactants across the English and Spanish BOSAs within this group. In the English version of the BOSA-F1, clinicians were more likely to serve as the social partner, whereas caregivers were more commonly the interactants in the Spanish administrations. Unlike the ADOS-2, where clinicians are held to some standards of skill and following protocols, the BOSA can be done with parents or other familiar adults whose behaviors may be more varied (though there are attempts to keep instructions and ways of conveying expectations as clear as possible across all interactants). Future research should aim to hold the interactant constant across both language administrations. This was not feasible in our present sample, as many parents were monolingual Spanish speakers and therefore could not complete the BOSA in English; however, obtaining a parent–child interaction, when possible, even in just one language, was still considered valuable.
In relation to the BOSA-F2 group (older individuals with fluent language), the Spanish BOSA showed great sensitivity but specificity below the expected threshold, meaning it had a high number of false positives. One explanation may be that most individuals administered the BOSA-F2 were more fluent in English than Spanish, further complicating the interpretation of Spanish-language administrations, potentially complicating social communication between the participant and the interactant. While these findings should be interpreted cautiously due to the small sample size, they highlight the need for additional research on how bilingualism may affect brief observational assessments for adults. Bilingualism could introduce complexity in interpretation of behavior, especially in brief assessments, where social behaviors related to the use of a second language, such as code switching or social hesitancy, may resemble features of autism (Fombonne, 2020). These findings suggest that this may be a particular issue for individuals with more fluent language, as opposed to those who primarily speak in single words or short phrases.
While our small sample size limits our ability to draw firm conclusions regarding BOSA performance with this population, our findings indicate that the BOSA shows promise as a tool for addressing several gaps in autism assessment, pending further validation. The BOSA may serve as an effective observational assessment within autism screening and diagnostic procedures, especially when used in combination with other tools such as medical chart review and parent interviews and questionnaires, particularly for less verbally fluent individuals and toddlers. Its potential for administration by non-specialists in early intervention and school-based settings could help reduce diagnostic delays that disproportionately affect families of color and non-English-speaking households (Aylward et al., 2021). Future research should evaluate the BOSA’s sensitivity, specificity, and predictive value in this role, especially when used by clinicians with varying levels of ADOS-2 training.
Further, the BOSA has the potential to expand access to autism assessments and address the evaluation waitlist crisis if used with additional tools (Kanne & Bishop, 2021). The BOSA kits can be shipped anywhere, including families’ homes to avoid travel and minimize barriers to access. Interactants can be coached through the interaction via telehealth, or they can send video recordings of the BOSA interactions to the clinician via a secure platform if preferred. Observational tools like the BOSA that can be administered in naturalistic, or community settings are particularly valuable for reducing barriers to early detection in underserved populations (Dow et al., 2019; McCarty & Frye, 2020). Moreover, few brief observational tools have been validated for linguistically diverse populations across the lifespan, which positions the BOSA as uniquely valuable in addressing diagnostic disparities (Dow et al., 2021; Zander et al., 2015).
Finally, the BOSA administrations (when videoed at multiple timepoints using the same materials) have been used to measure change over time in response to intervention. A validated coding scheme designed for this purpose, the Brief Observation of Social Communication Change (BOSCC), has been validated using adult–child interaction videos in the same format as the BOSA (Byrne et al., 2023, 2024; Grzadzinski et al., 2016; Reszka et al., 2024). This multipurpose kit enhances clinical utility and may help reduce overall costs for providers. Although the BOSCC has not yet been formally validated with multilingual or non-English-speaking populations, it has been administered in several languages, including German, French, Hindi, Korean, Dutch and Spanish, though systematic validation in these languages is still needed.

Limitations and Future Directions

Results presented should be carefully interpreted given the small sample size, especially for the Spanish group. The statistical power is insufficient to draw firm conclusions, and larger samples are needed to ensure stable parameter estimates and more reliable cross-language comparisons. Despite the sample size limitation, this study provides some preliminary evidence regarding both the utility of the BOSA, with different methodologies leading to the same results, and its limitations. While the BOSA offers several advantages, including a modular standardized structure, brief administration time, and use of relatively inexpensive materials, its current requirement that coders be familiar with the ADOS-2 may limit scalability in community settings and among clinicians without formal ADOS-2 training. Because the BOSA relies on clinicians experienced in ADOS-2 administration and scoring, an important next step will be to evaluate whether it can be used effectively by clinicians who do not use the ADOS-2 in their clinical practice. Findings from this study also suggest that further adaptation may be needed when using the BOSA as a screener for individuals with more fluent language abilities (i.e., versions F1 and F2 of the BOSA), given the lower sensitivity and specificity in this subgroup. Therefore, additional research is needed to optimize performance among older and more verbally fluent populations.
Looking ahead, a key goal is to continue refining the BOSA and evaluate its role within diagnostic referral and evaluation pathways. In applying the BOSA to bilingual and multilingual populations, it will also be important to further examine factors that may influence the validity and interpretability of the BOSA assessment, such as the role of language dominance and the identity of the social partner (e.g., caregiver vs. clinician). Recruiting larger samples will be critical to address these additional questions while also potentially needing to adjust cutoff scores to increase psychometrics.

Conclusion

Equitable access to autism evaluations remains a critical global need (Brinster et al., 2023; Divan et al., 2021). Observational assessments play a central role in identifying autism-related behaviors, particularly for individuals whose symptoms may not be fully reflected in parent-report measures (Zander et al., 2015). Mischaracterization of autism, whether through under- or over-identification, can have lasting consequences for individuals and families, including inappropriate service provision, increased stigma, and persistent unmet support needs. The BOSA can help address this diagnostic equity gap by providing a standardized yet flexible observational tool that is more feasible to implement in under-resourced, multilingual communities and clinically informative when used with additional tools. Beyond these contexts, the BOSA also holds broader clinical value, as it enables the collection of more naturalistic behavioral samples (for example, of a child with a parent) to supplement in-person evaluations with a range of social partners not limited to the clinician, and helps reduce access barriers (such as transportation) by allowing for remote, video-based administration. Continued refinement and validation in real-world settings will be essential to ensuring the BOSA’s utility in promoting culturally and linguistically responsive autism diagnostic practices.

Acknowledgments

We would like to thank all of the families who participated in this study and all community-based organizations that made recruitment possible.

Declarations

Conflict of Interest

Catherine Lord receives royalties from Western Psychological Services (WPS) for the sales of the ADOS-2, SCQ, and ADI-R.

Ethical approval

The BOSA is copyrighted by WPS due to its overlap with the ADOS and BOSCC, but it is currently available without cost with permission from WPS for researchers. This study received ethical approval from the University of California, Los Angeles Institutional Review Board (#23–000535) in May 2023. All study procedures were conducted in compliance with the ethical standards of the institutional research ethics committee. Written informed consent was obtained from all adult participants, or from parents or legal guardians for minor participants and those unable to provide consent.

Author Contributions

Data collection was performed by all authors. M.T., J.L. and C.L. contributed to the study conception and design. Material preparation and analysis were performed by the first author. M.T. and J.L. wrote the original draft, and all authors contributed to reviewing, editing and approved the final submission.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Titel
Bridging Languages, Broadening Access: Examining an Observation-Based Autism Assessment with a Latinx Sample
Auteurs
Maira Tafolla
Juliette Lerner
So Hyun Kim
Catherine Lord
Publicatiedatum
09-02-2026
Uitgeverij
Springer US
Gepubliceerd in
Journal of Autism and Developmental Disorders
Print ISSN: 0162-3257
Elektronisch ISSN: 1573-3432
DOI
https://doi.org/10.1007/s10803-026-07237-z
go back to reference American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). American Psychiatric Publishing. https://doi.org/10.1176/appi.books.9780890425596
go back to reference Aylward, B. S., Gal-Szabo, D. E., & Taraman, S. (2021). Racial, ethnic, and sociodemographic disparities in diagnosis of children with autism spectrum disorder. Journal of Developmental and Behavioral Pediatrics, 42(8), 682–689.CrossRefPubMedPubMedCentral
go back to reference Bal, V. H., Maye, M., Salzman, E., Huerta, M., Pepa, L., Risi, S., & Lord, C. (2020). The adapted ADOS: A new module set for the assessment of minimally verbal adolescents and adults. Journal of Autism and Developmental Disorders, 50(3), 719–729.CrossRefPubMedPubMedCentral
go back to reference Brinster, M. I., Brukilacchio, B. H., Fikki-Urbanovsky, A., Shahidullah, J. D., & Ravenscroft, S. (2023). Improving efficiency and equity in early autism evaluations: The (S)TAAR model. Journal of Autism and Developmental Disorders, 53(1), 275–284. https://doi.org/10.1007/s10803-022-05425-1CrossRefPubMed
go back to reference Byrne, K., & Lord, C. (2023). The brief observation of social communication change (BOSCC): Procedures, strengths, limitations, and future directions. Archives of Paediatric & Developmental Pathology, 6(1), 1–3. https://doi.org/10.47739/2641-774x.paediatricpathology.1026CrossRef
go back to reference Byrne, K., Sterrett, K., Holbrook, A., Kim, S. H., Grzadzinski, R., & Lord, C. (2024). Extending the usefulness of the Brief Observation of Social Communication Change (BOSCC): Validating the phrase speech and young fluent version. Journal of Autism and Developmental Disorders, 54(3), 1009–1023.CrossRefPubMed
go back to reference Chavez, A. E., Feldman, M. S., Carter, A. S., Eisenhower, A., Mackie, T. I., Ramella, L., Hoch, N., & Sheldrick, R. C. (2021). Delays in autism diagnosis for U.S. Spanish-speaking families: The contribution of appointment availability. Evidence-Based Practice in Child and Adolescent Mental Health, 7(2), 275–293.CrossRefPubMedPubMedCentral
go back to reference Corona, L. L., Weitlauf, A. S., Hine, J., Berman, A., Miceli, A., Nicholson, A., Stone, C., Broderick, N., Francis, S., Juárez, A. P., Vehorn, A., Wagner, L., & Warren, Z. (2021). Parent perceptions of caregiver-mediated telemedicine tools for assessing autism risk in toddlers. Journal of Autism and Developmental Disorders, 51(2), 476–486.CrossRefPubMedPubMedCentral
go back to reference Divan, G., Bhavnani, S., Leadbitter, K., Ellis, C., Dasgupta, J., Abubakar, A., Elsabbagh, M., Hamdani, S. U., Servili, C., Patel, V., & Green, J. (2021). Annual research review: Achieving universal health coverage for young children with autism spectrum disorder in low- and middle-income countries: A review of Reviews. Journal of Child Psychology and Psychiatry, 62(5), 514–535. https://doi.org/10.1111/jcpp.13404CrossRefPubMed
go back to reference Dow, D., Day, T. N., Kutta, T. J., Nottke, C., & Wetherby, A. M. (2019). Screening for autism spectrum disorder in a naturalistic home setting using the systematic observation of red flags (SORF) at 18–24 months. Autism Research, 13(1), 122–133.CrossRefPubMedPubMedCentral
go back to reference Dow, D., Holbrook, A., Toolan, C., McDonald, N., Sterrett, K., Rosen, N., Kim, S. H., & Lord, C. (2021). The brief observation of symptoms of autism (BOSA): Development of a new adapted assessment measure for remote telehealth administration through COVID-19 and beyond. Journal of Autism and Developmental Disorders, 52(12), 5383–5394. https://doi.org/10.1007/s10803-021-05395-wCrossRefPubMedPubMedCentral
go back to reference Fombonne, E. (2020). Challenges in estimating the prevalence of autism spectrum disorders. Current Directions in Psychological Science, 29(6), 509–515.
go back to reference Granana, N., Astorino, F., Richaudeau, A., Costa, L., de Fernanz Carrera, E., Nanclares, V., ECHO PROTECTEA BOSA CONSORTIUM. (2025). The brief observation of symptoms of autism: Validation study in a Latin American sample. Autism, 29(4), 896–906.CrossRefPubMed
go back to reference Grzadzinski, R., Carr, T., Colombi, C., McGuire, K., Dufek, S., Pickles, A., & Lord, C. (2016). Measuring changes in Social Communication Behaviors: Preliminary development of the brief observation of Social Communication Change (BOSCC). Journal of Autism and Developmental Disorders, 46(7), 2464–2479. https://doi.org/10.1007/s10803-016-2782-9CrossRefPubMed
go back to reference Imanpour, S. (2024). System experiences of mothers who have limited English proficiency and preschoolers with autism. Journal of Child and Family Studies, 33(8), 2637–2645. https://doi.org/10.1007/s10826-024-02882-3CrossRef
go back to reference Kanne, S. M., & Bishop, S. L. (2021). Editorial perspective: The autism waitlist crisis and remembering what families need. Journal of Child Psychology and Psychiatry, 62(2), 140–142. https://doi.org/10.1111/jcpp.13254CrossRefPubMed
go back to reference Khowaja, M. K., Robins, D. L., & Hazzard, A. P. (2018). Utilizing two-tiered screening for early detection of autism spectrum disorder. Autism, 22(7), 881–890. https://doi.org/10.1177/1362361317712649CrossRefPubMed
go back to reference Lim, N., O’Reilly, M., Sigafoos, J., Lancioni, G. E., & Sanchez, N. J. (2020). A review of barriers experienced by immigrant parents of children with autism when accessing services. Review Journal of Autism and Developmental Disorders, 8(3), 366–372.CrossRef
go back to reference Lord, C., Rutter, M., DiLavore, P. C., Risi, S., Gotham, K., & Bishop, S. L. (2012). Autism Diagnostic Observation Schedule, Second Edition (ADOS-2). Western Psychological Services.
go back to reference McCarty, P., & Frye, R. E. (2020). Early detection and diagnosis of autism spectrum disorder: Why is it so difficult? Seminars in Pediatric Neurology, 35, 100831. https://doi.org/10.1016/j.spen.2020.100831CrossRefPubMed
go back to reference Miller, L. E., Perkins, K. A., Dai, Y. G., & Fein, D. A. (2017). Comparison of parent report and direct assessment of child skills in toddlers. Research in Autism Spectrum Disorders, 41–42, 57–65. https://doi.org/10.1016/j.rasd.2017.08.002CrossRefPubMedPubMedCentral
go back to reference Nordahl-Hansen, A., Kaale, A., & Ulvund, S. E. (2014). Language assessment in children with autism spectrum disorder: Concurrent validity between report-based assessments and direct tests. Research in Autism Spectrum Disorders, 8(9), 1100–1106. https://doi.org/10.1016/j.rasd.2014.05.017CrossRef
go back to reference Pellegrini, A. D. (2001). The role of direct observation in the assessment of young children. Journal of Child Psychology and Psychiatry, 42(7), 861–869.PubMed
go back to reference Peña, E. D., Gutiérrez-Clellen, V. F., Iglesias, A., Goldstein, B. A., & Bedore, L. M. (2018). Bilingual english spanish assessment (BESA). Baltimore, MD: Brookes.
go back to reference R Core Team. (2023). R: A language and environment for statistical computing [Software]. R Foundation for Statistical Computing. https://www.R-project.org/
go back to reference Reszka, S. S., Wallisch, A., Boyd, B. A., Watson, L. R., & Grasley-Boy, N. (2024). Initial examination of use of the Brief Observation of Social-Communication Change (BOSCC) across home and school contexts. Infant and Child Development, 33, e2547.CrossRef
go back to reference Rutter, M., Le Couteur, A., & Lord, C. (2003). Autism diagnostic interview-revised. Los Angeles, CA: Western Psychological Services, 29(2003), 30.
go back to reference Sánchez-García, A. B., Galindo-Villardón, P., Nieto-Librero, A. B., Martín-Rodero, H., & Robins, D. L. (2019). Toddler screening for autism spectrum disorder: A meta-analysis of diagnostic accuracy. Journal of Autism and Developmental Disorders, 49(5), 1837–1852. https://doi.org/10.1007/s10803-018-03865-2CrossRefPubMedPubMedCentral
go back to reference Shaw, K. A. (2025). Prevalence and early identification of autism spectrum disorder among children aged 4 and 8 years—Autism and Developmental Disabilities Monitoring Network, 16 sites, United States, 2022. MMWR. Surveillance Summaries. https://doi.org/10.15585/mmwr.ss7402a1CrossRef
go back to reference Smith, C. J. M., Rozga, A., Matthews, N., Oberleitner, R., Nazneen, N., & Abowd, G. D. (2017). Investigating the accuracy of a novel telehealth diagnostic approach for autism spectrum disorder. Psychological Assessment, 29(3), 245–252. https://doi.org/10.1037/pas0000317CrossRefPubMed
go back to reference Stone, W. L., Coonrod, E. E., Turner, L. M., & Pozdol, S. L. (2004). Psychometric properties of the STAT for early autism screening. Journal of Autism and Developmental Disorders, 34(6), 691–701.CrossRefPubMed
go back to reference Tafolla, M., Benrey, N., Rosen, N., Lerner, J., & Lord, C. (2025). Autism assessment with English-Spanish bilingual individuals in the United States. Journal of Autism and Developmental Disorders.
go back to reference Wagner, L., Corona, L. L., Weitlauf, A. S., Marsh, K. L., Berman, A. F., Broderick, N. A., Francis, S., Hine, J., Nicholson, A., Stone, C., & Warren, Z. (2021). Use of the TELE-ASD-PEDS for autism evaluations in response to COVID-19: Preliminary outcomes and clinician acceptability. Journal of Autism and Developmental Disorders, 51(9), 3063–3072. https://doi.org/10.1007/s10803-020-04767-yCrossRefPubMed
go back to reference Wallis, K. E., & Guthrie, W. (2024). Screening for autism: A review of the current state, ongoing challenges, and novel approaches on the horizon. Pediatric Clinics, 71(2), 127–155.PubMed
go back to reference Wetherby, A. M., Guthrie, W., Hooker, J. L., Delehanty, A., Day, T. N., Woods, J., Pierce, K., Manwaring, S. S., Thurm, A., Ozonoff, S., Petkova, E., & Lord, C. (2021). The early screening for autism and communication disorders: Field-testing an autism-specific screening tool for children 12 to 36 months of age. Autism, 25(7), 2112–2123.CrossRefPubMedPubMedCentral
go back to reference Wigham, S., Rodgers, J., Berney, T., Le Couteur, A., Ingham, B., & Parr, J. R. (2019). Psychometric properties of questionnaires and diagnostic measures for autism spectrum disorders in adults: A systematic review. Autism, 23(2), 287–305.CrossRefPubMed
go back to reference Young, R. L., & Nah, Y. H. (2016). Examining autism detection in early childhood (ADEC) in the early identification of young children with autism spectrum disorders (ASD). Australian Psychologist, 51(4), 261–271.CrossRef
go back to reference Zander, E., Willfors, C., Berggren, S., Choque-Olsson, N., Coco, C., Elmund, A., Moretti, Å. H., Holm, A., Jifält, I., Kosieradzki, R., Linder, J., Nordin, V., Olafsdottir, K., Poltrago, L., & Bölte, S. (2015). The objectivity of the Autism Diagnostic Observation Schedule (ADOS) in naturalistic clinical settings. European Child and Adolescent Psychiatry, 25(7), 769–780. https://doi.org/10.1007/s00787-015-0793-2CrossRefPubMed
go back to reference Zuckerman, K. E., Lindly, O. J., Reyes, N. M., Chavez, A. E., Macias, K., Smith, K. N., & Reynolds, A. (2017). Disparities in diagnosis and treatment of autism in Latino and Non-Latino white families. Pediatrics. https://doi.org/10.1542/peds.2016-3010CrossRefPubMedPubMedCentral