Rates of autism spectrum disorder (ASD) are higher among those with Attention-Deficit/Hyperactivity Disorder (ADHD) than in the general population (Ottosen et al., 2019). It has been estimated that up to 60% of people with ADHD show features of ASD (Ros & Graziano, 2018), while 21% of those with ADHD also meet diagnostic criteria for co-occurring ASD (Hollingdale et al., 2020). High rates of co-occurring disorders could be explained by shared genetic underpinnings (Elia et al., 2010; Ghirardi et al., 2018) and similar behavioral phenotypes (Antshel & Russo, 2019). Among those with ADHD, rates of ASD are especially high among boys (Ottosen et al., 2019) and those with below average intelligence quotient (IQ) (Al-Khudairi et al., 2019).
Challenges in Diagnostic Assessment of ASD in ADHD
In children diagnosed with ADHD, there is a unique challenge in assessing for the presence of co-occurring ASD. Mainly, there is considerable overlap in symptoms between the two disorders (Grzadzinski et al., 2011). Both disorders are associated with social/communication differences (Geurts et al., 2004; Mikami et al., 2019; Staikova et al., 2013), reduced emotion recognition (Sinzig et al., 2008), and sensory processing differences (Dellapiazza et al., 2021; Kern et al., 2015). Those with ADHD are more likely to have difficulty applying their knowledge of social norms in the moment and engage inappropriately with peers (Cervantes et al., 2013), thereby emulating some of the hallmark symptoms of ASD. Indeed, many behaviors observed in ASD can also be construed as ADHD symptoms. For example, difficulty holding a conversation could be indicative of either challenge with social-emotional reciprocity (ASD symptom) or not listening when spoken to and frequently interrupting others (ADHD symptom).
Among those with ASD + ADHD, an ADHD diagnosis tends to precede an ASD diagnosis (Sainsbury et al., 2023), so clinicians have the difficult task of discerning whether symptoms are in excess of what is expected for ADHD and qualify for a diagnosis of ASD. However, there is little evidence of how well ASD diagnostic and screening tools can accurately rule out ASD among those with ADHD (without ASD). Previous reports have shown that clinicians must be cautious when using ASD screening measures in individuals with behavioral and emotional problems, as their specificity, or ability to identify those who do not have ASD, is lower among this population (Havdahl et al., 2016). In other words, the presence of ADHD increases the risk of misclassification of ASD. In a study of the Social Responsiveness Scale (SRS; Constantino & Gruber, 2012), greater externalizing behaviors predicted high raw scores (indicating greater level of ASD symptoms) among unaffected siblings of children with ASD (Hus et al., 2013). Sub-threshold and threshold autism symptoms are often reported when using common diagnostic and screening measures, such as the Social Communication Questionnaire (SCQ; Ghaziuddin et al., 2010; Mouti et al., 2019), Childhood Autism Rating Scale (CARS; Mayes et al., 2012), and the Autism Diagnostic Observation Schedule (De Giacomo et al., 2021; Greene et al., 2022), among those with ADHD. Together, these findings have prompted researchers to examine the extent to which ASD is being misclassified in those with ADHD (Aiello et al., 2021; Ghaziuddin et al., 2010; Greene et al., 2022).
Diagnosing ASD in females with ADHD may be particularly difficult. Females differ in symptom presentation of ASD, which may not be fully represented on many diagnostic measures. Specifically, girls with ASD may have fewer social difficulties as young children and fewer apparent restricted and repetitive behaviors when compared to males (Estrin et al., 2021). This presentation can make differentiating ASD symptoms from social differences due to ADHD more difficult.
Gaps in Literature
Despite the evidence suggesting low diagnostic accuracy in ASD measures, no study has systematically reviewed the literature on the validity of these measures in effectively ruling out ASD in children with ADHD. Studies have reviewed the quality of assessment tools in adults (Wigham et al., 2019) and toddlers (Sánchez-García et al., 2019) with ASD, but not school-aged children. School age may bring to light executive functioning impairments that were previously undetected, as it is associated with increased demands to comply in a structured environment (Wolraich et al., 2014). Moreover, no study to date has reviewed instruments’ performance specifically in those with ADHD. Instead, studies aggregate clinical comparison groups, such as intellectual disability (ID), language disorder, and ADHD together (Norris & Lecavalier, 2010). Reviewing the performance of ASD measures within ADHD alone is critical for exposing diagnostic challenges unique to this population. We specifically chose to evaluate studies that examine ASD measures’ accuracy in ruling out ASD among those with confirmed ADHD (without ASD). The proposed study focused on school-aged children, as that is when a diagnosis of ADHD tends to become reliable (APA, 2013).
Importance and Relevance of Work
In alignment with future directions proposed by Antshel and Russo (2019), our review will shed light on how the presence of ADHD symptoms affects assessment and diagnosis of ASD. Because there is substantial symptom overlap between ASD and ADHD, a tool with high specificity when used on a sample of children with ADHD would bolster confidence in a measure to inform an accurate clinical decision. Accuracy of these assessment tools can decrease the rate of misdiagnosis, which in turn could serve to decrease the wait list to receive ASD-specific interventions and improve accuracy of diagnostic assessments for both neurodevelopmental disorders.
Method
A literature review was conducted on four databases, PubMed, PsycINFO, ERIC, and Web of Science, using the following search terms:
-
Abstract: (ADHD OR “attention deficit*” OR ADD OR “disruptive behavior*”) AND.
-
Title: (autis* OR ASD OR “autism spectrum disorder” OR Asperger* OR PDD* OR “pervasive developmental disorder”) AND.
-
All Text: (screen* OR diagnos* OR measur* OR tool OR instrument OR sensitivity OR specificity OR “receiver operating characteristics” OR “area under curve” OR ROC OR AUC OR accuracy OR “differentia* validity” OR “utility” OR psychometric* OR “positive predictive value” OR PPV OR “negative predictive value” OR NPV OR “diagnostic odds ratio” OR “false positive” OR “false negative” OR “true negative” OR “true positive” OR “likelihood ratio” OR “Youden’s index” OR mean* OR score* or correlation* or regression* or validat*).
Autism-related key terms were searched from the title, and ADHD- and measurement-related key words were searched from the full record.
Inclusion criteria were as follows:
1.
Empirical study that included a quantitative analysis (i.e., exclude case studies, reviews, or recommendations).
2.
Study consisted of children with a mean age between 4 and 18 with a confirmed diagnosis of ADHD.
3.
Caregivers or clinicians completed an autism-specific clinical screening or diagnostic tool that was validated in English.
a.
Reported a psychometric property to assess differential validity or performance within ADHD. Some examples include:
i.
Specificity: rate of correct identification of individuals without a disorder.
ii.
Area under the curve (AUC): the ability of a measure to distinguish between a true positive and a true negative.
iii.
Negative predictive value (NPV): probability that following a negative test result, an individual will truly not have the disorder.
iv.
Proportion of ADHD sample that met threshold on ASD measure.
4.
Published in English in a peer-reviewed academic journal between 2000- December 2023.
5.
Measure was published after the publication of the Diagnostic and Statistical Manual of Mental Disorders, 4th edition, Text Revision (DSM-IV TR; 2000).
Exclusion criteria were as follows:
1.
Studies that used a mixed clinical group in their measurement of diagnostic validity were excluded, as the purpose of the study was to isolate the performance of measures in ADHD alone.
2.
Studies that analyzed ADHD and ADHD + ASD in the same group were excluded as the purpose is to understand how children with ADHD without ASD perform on these ASD measures. Further, in order to reduce ambiguity in the samples, studies were excluded if they did not explicitly state ASD as an exclusionary criterion in the ADHD group.
3.
Studies that examined outdated versions of measures were excluded (e.g., ADOS-Generic).
4.
Studies that solely reported means, standard deviations (SDs), or regressions were excluded, as they did not inform the accuracy or diagnostic utility of a measure.
5.
Studies that only analyzed a subset of the items of a measure were excluded.
6.
Studies that used machine learning to evaluate performance in ASD measures were excluded, as they often used a subset of items from a given measure.
The following information from each study were collected and coded: participants’ demographic information (including age, gender, and cognitive ability), diagnostic groups, sample size, type (diagnostic v. screening) and name of instrument, informant, diagnostic process, and psychometric properties (e.g., Specificity, AUC, NPV, means/SDs, group differences). Specificity, AUC, and NPV were only reported if they reflected data from ADHD and ASD samples. Two of the authors (MU and LB) independently screened titles/abstracts and reviewed full texts of each study. The outcome variables (psychometric properties and quality assessment) were coded independently by both reviewers, while the remaining variables (demographic characteristics, measurement information, diagnostic process) were coded by the primary reviewer. Reliability between the two coders during the training phase was 89%. Discrepancies about inclusion were resolved through discussion between the two coders. If a consensus was not obtained, the third author (LL) resolved the discrepancy. Data were synthesized narratively. Although the psychometric standards of any given measure depend on several factors (e.g., sample composition, type of measure, prevalence of condition, etc.), measures were judged to demonstrate utility in this population if their Specificity or AUC were approximately ≥ 0.8 or \(\:\le\:\) 20% of the sample exceeded the cutoff (Glascoe, 2005; Sonnander, 2000). Findings were reported according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses Diagnostic Test Accuracy (PRISMA-DTA) standards (Salameh et al., 2020).
Quality Ratings
Quality ratings were assigned to each study using the Quality Assessment of Diagnostic Accuracy Studies-2nd ed. (QUADAS-2; Whiting et al., 2011), an instrument used in systematic reviews that measures the quality of diagnostic test accuracy studies. QUADAS-2 criteria were tailored to the aims of the current study, as other review papers on diagnostic test accuracy have done (Vllasaliu et al., 2016). The QUADAS-2 has two core categories: Risk of Bias and Applicability Concerns. Risk of bias refers to the presence of systematic flaws in the design or conduct of a study that may distort the results, and Applicability Concerns indicate that any aspect of a study limits its ability to answer the review question. Each of these categories is judged by the following areas: Participant Selection, Index Test, and Method of Diagnosis. Risk of Bias also includes a domain for Flow and Timing. Ratings were indicated as “low” or “high” risk of bias. If insufficient data precluded the reviewers’ ability to make a judgment on more than one criterion, it was indicated as “high risk” for that domain.
Participant selection assessed whether studies represented gender (at least 20% girls) and intellectual abilities (at least 10% with IQ < 70) and had a sample size > 30. Index text evaluated whether the measure in question was administered and interpreted by someone not privy to official diagnosis, was appropriate for patients with ADHD, and was administered and scored (e.g., published cutoff) according to standardization. The only exceptions to this latter criterion were if studies attempted to validate a measure in a different language or if studies used a lower cutoff on the SCQ, as evidence has suggested that the published cutoff on the SCQ has low sensitivity, especially to young children or those without ID (Corsello et al., 2007; Eaves et al., 2006). Reference standard examined whether the diagnostic procedure for evaluating “true” diagnoses was adequate (e.g., used best estimate clinical diagnosis, incorporated variety of measures and informants), consistent across the entire sample, and administered/interpreted by someone not aware of the results of the index test. Flow and timing measured whether all participants’ available data were included in analyses and whether measures were not administered far apart.
Results
In total, 4,965 articles were identified after removal of duplicates. After screening for titles and abstracts, 189 full-text studies were assessed for eligibility, and 20 studies met inclusion criteria. Fig. 1 shows the study selection process. A summary description of the eight diagnostic and screening instruments from the included studies is provided in Table 1.
Fig. 1
PRISMA Flow Diagram. Note. PRISMA = Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA)
Table 1
Summary description of instruments
Diagnostic Measures | Summary of Measure Characteristics |
---|---|
ADOS-2 | The Autism Diagnostic Observation Schedule-2nd edition (ADOS-2) is a semi-structured, standardized assessment of ASD (Lord et al., 2012). It includes play-based activities designed to measure communication, reciprocal social interactions, and restricted and repetitive behaviors. There are four modules, depending on the individual’s language abilities (Modules 1 and 2 are intended for those with limited language, while Modules 3 and 4 are designed for those with fluent language). |
ADI-R | The Autism Diagnostic Interview-Revised (ADI-R) is a semi-structured interview conducted with caregivers that assesses current ASD presentation and developmental history. The ADI-R measures social-communication as well as restricted and repetitive behaviors (Rutter et al., 2003). |
CASD | The Checklist for Autism Spectrum Disorder (CASD) comprises a list of 30 ASD symptoms, which are scored as present (either current or past) or absent based on a semi-structured interview with the caregiver, information from the child’s teacher or childcare provider, direct observations, and other available records (Mayes, 2012). The 30 symptoms are grouped into six domains: problems with social interaction, perseveration, somatosensory disturbance, atypical communication and development, mood disturbance, and problems with attention and safety. |
3Di | The Developmental, Dimensional, and Diagnostic Interview (3Di) is a standardized computer-based diagnostic interview designed for caregivers of children ages 3 and up with suspected ASD (Skuse et al., 2004). Questions assess severity of features associated with ASD in three domains: reciprocal social interaction, communication and restricted and repetitive behaviors. |
Screening Measures | |
AQ | The Autism Spectrum Quotient (AQ)-Child (ages 4–11) and AQ-Adolescent (ages 9–15) are both 50-item parent-completed questionnaires that assess social skills, communication, imagination, attention switching, and attention to details (Auyeung et al., 2008). |
ASSQ | The Autism Spectrum Screening Questionnaire (ASSQ) is a 27-item parent questionnaire that assesses symptoms of ASD among children ages 7–16 who do not have ID. |
SCQ | The Social Communication Questionnaire (SCQ) consists of two versions: Lifetime and Current. The Lifetime version focuses on behaviors observed across the child’s lifespan. The Current version asks about behaviors observed within the past 3 months. |
SRS-2 | The Social Responsiveness Scale-2nd edition (SRS-2) is a 65-item parent rating scale that consists of the following subscales: Social Awareness, Social Cognition, Social Communication, Social Motivation, and Restricted Interests and Repetitive Behaviors (Constantino & Gruber, 2012). |
Table 2
Summary of Psychometric Properties
Participants | Measure | Psychometric Properties | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Region | Age (years) | IQ: Mean (SD); range | % Male | Sample Groups (n) | Sp | AUC | % cutoff | Cutoff score used | Mean(SD) | ||||
Diagnostic Measures | |||||||||||||
Grzadzinski et al., 2016 | USA | 9(4); 4–18 | IQ\(\:\ge\:\)70 | 79 | ASD (48) v. ADHD (164) | ADOS-2 | 21 | 2.7(1.6); ASD > ADHD | |||||
ADI-R | 30 | Soc = 12.1(7.1), Comm = 10.4(6.3), RRB = 4.6(3.2); ASD > ADHD | |||||||||||
Hoffmann et al., 2013 | Germany | 9.3(2.9); 5–18 | 103(19); IQ > 80 | 93 | ASD (62) v. ADHD (43) | ADI-R | 0.81# | 0.89# | NR | ||||
Mayes, 2018 | USA | 6.3(2.2); 2–12 | 104(16); 55–133 | 59 | ASD (336) v. ADHD (61) | CASD-SF | 0.90 | 10 | 3 | 1.0(1.1) | |||
Mayes & Lockridge, 2018 | USA | 6.1(2.2); 2–11 | 102(17); 55–130 | 58 | ASD (168) v. ADHD (40) | CASD (Mother) | NPV = 0.37 | 13 | 15 | 7.9(4.3); ASD > ADHD | |||
Lai et al., 2015 | Hong Kong | 8.4(1.5); 6–12 | IQ\(\:\ge\:\)70 | 82 | ASD (44) v. ADHD (76) v. ASD + ADHD (49) | 3Di | RSI | 0.52 | 0.82 | 22 | ASD = ASD + ADHD > ADHD > TD | ||
LCS | 0.60 | 0.85 | |||||||||||
RSB | 0.92 | 0.80 | |||||||||||
Screening Measures | |||||||||||||
Aiello et al., 2021 | Italy | 8.5; 6–14 | 101; 70–130 | 100 | ASD (77) v. ADHD (33) v. ASD + ADHD (24) | AQ-Child | 0.73# | 0.80# | 77 | 68.7(NR); ASD + ADHD = ASD > ADHD | |||
Wong et al., 2021 | Hong Kong | Child: 8.4(1.4),4–11 | ID excluded | 78 (Child) | ADHD (82) v. ASD (124) | AQ-Child | 0.87 | 76 | 65.1(11.1); ASD > ADHD; Cohen’s d = 1.5 | ||||
Adol: 14.2(2), 12–17 | 100 (Adol) | ADHD (51) v. ASD (78) | AQ-Adol | 0.83# | 76 | 69.9(12.1); ASD > ADHD; Cohen’s d = 1.33 | |||||||
Matsuura et al., 2014 | Japan | 10.8(1.8); 10–15 | 104(15); IQ\(\:\ge\:\)75 | 87 | ASD (11) v. ADHD (15) v. TD (19) | Brief AQ | 40 | 6 | 5(2.6); 14.1(7.5); ASD > ADHD | ||||
ASSQ | 20 | 19 | |||||||||||
Kim et al., 2018 | S. Korea | 7.8(0.9) | NR | 71 | ASD* (67) v. ADHD (200) v. TD (253) | ASSQ | 21# | 15# | 8.9(7.8), ASD > ADHD | ||||
Liu et al., 2023 | China | 8.8(1.7); 6–16 | 93(10); IQ\(\:\ge\:\)70 | 85 | ADHD(172) v. TD(44) | ASSQ | 38# | 12# | 10.4(8.1); ADHD > TD | ||||
Tahillioğlu et al., 2023 | Turkey | 10(1.8); 6–14 | ID excluded | 72 | ADHD(147) | ASSQ | 33# | 16# | NR | ||||
Kröger et al., 2011 | Germany | 9.7(1.8);6–13 | 101(11); IQ > 70 | 84 | ASD (53) v. ADHD (205) | SCQ | 0.83# | 0.89# | 11 | 8.6(3.8) | |||
Mouti et al., 2019 | Australia | 11.8(3.2); 6–17 | IQ > 70 | 87 | ASD (27) v. ADHD (46) | SCQ-L | 0.87# | 0.96# | 13 | 7.1(4.8); ADHD + ASD = ASD > ADHD | |||
Schwenck & Freitag, 2014 | Germany | 12.5(2.6) | 103(13); IQ > 70 | 87 | ASD (25) v. ADHD (62) | SCQ-L | 0.92# | 0.86# NPV = 0.88 | 14 | 7.9(4.2); ASD + ADHD = ASD > ADHD | |||
Cooper et al., 2014 | UK | 10.3(2.9); 5–18 | 84(14); 43–124 | 84 | ADHD (711) | SCQ | 39 | 15 | 13(6.6) | ||||
Craig et al., 2015 | Italy | 8.5(3.9); 7–9 | 85(20); IQ > 35 | 86 | ASD (43) v. ASD + ADHD (31) v. ADHD (51) | SCQ | 26 | 15 | 11.4(4.9); ASD + ADHD > ASD > ADHD | ||||
Kochhar et al., 2014 | UK | 12.5(1.8);9–15 | 105(12); IQ > 70 | 97 | ADHD (30) v. TD (30) | SCQ-L | 28 | 15 | 11.6(5.5); ADHD > TD | ||||
Rich et al., 2009 | USA | 10.4(3.2); 5–18 | 108(16); IQ > 70 | 71 | ADHD (58) | SCQ | 3 | 15 | NR | ||||
Guttentag et al., 2022 | USA | 8.2(1.6); 6–11 | IQ > 70 | 76 | ASD* (74) v. ADHD (102) | SCQ-L | 0.91 | 0.85 | 15 | NR | |||
SRS-2 | 0.78 | 0.78 | 65 | ||||||||||
Mellahn et al., 2022 | Australia | 60% ages 6–10; 33% ages 11–15; 2–18 | < 1% of ADHD group had ID | 74 | ADHD (239) v. ASD(117) v. ADHD + ASD (149) | SRS-2 | 59 RRB; 48 SCI | 65 | NR |
Summary of Psychometric Properties
Twenty studies were identified to report on eight diagnostic and screening instruments. The diagnostic measures included the ADOS-2, Autism Diagnostic Interview-Revised (ADI-R), Checklist for Autism Spectrum Disorder (CASD), and Developmental, Dimensional, and Diagnostic Interview (3Di). The screening measures included the Autism Spectrum Quotient (AQ), Autism Spectrum Screening Questionnaire (ASSQ), SCQ, and SRS-2. Main findings reported by each study are presented in Table 2. Of the 20 studies, four reported on two measures, resulting in 24 total reports. Henceforth, the term ‘study’ will refer to an article and a ‘report’ will refer to findings from any given measure.
There were only six reports that evaluated diagnostic ASD measures. Of these, one evaluated the ADOS-2, two evaluated the ADI-R, two examined the CASD, and one examined the 3Di. Sixteen reports were found to evaluate ASD screening measures, of which four evaluated the AQ, four evaluated the ASSQ, eight examined the SCQ, and two evaluated the SRS-2. The SCQ had the highest number of reports evaluating its psychometric properties. Of the reports that measured specificity, all but three measures (3Di, AQ-Child, and SRS-2) were found to have specificity > 0.8. Additionally, AUC was consistently around 0.8 or higher across all diagnostic and screening measures. The proportion of those who exceeded the cutoff across measures was widely variable, especially for screening instruments (3–59%).
Summary of Participants Characteristics
The average age across all reports was 9.6 years (Mage = 7.8 years across reports of diagnostic measures; Mage = 10.2 years across screening measures). Most studies had a larger proportion of males to females (M% males = 81%), with two studies excluding females. Further, fourteen of the 20 studies (70%) explicitly excluded ID. Only five studies (25%) included ID (Cooper et al., 2014; Craig et al., 2015; Mayes, 2018; Mayes & Lockridge, 2018; Mellahn et al., 2022), although ID was only a small portion of their ADHD samples and was not exclusively analyzed.
All studies had a sample of individuals with ADHD, who were diagnosed with some combination of parent interviews (55%), parent questionnaires (55%), teacher report (40%), clinician observation (35%), cognitive assessment (20%), medical record review (20%), use of previous psychology evaluation report (10%), parent indicated diagnosis (5%), and child interview (5%). 60% of the studies used a combination of two or more sources of data to ascertain ADHD diagnosis. Alongside their ADHD group, studies also included comparison groups such as ASD, typical development (TD), and ADHD + ASD. In those with comparison groups, ASD was diagnosed with the ADOS-2 (47%), the ADI-R (33%), clinician observation (33%), record review (33%), other parent interview (27%), teacher report (27%), parent questionnaires (40%), and psychology evaluation report (13%). 67% of studies that included an ASD group used more than one source of data to ascertain an ASD diagnosis.
Profiles of False Positives & True Negatives
Only three studies compared the characteristics of those with ADHD who exceeded cutoffs versus those who did not on the ASD measures. On the ASSQ, there were no differences in the distribution of ADHD subtypes between those who did and did not meet the threshold (Tahıllıoğlu et al., 2024). Those with ADHD who met cutoff on the ASSQ had greater hyperactivity symptoms than those with ADHD who did not meet cutoff, whereas no difference was found for inattention (Liu et al., 2023). Among both the SCQ and SRS-2, those considered true negatives (n = 65, 89%) and those considered false positives (n = 8, 11%) did not significantly differ on demographics, IQ, psychiatric conditions, psychoactive medication use, or severity of psychopathology (Guttentag et al., 2022).
Co-Occurring ASD and ADHD
A total of five studies reported on the profiles of children with co-occurring ASD + ADHD. Those diagnosed with ASD + ADHD had lower mean IQ and greater ASD severity on the SCQ than ADHD or ASD alone (Craig et al., 2015; Mellahn et al., 2022). In contrast, 3Di scores were similar across ADHD + ASD and ASD only group (Lai et al., 2015). Additionally, only two studies reported the ability of a measure to accurately differentiate ADHD from ADHD + ASD. Both studies that did so evaluated the SCQ, and one found 13 to be an optimal cutoff for the English version (Mouti et al., 2019; Sensitivity = 0.87, Specificity = 0.85, AUC = 0.93), while another found 15 to be adequate for the German version (Schwenck & Freitag, 2014; Sensitivity = 0.91, Specificity = 0.95, AUC = 0.98, NPV = 0.95).
Quality Ratings
Quality ratings of each study are summarized in Table 3. The two coders obtained 95% consensus, and the remaining 5% was resolved after discussion. On average, studies were at high risk on approximately two of the seven domains (range: 0–4). Method of Diagnosis was consistently rated at high Risk of Bias. The most frequent reasons for the high risk of bias included: (a) studies failed to use best estimate clinical diagnosis (e.g., use variety of sources of information, such as parent and teacher report, psychological assessment, etc.) to rule out ASD and confirm the presence of ADHD within their ADHD sample (60%); (b) results of the index test were not blinded to those interpreting or scoring the reference standard (15%); or (c) the diagnostic procedure was not consistent across the ADHD sample (15%). One domain frequently at high risk for Applicability concerns was Participant Selection. The most frequent reasons for this high risk included studies not recruiting a representative sample of girls (60%) or people with ID (75%), or not having an adequate sample size of at least 30 participants with ADHD (10%).
Table 3
Quality ratings assessment
Index test | Study | Risk of bias | Applicability Concerns | ||||||
---|---|---|---|---|---|---|---|---|---|
Low Risk | High Risk | Participant selection | Index test | Method of diagnosis | Flowand timing | Participant selection | Index test | Methodof diagnosis | |
Diagnostic Measures | |||||||||
ADOS-2 | Grzadzinski et al., 2016 | Low | Low | Low | Low | High | Low | Low | |
ADI-R | |||||||||
Hoffmann et al., 2013 | Low | Low | Low | Low | High | High | Low | ||
CASD | Mayes, 2018 | Low | Low | Low | Low | Low | Low | Low | |
Mayes & Lockridge, 2018 | Low | Low | Low | Low | Low | Low | Low | ||
3Di | Lai et al., 2015 | Low | Low | High | Low | High | Low | Low | |
Screening Measures | |||||||||
AQ | Aiello et al., 2021 | Low | Low | Low | Low | High | High | Low | |
Wong et al., 2021 | Low | Low | High | Low | High | High | High | ||
Matsuura et al., 2014 | Low | Low | High | Low | High | Low | High | ||
ASSQ | |||||||||
Kim et al., 2018 | Low | Low | High | Low | Low | Low | High | ||
Liu et al., 2023 | Low | Low | High | Low | High | Low | Low | ||
Tahillioğlu et al., 2023 | Low | Low | High | Low | High | Low | High | ||
SCQ | Cooper et al., 2014 | High | Low | High | High | High | Low | Low | |
Craig et al., 2015 | Low | Low | Low | Low | High | Low | Low | ||
Kochhar et al., 2011 | Low | Low | High | Low | High | Low | Low | ||
Kröger et al., 2011 | Low | Low | High | High | High | Low | Low | ||
Mouti et al., 2019 | Low | Low | High | Low | High | High | High | ||
Rich et al., 2009 | Low | Low | High | High | High | Low | Low | ||
Schwenck & Freitag, 2014 | Low | Low | High | Low | High | High | Low | ||
Guttentag et al., 2022 | Low | Low | Low | Low | High | Low | Low | ||
SRS-2 | |||||||||
Mellahn et al., 2022 | Low | Low | High | High | High | Low | High |
Discussion
Key Findings
The current review revealed that there is limited research examining the performance of ASD measures, particularly diagnostic measures, in school-aged children with ADHD. As such, our ability to draw conclusions about most measures was limited. Of the studies that were included, preliminary evidence revealed good specificity and AUC (around 0.8 or higher) across most measures that reported them (with exception of 3Di and AQ-Child). Across studies, between 3 and 59% exceeded the cutoff on screening measures and between 10 and 30% exceeded the cutoff on diagnostic measures. Of all the measures reviewed in the current paper, the SCQ had the most studies (n = 8) evaluating its performance. Across these studies, the SCQ was found to have good specificity and AUC for cutoff scores of 11 or higher, with higher cutoffs yielding higher specificity. We recommend using it with school aged children and progressing from the screening stage to a diagnostic evaluation based on its results.
Gaps in the Literature & Limitations
The current systematic review revealed several gaps in the literature and limitations in the methodology of included studies, which thereby limited interpretability of findings. First, there were very few studies examining performance of ASD measures, specifically diagnostic measures, in an ADHD sample. Most notably, only one study examined the ADOS-2 and two studies examined the ADI-R (Grzadzinski et al., 2016; Hoffmann et al., 2013), both considered “gold standard” tools for diagnosing ASD (Falkmer et al., 2013). Findings from Grzadzinski et al. (2016), in which 21% of their school-aged ADHD sample met cutoff on ADOS-2, are consistent with the psychometric properties of other clinical populations (Lebersfeld et al., 2021) and age ranges (e.g., adulthood; Hayashi et al., 2022). Additionally, this study included the use of a well-characterized sample of individuals with best estimate clinical diagnosis of ADHD (without co-occurring ASD) and blind administration of the ADOS-2. We are cautiously optimistic about the ability of the ADOS-2 to reliably classify children with ADHD, recognizing we only located one study examining the question.
In addition, only two studies examined the ADI-R. In one study using the German translation, adequate psychometric properties were reported (Hoffmann et al., 2013). However, in another study using the original English version, 30% of the sample without ASD exceeded the cutoff (Grzadzinski et al., 2016). These findings suggest that the German version of the ADI-R is acceptable to use in this population. However, the latter findings warrant some caution in using the English version of the ADI-R alone to rule out ASD, as 30% exceeds the threshold of what is acceptable.
Another important gap is our lack of understanding of how co-occurring ADHD + ID influences performance on ASD measures, as most studies in the review excluded ID. Even among studies that did include children with ADHD + ID, sample sizes were small, and thus subgroup analyses could not be conducted. Unfortunately, this is the case despite the evidence indicating strong associations between lower IQ and greater ASD and ADHD symptomatology (Cooper et al., 2014; Rommel et al., 2015). Because there is significant overlap in clinical features between ADHD + ID and ASD (APA, 2013) and as previous research has found lower specificity in ASD measures like the ADOS (Sappok et al., 2013), we surmise that the presence of ID would decrease specificity of ASD measures. Further, of the 20 studies included in the current review, none evaluated the accuracy of ASD measures across genders. Because girls with ASD may present with different symptom profiles, it is important to understand which ASD measures are best at accurately ruling out girls with ADHD without ASD, with some research suggesting the need for gender-specific norms and cutoffs (Lundström et al., 2019).
Finally, studies differed in their sample composition, diagnostic algorithms, and language translations, which precluded our ability to aggregate findings across studies to make meaningful conclusions. Specifically, studies failed to use best estimate clinical diagnosis to rule in ADHD and rule out ASD, did not blind results of the reference standard to the study administrators, and reported inconsistent diagnostic procedures within their sample. Further, nine studies used a diagnostic algorithm that was different from the published version, and various language translations were utilized across measures. Thus, the representativeness of the sample, limitations surrounding the diagnostic process, and differences in language and cultural translations could confound generalizability and aggregability of study findings.
Conclusions and Implications
With the rising rates of ASD, it is imperative to use instruments that can accurately rule out ASD among its most commonly occurring differential diagnoses, including ADHD (De Giacomo et al., 2021). Doing so can reduce the number of referrals to autism clinics, which are currently seeing very long waitlists. Further, accurate diagnosis has implications for appropriate provision of treatment. The effects of delayed or misdiagnosis leads to inappropriate or delayed access to services, thus increasing the likelihood of poorer outcomes.
Future directions for research involve more studies evaluating the accuracy of ASD measures (especially diagnostic measures) among individuals with ADHD. Additionally, future studies should evaluate how co-occurring conditions (such as ID) and gender play a role in the performance of these measures. In doing so, studies should include well-characterized samples that use best estimate clinical diagnosis, blind administration and interpretation, and representative sample compositions to enhance interpretability and generalizability of findings.
Author Contributions
MU conceptualized the study, reviewed and coded articles, and contributed to the writing of the manuscript. LB reviewed and coded studies and contributed to the writing. LL conceptualized the study, resolved discrepancies between coders, and contributed to the writing and editing of the manuscript.
Declarations
Conflict of Interest
The authors have no known conflict of interest to disclose.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.