Introduction
Autistic
1 advocates and their families prioritize research and clinical services focused on outcomes [
3‐
6]. Prominent among these outcomes is quality of life (QoL), which the World Health Organization defines as “an individual’s perception of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards, and concerns” [
7]. Maximizing QoL among autistic individuals is often identified as a goal across a wide range of disciplines in autism research [
8]. Agencies and organizations at national (e.g., the Interagency Autism Coordinating Committee) and international (e.g., the World Health Organization) levels have also emphasized large gaps in our current understanding of how support services and care models can increase QoL among autistic people [
9]. However, to maximize QoL among autistic people, researchers need to ensure that they are using psychometrically reliable and valid tools to measure autistic QoL.
Psychometric validation of QoL measures is critical when considering their use in autistic populations. If a measure’s functioning has not been tested within autistic populations, autism researchers cannot assume they are accurately measuring QoL. Making the assumption that a measure functions the same way across populations without evidence poses a threat to the measure’s validity, impeding its clinical utility [
10]. In qualitative studies, autistic individuals and their parents discussed the items of popular QoL measures and indicated multiple themes impacting the validity of their use within autistic populations, including emotional vocabulary and misinterpretation of items [
11]. Thus, there is a clear need to test the psychometric properties of QoL instruments developed for use in the general population in order to confidently use these measures in autistic populations.
Literature examining psychometric properties of QoL measures among autistic individuals is largely focused on adults [
12]. Findings from this nascent literature demonstrate potentially questionable psychometric support for popular QoL measures developed for the general population. For example, the World Health Organization Quality of Life-Brief Version (WHOQOL-BREF) [
13] demonstrated mixed fit indices in a sample of autistic adults and required four iterations of post-hoc modifications to achieve acceptable fit [
14]. Psychometric support for the autism-specific supplement to the WHOQOL-BREF, the Autism Spectrum Quality of Life [
14,
15], has also been mixed [
16]. The Patient-Reported Outcomes Measurement Information System Global Health survey (PROMIS Global-10), demonstrated promising psychometric properties, and minimal differential item functioning across subgroups of autistic adults [
17]. These results provide the evidence necessary to confidently use the PROMIS Global-10 as a reliable and valid indicator of QoL in autistic adults. However, this work did not explore whether such a measure performed similarly across autistic and general population groups. Therefore, for research seeking to compare QoL scores across groups of participants, open questions remain about whether the PROMIS scales demonstrate measurement invariance across autistic and general population groups.
When comparing scores from any given scale across two or more groups, researchers make an implicit, yet consequential, assumption that the scale measures the same construct in the same way across the groups. This assumption is known as measurement invariance, which is defined as “whether or not, under different conditions of observing and studying phenomena, measurement operations yield measures of the same attribute” [
18]. Comparisons and interpretations of scores between groups are only meaningful if the scale demonstrates measurement invariance [
19,
20]. Research has explored measurement invariance across autistic and general population groups for domains such as cognitive functioning [
21‐
23], behavioral concerns [
24], sleep impairment [
25], and depression [
26,
27]. Given the burgeoning emphasis on incorporating QoL outcomes into autism research, it is crucial to evaluate measurement invariance in QoL measures.
To measure autistic QoL, Graham Holmes et al. (2020) leveraged patient-reported outcome measures from the National Institutes of Health (NIH) [
28] to curate a specialized battery for assessing QoL across the lifespan for individuals on the autism spectrum: the PROMIS Autism Battery – Lifespan (PAB-L). Each measure includes scales assessing various domains of QoL, including health, emotional distress, subjective well-being, and social functioning. The PAB-L demonstrated good reliability, feasibility, and acceptability among a large (
N = 912) sample of autistic individuals and their families [
29]. Despite these strengths, it remains unclear whether items on the PAB-L contribute to the measurement of QoL in the same way across autistic and general populations.
The primary objective of the current study was to evaluate the measurement invariance of the PAB-L Emotional Distress (Depression, Anger, Anxiety, Psychological Stress) and Subjective Well-Being (Life Satisfaction, Positive Affect, and Meaning & Purpose) scales among autistic and general population teens. Using existing data from the original PAB-L study [
29] and from publicly available PROMIS pediatric scores from a nationally representative sample of participants [
30], we conducted secondary data analyses using multi-group confirmatory factor analyses (CFA) to assess for measurement invariance across autistic and general population adolescents. We focused our analyses on self-report because we were primarily interested in the measurement of
lived experience of QoL, which is best captured by self-report.
Discussion
The current study is the first to document the varying levels of measurement invariance between the self-reported QoL of autistic and general population teenagers across seven PROMIS pediatric scales. Testing measurement invariance is a critical step towards assuring that measures that were developed and validated on non-autistic populations function in the same way for autistic individuals. Our focus on the self-report of teenagers emphasizes the lived experience of QoL across the community-prioritized domains of Emotional Distress (Depression, Anger, Anxiety, and Psychological Stress) and Subjective Well-Being (Life Satisfaction, Positive Affect, and Meaning & Purpose). Such efforts are essential for validating patient-reported outcomes for use in autism research [
55]. Taken together, the current study addresses an important gap in the literature on autistic QoL by providing the psychometric validation necessary to confidently use these scales to measure various dimensions of QoL among autistic youth.
Our results highlight varying degrees of measurement invariance on the Emotional Distress and Subjective Well-Being self-report scales for autistic and general population teens. The Depression and Positive Affect scales demonstrated configural, metric, and scalar invariance between the two groups. These results suggest that the constructs are measured in a similar fashion among autistic teens as they are in the general population and that the scores from these scales can be compared meaningfully across autistic and general population groups.
The Anger and Psychological Stress scales both demonstrated configural and metric, but not scalar, invariance between the self-report of autistic teens and general population teens. These results suggest that the scales’ items capture the respective latent constructs across both groups, but researchers should not compare means between autistic and general populations on these self-report scales. The scales demonstrate a degree of measurement bias such that equivalent scores on the scales do not necessarily imply equal levels of anger or psychological stress in autistic versus non-autistic teens. Finally, the Anxiety, Life Satisfaction, and Meaning & Purpose teen self-report scales did not demonstrate metric or scalar invariance between the groups, indicating that these PROMIS scales may function differently between autistic and general population teens. These results suggest that the scales do not capture the intended constructs in autistic teens in the same way as they do in non-autistic teens.
Our findings for varying levels of measurement invariance of the PROMIS scales between autistic and general population teens is not surprising given that these measures were not specifically developed to measure QoL among autistic teens. The PROMIS pediatric Depression, Anxiety, and Anger scales were developed and standardized on a large sample of children and teens recruited from pediatric clinics and school settings [
36‐
38]. Similarly, the PROMIS pediatric Psychological Stress [
32], Life Satisfaction [
33], Positive Affect [
35], and Meaning & Purpose [
34] scales were developed and standardized with samples of children from opt-in online panels, school districts, and hospital clinic settings. These samples included children with chronic health conditions, such as asthma, Attention-Deficit/Hyperactivity Disorder, and gastrointestinal disorders; however, autistic children were not reported as a part of the measures’ development samples. The fact that autistic individuals were not knowingly included in these development samples should not negate the tremendous amount of work that goes into creating and validating important patient reported outcome measures to capture different dimensions of QoL in pediatric populations. However, researchers and clinicians using these measures as outcomes in autism research and/or clinical work should do so with the knowledge that most of the PAB-L scales are not specifically validated for use with autistic teens.
To our knowledge, the present investigation is the first study to investigate the factorial validity of the self-report of autistic teens using the PROMIS scales. While previous research has investigated measurement invariance of PROMIS scales between parent-proxy reports for autistic and non-autistic children [
25], self-report of QoL domains, particularly among autistic individuals, is likely a better estimation of an individual’s lived experiences of QoL. Future research will benefit from investigations of how PROMIS scales function similarly or differently depending on reporter (e.g., parent-proxy vs. self-report) in capturing autistic QoL. Previous research has highlighted ways in which the measurement of subjective experiences differ between parent-report and autistic adolescent self-report, including social anxiety [
56] and sensory sensitivity [
57]. Thus, psychometric investigations of QoL measures which include autistic self-report are likely to improve our measurement, and therefore understanding, of autistic QoL.
Beyond the individual scale findings presented, the current investigation highlights an important gap in the field of autism research. Put simply, the rigor with which we measure QoL in autism lags significantly behind the state of measurement science [
55]. Specifically, standard psychometric practices, including testing whether a measure that was developed using non-autistic samples functions similarly in autistic populations, are infrequently applied before utilizing the scale in autism research. Such practices are particularly crucial in the adoption of patient-reported outcome measures, including QoL, as these constructs tap into subjective experiences, rather than objective or observable indicators. Researchers using the PROMIS scales have the advantage of a rigorous development and validation process in the general population or for clinical groups that were included in the initial construction. However, this rigorous development process is only advantageous to autism researchers if the scale functions in the same way in autistic populations.
Clinical implications
The degree that a violation of measurement invariance is problematic depends on how the instrument is used across groups [
58]. Some research seeks to compare autistic individuals to non-autistic individuals through comparing scores on a single measure and drawing conclusions based on differences (or lack of differences) between group means. To accurately compare group means, the instrument should demonstrate scalar invariance between the two groups. In the present study, the Depression and Positive Affect scales of the PROMIS self-report demonstrated scalar invariance between autistic and general population teens. Put differently, our results demonstrate that similar scores on the PROMS self-report Depression and Positive Affect subscales imply similar levels of depression and positive affect between autistic and non-autistic teens. As such, these scales can be validly used to compare these groups.
In contrast, other studies may use a teen’s self-report as an outcome within an autistic population, rather than compare across autistic and non-autistic teenagers. In this case, an instrument demonstrating metric (but not scalar) invariance may be sufficient given the research question. Our finding that the PROMIS Anger and Psychological Stress scales demonstrated invariance at the metric level suggests that these scales function similarly in the measurement of anger and psychological stress across groups. However, these scales did not demonstrate scalar invariance, suggesting test bias and that equal observed scores on the PROMIS Anger and Psychological Stress scales do not necessarily imply equal levels of anger or psychological stress in autistic versus non-autistic teens.
Limitations
Results of our study are limited by several factors related to our samples of autistic and general population teens. First, given that this was secondary data analysis based on virtual surveys, limited information was available regarding the cognitive functioning, verbal abilities, or adaptive behavior skills of our samples. Future work able to characterize more multidimensional characteristics would be helpful in validating the generalizability of results presented here. Relatedly, our current study did not include a representative sample of autistic teens who were reported to be minimally verbal or have co-occurring intellectual disability. It is unknown to what extent our current findings would extend to autistic teens with co-occurring intellectual disability. Given the high prevalence rate of intellectual disability among autistic individuals [
59], additional research is necessary to understand how to reliably and validly measure QoL among this population across the lifespan. Finally, autism diagnostic status was not reported in the general population sample data, so it is possible that the general population sample contained some autistic adolescents.
As is common in autism research given the imbalanced sex and gender ratio of individuals diagnosed with autism, our autistic sample had a higher proportion of teens identifying as male than female or transgender. In addition, since only the autistic group was asked about transgender identity, there is no comparison group for transgender adolescents in the general population sample. Future work may benefit from oversampling methods for autistic women, transgender, and nonbinary individuals to characterize their lived experience of QoL. Additionally, both autistic and general population groups were more white and non-Hispanic/Latino than is reflected in the current United States population. Increasing racial and ethnic diversity is largely recognized as a high-priority area of growth across a wide range of autism research [
60,
61].
Conclusion
The results of this study provide evidence that the Depression and Positive Affect scales of the PROMIS can be used confidently in research and clinical work based on the self-report of autistic and non-autistic teens. Scores on the PROMIS Depression and Positive Affect scales can also be compared across these two groups given the psychometric properties of the scales. The PROMIS Anger and Psychological Stress scales demonstrated metric invariance and can be used to measure these QoL constructs in autistic teens, though caution should be used in research seeking to compare autistic and general population teens on these measures. Finally, neither scalar nor metric invariance was indicated for the Anxiety, Life Satisfaction, and Meaning & Purpose scales, suggesting that these scales measure these constructs differently in autistic and non-autistic teens. The inclusion of QoL outcomes is a welcome advancement in autism research; however, the process in validating QoL measures for use in autistic populations is an ongoing process necessary to support such research. We believe the past and current work on the PAB-L exemplifies this process – the PAB-L leveraged rigorously developed PROMIS scales and consulted with autistic individuals, their families, and care providers to select scales that reflect constructs most meaningful for this population. Feasibility, acceptability, and reliability of the PAB-L were favorable [
29], and the current work extends this process to examine the psychometric properties of the scales for autistic teenagers as compared with general population teenagers. Multi-group CFAs that demonstrate measurement invariance between autistic and general populations help researchers make empirically supported decisions regarding the measures they use to capture QoL outcomes in their research.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.