Uniform Differential Item Functioning Across Gender, Grade Level and Racial Groups: A MIMIC Investigation of the Non-clinical Parent Ratings of the Pediatric Symptom Checklist-17
The Pediatric Symptom Checklist-17 (PSC-17) is a screening instrument designed to measure children’s behavioral and emotional problems. This study investigated the factor structure of the PSC-17 and the uniform differential item functioning (DIF) of the PSC-17 item scores as a function of school children’s gender, grade level, and racial/ethnic groups. Parent ratings of 1,305 children from pre-K to Grade 1 were used. Confirmatory factor analysis (CFA) of the scale’s factor structure indicated that parent ratings of PSC-17 were composed of three subscales: Externalizing Problems, Internalizing Problems, and Attention Problems. A multiple indicators multiple causes (MIMIC) analysis results showed that four of the PSC-17 items exhibited statistically uniform DIF as a function of race/ethnicity but not as a function of gender or grade level. Uniform DIF had little impact on latent mean differences of Internalizing, Attention, and Externalizing Problems among gender, race/ethnicity, and grade-level groups. This study’s results implied that teachers or schools should be cautious with comparing racial/ethnic groups at the item level. However, they can compare children’s subscale scores across gender, race/ethnicity, and grade levels with parent ratings of the PSC-17.
It is estimated that approximately 15% of children under the age of six have clinically significant mental health problems (Kann, 2016). This is concerning as early social, emotional and behavioral (SEB) problems are associated with later SEB development (Treyvaud et al., 2012; Waller et al., 2017; Wang et al., 2022), academic performance (Hammer et al., 2017; Polderman et al., 2010), school readiness (Gartstein et al., 2016), school engagement (Olivier et al., 2020), peer relationship (Kochel et al., 2012; Waller et al., 2017), school readiness (Denham & Brown, 2010; Shala, 2013), criminal activity and drug abuse (Halle & Darling-Churchill, 2016; Jones et al., 2015). Given the potential negative consequences of mental health problems, it is important to identify and treat these difficulties early in children’s lives.
Schools play an important role in identifying children with SEB problems (Bringewatt & Gershoff, 2010). Some schools have increased their early identification and screening procedures to identify at-risk students (Lane et al., 2012). Some school-based services, such as the multi-tiered system of support (MTSS), function well in preventing the occurrence of potential problems and reducing the risk level of targeted students (Fazel et al., 2014; Olubiyi et al., 2019; Sanchez et al., 2018). When early identification is combined with early prevention and intervention services, it can reduce the possibilities of students’ academic failures and future life difficulties (e.g., Lane & Menzies, 2003; Walker & Shinn, 2002) and minimize the impact of risk factors, as well as prevent further development of behavioral and social-emotional difficulties (Glover & Albers, 2007).
School-based universal mental health screening assessments have emerged as an efficient method to identify children at risk for SEB problems and may benefit from relevant interventions (Hunter et al., 2005; Kettler et al., 2014; Severson & Walker, 2002). However, roughly only 12% of schools use universal screening instruments to assess students’ SEB problems (Bruhn et al., 2014). The limited number of schools using universal screening instruments might be due to the short history of universal screening for young children’s SEB problems (Greenwood et al., 2011; Steed & Banerjee, 2016).
While screening young children for SEB problems is not yet widespread, it has been advocated due to the high rate of mental problems among youth (Essex et al., 2009) and the benefits provided to both schools and students. First, all students, regardless of whether they have risky behaviors or not, have the equal opportunity to be screened and receive possible support, thus reducing the likelihood of missing students who need intervention (Lane et al., 2012). Second, screening is recommended as an initial step for identifying risky behaviors, prevention, and intervention services in schools to reduce the negative effects of risky behaviors on students’ long-term outcomes (Walker et al., 2009; Webster-Stratton et al., 2011). Third, screening can help schools make better use of limited school resources such as counseling, prosocial skills training, and academic support (Walker et al., 2005). As many decisions may be made using a screening instrument, identifying a valid and reliable tool for effective universal screening in school is essential.
The utility and precision of screening assessments are based on their appropriateness for intended use, technical adequacy, and usability (Glover et al., 2007). Dowdy et al. (2010) recommended the following instruments for universal school-based mental health screening: Strengths and Difficulties Questionnaire (SDQ; Goodman, 1997); Pediatric Symptom Checklist (PSC; Gardner et al., 1999); Systematic Screening for Behavior Disorders (SSBD; Walker & Severson, 1992), Behavior Assessment System for Children-Second Edition (BASC-2, Kamphaus & Reynolds, 2007), and Social, Academic, and Emotional Behavioral Risk Screener (SAEBRS, Kilgus et a., 2013). Among these forms, our focus is the Pediatric Symptom Checklist-17 (PSC-17). The screener is the shortened version of the full (35-item) PSC form and was originally a caregiver-completed questionnaire for children aged 4–16 years (Gardner et al., 1999). Compared to other screening tools, which are lengthy or expensive, PSC-17 has the advantages of free access for the public and a brief design with just 17 items, saving schools both time and money. The online availability (http://www.massgeneral.org/psychiatry/assets/PSC-17_English.pdf) makes it easier for school personnel to administer the assessment, including collecting, managing, and interpreting the assessment data. Further, the PSC-17 scores have demonstrated a strong correlation with other screening instruments such as Attention, Behavior, Language, and Emotion (ABLE; Barbarin, 2007), the BESS, and the SDQ (DiStefano et al., 2017), indicating sound validity evidence of the PSC-17. Therefore, the PSC-17 should be an acceptable option for schools to conduct universal screening of behavioral and emotional problems of students.
Factor Structure and Measurement Invariance of the PSC-17
The PSC-17 has been validated and successfully used in clinical environments (Blucker et al., 2014; Borowsky et al., 2003; Chaffin et al., 2017; Gardner et al., 2007; Murphy et al., 2016; Stoppelbein et al., 2012). Three subscales underlying the PSC-17, including Externalizing Problems, Internalizing Problems, and Attention Problems, were identified with exploratory factor analysis (EFA) by Gardner et al. (1999) and confirmed with confirmatory factor analysis (CFA) (Chaffin et al., 2017; Murphy et al., 2016; Stoppelbein et al., 2012). Partial metric invariance was established across majority and minority groups (Stoppelbein et al., 2012). Bergmann et al. (2018) identified scalar invariance across gender. One study on the Externalizing Problems subscale of the PSC-17 identified one item (i.e., Item 4" refuses to share") with uniform differential item functioning (DIF) across genders and two items (i.e., Item10 " blames others" and Item14" teases others") demonstrating DIF across races (Studts et al., 2017). However, all these studies were conducted with parent ratings on the PSC-17 in primary care settings.
Recently, the PSC-17 has been used in school settings. DiStefano et al. (2017) identified a second-order factor structure of PSC-17 with two cross-loading items when used by teachers to rate preschool children. The first-order factors include Externalizing Problems, Internalizing Problems, and Attention Problems, and the second-order factor is Maladaptive Behavior. With a sample of teacher ratings of lower-grade elementary school children, Liu et al. (2020b) identified a 3-factor structure with one item loaded on a different latent factor. The study by Liu et al. (2020a) identified the high internal consistency and strong criterion-related validity of the PSC-17, indicating that this instrument is a high-quality universal screening tool for use in school settings. Gender invariance of teacher ratings of PSC-17 was established among preschool children. Boys exhibited more maladaptive behaviors than girls (Liu et al. (2018). Invariance of the PSC-17 was established across gender, race, and grade levels of children from the first grade to the second grade. Boys demonstrated higher levels of externalizing problems and attention problems than girls. Children from different racial groups or grade levels did not differ in their SEB functioning (Liu et al., 2020b). Measurement invariance of the PSC-17 was also established across teacher and parent respondents. Teachers perceived children as demonstrating a lower risk level of externalizing problems relative to parents. Parents and teachers did not differ in their perceptions of children's internalizing and attention problems (Gao et al., 2022). However, these studies, except the study by the (Gao et al., 2022), used only teacher-rated samples. To our knowledge, no study has examined the measurement equivalence of the PSC-17 rated by parents in non-clinical settings. To contribute to the research in measurement invariance of the non-clinical parent ratings of the PSC-17, we focused on testing measurement invariance in the PSC-17 across gender, grade level, and race/ethnicity.
Previous investigations of the PSC-17 in school settings have used teacher ratings of children’s SEB problems. However, as children behave differently in different settings, such as in school and at home (American Psychiatric Association, 2013), a multi-informant approach is important for assessing children’s behavioral problems (Wong et al., 2020). Besides, different informants may have unique perspectives on a child’s behavior (Achenbach, 2006; De Los Reyes & Kazdin, 2005). Teachers are more likely to take a normative approach in judging a child’s behavior, while parents may be a better judge of a child’s behavior from a more critical perspective (Konold et al., 2004). As only low to moderate correlations between parent and teacher ratings of children’s mental health have been identified, collecting multiple informants’ ratings will provide rich information about children’s behavioral problems (De Los Reyes et al., 2015). Therefore, the examination of non-clinical parent ratings on the PSC-17 is warranted.
Primary Research Questions
The current study the aimed to investigate the factor structure of the PSC-17 and the uniform differential item functioning (DIF) of the PSC-17 item scores as a function of school children’s gender, grade level, and racial/ethnic groups. Specifically, the present study was designed to address the following research questions:
(1)
What is the latent factor structure of the parent-rated PSC-17?
(2)
Does parent-rated PSC-17 have uniform DIF items as a function of grade level, gender, or race/ethnicity?
(3)
Do the uniform DIF items affect the extent to which latent factor means differ across grade level, gender, or race/ethnicity?
Method
Instrument
The Pediatric Symptom Checklist-17 (PSC-17) is the shortened form of the full PSC instrument (Gardner et al., 1999) designed to assess a child’s overall psychosocial functioning. The PSC-17 consists of 17 items that assess three sub-dimensions of maladaptive behavior: Internalizing Problems (e.g., “Feels sad, unhappy”), Externalizing Problems (e.g., “Fights with other children”), and Attention Problems (e.g., “Has trouble concentrating”). Respondents rate each symptom on a scale of 0 (never), 1(sometimes), and 2 (often). The 17 items are summed to produce a total score ranging from 0 to 34, with a higher score reflecting greater risk. A total score above 15 indicates an overall psychosocial health risk. The Internalizing Problems subscale consists of five items assessing children’s internalizing issues such as depression and anxiety. A score above five indicates risk on this subscale. Externalizing Problems subscale measures children’s disruptive behavior, such as aggression and hyperactivity. The Attention Problems subscale assesses whether children have trouble concentrating. Both Externalizing Problems and Attention Problems subscales consist of seven items, with sum scores greater than seven indicating risks.
Sample
A cluster sampling method was used in this study. Data were collected from children in nine public schools and child development centers in two states (California and South Carolina) involved in a funded grant project investigating universal screening. Parents provided consent for participation in the project; however, the method of consent varied by site. In California, parents were asked to give informed consent, while South Carolina sites used passive consent procedures. Both sites sent hard copies of forms home with children with their classroom work.
The children were nested within 91 classrooms. Hard copies of the PSC-17 forms were sent home for parents to complete. Forms were distributed in Spanish or English, depending on the home language. Parents were not compensated for participation; however, small incentives (e.g., pencils and stickers) were given to children. Families were not given individual feedback but were encouraged to contact the project investigators with any questions or for individualized information. We had bilingual project staff to assist Hispanic families with questions and concerns.
PSC-17 parent ratings were combined during three academic years (2016–17, 2017–18, 2018–19). The sample consists of 1,305 ratings of children aged from 3 to 6. The sample size is adequate based on the recommended sample size range from 300 to 460 cases, considering the number of indicators and factors, the magnitude of factor loadings, path coefficients, and the amount of missing data (Wolf et al., 2013). Institutional Review Board approval and informed consent were obtained before the data collection, and ethical treatment of subjects was followed during data collection and analysis procedures.
Female children (n = 591, 47.1%) and male children (n = 664, 52.9%) were evenly distributed in the sample. The sample of children rated by parents was predominantly Hispanic (53.6%), White (30.6%), African American (12.0%), and other racial groups, including Asian American, Pacific Islander/Native Hawaiian, American Indian/ Alaska natives, and multi-racial backgrounds (3.8%). The sample included children from different grade levels: Pre-kindergarten (50.1%), 5-year-old Kindergarten (40.4%), and Grade 1(9.5%); Grade level was coded as dichotomous. Approximately 50.1% of sampled children (n = 618) were from Pre-kindergarten, and 49.9% of sampled children (n = 615) were from K to grade 1.
Statistical Analysis
All analyses were conducted using Mplus 8.4 software (Muthén & Muthén, 1998-2015). The weighted least squares with mean and variance adjusted (WLSMV) estimation method was chosen to accommodate the categorical nature of the PSC-17 data and address non-normality in the data (Finney & DiStefano, 2013). In Mplus, the method serves as the default when dealing with categorical data (Muthén & Muthén, 1998-2015. Missing data ranged from 0.3% to 2.9% across the 17 items. Due to the low proportion of missing data, pairwise deletion was utilized, meaning cases are excluded only if they have missing data on variables involved in data analysis. This approach is recommended for categorical data analysis as it maximizes case inclusion (DiStefano et al., 2017). Additionally, the nesting structure of the data (i.e., students nested in classrooms) was considered using a design effect to provide more accurate standard errors for parameter estimates. (Stapleton, 2013).
As the 3-factor structure of the PSC-17 used in the school setting was previously identified, we did not perform exploratory factor analysis. Confirmatory Factor Analysis (CFA) was conducted to test whether the three-factor solution is appropriate in the current study compared to previously established CFA models for teacher ratings of the PSC-17 in school settings (DiStefano et al., 2017; Liu et al., 2020b; Gao et al., 2022) and the parent-rated PSC-17 in clinical settings (Chaffin et al., 2017; Murphy et al., 2016; Stoppelbein, et al., 2012). The proposed model comprises three intercorrelated sub-scales: Internalizing Problems, Attention Problems, and Externalizing Problems.
To identify uniform differential item functioning (DIF) items, a Multiple Indicators Multiple Causes (MIMIC) model, defined as CFA models with covariates (Brown, 2015), was used. This model comprises a measurement model defining the relationship between indicators and latent variables (established at the CFA stage) and a structural model specifying the direct effects of covariates on item responses and latent factors (Jöreskog & Sörbom, 1996). MIMIC modeling assesses measurement invariance by allowing direct paths from grouping variables to observed variables (Kim et al., 2012). Uniform DIF occurs when the focal group consistently performs differently reference group after controlling for the level of the scale score (Scott et al., 2009). Potential uniform DIF items in the parent-rated PSC-17 were identified using children’s gender (0 = male, 1 = female), grade level(3 K to 4 K = 0, 5 K-1st grade = 1), and African American race/ethnicity(0 = no, 1 = yes), Hispanic(0 = no, 1 = yes), other racial/ethnic groups (0 = no, 1 = yes) as covariates. Male children, children from 3 to 4 K, and children from the White group served as reference categories. Item responses are regressed onto grouping variables to determine whether members of different groups vary in the possibility of endorsing any item response option after controlling their level on latent variables (Finch, 2005). MIMIC models were utilized to detect latent trait group differences by regressing the latent trait onto covariates while assuming that the hypothesized structure is invariant across groups (Green & Thompson, 2012). If the invariance measurement test identified items exhibiting uniform DIF, MIMIC modeling was used to examine whether the latent mean comparison across groups might be biased.
Following the approach used by Kim et al. (2012), a baseline MIMIC model was first constructed. In this baseline model, all latent variables identified in the CFA analysis were simultaneously regressed on all the covariates without any PSC-17 items included in the model (i.e., the direct effects of covariates on all items’ difficulty were constrained to zero). Next, the baseline models were compared with the relaxed MIMIC models where one direct effect of a covariate on a PSC-17 item was added sequentially, with all the covariates' direct effect on each latent variable (i.e., the direct effect of one covariate on one item’s difficulty was freely estimated). To compare these nested models, the WLSMV model chi-square difference test was conducted using the Mplus Difftest feature. A significant change in the chi-square difference with one degree of freedom between the baseline model and the less constrained model would indicate uniform DIF for the given item.
As high Type I error rates were reported when MIMIC modeling was used to identify nonvariant variables (Kim et al., 2012), the Oort adjustment to the Chi-square difference test was used to control Type I error inflation (i.e., false identification of uniform DIF for invariant items in the model comparison; Kim et al., 2012; Oort, 1998). Oort’s formula stated as \({K}{\prime}=[{x}_{0}^{2}/(K+{df}_{0}-1)]*\text{K}\), adjusts the critical chi-square value to account for the potential model misspecification in the baseline MIMIC model when analyzing categorical items. In the formula, K´ refers to the adjusted critical value for the chi-square difference test, and K is the critical value for the chi-square difference test (e.g., the critical value is 3.84 for 1 df at the 0.05 level of significance). χ02 is the chi-square value for the baseline model, and df0 is the degree of freedom for the baseline model. This method not only helps to control Type I error rates at or below the nominal level but also assists with maintaining high power across different study conditions (Kim et al., 2012).
If uniform DIF items were identified, the effects of gender, grade level, and race/ethnicity on the latent factor means were examined using MIMIC models to determine if latent factor means exhibit bias. The first MIMIC model was the baseline MIMIC model which only includes direct paths from covariates on latent factors. The second MIMIC model was the model in which the direct effects of the covariates on the identified DIF items, as well as latent factors, were included.
The CFA and MIMIC models were evaluated using the following indices commonly used with analysis of categorical data (Finney & DiStefano, 2013): (a) chi-square statistic, (b) comparative fit index (CFI), (c) root mean squared error of approximation (RMSEA), and (e) standardized root mean square residual (SRMR). As chi-square statistics are sensitive to sample size, the values were reported for model comparison purposes. CFI ≥ 0.90, RMSEA ≤ 0.08, and SRMR ≤ 0.10 indicated an acceptable model fit. CFI ≥ 0.95, RMSEA ≤ 0.05, and SRMR ≤ 0.08 suggested good fit (Hu & Bentler, 1999). 90% confidence interval (CI) close to the RMSEA point estimate should contain 0.05 to show the possible close fit (Browne & Cudeck, 1993).
Besides global fit, local fit indices were examined as global model fit might be affected by the magnitude of the factor loadings and the number of items (Greiff & Heene, 2017; McNeish et al., 2018). Residual values, which show the difference between the observed and estimated covariances, were evaluated. Large standardized residuals (e.g., |> 3.0|) suggest a possible local model misfit (Raykov & Marcoulides, 2012). The interpretability of parameter estimates was also examined.
Results
Descriptive Statistics
Chi-square tests were conducted to examine the association between the PSC-17 item responses and children’s gender, grade, and race/ethnicity conditions. Applying a Bonferroni correction to adjust the alpha level for group comparison (0.05/17 = 0.002), results revealed significant differences in parental ratings across genders for three out of the five items in Attention Problems and four out of the seven items in Externalizing Problems (p < 0.001). Preschool children were rated differently from 5 K to Grade 1 students, showing differences in two items in Internalizing Problems and one item in Attention Problems (p < 0.001). Furthermore, parent ratings varied among children from different racial backgrounds, manifesting in differences for one item within Internalizing Problems, three within Attention Problems, and two within Externalizing Problems (p < 0.001).
T-tests were conducted to examine subscale scores (i.e., sum scores of the items from each scale) differences across grade levels or genders. Simultaneously, an ANOVA test was performed to assess variation in subscale scores across racial groups. The adjusted alpha level for group comparison, using the Bonferroni correction, was set at 0.017. T-test results indicated that parents rated 5 K children and boys as exhibiting a higher risk level of Internalizing Problems. ANOVA test findings revealed differences in parental ratings among children from different racial groups. Specifically, post-hoc analysis using the Tukey method test suggested that Hispanic children were rated as having a higher risk level of Attention Problems and Externalizing Problems compared to White children. Taken together, these findings underscore the impact of gender, grade level, and race/ethnicity on parent ratings of children’s psychosocial functioning as assessed by the PSC-17 (Table 1).
Table 1
Parent-rated PSC-17 items and subscales by gender, grade and race/ethnicity
Items
Grade Level
Gender
Race/Ethnicity
χ2/p-value
χ2/p-value
χ2/p-value
Internalizing Problems
Item2: Feels sad, unhappy
35.91(< .001)
Item9: Is down on him or herself
23.82(< .001)
Item15: Worries a lot
21.60(< .001)
Attention Problems
52.95(< .001)
Item 1: Fidgety, unable to sit still
26.67(< 0.001)
75.31(< .001)
Item3: Daydreams too much
15.04(< .001)
Item7: Has trouble concentrating
38.20(< .001)
51.13(< .001)
Item13: Acts as if driven by a motor
60.20(< .001)
Item 17: Distracted easily
31.52(< .001)
Externalizing Problems
Item4: Refuses to share
14.27(< .001)
Item5: Does not understand other people’s feelings
23.09(< .001)
Item8: Fights with other children
15.75(< .001)
Item12: Does not listen to rules
23.26(< .001)
Item14: Teases others
16.45(< .001)
Item16: Takes things that do not belong to him or her
41.43(< .001)
Subscales
F value/p-value
F value/p-value
F value/p-value
Internalizing Problems
18.22 (< .001)
8.621 (.003)
Attention Problems
3.90(.009)
Externalizing Problems
4.48(.004)
Only significant results were reported
Confirmatory Factor Analysis
The single-factor and three-factor models were tested to evaluate the adequacy of the hypothesized 3-factor solution for PSC-17 item scores. A single-factor model allowed all items to load on a single latent factor. In the 3-factor confirmatory Factor Analysis (CFA) model, each item was freely estimated to load on one of the three factors; all factors were intercorrelated. Model fit indices are presented in Table 2.
Table 2
Fit statistics of estimated CFA and MIMIC models for parent-rated PSC-17
Models
χ2
df
CFI
RMSEA (90% CI)
SRMR
CFA models
Single-factor CFA model
1302.676
119
0.854
0.089(0.085–0.094)
0.124
3-factor CFA model
540.308
116
0.948
0.054(0.050–0.059)
0.077
MIMIC Models
Baseline MIMIC Model (without DIF Items)
682.006
186
0.930
0.047(0.044–0.051)
0.075
Final MIMIC model (with DIF Items)
599.635
182
0.941
0.044(0.040–0.048)
0.070
χ2 Chi-square test statistic, df degree of freedom, CFI comparative fit index, RMSEA root mean squared error of approximation, SRMR standardized root mean square residual, CI confidence interval
Based on the criteria of model fit, small residuals, and interpretability of parameter estimates, the single-factor model exhibited poor model fit as all model fit indices were outside of the recommended boundaries (χ2(119) = 1302.676, RMSEA = 0.089, 90% CI [0.085, 0.094]; CFI = 0.854; SRMR = 0.12); this model was not considered further. The three-factor solution yielded an acceptable fit with all the model fit indices within the recommended bounds (χ2(116) = 540.308, RMSEA = 0 0.054, 90% CI [0.050, 0.059]; CFI = 0.948; SRMR = 0.077). Figure 1 shows a path diagram of the 3-factor model.
Standardized factor loadings indicated positive and significant associations of all items with the stated factors. The loading values ranged from 0.45 to 0.89, indicating that each subscale sufficiently captures the variance of the related variables. All the latent factors were significantly correlated. The correlation between latent factors was 0.41 (Attention Problems with Internalizing Problems), 0.57 (Externalizing Problems with Internalizing Problems), and 0.66 (Externalizing Problems and Attention Problems), suggesting that children at a higher risk level in one of the three SEB dimensions were more likely to develop problems in the other two dimensions. Standardized parameter estimates are provided in Table 3.
Table 3
PSC-17 3-factor CFA results: standardized estimates
Items
Item Loading Values
Internalizing Problems
Attention Problems
Externalizing Problems
Item2: Feels sad, unhappy
0.65
Item6: Feels hopeless
0.76
Item9: Is down on him/herself
0.82
Item15: Worries a lot
0.66
Item11: Seems to be having less fun
0.70
Item3: Daydreams too much
0.59
Item1: Fidgety, is unable to sit still
0.82
Item17: Distracted easily
0.85
Item7: Has trouble concentrating
0.89
Item13: Acts as if driven by a motor
0.45
Item8: Fights with other children
0.73
Item12: Does not listen to rules
0.76
Item5: Does not understand other people’s feelings
0.52
Item14: Teases others
0.72
Item10: Blames others for his/her troubles
0.70
Item4: Refuses to share
0.67
Item16: Take things that do not belong to him/her
0.74
The coefficient omega for Internalizing Problems, Attention Problems, and Externalizing Problems are 0.87, 0.82, and 0.86, respectively, suggesting high internal consistency for the parent responses.
MIMIC Models
The baseline model was illustrated in Fig. 2 without a bolded path from gender to Item 9. This baseline model showed a significant model chi-square value (χ2(186) = 682.006, p < 0.001). Other model fit indices yielded adequate model fit with all the indices within the recommended bounds (RMSEA = 0.047, 90% CI [0.044, 0.051]; CFI = 0.930; SRMR = 0.075). Subsequently, the baseline MIMIC model was compared to multiple, less constrained MIMIC models (see Fig. 2 with the bolded path from grade to Item 9 as an example).
The WLSMV model chi-difference test was conducted for each model comparison, using an adjusted critical chi-square value of 13.86 for a nominal alpha of 0.05. As a result, four PSC-17 items with uniform DIF related to children’s racial background were identified. After identifying these items, the baseline MIMIC model was revised by adding the direct effects of race/ethnicity simultaneously on these four DIF items to the baseline MIMIC model. The final MIMIC model with DIF items yielded a good model fit (χ2(182) = 599.635, RMSEA = 0.044, 90%, CI [0.040, 0.048]; CFI = 0.941; SRMR = 0.070). Fit statistics for both the baseline MIMIC model and the final MIMIC model with DIF Items are shown in Table 2.
The final MIMIC model indicated uniform DIF for four PSC-17 items across racial /ethnic groups. The unstandardized parameter estimates for the direct effects of race/ethnicity on the items showing DIF are shown in Table 4. Unstandardized path coefficient values were reported, as they are preferred when covariates are categorical (Kline, 2023). Parents were more likely to rate African American children higher than White children on Item 14 (“Tease others”) in the Externalizing Problems. Compared to White children, Hispanic children were more likely to be rated lower by parents on Item 1(“Fidgety, unable to sit still) and Item 7 (“Has trouble concentrating”) but higher on Item 13 (“Acts as if driven by a motor”). No significant relationship was found between the PSC-17 items and children’s gender and grade level, indicating that the parent-rated PSC-17 functioned equally across children's gender and grade level.
Table 4
Parameter estimates for the direct effect of race/ethnicity on PSC-17 items with DIF
DIF Items
Unstandardized b
SE
Effect of African-American Race/Ethnicity on PSC-17 Items
Item14: Teases others
0.43***
0.10
Effect of Hispanic Race/ Ethnicity on PSC-17 Items
Item1: Fidgety, unable to sit still
−0.45***
0.07
Item7: Has trouble concentrating
−0.17*
0.07
Item13: Acts as if driven by a motor
0.78***
0.08
The reference group for race/ethnicity was White
The association between covariates (i.e., gender, grade level, and race/ethnicity) and the means of the three latent factors (i.e., Internalizing Problems, Attention Problems, and Externalizing Problems) was examined to determine the extent to which group comparisons at the latent factor level might be biased by items with DIF. The estimated direct effects of the covariates on the three latent factors from the baseline MIMIC model (i.e., model without the items included) and those from the final MIMIC model (i.e., model with the four items with DIF included) were compared as shown in Table 5. The comparison showed little difference between the two models, indicating that the identified uniform DIF of the four PSC-17 items does not significantly impact the relationship between covariates and the latent factor means. Hence, the latent mean differences should be the true group difference.
Table 5
Summary of relations between covariates and PSC-17 latent factors
Baseline MIMIC model (without DIF items)
Final MIMIC model (with DIF items)
Covariates
Unstandardized coefficient
SE
Unstandardized coefficient
SE
Internalizing Problems
Internalizing Problems
Gender
−0.07
0.06
−0.07
0.06
Grade level
0.22**
0.04
0.22**
0.08
Race/African American
−0.46**
0.14
−0.46**
0.14
Race/Hispanic
−0.21*
0.10
−0.21*
0.10
Race/Other
−0.18
0.18
−0.18
0.18
Attention Problems
Attention Problems
Gender
−0.36***
0.06
−0.36***
0.06
Grade level
0.18**
0.04
0.17**
0.06
Race/African American
−0.17
0.11
−0.16
0.11
Race/Hispanic
−0.30***
0.08
−0.25**
0.08
Race/Other
0.10
0.15
0.10
0.15
Externalizing Problems
Externalizing Problems
Gender
−0.21***
0.04
−0.21***
0.04
Grade level
−0.10*
0.02
−0.10*
0.05
Race/African American
−0.25**
0.09
−0.25**
0.09
Race/Hispanic
−0.26***
0.05
−0.26***
0.05
Race/Other
−0.08
0.14
−0.08
0.14
The reference group for race, gender, and grade level is White, male, and preschool children, respectively; DIF = differential item functioning
**** p < 0.001; **p < 0.01; *P < 0.05
Boys exhibited more Attention and Externalizing Problems compared to girls (p < 0.001). However, boys and girls did not significantly differ in the mean level of Internalizing Problems. Preschool children had significantly lower means on Internalizing Problems(p < 0.001) and Attention Problems (p < 0.05) but higher means on Externalizing Problems (p < 0.05) than children from 5 K to the 1st Grade. Relative to White children, African American children had significantly lower means of Internalizing Problems (p < 0.01) and Externalizing Problems (p < 0.01). However, no differences in Attention Problems were identified between the two groups. Compared with White children, Hispanic children had lower mean scores in Internalizing Problems (p < 0.05), Attention problems(p < 0.001), and Externalizing Problems(p < 0.001). Children from other racial groups did not show significant differences from the White children regarding their behavioral problems.
Discussion
To better understand children’s SEB problems, caregivers of children, including teachers and parents, need reliable and valid screening instruments that function similarly under different conditions. PSC-17 has been used as a screening instrument for assessing children's psychosocial functioning, yet prior work has largely used parent ratings in clinical settings rather than school settings. When the PSC-17 was used in school settings for universal screening, parents were often involved as different informants provide unique perspectives on a child's behavior (Achenbach, 2006; Ferdinand et al., 2003) and help provide a comprehensive view of students' psychosocial functioning across contexts (Hunsley & Mash, 2007). While parent-rated PSC-17 began to be used in school settings, it is important to identify item-level bias, or DIF, which indicates that a given item functions differently across different subgroups independent of the level of the construct being measured. Hence, the goal of this study was to identify DIF items on parent-rated PSC-17 as a function of children’s gender, grade level, and race.
Factor Structure of the Parent Ratings of the PSC-17
The CFA results confirmed the underlying 3-factor structure of the PSC-17 (i.e., Internalizing Problems, Externalizing Problems, Attention Problems) identified in previous studies using the PSC-17 in school settings (DiStefano et al., 2017; Liu et al., 2020b; Gao et al., 2022) and clinical settings (Chaffin et al., 2017; Murphy et al., 2016; Stoppelbein, et al., 2012; Wagner et al., 2015). These findings support the multidimensionality of the parent-rated PSC-17. Therefore, school practitioners should move beyond the traditional classification of children as simply normal or abnormal regarding their psychosocial functioning. Instead, they can use the PSC-17 to assess specific areas of children’s psychosocial functioning, such as internalizing, externalizing, and attention problems, and target interventions that appropriately address these specific areas.
Uniform DIF as a Function of Gender, Race/Ethnicity, and Grade Level
All parent-rated PSC-17 items performed consistently across child gender and grade level, indicating that school practitioners can use the parent-rated PSC-17 to compare children’s psychosocial functioning across these demographics. However, 4 out of the 17 items exhibited DIF across the child race. Specifically, parents of African American children were more likely to choose higher response options for “teases others”, a finding consistent with the study by Studs et al. (2017). Compared to parents of White children, parents of Hispanic children were more inclined to select higher responses for Item 13 (“Acts as if driven by a motor”) but lower responses for Item 1(“Fidgety, unable to sit still”) and Item 7(“Has trouble concentrating”). These results suggest that the scores on these four items do not have the same meaning for parents of White children and parents of minority children. According to Wainer (1995), the presence of DIF items in a scale can affect the scale-level measurement differently. Therefore, when school practitioners implement universal screening with parent-rated PSC-17, they should consider revising or deleting these DIF Items to enhance the sensitivity and specificity of identifying behavioral problems among diverse early-aged children. In particular, school practitioners should be cautious when making comparisons across children from different racial groups at the item level.
Latent Mean Differences Across Gender, Race/Ethnicity, and Grade Level
The DIF items did not affect the latent mean differences across child gender, grade level, or race. This might be because the uniform DIF of individual items did not consistently favor one group over others (Wiesner et al., 2015). The findings indicated that DIF items did not bias the scale-level measurement of the psychosocial functioning of children with the parent-rated PSC-17. Thus, school practitioners can use parent-rated PSC-17 to compare differences in children’s internalizing, externalizing, and attention problems across gender, grade level, and racial groups.
Latent mean comparison across genders showed that boys exhibited a higher risk of attention problems and externalizing problems than girls, which was consistent with previous findings (Liu at al., 2020b; Chi & Cui, 2020; Nigg & Nikolas, 2008; Reid et al., 2000). The finding indicates that schools and families should provide extra support to male students in enhancing their attention and externalizing behavior skills. In terms of racial group variation, compared to White children, African American children demonstrated a lower risk of internalizing and ixternalizing problems. Hispanic children showed lower risk levels of internalizing, attention, and externalizing problems compared to White children. The finding is supported by previous studies indicating that minority parents are less likely to identify their children’s maladjustment (Robert et al., 2005), and White parents are more sensitive to their children’s maladjustment than minority parents (Lau et al., 2004). As parents play a crucial role in helping children enhance behavioral skills at home, schools could organize training sessions for parents of diverse racial groups on how to effectively identify and support their children’s psychosocial functioning. Children in 5 K and the first grade exhibited a higher risk for internalizing and attention problems but a lower risk level for externalizing problems relative to preschoolers. This finding aligns with the previous studies showing that internalizing problems tend to increase as children age (Achenbach et al., 1991; Gilliom & Shaw, 2004) and externalizing problems decrease throughout childhood for most children (Fanti & Henrich, 2007; Shaw et al., 2003). Regarding attention problems, the finding is consistent with previous study results that ADHD symptoms developed during preschools become increasingly impairing during elementary school (APA, 2013). These findings imply that schools should provide universal screening for children at the preschool stage for early identification and intervention to reduce the possibility of further development of psychosocial difficulties at an older age.
Limitations
The present study has several limitations. The current study was conducted with samples from pre-K to the first grade from public schools in two US states. The findings may not generalize to clinical samples, older children, and children from other different geographical regions. Hispanic children were overrepresented in the current sample (53.6% of participants). Large national studies with children from diverse backgrounds in different parts of the country should be conducted to determine whether the present results can be replicated. Additionally, the current study examined uniform DIF items from PSC-17 as a function of student gender, race/ethnicity, and grade level, but future research should consider factors such as socioeconomic status and language background for a more comprehensive view.
Conclusion
The present study identified the 3-factor structure of PSC-17 (i.e., Internalizing Problems, Externalizing Problems, Attention Problems) when rated by parents in school settings. The findings supported the scale’s underlying multifactor structure with parent raters and informed researchers and school practitioners that the structure of the PSC-17 matched the underlying theory. While four items exhibited uniform DIF, this had little impact on latent mean differences of the internalizing problems, attention problems, and externalizing problems. The findings imply that practitioners can use parent ratings of the PSC-17 to compare children’s internalizing, externalizing, and attention problems across gender, race/ethnicity, and grade levels. In addition, they can also compare children across genders and grade level groups at the item level. The three items on Attention Problems (i.e., "Act as if driven by a motor", " Fidgety, unable to sit still”, "Has trouble concentrating") and one item on Externalizing Problems (i.e., "Teases others") varied in the degree that they measured children’s attention problems and externalizing problems across racial/ethnic groups. Overall, the findings support the use of the PSC-17 as a reliable screening instrument for schools within MTSS framework to help parents assess SEB problems in preschool-aged and lower-grade children. As four PSC-17 items exhibited statistically uniform DIF as a function of race/ethnicity but not of gender or grade level, school practitioners should be cautious with making comparisons across race/ethnicity groups at the item level, particularly for the items exhibiting DIF.
Acknowledgements
The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305A150152 to the University of South Carolina. The opinions expressed are those of the authors and do not represent the views of the Institute or the U.S. Department of Education.
Declarations
Conflict of Interest
Ruiqin Gao, Christine Distefano, Jin Liu, Ning Jiang, Fred Greer, Erin Dowdy declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Uniform Differential Item Functioning Across Gender, Grade Level and Racial Groups: A MIMIC Investigation of the Non-clinical Parent Ratings of the Pediatric Symptom Checklist-17
Auteurs
Ruiqin Gao
Christine Distefano
Jin Liu
Ning Jiang
Fred Greer
Erin Dowdy
Achenbach, T. M. (2006). As others see us: Clinical and research implications of cross-informant correlations for psychopathology. Current Directions in Psychological Science,15(2), 94–98. https://doi.org/10.1111/j.0963-7214.2006.00414.xCrossRef
Achenbach, T. M., Howell, C. T., Quay, H. C., Conners, C. K., & Bates, J. E. (1991). National survey of problems and competencies among four-to sixteen-year-olds: Parents' reports for normative and clinical samples. Monographs of the society for research in child development, i-130.https://doi.org/10.2307/1166156
Bergmann, P., Lucke, C., Nguyen, T., Jellinek, M., & Murphy, J. M. (2018). Identification and utility of a short form of the Pediatric Symptom Checklist-Youth self-report (PSC-17-Y). European Journal of Psychological Assessment. https://doi.org/10.1027/1015-5759/a000486CrossRef
Blucker, R. T., Jackson, D., Gillaspy, J. A., Hale, J., Wolraich, M., & Gillaspy, S. R. (2014). Pediatric behavioral health screening in primary care: A preliminary analysis of the pediatric symptom checklist-17 with functional impairment items. Clinical Pediatrics,53(5), 449–455. https://doi.org/10.1177/0009922814527498CrossRefPubMed
Bruhn, A. L., Woods-Groves, S., & Huddle, S. (2014). A preliminary investigation of emotional and behavioral screening practices in K–12 schools. Education and Treatment of Children,37(4), 611–634. https://doi.org/10.1353/etc.2014.0039CrossRef
Chaffin, M., Campbell, C., Whitworth, D. N., Gillaspy, S. R., Bard, D., Bonner, B. L., & Wolraich, M. L. (2017). Accuracy of a pediatric behavioral health screener to detect untreated behavioral health problems in primary care settings. Clinical Pediatrics,56(5), 427–434. https://doi.org/10.1177/0009922816678412CrossRefPubMed
Chi, X., & Cui, X. (2020). Externalizing problem behaviors among adolescents in a southern city of China: Gender differences in prevalence and correlates. Children and Youth Services Review,119, 105632. https://doi.org/10.1016/j.childyouth.2020.105632CrossRef
De Los Reyes, A., & Kazdin, A. E. (2005). Informant discrepancies in the assessment of childhood psychopathology: A critical review, theoretical framework, and recommendations for further study. Psychological Bulletin,131(4), 483. https://doi.org/10.1037/0033-2909.131.4.483CrossRef
De Los Reyes, A., Augenstein, T. M., Wang, M., Thomas, S. A., Drabick, D. A., Burgers, D. E., & Rabinowitz, J. (2015). The validity of the multi-informant approach to assessing child and adolescent mental health. Psychological Bulletin,141(4), 858. https://doi.org/10.1037/a0038498CrossRefPubMedCentral
Denham, S. A., & Brown, C. (2010). “Plays nice with others”: Social–emotional learning and academic success. Early Education and Development,21(5), 652–680. https://doi.org/10.1080/10409289.2010.497450CrossRef
DiStefano, C., Liu, J., & Burgess, Y. (2017). Investigating the structure of the pediatric symptoms checklist in the preschool setting. Journal of Psychoeducational Assessment, 35(5), 494–505. https://doi.org/10.1177/0734282916647648
Dowdy, E., Furlong, M., Eklund, K., Saeki, E., & Ritchey, K. (2010). Screening for mental health and wellness: Current school-based practices and emerging possibilities. Handbook of Youth Prevention Science, pp. 70–95. https://doi.org/10.4324/9780203866412.ch4
Essex, M. J., Kraemer, H. C., Slattery, M. J., Burk, L. R., Thomas Boyce, W., Woodward, H. R., & Kupfer, D. J. (2009). Screening for childhood mental health problems: Outcomes and early identification. Journal of Child Psychology and Psychiatry,50(5), 562–570. https://doi.org/10.1111/j.1469-7610.2008.02015.xCrossRefPubMed
Fanti, K. A., & Henrich, C. C. (2007). The relation of home and childcare/school environment to differential trajectories of externalizing problems. International Journal About Parents in Education,1, 117–123. https://doi.org/10.54195/ijpe.18257CrossRef
Ferdinand, R. F., Hoogerheide, K. N., Van Der Ende, J., Visser, J. H., Koot, H. M., Kasius, M. C., & Verhulst, F. C. (2003). The role of the clinician: Three-year predictive value of parents’, teachers’, and clinicians’ judgment of childhood psychopathology. Journal of Child Psychology and Psychiatry,44(6), 867–876. https://doi.org/10.1111/1469-7610.00171CrossRefPubMed
Finch, H. (2005). The MIMIC model as a method for detecting DIF: Comparison with Mantel-Haenszel, SIBTEST, and the IRT likelihood ratio. Applied Psychological Measurement,29(4), 278–295. https://doi.org/10.1177/0146621605275728CrossRef
Finney, S., & DiStefano, C. (2013). Dealing with nonnormality and categorical data in structural equation modeling. A second course in structural equation modeling. Greenwich, CT: Information Age.
Gao, R., Raygoza, A., Distefano, C., Greer, F., & Dowdy, E. (2022). Assessing measurement equivalence of PSC-17 across teacher and parent respondents. School Psychology International,43(5), 477–495. https://doi.org/10.1177/01430343221108874CrossRef
Gardner, W., Murphy, J. M., Childs, G., Kelleher, K., Pagano, M., Jellinek, M., McInerny, T. K., Wasserman, R. C., Nutting, P., & Chiappetta, L. (1999). The PSC-17: A brief Pediatric Symptom Checklist with psychosocial problem subscales. A report from PROS and ASPN. Ambulatory Child Health,5, 225–236.
Gardner, W., Lucas, A., Kolko, D. J., & Campo, J. V. (2007). Comparison of the PSC-17 and alternative mental health screens in an at-risk primary care sample. Journal of the American Academy of Child & Adolescent Psychiatry,46(5), 611–618. https://doi.org/10.1097/chi.0b013e318032384bCrossRef
Glover, T. A., & Albers, C. A. (2007). Considerations for evaluating universal screening assessments. Journal of School Psychology,45(2), 117–135. https://doi.org/10.1016/j.jsp.2006.05.005CrossRef
Greenwood, C. R., Bradfield, T., Kaminski, R., Linas, M., Carta, J. J., & Nylander, D. (2011). The response to intervention (RTI) approach in early childhood. Focus on Exceptional Children,43(9), 1–22. https://doi.org/10.17161/fec.v43i9.6912CrossRef
Greiff, S., & Heene, M. (2017). Why psychological assessment needs to start worrying about model fit. European Journal of Psychological Assessment,33(5), 313–317. https://doi.org/10.1027/1015-5759/a000450CrossRef
Halle, T. G., & Darling-Churchill, K. E. (2016). Review of measures of social and emotional development. Journal of Applied Developmental Psychology,45, 8–18.CrossRef
Hammer, D., Melhuish, E., & Howard, S. J. (2017). Do aspects of social, emotional and behavioural development in the pre-school period predict later cognitive and academic attainment? Australian Journal of Education,61(3), 270–287. https://doi.org/10.1177/0004944117729514CrossRef
Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal,6(1), 1–55. https://doi.org/10.1080/10705519909540118CrossRef
Hunsley, J., & Mash, E. J. (2007). Evidence-based assessment. Annual Review of Clinical Psychology,3, 29–51.CrossRefPubMed
Hunter, L., Hoagwood, K. E., Evans, S., Weist, M., Smith, C., Paternite, C., & Horner, R. (2005). Working Together to Promote Academic Performance, Social and Emotional Learning, and Mental Health for All Children. New York, Columbia University.
Jones, D. E., Greenberg, M., & Crowley, M. (2015). Early social-emotional functioning and public health: The relationship between kindergarten social competence and future wellness. American Journal of Public Health,105(11), 2283–2290. https://doi.org/10.2105/ajph.2015.302630CrossRefPubMedPubMedCentral
Jöreskog, K. G., & Sörbom, D. (1996). LISREL 8: User’s reference guide. Scientific Software International.
Kamphaus, R. W., & Reynolds, C. R. (2007). Behavior assessment system for children—second edition (BASC-2): Behavioral and Emotional Screening System (BESS). Bloomington, MN: Pearson.
Kettler, R. J., Glover, T. A., Albers, C. A., & Feeney-Kettler, K. A. (2014). An introduction to universal screening in educational settings. In R. J. Kettler, T. A. Glover, C. A. Albers, & K. A. Feeney-Kettler (Eds.), Universal screening in educational settings: Evidence-based decision making for schools (pp. 3–16). American Psychological Association.CrossRef
Kilgus, S. P., Chafouleas, S. M., & Riley-Tillman, T. C. (2013). Development and initial validation of the Social and Academic Behavior Risk Screener for elementary grades. School Psychology Quarterly,28(3), 210. https://doi.org/10.1037/spq0000024CrossRefPubMed
Kim, E. S., Yoon, M., & Lee, T. (2012). Testing measurement invariance using MIMIC: Likelihood ratio test with a critical value adjustment. Educational and Psychological Measurement,72(3), 469–492. https://doi.org/10.1177/0013164411427395CrossRef
Kline, R. B. (2023). Principles and practice of structural equation modeling. Guilford publications.
Kochel, K. P., Ladd, G. W., & Rudolph, K. D. (2012). Longitudinal associations among youth depressive symptoms, peer victimization, and low peer acceptance: An interpersonal process perspective. Child Development,83(2), 637–650. https://doi.org/10.1016/j.ypsy.2012.08.064CrossRefPubMedPubMedCentral
Konold, T. R., Walthall, J. C., & Pianta, R. C. (2004). The behavior of child behavior ratings: Measurement structure of the Child Behavior Checklist across time, informants, and child gender. Behavioral Disorders,29(4), 372–383. https://doi.org/10.1177/019874290402900405CrossRef
Lane, K. L., Menzies, H. M., Oakes, W. P., & Kalberg, J. R. (2012). Systematic screenings of behavior to support instruction. Guilford.
Lane, K. L., & Menzies, H. M. (2003). A school-wide intervention with primary and secondary levels of support for elementary students: Outcomes and considerations. Education and Treatment of Children,26, 431–451.
Lau, A. S., Garland, A. F., Yeh, M., Mccabe, K. M., Wood, P. A., & Hough, R. L. (2004). Race/ethnicity and inter-informant agreement in assessing adolescent psychopathology. Journal of Emotional and Behavioral Disorders,12(3), 145–156. https://doi.org/10.1177/10634266040120030201CrossRef
Liu, J., DiStefano, C., Burgess, Y., & Wang, J. (2018). Pediatric symptom checklist-17: Testingmeasurement invariance of a higher-order factor model between boys and girls. European Journal of Psychological Assessment,1(1), 1–7.
Liu, J., Burgess, Y., DiStefano, C., Pan, F., & Jiang, N. (2020a). Validating the pediatric symptoms checklist–17 in the preschool environment. Journal of Psychoeducational Assessment,38(4), 460–474. https://doi.org/10.1177/0734282919828234
Liu, J., Guo, S., Gao, R., & DiStefano, C. (2020b). Investigating school children’s behavioral and emotional problems using pediatric symptoms checklist-17 in a structural equation modeling framework. School Psychology International, 41(3), 257–275. https://doi.org/10.1177/0143034320912301
McNeish, D., An, J., & Hancock, G. R. (2018). The thorny relation between measurement quality and fit index cutoffs in latent variable models. Journal of Personality Assessment,100(1), 43–52. https://doi.org/10.1080/00223891.2017.1281286CrossRefPubMed
Murphy, J. M., Bergmann, P., Chiang, C., Sturner, R., Howard, B., Abel, M. R., & Jellinek, M. (2016). The PSC-17: Subscale scores, reliability, and factor structure in a new national sample. Pediatrics,138(3), 1–8. https://doi.org/10.1542/peds.2016-0038CrossRef
Muthén, L. K. & Muthén, B. O. (1998–2015). Mplus user’s guide. Seventh edition. Los Angeles, CA: Muthén & Muthén.
Nigg, J., & Nikolas, M. (2008). Attention-deficit/hyperactivity disorder. In T. P. Beauchaine & S. P. Hinshaw (Eds.), Child and adolescent psychopathology (pp. 301–334). John Wiley & Sons.
Olivier, E., Morin, A. J., Langlois, J., Tardif-Grenier, K., & Archambault, I. (2020). Internalizing and externalizing behavior problems and student engagement in elementary and secondary school students. Journal of Youth and Adolescence,49(11), 2327–2346. https://doi.org/10.1007/s10964-020-01295-xCrossRefPubMed
Olubiyi, O., Futterer, A., & Kang-Yi, C. D. (2019). Mental health care provided through community school models. The Journal of Mental Health Training, Education and Practice,14(5), 297–314. https://doi.org/10.1108/jmhtep-01-2019-0006CrossRef
Oort, F. J. (1998). Simulation study of item bias detection with restricted factor analysis. Structural Equation Modeling: A Multidisciplinary Journal,5(2), 107–124. https://doi.org/10.1080/10705519809540095CrossRef
Polderman, T. J. C., Boomsma, D. I., Bartels, M., Verhulst, F. C., & Huizink, A. C. (2010). A systematic review of prospective studies on attention problems and academic achievement. Acta Psychiatrica Scandinavica,122(4), 271–284. https://doi.org/10.1111/j.1600-0447.2010.01568.xCrossRefPubMed
Reid, R., Riccio, C. A., Kessler, R. H., Dupaul, G. J., Power, T. J., Anastopoulos, A. D., ... & Noll, M. B. (2000). Gender and ethnic differences in ADHD as assessed by behavior ratings. Journal of Emotional and Behavioral Disorders, 8(1), 38–48. https://doi.org/10.1177/106342660000800105
Roberts, R. E., Alegria, M., Roberts, C. R., & Chen, I. G. (2005). Concordance of reports of mental health functioning by adolescents and their caregivers: A comparison of European, African and Latino Americans. The Journal of Nervous and Mental Disease,193(8), 528–534. https://doi.org/10.1097/01.nmd.0000172597.15314.cbCrossRefPubMed
Sanchez, A. L., Cornacchio, D., Poznanski, B., Golik, A. M., Chou, T., & Comer, J. S. (2018). The effectiveness of school-based mental health services for elementary-aged children: A meta-analysis. Journal of the American Academy of Child & Adolescent Psychiatry,57(3), 153–165. https://doi.org/10.1016/j.jaac.2017.11.022CrossRef
Scott, N. W., Fayers, P. M., Aaronson, N. K., Bottomley, A., de Graeff, A., Groenvold, M., ... & Quality of Life Cross-Cultural Meta-Analysis Group. (2009). A simulation study provided sample size guidance for differential item functioning (DIF) studies using short scales. Journal of clinical epidemiology, 62(3), 288-295. https://doi.org/10.1016/j.jclinepi.2008.06.003
Severson, H. H., & Walker, H. M. (2002). Proactive approaches for identifying children at risk for sociobehavioral problems. Interventions for Children With or at Risk for Emotional and Behavioral Disorders, 1:33–53.
Stapleton, L. M. (2013). Multilevel structural equation modeling with complex sample data. In G. R. Hancock & R. O. Mueller (Eds.), Quantitative methods in education and the behavioral sciences: Issues, research, and teaching. Structural equation modeling: A second course (p. 521–562). IAP Information Age Publishing.
Steed, E. A., & Banerjee, R. (2016). Assessment and early identification of young children with social emotional difficulties and behavioral challenges. Journal of Intellectual Disability-Diagnosis and Treatment,3, 198–204. https://doi.org/10.6000/2292-2598.2015.03.04.5CrossRef
Stoppelbein, L., Greening, L., Moll, G., Jordan, S., & Suozzi, A. (2012). Factor analyses of the Pediatric Symptom Checklist-17 with African-American and Caucasian pediatric populations. Journal of Pediatric Psychology,37(3), 348–357. https://doi.org/10.1093/jpepsy/jsr103CrossRefPubMed
Studts, C. R., Polaha, J., & van Zyl, M. A. (2017). Identifying unbiased items for screening preschoolers for disruptive behavior problems. Journal of Pediatric Psychology,42(4), 476–486. https://doi.org/10.1093/jpepsy/jsw090CrossRefPubMed
Green, S.B., &Thompson, M.S (2012) . A flexible structural equation modeling approach for analyzing means. In R. H. Hoyle (Ed.), Handbook of structural equation modeling (pp. 393–416), Guilford Press.
Treyvaud, K., Doyle, L. W., Lee, K. J., Roberts, G., Lim, J., Inder, T. E., & Anderson, P. J. (2012). Social–emotional difficulties in very preterm and term 2 year olds predict specific social–emotional problems at the age of 5 years. Journal of Pediatric Psychology,37(7), 779–785. https://doi.org/10.1093/jpepsy/jss042CrossRefPubMed
Wagner, J. L., Guilfoyle, S. M., Rausch, J., & Modi, A. C. (2015). Psychometric validation of the Pediatric Symptom Checklist-17 in a pediatric population with epilepsy: A methods study. Epilepsy & Behavior,51, 112–116. https://doi.org/10.1016/j.yebeh.2015.06.027CrossRef
Wainer, H. (1995). Precision and differential item functioning on a testlet-based test: The 1991 Law School Admissions Test as an example. Applied Measurement in Education,8(2), 157–186. https://doi.org/10.1207/s15324818ame0802_4CrossRef
Walker, H. M., & Severson, H. H. (1992). Systematic screening for behavior disorders (SSBD). Longmont, CO: Sopris West.
Walker, B., Cheney, D., Stage, S., Blum, C., & Horner, R. H. (2005). Schoolwide screening and positive behavior supports: Identifying and supporting students at risk for school failure. Journal of Positive Behavior Interventions,7(4), 194–204. https://doi.org/10.1177/10983007050070040101CrossRef
Walker, H. M., & Shinn, M. R. (2002). Structuring school-based interventions to achieve integrated primary, secondary, and tertiary prevention goals for safe and effective schools. In M. R. Shinn, G. Stoner, & H. M. Walker (Eds.), Interventions for academic and behavior problems: Preventive and remedial approaches (pp. 1–26). Silver Spring, MD: National Association of School Psychologists.
Walker, H. M., Seeley, J. R., Small, J., Severson, H. H., Graham, B. A., Feil, E. G., ... & Forness, S. R. (2009). A randomized controlled trial of the First Step to Success early intervention: Demonstration of program efficacy outcomes in a diverse, urban school district. Journal of Emotional and Behavioral Disorders, 17(4), 197–212. https://doi.org/10.1177/1063426609341645
Waller, R., Hyde, L. W., Baskin-Sommers, A. R., & Olson, S. L. (2017). Interactions between callous unemotional behaviors and executive function in early childhood predict later aggression and lower peer-liking in late-childhood. Journal of Abnormal Child Psychology,45(3), 597–609. https://doi.org/10.1007/s10802-016-0184-2CrossRefPubMedPubMedCentral
Wang, L., Chen, Y., Zhang, S., & Rozelle, S. (2022). Paths of social-emotional development before 3 years old and child development after 5 years old: Evidence from ruralChina. Early Human Development,165, 105539. https://doi.org/10.1016/j.earlhumdev.2022.105539CrossRefPubMed
Wiesner, M., Windle, M., Kanouse, D. E., Elliott, M. N., & Schuster, M. A. (2015). DISC Predictive Scales (DPS): Factor structure and uniform differential item functioning across gender and three racial/ethnic groups for ADHD, conduct disorder, and oppositional defiant disorder symptoms. Psychological Assessment,27(4), 1324. https://doi.org/10.1037/pas0000101CrossRefPubMedPubMedCentral
Wolf, E. J., Harrington, K. M., Clark, S. L., & Miller, M. W. (2013). Sample size requirements for structural equation models: An evaluation of power, bias, and solution propriety. Educational and Psychological Measurement,73(6), 913–934. https://doi.org/10.1177/0013164413495237CrossRef
Wong, C. L., Ching, T. Y., Cupples, L., Leigh, G., Marnane, V., Button, L., ... & Gunnourie, M. (2020). Comparing parent and teacher ratings of emotional and behavioural difficulties in 5-year old children who are deaf or hard-of-hearing. Deafness & Education International, 22(1),3–26.https://doi.org/10.1080/14643154.2018.1475956