Hierarchical multiple regression was used to: (1) control cognitive variables that differed between interview conditions (Tables
1,
2); and (2) assess differences in performance across interview conditions (steps 1 and 2 of each regression respectively). For all regression analyses, key statistical checks (Durbin-Watson, tolerance and VIF statistics, Cook’s and Mahalanobis distances, standardised DFβs, leverage values, plots of standardised residuals and predicted standardised values, standardised residuals, partial plots) were carried out to ascertain that no individual cases had undue influence on the regressions (Field
2013). For error data, log transformations were performed, and proportion correct data were subject to an arcsine transformation prior to analyses (Cohen and Cohen
1983).
Research Question 1: Did the Interview Interventions Improve Performance in Children with ASD?
Interview condition differences in performance were assessed for four dependent variables: total correct details; total incorrect details; total confabulations; and proportion of correct details (see Table
3 for mean raw scores). The three variables that differed between interview groups (receptive vocabulary, dual task attention, Recalling Sentences; see Table
1) were initially entered at Step 1 of each regression, but the only variable that ever related to interview performance was receptive vocabulary, therefore, the final models retain only this variable. Three dummy-coded interview condition variables, introduced at Step 2, assessed interview condition differences between the reference condition (Best-Practice interview) and each of the other three interview conditions. Table
4 gives full details of Step 2 from each regression.
Table 3
Mean (SD) raw scores for correct, incorrect, confabulated, and proportion accurate details in the investigative interview for children with ASD in each interview condition, as well as numbers of details in the six information categories (people, setting, actions, conversation, objects, general)
Correct details | 20.06 (15.11) | 32.00 (16.56) | 27.17 (24.11) | 18.47 (16.61) |
Incorrect details | 2.67 (2.93) | 3.11 (3.48) | 2.17 (2.20) | 2.82 (3.38) |
Confabulations | 1.50 (2.15) | 3.67 (5.02) | 3.78 (4.89) | 1.94 (2.90) |
Proportion of correct details | 0.82 (0.12) | 0.81 (0.16) | 0.78 (0.20) | 0.79 (0.20) |
People | 7.83 (6.11) | 11.61 (6.77) | 8.61 (6.79) | 6.94 (6.71) |
Setting | 1.11 (1.28) | 2.17 (1.76) | 1.28 (1.94) | 0.41 (0.51) |
Actions | 4.39 (4.46) | 6.22 (4.55) | 5.72 (6.70) | 3.94 (4.49) |
Conversation | 0.50 (0.99) | 1.94 (2.90) | 1.44 (2.01) | 0.88 (1.58) |
Objects | 2.83 (2.66) | 5.22 (4.40) | 4.89 (4.66) | 3.65 (3.72) |
General | 3.39 (3.98) | 4.83 (4.03) | 5.22 (5.40) | 2.65 (3.46) |
Table 4
Details of step 2 for regressions predicting investigative interview performance in children with ASD
Total correct details | 0.30*** | 0.05 | 0.50*** | 0.13 | −0.04 | −0.14 |
Total incorrect details | 0.10 | 0.02 | 0.32* | −0.05 | −0.19 | −0.11 |
Total confabulations | 0.08 | 0.06 | 0.07 | 0.21 | 0.23 | 0.02 |
Proportion of correct details | 0.02 | 0.01 | 0.12 | −0.03 | −0.09 | −0.04 |
For total correct details, the full regression model was significant (F(4,66) = 7.22, p < .001) and accounted for 30.4% of the variance. Introducing the dummy coded interview condition variables at Step 2 resulted in no significant change in R
2 (5.0%), indicating no significant differences in performance across interview conditions (F Change (3,66) = 1.59, p = .20). Standardised β-values indicated that receptive vocabulary was significantly related to number of correct details recalled on the Investigative Interview (p < .001).
For total incorrect details, the full regression model accounted for 9.7% of the variance and was not significant (F(4,66) = 1.77, p < .15), although Step 1 of the model was significant and showed an effect for receptive vocabulary (p = .02). Introducing dummy coded interview condition variables at Step 2 did not result in a significant change in R
2 (F Change (3,66) = 0.58, p = .63, 2.4% of the variance), indicating no significant interview condition differences. For total confabulations, the full regression model accounted for 7.6% of the variance and was not significant (F(4,66) = 1.36, p = .22). Introducing dummy coded interview condition variables at Step 2 did not result in a significant change in R
2 (F Change (3,66) = 1.30, p = .28, 5.5% of the variance), indicating no significant interview condition differences. For proportion of correct details, n = 59 because 12 children (distributed across the four interview conditions) recalled nothing in the Investigative Interviews, therefore no proportion correct values could be calculated. The full regression model accounted for 1.5% of the variance and was not significant (F(4,54) = 0.21, p = .93). Introducing dummy coded interview condition variables at Step 2 did not result in a significant change in R
2 (F Change (3,54) = 0.10, p = .96), indicating no significant interview condition differences.
Correct details were also coded for type of information recalled (people, setting, actions, conversation, objects, general—see Table
3 for mean raw scores), and similar regressions were used to assess potential differences between interview conditions in each of these sub-categories. Alpha was set at
p < .008 after Bonferroni corrections based on six additional regressions. Log transformations were applied for setting, actions, conversation and general details. The overall regression models were significant for all types of details except conversation details [people (
F(4,66) = 3.43,
p = .01), setting (
F(4,66) = 6.06,
p < .001), actions (
F(4,66) = 5.51,
p = .001), objects (
F(4,66) = 4.95,
p = .002) and general (
F(4,66) = 7.43,
p < .001); the regression model missed significance for conversation details (
F(4,66) = 2.99,
p = .02)].
Of particular interest was whether there were interview condition differences for any of these types of details, i.e. significant changes in R
2 at Step 2 of the models. In fact, there were no significant interview condition differences for people, action, conversation, object or general details, but there was a difference for setting details (F Change at step 2 (3,66) = 5.17, p = .003). Inspection of the standardised β-values indicated that the contrast between the Best-Practice interview and the RI interview was marginally significant (p = .03): children in the RI condition tended to recall fewer setting details (although note that numbers of setting details recalled were small across all interview conditions). In terms of other predictors, receptive vocabulary was a significant predictor of all types of details except conversation details [people (β = 0.34, p = .006), setting (β = 0.33, p = .005), actions (β = 0.47, p < .001), object (β = 0.44, p < .001), and general (β = 0.52, p < .001); although receptive vocabulary also related to conversation details (β = 0.34, p = .006), this result cannot be interpreted as the overall regression model was non-significant].
Summary
For children with ASD, none of the interview interventions significantly improved overall number of correct details recalled, type of details recalled, or error rates compared to a Best-Practice interview. There was a marginally significant tendency for those in the RI condition to recall fewer correct setting details.
Research Question 2: Did the Interview Interventions Improve Performance in TD Children?
Similar regressions were carried out for correct, incorrect, confabulated and proportion correct details for the TD group (see Table
5 for mean raw scores). Variables that differed significantly between interview conditions were included at Step 1 to control for their effects (these were age and Brief Interview total correct—see Table
2): we also included IQ, which showed a marginally significant interview group difference. Table
6 gives details about step 2 of each regression.
Table 5
Mean (SD) raw scores for correct, incorrect, confabulated, and proportion accurate details in the investigative interview for TD children in each interview condition, as well as numbers of details in the six information categories (people, setting, actions, conversation, objects, general)
Correct details | 30.04 (18.03) | 37.59 (15.31) | 32.40 (19.15) | 53.92 (18.85) |
Incorrect details | 3.35 (2.83) | 4.18 (3.47) | 3.48 (2.89) | 5.47 (3.20) |
Confabulations | 2.73 (4.07) | 5.52 (8.51) | 5.14 (6.80) | 3.84 (4.61) |
Proportion of correct details | 0.84 (0.10) | 0.79 (0.18) | 0.77 (0.18) | 0.84 (0.11) |
People | 10.57 (7.19) | 11.80 (6.63) | 9.64 (7.70) | 18.39 (7.36) |
Setting | 1.03 (1.20) | 2.07 (1.19) | 1.05 (1.40) | 1.66 (1.48) |
Actions | 5.51 (4.58) | 6.34 (4.19) | 5.48 (4.71) | 10.74 (6.00) |
Conversation | 1.27 (1.80) | 1.07 (1.45) | 1.12 (2.18) | 2.29 (2.73) |
Objects | 4.71 (3.59) | 7.66 (3.87) | 6.52 (4.92) | 10.29 (5.64) |
General | 6.96 (4.66) | 8.61 (4.76) | 8.60 (4.05) | 10.58 (4.59) |
Table 6
Details of step 2 for regressions predicting investigative interview performance in TD children
Total correct details | 0.56*** | 0.11*** | 0.27*** | 0.38*** | 0.18*** | 0.18** | 0.08 | 0.38*** |
Total incorrect details | 0.22*** | 0.02 | 0.08 | 0.37*** | 0.00 | 0.07 | 0.02 | 0.16* |
Total confabulations | 0.07* | 0.04* | 0.20** | 0.00 | 0.05 | 0.19* | 0.19* | 0.07 |
Proportion of correct details | 0.05 | 0.03 | −0.07 | 0.08 | 0.09 | −0.13 | −0.17* | 0.02 |
For total correct details, the full regression model was significant (F(6, 192) = 40.62, p < .001). Introducing the dummy coded interview condition variables at Step 2 of the regression resulted in a significant change in R
2, indicating significant differences in performance across interview types (F Change (3, 192) = 15.86, p < .001). Inspection of the standardised β-values at Step 2 showed that children receiving the RI (p < .001) and Verbal Labels (p = .001) interviews recalled significantly more information than children receiving the Best-Practice interview. After accounting for the other variables, children in the RI interview recalled 18.96 more items of correct information than children in the Best-Practice interview (95% CI 13.43–24.49 items); and children in the Verbal Labels condition recalled 8.47 more items of correct information than children receiving a Best-Practice interview (95% CI 3.35–13.59 items). Age, Brief Interview total correct and IQ were also significantly related to Investigative Interview performance (ps < 0.001). The full model accounted for 55.9% of the variance, and the change in R² at Step 2 of the model was 10.9% (p < .001).
For total incorrect details, the full regression model was significant (F(6, 192) = 9.06, p < .001). Introducing the dummy coded interview condition variables at Step 2 did not result in a significant change in R
2 (F Change (3, 192) = 1.75, p = .16), indicating no significant interview condition differences. Inspection of the standardised β-values showed that Brief interview total correct score (p < .001) was significantly related to total incorrect details. Note: although the β-values showed a significant contrast between the Best-Practice and RI conditions, p = .03, the lack of an overall R
2 change at step 2 of the model means this result cannot be interpreted. The full model accounted for 22.0% of the variance, and the change in R
2 at Step 2 of the model was 2.1%.
For total confabulations, the full regression model was significant (F(6, 192) = 2.54, p = .02). Introducing the dummy coded interview condition variables at Step 2 resulted in a significant change in R
2 (F Change (3, 192) = 2.82, p = .04). Standardised β-values at Step 2 revealed that children in the Verbal Labels (p = .02) and Sketch-RC (p = .018) conditions made more confabulations than children in the Best-Practice condition. Age was significantly related to total confabulations (p = .01). The full model accounted for 7.4% of the variance, and the change in R
2 at Step 2 of the model was 4.1%.
For proportion of correct details, n = 193 as six children did not recall any correct details in the investigative interview (five in the Best-Practice interview, one in the Verbal Labels interview). The full regression model was not significant (F(6, 186) = 1.54, p = .17). Introducing the dummy coded interview condition variables at Step 2 of the regression did not result in a significant change in R
2 (F Change (3, 186) = 2.18, p = .09), indicating no significant interview condition differences. Note: although the contrast between the Best-Practice and Sketch-RC interviews was significant (p = .04), this cannot be interpreted as the overall regression model was not significant. The full model accounted for 4.7% of the variance, and the change in R
2 at Step 2 of the model was 3.3%.
Correct details were also coded for type of information recalled (people, setting, actions, conversation, objects, general—see Table
5 for mean raw scores), and similar regressions were used to assess interview condition differences in each of these sub-categories. Alpha was set at
p < .008 after Bonferroni corrections based on six regressions. Log transformations were applied to setting and conversation data. The regression models were significant for all types of details [people (
F(6, 192) = 19.98,
p < .001); setting (
F(6, 192) = 10.90,
p < .001); actions (
F(6, 192) = 21.61,
p < .001); objects (
F(6, 192) = 21.88,
p < .001); conversation (
F(6, 192) = 7.92,
p < .001); and general (
F(6, 192) = 16.69,
p < .001)]. Of particular interest was whether there were interview condition differences for any of the types of details, i.e. significant changes in
R
2 at Step 2 of the models. Such differences were found for all types of correct details except conversation details [people (
F Change (3, 192) = 8.81,
p < .001); setting (
F Change (3, 192) = 10.57,
p < .001); actions (
F Change (3, 192) = 7.90,
p < .001); objects (
F Change (3, 192) = 14.33,
p < .001); and general (
F Change (3, 192) = 5.76,
p = .001);
R
2 change for conversation details was not significant (
F Change (3, 192) = 2.25,
p = .08].
In order to interpret these interview condition differences, the β-values were inspected. These indicated that the following interview condition differences were present: (1) Children in the RI interview recalled significantly more details about people (β = 0.31, p < .001) and actions (β = 0.31, p < .001) than children in the Best-Practice interview. (2) Children in the Verbal Labels interview recalled significantly more details about setting than children in the Best-Practice interview (β = 0.37, p < .001). (3) Children in all three interview intervention conditions (Verbal Labels, Sketch-RC and RI) recalled significantly more details about objects (β’s = 0.28, p < .001; 0.18, p = .004; and 0.39, p < .001 respectively) and general information (β’s = 0.19, p = .006; 0.18, p = .008; and 0.25, p < .001 respectively) than children in the Best-Practice interview. In terms of other predictors of correct recall for types of details, Brief Interview total correct was a significant predictor for all types of details except setting [person (β = 0.38, p < .001), actions (β = 0.25, p < .001), conversation (β = 0.41, p < .001), objects (β = 0.25, p < .001), and general (β = 0.23, p = .001)]. Age was related to correct recall of all types of details except person and conversation [setting (β = 0.20, p = .004), actions (β = 0.33, p < .001), objects (β = 0.25, p < .001), and general (β = 0.28, p < .001)]. Finally, IQ related to performance on two types of details [objects (β = 18, p = .003) and general (β = 0.23, p < .001)].
Summary
For TD children, the RI and Verbal Labels interviews increased the number of correct details recalled compared to a Best-Practice interview. RI interviews showed the greater increase, without affecting error rates. In contrast, the Verbal Labels interview increased the number of confabulations. In terms of types of details recalled, all interview interventions led to at least some improvements: RI interviews increased the number of people, actions, objects and general details recalled; Verbal Labels interviews increased the number of setting, objects and general details recalled; and Sketch-RC interviews increased the numbers of objects and general details recalled.
Research Questions 3 and 4: Were There ASD/TD Group Differences in Performance on the Investigative Interview, and did the Interview Interventions Affect Performance Differently in the Two Samples of Children?
Hierarchical multiple regression was used to test these two research questions by including all participants in the same regressions. Step 1 reflected the variables included in the previous regressions, and for this we merged the background variables that differed between interview conditions in the ASD and TD samples (age, receptive vocabulary, Brief Interview correct details—we did not include IQ as this was strongly related to receptive vocabulary); plus the dummy-coded interview condition variables (with the Best-Practice interview acting as the reference condition in each case). To test for overall group differences in performance (Research question 3), group was entered at Step 2 of the model (ASD versus TD). To test whether group interacted with interview condition (Research question 4), three interaction variables (Jaccard et al.
1990) were entered at Step 3 (those between group and each of the dummy-coded interview condition variables).
For correct details, one multivariate outlier was identified, but as removing this case made no difference to the results, it was retained. The overall regression model for correct details was significant (F(10, 259) = 35.81, p < .001, accounting for 58% of the variance). The three control variables were significant at step 1 and remained so by step 3 (β age = 0.21, p < .001; β Brief Interview total correct = 0.46, p < .001; β receptive vocabulary = 0.19, p < .001). The significant interview condition contrasts exactly corresponded to those found for the TD sample (i.e. RI > Best-Practice; Verbal Labels > Best-Practice; Sketch-RC = Best-Practice)—these results were not surprising as TD children formed the majority of the combined sample.
At step 2, there was a significant change in R
2 (F Change (1,262) = 9.71, p = .002, 1.7% of the variance) with the entry of group (β = 0.15, p = .002), which initially suggested an overall ASD versus TD group difference in interview performance. However, there was also a significant change in R
2 (F Change (3, 259) = 8.65, p < .001, 4.2% of the variance) with the entry of the interaction terms at step 3, and, critically, at this final step the group effect became non-significant (β = 0.07, p = .38). This means that overall ASD versus TD group differences were different for different interview condition comparisons. The term reflecting the RI versus Best-Practice interview by group interaction was significant (β = 0.36, p < .001), which confirmed the separate sample analyses reported earlier showing that whilst RIs improved recall of correct details compared to a Best-Practice interview in TD children, this beneficial effect was not observed for children with ASD. The comparison between the Best-Practice interview and, respectively the Verbal Labels and Sketch-RC interviews, did not interact with group (β’s = –0.08 and 0.01, ps = 0.38 and 0.91), indicating that ASD/TD group differences were not apparent for either of these interview contrasts: performance levels were no different between ASD and TD children.
For incorrect details, although the full regression model was significant (F(10, 259) = 7.55, p < .001), the only significant β-value at step 3 was for Brief Interview total correct (β = 0.35, p < .001). For confabulations, the full regression model was again significant (F(10, 259) = 1.99, p = .03), but no individual β-values were significant at step 3. The full regression model for proportion of accurate details was not significant. Hence, there were no group differences or interactions for error scores or proportion of accurate details.
Summary
Recall of correct details was significantly higher in the RI than the Best-Practice interview for TD children, but the beneficial effect of RIs was not observed for children with ASD. The other interview contrasts (Verbal Labels with Best-Practice; Sketch-RC with Best-Practice) did not interact with group: this indicated that children with ASD and TD performed at the same level on these interviews and there were no significant group differences in their relative effects.