Dynamic prediction characteristics of the Child Abuse Potential Inventory☆
Introduction
Risk instruments are commonly used to assess outcomes of interventions designed to prevent occurrences or recurrences of child maltreatment (e.g., Acton & During, 1992; Haskett, Scott, & Fann, 1995; Kolko, 1996; Meezan & O’Keefe, 1998; Milner, Murphy, Valle, & Tolliver, 1998; Wolfe, Edwards, Manion, & Koverola, 1988). Although the primary goal of child maltreatment prevention and intervention efforts is to prevent future episodes of child maltreatment (e.g., Altepeter & Walker, 1992; Chalk & King, 1998; Williams, 1983), the majority of studies and program evaluations do not directly measure this bottom-line outcome (Chalk & King, 1998; Schellenbach, 1998). When evaluating outcomes, there are a number of possible reasons for assessing change using risk instruments or other proxy measures rather than actual incidents of maltreatment. If research or evaluation efforts inquire directly into ongoing abuse or neglect, there may be irreconcilable conflicts between the demands of mandatory child abuse reporting laws and ethical demands to protect the welfare of research participants and fully inform participants of the risks inherent in their participation. Participants who are fully informed about the potential consequences of answering questions about ongoing abuse or neglect (e.g., reports to child protective services, possible removal of their child from the home) may not report or may deny actual occurrences of maltreatment (Ammerman, 1998). Incidents of officially reported future abuse or neglect may be captured from administrative data bases (i.e., state child abuse registries); however, this may require special permissions which researchers or evaluators may not be able to acquire. Even when it is possible to obtain access to official records. the nature of official CPS report data creates intrinsic problems. Tracking future incidents of abuse usually requires following participants for an extended period of time, possibly up to several years following the intervention, at considerable cost in staff time and other resources. Among some populations, such as low-risk primary prevention populations, low incidence rates necessitate very large samples and long follow-up periods in order to test even the most basic hypotheses. Finally, official report data likely underestimate actual behavior and may be affected by case finding biases (Chalk & King, 1998).
Given these obstacles and limitations, it is not surprising that many research and program evaluation projects have utilized maltreatment risk instruments and changes on these instruments as a measure of intervention outcome. If reducing risk is considered equivalent to reducing future rates of maltreatment, it is not unreasonable to assume that score reductions on child abuse risk measures serve as proxies for actual changes in risk status. Furthermore, risk measures generally have the advantage of yielding continuous variable scores. Using continuous variable outcomes increases analytic power and solves many of the problems inherent in recidivism data such as requirements for lengthy follow-up and low base rates. Further, continuous variable risk measures may be obtained at several points in time, allowing analysis of change trajectories and maintenance of gains.
In addition, there may be theory-based reasons for using risk assessment instruments in evaluating program outcomes. Factors assessed by risk instruments and factors targeted by interventions may be synonymous. Broadly speaking, interventions and risk measures are drawn from the same theory base. For example, interventions may target child development knowledge, rigid childrearing attitudes, unrealistic expectations of children, or parenting stress—many of the same areas measured by risk instruments. Interventions target risk (and presumed causative) factors that are most easily assessed using risk assessment instruments measuring the same construct (Schellenbach, 1998). This is particularly true when the focus of the intervention is on behaviors (e.g., harsh disciplinary practice in the home) that occur infrequently or are difficult to measure or observe directly (Ammerman, 1998). Although the logic of using risk instruments as change measures seems intuitive, and the practice is routine, research demonstrating the correspondence between changes on risk instruments and changes in the likelihood of future child maltreatment incidents is lacking. Even when the instrument is proven to be a good predictor of future child maltreatment, an instrument’s ability to predict future maltreatment is generally demonstrated only statically (e.g., Milner, Gold, Ayoub, & Jacewitz, 1984). That is, a score taken at some single point in time is found to predict subsequent maltreatment behavior. However, when measures are used to evaluate change (e.g., following intervention), exposure to repeated testing and treatment create different conditions than were present during the normative development of the instrument (Kazdin, 1999). Thus, the implications of score change validity must be demonstrated dynamically.
Dynamic prediction can be modeled in different ways, but perhaps the best way to conceptualize dynamic prediction is to model the predictor as a time-dependent variable. That is, the values of the predictor variable can change for a given case over time. In this conceptualization, scores on the risk instrument that are taken at a later point in time should have better future predictive validity during the subsequent time period than scores taken at an earlier point in time. For example, scores taken at program completion are expected to be better predictors of post-program recidivism than scores taken at program enrollment. Although this hypothesis seems obvious and has intuitive appeal, it may not be true given certain scenarios. For example, if the risk instrument is based entirely on historical or other static variables (e.g., gender, childhood history of abuse, ever having a report to CPS), the instrument would not be sensitive to change even following extremely effective interventions. In this case, both program enrollment and program completion scores would be approximately equivalent and consequently would have identical predictive power for post-completion events. We may exclude this possibility for the CAP and other self-report child maltreatment risk instruments that are primarily comprised of dynamic, changeable variables such as childrearing attitudes, emotional distress, or current parent-child problems, and that include relatively few static variables.
Another possible scenario in which the dynamic predictive validity of a risk assessment instrument may be questionable is when interventions produce superficial attitude changes or have demand characteristics that change the way participants answer questions on the risk instrument. Participants may report different attitudes or knowledge, but the reported changes may not translate into actual changes in behavior or decreases in the likelihood of future child maltreatment incidents (Ogles, Lambert, & Masters, 1996; Reppucci, Britner, & Woolard, 1997). Attitudes may be poor predictors of behavior, particularly when there are other influences on behavior or a less direct relationship between the attitude and the behavior (e.g., Kraus, 1995). For example, a parent may hold certain attitudes about parenting in general, but be unable to apply those parenting beliefs to their own child due to competing attitudes specific to their child.
Similarly, if a risk measure contains marker variables—variables associated with likelihood of child maltreatment only through their indirect association with actual causal factors—then changes on these markers, without corresponding changes in the causal factors, may be meaningless with respect to the actual likelihood of child maltreatment. By way of analogy, consider the use of earlobe creases as a marker for cardiovascular disease risk. Even if earlobe creases predict risk for future cardiovascular disease, an intervention that changes earlobe creases would not be expected to change actual cardiovascular disease risk.
Another possibility is that interventions might produce very transitory changes followed by rapid return toward baseline. Transitory improvement measured at post-intervention might not last long enough to actually be predictive of long-term behaviors. In this case, change on a risk measure might reflect changes in actual maltreatment risk only for a very brief period, and might not be valid beyond this. For example, support provided during the course of the intervention may temporarily decrease parental distress, which subsequently rebounds when the intervention ends and support is withdrawn. Finally, differential response bias might affect dynamic predictive validity. For example, individuals could “fake good” at post-intervention assessment (assuming they responded accurately at program enrollment) and give responses that were learned to be socially “correct” during the intervention, without corresponding changes outside of the intervention setting. Under any of the various scenarios previously described, decreases in risk as assessed by scores on a risk assessment tool may not accurately reflect changes in the actual likelihood of child maltreatment.
Change in scores on risk instruments can be analyzed using various algorithms, many of which may be valid if the instrument has strong dynamic prediction. Perhaps the greatest concerns arise in connection with analysis of what is sometimes called clinically significant change. Analysis of clinically significant change involves setting a priori criteria for the amount and types of score changes judged to reflect real-life or ecologically valid individual change, then grouping participants based upon the change algorithm. A simple indicator of clinically significant change might be a score change of some pre-set magnitude (often one-half to one standard deviation). For instruments that have established cut-off scores, clinically significant change might be defined as score changes that cross the established boundary score. Or a combination of both approaches might be used (i.e., either a large change, or a moderate change that crosses the cut-off score). Clinical change algorithms under certain scenarios, such as those discussed above, may yield counter-intuitive and misleading results. However, research demonstrating the relationship between change algorithms and future incidents of child maltreatment and thus the validity and utility of change algorithms is lacking.
The purpose of the present study is to examine the dynamic risk prediction characteristics of the most widely used and strongly supported child maltreatment risk self-report measure, the Child Abuse Potential Inventory (CAP; Milner, 1986). The CAP is a self-report screening instrument for child physical abuse comprised of 160 items in a forced choice, agree/disagree format. The Abuse Scale, which is the main risk indicator on the CAP, consists of 77 items and 6 factor subscales (i.e., Distress, Rigidity, Unhappiness, Problems with Child and Self, Problems with Family, and Problems from Others). In addition, the CAP contains three validity scales (i.e., a lie scale, a random response scale, and an inconsistency scale), which form three response distortion indexes (i.e., faking-good, faking-bad, and the random response index). An ego-strength scale and loneliness scale also have been developed (Milner, 1994). Internal consistency estimates for the Abuse Scale of the CAP range from .85 to .98 for physically abusive parents and general population groups. Test-retest reliabilities reported in the CAP technical manual for 1-day, 1-week, 1-month, and 3-month intervals are .91, .90, .83, and .75, respectively (Milner, 1986).
Substantial support for the construct, predictive, and concurrent validity of the CAP has been demonstrated (see review by Milner, 1994). Elevated CAP Abuse Scale scores are related to characteristics frequently reported in identified child physical abusers, including the childhood receipt or witnessing of physical abuse, low levels of social support, low self-esteem, increased physiological reactivity to stressful stimuli, and increased use of power-assertive disciplinary strategies. In a prospective study of at-risk parents (Milner et al., 1984), significant relationships were reported between CAP Abuse Scale scores and subsequent incidents of child physical abuse and child neglect.
Correct classification rates for physically abusive and matched comparison parents in the mid-80% to low-90% range have been reported (e.g., Milner, 1989a; Milner, Gold, & Wimberley, 1986; Milner & Robertson, 1990; Milner & Wimberley, 1980). Lower rates have been reported for parents with childhood histories of maltreatment or histories of maltreatment of their own children other than physical abuse (e.g., neglect; Caliso & Milner, 1992) and parents of children with certain medical problems (Milner, 1991). In general, higher overall correct classification rates have been reported using the 166-point cutoff score derived from signal detection theory as compared with the 215-point cutoff score recommended for clinical use (Milner et al., 1998). Milner and his associates (Milner, 1986, Milner, 1994, Milner et al., 1998) provide additional information supporting the reliability and validity of the CAP.
CAP scores appear to change over the course of treatment, and the CAP has been widely used as a treatment outcome measure. For example, 81% of 307 administrators, direct service providers, and researchers surveyed in the United States and Canada indicated using the CAP for purposes of treatment evaluation (Milner, 1989b). Statistically significant changes in CAP Abuse Scale and/or subscale scores have been observed in studies using pre- and post-assessments or pre- and follow-up-assessments (e.g., Acton & During, 1992; Barth, 1989, Black et al., 1994; Ethier, Couture, Lacharite, & Gagnier, 2000; Fulton, Murphy, & Anderson, 1991; Meezan & O’Keefe, 1998; National Committee for Prevention of Child Abuse, 1992, Thomasson et al., 1981, Vogel, 1987; Wilson & St. Pierre, 1990; Wolfe et al., 1988), which supports the CAP’s capacity to change over the course of intervention. Many of the studies also observe clinically significant changes (i.e., reduction of elevated CAP scores to below the clinical cut-off; Acton & During, 1992; Ethier et al., 2000, Kolko, 1996; Meezan & O’Keefe, 1998; Vogel, 1987, Wolfe et al., 1988). Reductions in CAP scores have been proposed to indicate successful intervention. However, no studies could be located that examined how changes in CAP scores following intervention are related to subsequent occurrences or recurrences of child maltreatment.
After examining the static future predictive validity of the CAP with our sample, we will test the dynamic predictive validity of the CAP and the validity of three clinically significant change algorithms (pre-set change magnitude, crossing the clinical cut-off, and a combination). Given the substantial support for the static reliability and validity of the CAP and the CAP’s use as a treatment change measure, it was hypothesized that time-dependent models including both pre- and post-intervention CAP scores, compared to pre-intervention only, would be better predictors of child maltreatment incidents. It was also hypothesized that individuals classified as improved using clinically significant change algorithms would have lower future rates of maltreatment than other groups.
Section snippets
Participants
Participants in the study were 459 parents participating in one of 27 community-based family preservation and family support programs in a southwestern state between 1996 and 1999. A detailed description of program content and all program enrollees can be found elsewhere (Chaffin, Bonner, & Hill, 2001). Parents were selected for this study if the parent had completed a pre-intervention assessment, the intervention, and a post-intervention assessment. The median duration of intervention
Analysis of static future predictive validity
A total of 63 (14%) participants had a defined failure event across a median follow-up time of 726 days. The majority of failures were for neglect (61%) or physical abuse combined with neglect (21%). Ten percent (10%) were for physical abuse alone and 8% were for sexual abuse. In order to examine the static future predictive validity of the pre-intervention CAP Abuse Scale scores with this sample, a Cox proportional hazards survival analysis was performed for the time-to-event data. The CAP
Discussion
The results of this study support two main conclusions. First, the results offer additional support for the future predictive validity of the pre-intervention CAP as a single-point-in-time measure of child maltreatment risk. Not only did the CAP predict future risk for maltreatment reports with this mixed sample of CPS-involved and high-risk parents, CAP Abuse Scale scores predicted over and above the prediction made by the best set of demographic and historical predictors available in this
Acknowledgements
The authors wish to express their gratitude to Linda Smith, Kathy Sims, John Brown, and John Gelona of the Department of Human Services for their support of the project, to Anndrea Finley, Shaunna Machtolf and Jennifer Moslander at the Center on Child Abuse and Neglect for their invaluable assistance in completing this study, and to Joel Milner for his comments on an earlier version of this study.
References (36)
Evaluation of a task-centered child abuse prevention program
Children and Youth Services Review
(1989)- et al.
Childhood history of abuse and child abuse screening
Child Abuse & Neglect
(1992) - et al.
Family preservation and family support programs: Child maltreatment outcomes across client risk levels and program types
Child Abuse & Neglect
(2001) - et al.
Child abuse potential inventory and parenting behavior: Relationships with high-risk correlates
Child Abuse & Neglect
(1995) Assessing physical child abuse risk: The Child Abuse Potential Inventory
Clinical Psychology Review
(1994)- et al.
Preliminary results of aggression management training for aggressive parents
Journal of Interpersonal Violence
(1992) - Altepeter, T. S., & Walker, C. E. (1992). Prevention of physical abuse of children through parent training. In D. J....
- Ammerman, R. T. (1998). Methodological issues in child maltreatment research. In J. R. Lutzker (Ed.), Handbook of child...
- et al.
Parenting and early development among children of drug-abusing women: Effects of home intervention
Pediatrics
(1994) - Chalk, R., & King, P. A. (Eds.). (1998). Violence in families: Assessing prevention and treatment programs. Washington,...
Impact of a multidimensional intervention programme applied to families at risk for child neglect
Child Abuse Review
Increasing adolescent mothers’ knowledge of child development: An intervention program
Adolescence
The meanings and measurement of clinical significance
Journal of Consulting and Clinical Psychology
Individual cognitive behavioral treatment and family therapy for physically abused children and their offending parents: A comparison of clinical outcomes
Child Maltreatment
Attitudes and the prediction of behavior: A meta-analysis of the empirical literature
Personality and Social Psychology Bulletin
Evaluating the effectiveness of multifamily group therapy in child abuse and neglect
Research on Social Work Practice
Cited by (85)
Applying Socio-Emotional Information Processing theory to explain child abuse risk: Emerging patterns from the COVID-19 pandemic
2023, Child Abuse and NeglectCitation Excerpt :Statistics based on official reports of these forms of child maltreatment are routinely considered the “tip of the iceberg” due to substantial underreporting (Sedlak et al., 2010; Stoltenborgh et al., 2015). Consequently, rather than be limited by reliance on official reports, alternative approaches attempt to estimate a parent's child abuse risk by inquiring about their beliefs and behaviors that presage child maltreatment (Bavolek & Keene, 2001; Chaffin & Valle, 2003). Child abuse risk is based on a conceptualization of physical or psychological aggression as operating along a parent-child aggression (PCA) continuum of severity and intensity (Gershoff, 2010; Rodriguez, 2021; Straus, 2000, 2001).
The evidence base for risk assessment tools used in U.S. child protection investigations: A systematic scoping review
2022, Child Abuse and NeglectCitation Excerpt :Further, not all risk assessment tools used by CPS agencies are equally represented in the literature, which may reflect the popularity of different tools across the country; however, CPS agencies that use risk assessment tools should consider the quantity and quality of the literature that exists for their tools. The SDM Risk Assessment, for example, is used in 34 states and the District of Columbia (Samant et al., n.d.) and is also the tool that has been evaluated most in the literature—the actuarial Abuse Scale of the Child Abuse Potential Inventory (Chaffin & Valle, 2003; B. M. Wells et al., 2011) and the Utah Risk Assessment Scales (Nasuti, 1998; Nasuti & Pecora, 1993) were the only other tools evaluated in more than one study. Even so, the present scoping review found only 10 studies that evaluated the validity or reliability of the SDM Risk Assessment (including one doctoral dissertation), three of which sought to understand how it performs relative to consensus-based tools, which are scarcely used in practice today (Baird et al., 1999; Baird & Wagner, 2000; Mendoza et al., 2016).
Validation and further development of a risk assessment instrument for child welfare
2021, Child Abuse and NeglectCitation Excerpt :The predictive validity of the LIRIK was poor, and the instrument did not perform better than chance in predicting child welfare recidivism (Area Under the Curve [AUC] = .53; Van der Put, Assink et al., 2016). As a large body of research had shown that actuarial instruments outperform clinical instruments in risk assessment (e.g., Baird & Wagner, 2000; Barber, Shlonsky, Black, Goodman, & Trocmé, 2008; Camasso & Jagannathan, 2000; Chaffin & Valle, 2003; Chan, 2012; Coohey, Johnson, Renner, & Easton, 2013; D’andrade et al., 2008; Dorsey, Mustillo, Farmer, & Elbogen, 2008; Johnson, 2011; Van der Put et al., 2017; Van der Put, Hermanns, van Rijn-van Gelderen, & Sondeijker, 2016), and a Dutch actuarial risk assessment instrument was not yet available, an actuarial risk classification was developed using the items of the LIRIK that each assessed a potential risk factor for child maltreatment. This risk classification (ARIJ) was only based on the risk factors that significantly predicted child welfare recidivism and performed significantly better than the LIRIK in predicting relapse with a moderate predictive validity (AUC = .63).
- ☆
This study was supported by a grant from the Oklahoma Department of Human Services, Division of Children and Family Services.
- 1
Present address: Centers for Disease Control and Prevention, National Center for Injury Prevention and Control, Division of Violence Prevention, 4770 Buford Highway, Mailstop K60, Atlanta, GA 30341.