Introduction
Screening for personality pathology is of paramount importance; especially in clinical settings. Studies have shown that between 3 and 10% of the general population meet the diagnostic criteria of one or more personality disorders [
1,
2]. Prevalence rates in psychiatric populations have been found to be substantially higher: 45–51% in US samples and 40–92% in European samples [
3]. Personality disorders are characterized by considerable suffering and/or lasting impairment of social adaptiveness. Patients diagnosed with personality disorders have a higher risk for suicide, and often suffer from psychosocial impairment, experience decreased work capacity and have inadequate skills for establishing lasting interpersonal relationships [
4].
Traditionally, personality traits (including maladaptive ones) have been regarded as stable. However, there is a growing body of research that focuses on and finds support for changeable aspects of personality. In general psychology as well as in psychiatry, a distinction is made between personality characteristics that are regarded as relatively stable over time, i.e., personality traits or style, and personality characteristics that are more amenable to change, i.e., characteristics adaptations, (e.g., [
5,
6]). In the personality disorder field, characteristic adaptations are often referred to as personality functioning, and include, among others, values, goals, self-concepts and mental representations of others. For the development of an effective treatment plan, it is highly useful for a clinician to gain insight into a patient’s personality aspects that are both maladaptive and changeable.
The Severity Indices of Personality Problems 118 (SIPP-118) is a self-report questionnaire that was specifically designed to measure interpersonal differences in (mal)adaptive personality capacities [
5]. The SIPP-118 encompasses 16 facets derived from consensus meetings involving 10 experts in the field of personality pathology. Furthermore, five higher-order factors were proposed based on exploratory factor analyses: social concordance, relational functioning, self-control, responsibility and identity integration
1. As reported by Pedersen et al. [
8], a number of studies have supported clinical relevance, utility, and the relationship between SIPP-118 scores and personality disorder (PD) severity levels [
9‐
16]. However, no consensus has yet emerged as to which scores are best to report: the facet or the higher-order factor scores. Whereas the 16 facets were based on theory and expert opinion, and tested using confirmatory factor analyses, the higher-order structure suggested by the developers was based on exploratory analyses only. This has caused some authors to be more cautious in adopting the higher-order factors, which have moreover been proven difficult to replicate [
8].
The facets were developed using an approach that was content-driven: Experts identified concepts, generated items, and these were in turn evaluated by patients. The facets that were included in the instrument showed Cronbach’s alpha values of at least .70 and were found to fit single-factor models well [
5]. In a subsequent study conducted by members of the same research group, the psychometric properties of the SIPP-118 were evaluated in two adolescent samples: a patient and non-patient sample [
13]. Cronbach’s alpha estimates ranged between .59-.89, with the lowest values being found for the facet respect and the highest for self-respect. Known-groups validity was supported by the finding that a higher degree of pathology as measured by the SIPP-118 was found in the patient sample compared to the non-patient sample. Correlations among facets pertaining to the same higher-order factor varied between .24 and .73, and between .10 and .68 for facets not pertaining to the same higher-order factor. These findings do not provide a clear support for the suggested higher-order factors. All facets except for enduring relationships and responsible industry were sensitive to change in the adolescent patients studied. The largest effect was found for stable self-image. In a recent study, using both a community and two clinical samples, Cronbach’s alpha estimates ranged between .63-.85 (lowest value for the facet respect, highest for aggression regulation and self-respect), with most values exceeding .70 [
8]. The authors were not able to replicate the higher-order factors proposed by Andrea and colleagues. The focus of this study is on the facets, since they have a more solid foundation compared to the higher-order factors.
Notably, the SIPP-118 was used in the early stages in the development phase of the diagnostic content for the Levels of Personality Functioning Scale (Criterion A) of the Alternative Model for Personality Disorders [
17], especially, with respect to the fine-tuning of severity level descriptions. Furthermore, the SIPP-118 is sometimes used in research studies to obtain an estimate of personality dysfunction; for instance, Bastiaansen and colleagues [
18] extracted a single higher-order factor using the SIPP-118, which they used in subsequent analyses to investigate the relationship between personality functioning and personality traits. From a research perspective, it may be useful to obtain one or multiple summary factors for the SIPP facets, (in research, the SIPP is often used as an overall indicator of personality functioning). From a clinical viewpoint, however, using such factor solutions may be suboptimal, since they are mostly based on small samples and exploratory analyses, with the purpose of data reduction rather than obtaining clinically meaningful latent traits. Often, test developers suggest both total and subscale scores to be calculated for their instruments. This type of approach has been criticized by some; if the subscales do not explain substantial portions of variance, it may be more suitable to focus on a total score only [
19]. Others have argued that ignoring subscales can lead to an impoverished measurement practice, where crucial characteristics of the patient are overlooked (e.g., [
20]).
In this study, we will assess the incremental value of subscale (i.e., facet) scores over and above the total score. We will do so in two steps. First, we will evaluate to what degree the SIPP items tap into an overall general factor (also referred to as a g-PD factor) using bi-factor modeling. Second, we will study the distinctiveness of the facets using proportional reduction in mean squared error (PRMSE) based statistics. We choose to focus on the facets, and not the higher-order factors, since the former have a strong theoretical basis.
Discussion
In this study, we focused on evaluating the relative strength of the 16 facets of the SIPP-118. Having 16 facets at one’s disposal allows for a detailed picture of patients’ adaptive and maladaptive capacities, but results in a number of scores that might be overwhelming to interpret in daily clinical practice. The question arises, therefore, whether it is worth the trouble to both patient and clinician to obtain and interpret all 16 facet scores. Our results indicate that 14 out of 16 facets have a clear incremental value. Moreover, the general factor that we extracted in our bi-factor analyses was not strong enough to warrant using the SIPP-118 as a unidimensional measure. The outcomes were highly similar across the large clinical Dutch and Norwegian samples we used, supporting generalizability of our findings.
In recent decades, there has been a strong call for moving from a categorical to a dimensional approach to PD diagnoses, (e.g., [
38,
39]). As an effect, there has been an increased interest in the so-called p-factor (general factor of psychopathology, e.g., [
40]) or g-PD factor (general factor of PD, e.g., [
41,
42]). A number of previous studies using interview-rated PD criteria have found a strong relationship between borderline PD traits and the g-PD [
41,
43]. In our study, we did not find a strong general factor. This may be partly due to the content of the SIPP-118, which was designed to assess changeable aspects of maladaptive personality functioning, and the items do not necessarily directly reflect the different DSM-5 PDs. Furthermore, multidimensionality was explicitly introduced during the item generation phase.
Previous studies have yielded inconsistent findings with respect to the higher-order factor structure of the SIPP-118 (see [
8]). It is unclear what caused these inconsistencies, but since this higher-order structure was informed by exploratory factor analysis only, it may not be surprising that the results differ across studies. Exploratory analyses may be particularly sensitive to sample characteristics, and not generalize well. In this study, we used an analytic approach with a specific focus on the facets. The results were observed to be comparable across the three samples. Although more research is needed to ascertain whether the generalizability holds for different subgroups and non-European countries, the results so far are reassuring. Overall, we found strong support for the facets. That being said, the facets stable self-image and frustration tolerance did not show sufficient distinctiveness (the VAR values for these facets did not differ significantly from 1.1, this was true for all samples). As described by Feinberg and Jurich [
35], the goal of reporting subscores is to allow for fine-grained inferences from the item responses. However, reporting subscores that do not have a demonstrated added value may result in decisions being made based on misinformation and incorrect representations of the trait being measured. We suggest the facets stable self-image and frustration tolerance be used with caution or not at all.
The SIPP is a valuable instrument that is not tied to a particular model of PD. We would like to stress that we do not suggest that solely the SIPP be used in diagnosis. The patient perspective is important and should be central in certain situations, but it does not paint the whole picture. It has been repeatedly shown that self-report instruments cannot be used as a proxy for (or replacement of) clinical diagnosis (see [
44]). This may be especially true for certain types of PDs, such antisocial PD [
45]. As to the question posed in the title of this article, our results suggest that yes—we really do need those facets! If it is not feasible in a given situation to administer the entire instrument, one possibility would be to make a selection of facets, depending on the goal for which the instrument is being used. For obtaining a general severity score, we would suggest to use an instrument that shows a stronger g-PD factor.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.