Abstract
Objective. Changes in appearance are common in patients with systemic sclerosis (SSc) and can significantly affect well-being. The Satisfaction with Appearance Scale (SWAP) measures body image dissatisfaction in persons with visible disfigurement; the Brief-Satisfaction with Appearance Scale (Brief-SWAP) is its short form. The present study evaluated the reliability and validity of SWAP and Brief-SWAP scores in SSc.
Methods. A sample of 207 patients with SSc participating in the University of California, Los Angeles Scleroderma Quality of Life Study completed the SWAP. Brief-SWAP scores were derived from the SWAP. The structural validity of both measures was investigated using confirmatory factor analysis. Internal consistency reliability of total and subscale scores was assessed with Cronbach’s alpha coefficients. Convergent and divergent validity was evaluated using the Center for Epidemiological Studies Depression Scale, the Health Assessment Questionnaire-Disability Index, and the Medical Outcomes Study Short Form-36 questionnaire.
Results. SWAP and Brief-SWAP total scores were highly correlated (r = 0.97). The 4-factor structure of the SWAP fit well descriptively; the 2-factor structure of the Brief-SWAP fit well descriptively and statistically. Internal consistencies for total and subscale scores were good, and results supported convergent and divergent validity.
Conclusion. Both versions are suitable for use in patients with SSc. The Brief-SWAP is most efficient; the full SWAP yields additional subscales that may be informative in understanding body image issues in patients with SSc.
Systemic sclerosis (SSc) is a chronic, multisystem, connective tissue disease that attacks healthy body tissue1. SSc may be divided into 2 subtypes: (1) limited cutaneous SSc is characterized by skin involvement limited to the fingers, hands, lower arms, lower legs, and face; (2) and diffuse cutaneous SSc includes more widespread skin and organ involvement1,2,3. Visible changes in appearance are common in patients with SSc. These changes can harm self-image and quality of life2,4,5, and may result in body image dissatisfaction (BID; also called appearance dissatisfaction) and psychological distress6,7. To date, there has been limited research on BID in SSc4,5.
More research on BID in SSc, and its relationship to quality of life, is needed. However, research efforts have been hampered by the lack of measures appropriate to and validated for patients with SSc. Although a variety of BID measures are available, most were intended for use in other settings or with other populations (e.g., eating disorders) and require modifications, and/or have not been evaluated for use in SSc. Identifying measures of BID that are reliable and valid for use in SSc is critical to understanding how disease-related physical changes affect quality of life in this population.
The Satisfaction with Appearance Scale (SWAP)
The SWAP8 is a 14-item measure of BID that was originally developed for use with individuals with physical disfigurements as a result of burn injuries, but has since been adapted and used in research on SSc9,10. The SWAP was designed to measure 2 central aspects of body image: subjective satisfaction with appearance, and the social-behavioral effect of disfigurement, and a 2-factor structure was hypothesized11. In the original validation sample of patients with burn injuries, unexpectedly, each of the SWAP’s 14 items loaded onto 1 of 4 factors (subscales) labeled Social Distress, Facial Features, Non-Facial Features, and Perceived Social Impact8.
The SWAP has since been adapted for use in SSc with the word “burn” replaced with the word “illness” or “scleroderma.” Few studies, however, have examined the psychometric properties of the measure in SSc. Benrud-Larson, et al12 used the SWAP to examine the relationship between BID and psychosocial functioning in 129 female, predominantly white patients with SSc. Internal consistency reliability was excellent (α = 0.90) and SWAP total scores significantly correlated with measures of depressive symptoms, disability, psychosocial functioning, and pain in the expected directions and magnitudes, providing evidence of convergent validity. The factor structure of the measure, however, was not evaluated.
Jewett, et al10 examined the psychometric properties of the SWAP in a sample of 217 women with SSc from the Johns Hopkins Scleroderma Center (JHSC) and 654 women with SSc from the Canadian Scleroderma Research Group (CSRG) registry. Patients were predominantly white and diagnosed with the limited disease subtype (70% and 72.2%, respectively). Internal consistency reliability was excellent in both samples (α = 0.90 and 0.91). Confirmatory factor analysis (CFA) was used to examine a 2-factor structure (Subjective Dissatisfaction and Perceived Social Impact). After 2 pairs of item error covariances were freed, the 2-factor structure fit well in both samples based on descriptive fit indices. Evidence of convergent validity for the total score of the measure was provided by significant correlations in expected directions with measures of depressive symptoms, pain, and quality of life.
Heinberg, et al9 analyzed the factor structure of the SWAP using a sample (n = 254) drawn from the same Johns Hopkins dataset as Jewett, et al10, but including a 15th item that had been administered (“My appearance makes others feel uncomfortable”). Patients completed the 15-item version of the SWAP at baseline and 18 months later. The sample was predominantly female, white, and diagnosed with limited disease. Principal components analysis at each timepoint resulted in the extraction of 2 factors, Subjective Dissatisfaction and Perceived Social Impact, with the new item added to the latter scale. Internal consistency reliability was good for both subscales (α ≥ 0.88). Based on this, the authors suggested that the 4-factor structure reported for patients with burn injuries was not suitable for persons with SSc9. However, the authors used exploratory methods and did not statistically compare a 4-factor model with a 2-factor model.
The Brief-SWAP
Jewett, et al10 derived a 6-item Brief-SWAP from the more commonly used 14-item SWAP, attempting to retain the 2 subscales (Subjective Dissatisfaction and Perceived Social Impact) previously identified for SSc. Jewett, et al argued that many of the items of the SWAP were superfluous, and chose 3 items to represent each of the 2 subscales based on theoretical and psychometric considerations. As described above, Jewett, et al analyzed data from samples of female patients with SSc from the JHSC and the CSRG. CFA showed support for the hypothesized 2-factor model of the Brief-SWAP. Two 3-item subscales were supported and named Subjective Dissatisfaction and Perceived Social Impact. Internal consistency reliability (Cronbach’s alpha coefficient) for the Brief-SWAP total score was 0.82 in both samples.
A second study by the same research team, also using a Canadian sample drawn from the CSRG registry, evaluated the psychometric properties of the Brief-SWAP in 489 women and men with SSc13. The 2-factor structure was replicated using CFA, and the same two 3-item subscales were derived, renamed as Dissatisfaction with Appearance (replacing Subjective Dissatisfaction) and Social Discomfort (replacing Perceived Social Impact). Internal consistency reliability was good for both subscales (α = 0.82 and 0.83, respectively).
To date, the structural validity of the SWAP and Brief-SWAP has only been examined in an all-female sample from the JHSC, and a female and male sample from the CSRG. An additional study examined the structural validity of the 15-item version of the SWAP in a predominantly female sample, also drawn from the JHSC. Further examination of the reliability and validity of the SWAP and Brief-SWAP is needed in distinct populations of patients with SSc to further establish the generalizability of these measures’ psychometric properties.
The present study contributes to the literature by attempting to replicate previously reported factor structure and psychometric findings from previous studies8,9,10,13 in a diverse sample in terms of sex, ethnicity, and disease subtype. The aims of this study were to (1) examine and compare the structural validities of the SWAP and Brief-SWAP, (2) examine and compare internal consistency reliability coefficients for the SWAP and Brief-SWAP, and (3) examine and compare convergent and divergent validity for the 2 measures.
MATERIALS AND METHODS
Patients
The sample consisted of 207 patients with SSc (confirmed by study rheumatologists) who were participating in a single-center, longitudinal study. Disease subtype classification was made according to American College of Rheumatology criteria14. The study was approved by the University of California, Los Angeles Institutional Review Board.
SWAP8
The SWAP is a 14-item measure of BID. Table 1 gives individual items and corresponding subscales. Respondents rate the extent to which each item reflects their feelings about their appearance on a scale ranging from 1 (strongly disagree) to 7 (strongly agree). Items 4 to 11 are reverse-scored. Total scores, as well as 4 subscale scores (Social Distress, Facial Features, Non-Facial Features, and Perceived Social Impact), can be calculated. To calculate SWAP scores, 1 is subtracted from each item to anchor all items at 0, and then item scores are summed. Scores for the Facial Features and Non-Facial Features subscales can range from 0 to 24, and scores for the Social Distress and Perceived Social Impact subscales can range from 0 to 18. Total scores can range from 0 to 84. Higher scores indicate greater BID. Completion time is estimated at 5 min.
Brief-SWAP10
The Brief-SWAP is a 6-item short form derived from the SWAP8. Table 1 shows individual items and corresponding subscales. Total scores, as well as 2 subscale scores (Dissatisfaction with Appearance, Social Discomfort), can be calculated. Scores are calculated by subtracting 1 from each item to anchor items at 0. Items for the Dissatisfaction with Appearance subscale are reverse-scored, and then item scores are totaled. Subscale scores can range from 0 to 18, and total scores can range from 0 to 36. Higher scores indicate greater BID. The Brief-SWAP was not given in the present study; rather, Brief-SWAP scores were derived from the SWAP. Completion time is estimated at 2 min.
Center for Epidemiologic Studies Depression Scale-Short Form (CES-D Short Form)15
The CES-D Short Form is a 10-item version of the widely used CES-D16, a screening measure of depressive symptoms. Scores can range from 0 to 30, with higher scores indicating more frequent depressive symptoms. Internal consistency reliability was good in the present sample (α = 0.83).
Health Assessment Questionnaire-Disability Index (HAQ-DI)17
The HAQ-DI is a 20-item measure of functional ability that has been validated for SSc18,19. Responses are rated on a scale ranging from 0 (no disability) to 3 (completely disabled). A total score is calculated by averaging the 8 category scores (i.e., dressing, rising, walking, eating, hygiene, reach, grip, and usual activities). The HAQ-DI demonstrated strong internal consistency reliability in the present sample (α = 0.93).
Modified Rodnan Skin Score (mRSS)20
The mRSS is a physician-administered measure of skin disease severity validated for patients with SSc21,22. The mRSS total score is determined by measuring the scope and severity of skin thickening in 17 body areas by palpitation on a scale ranging from 0 (uninvolved) to 3 (severe thickening). Scores can range from 0 to 51, with higher scores indicating greater severity.
Medical Outcomes Study Short Form-36 questionnaire (SF-36)23
The SF-36 measures quality of life in 8 domains. Physical component summary (PCS) and mental component summary (MCS) scores are derived from the domain scores, with higher scores indicating better quality of life. The SF-36 has previously demonstrated good reliability and validity in patients with SSc24. The standard 4-week recall version of the SF-36 version 2.0 was used.
Statistical analysis
Descriptive statistics for demographic and medical variables, and all measures, were calculated for the total sample. Pearson correlations were calculated to demonstrate overlapping variance between the SWAP and Brief-SWAP.
CFA was used to determine the best fitting factor structures of the SWAP and Brief-SWAP in patients with SSc. The goodness of fit of the previously established 4-factor structure (Social Distress, Facial Features, Non-Facial Features, and Perceived Social Impact) of the 14-item SWAP and the 2-factor structure (Dissatisfaction with Appearance and Social Discomfort) of the 6-item Brief-SWAP were initially examined. Interfactor correlations were specified among the latent variables. As recommended by Bentler, overall model fit was determined by consulting 3 fit indices25: (1) the root mean square error of approximation (RMSEA)26, an absolute index of overall model fit; (2) the standardized root mean residual (SRMR)27; and (3) the robust comparative fit index (CFI)28. For RMSEA and SRMR indices, values less than 0.08 were considered acceptable fit and values less than 0.05 were considered good fit. For CFI, values greater than 0.90 were considered acceptable fit and values greater than 0.95 were considered good fit. Models were determined to fit well if values for at least 2 of the descriptive fit indices indicated at least acceptable model fit. The likelihood ratio chi-square was also reported for completeness; however, it was not used as the primary indicator of model fit because it is highly influenced by sample size and almost always statistically significant, and thus not a good index of degree of fit29.
Next, the best fitting factor structures for the SWAP and Brief-SWAP were compared. Because likelihood-ratio tests cannot be used to compare non-nested models30, the Akaike information criterion (AIC)31 and the sample size-adjusted Bayesian information criterion (sBIC)32 were used to evaluate comparative model fit. For both criteria, smaller values indicate better model fit. Both AIC and sBIC criteria reward parsimony. Thus, model comparison using AIC and sBIC indices were considered in conjunction with other model fit and psychometric validation results.
Internal consistency reliability was examined for the SWAP, Brief-SWAP, and all subscales using Cronbach’s alpha coefficient. Convergent validity constructs were selected to replicate previous research8,9,10 using constructs known to be associated with BID in patients with SSc. The factors for each form of the SWAP and Brief-SWAP were expected to be moderately positively associated with measures of depressive symptoms (CES-D), physical function (HAQ-DI), and disease severity (mRSS), and moderately negatively associated with a quality of life measure (SF-36 PCS and MCS). For divergent validity, based on previous research8, the SWAP and Brief-SWAP were expected to have little to no correlation with bodily pain (SF-36 Bodily Pain Scale), after controlling for depression.
RESULTS
Descriptive statistics
Table 2 gives sample characteristics and means and SD for all measures. The sample (n = 207) was predominantly female (83.1%), white (71.5%), married (57%), and had some college or higher education (81.6%). Mean age of the sample was 54.1 years (SD 15.4). About half of the sample had limited SSc (50.2%), followed by diffuse SSc (40.1%). Time since diagnosis of SSc was 7.57 years (SD 7.9) and the mean mRSS, a widely used measure of disease severity, was 8.70 (SD 8.5). The mean percent predicted forced vital capacity for the total sample was 78.98% (SD 21.76). Only 4.8% of patients reported renal crisis. The correlation between SWAP and Brief-SWAP total scores was significant and very strong (r = 0.97, p < 0.01).
SWAP
First, a 4-factor model for the 14-item SWAP was examined using CFA (Table 1). Interfactor correlations were specified among the 4 latent variables. This 4-factor model did not fit well statistically [chi-square (71) = 149.01, p < 0.01], but it did fit well descriptively (RMSEA = 0.07, SRMR = 0.04; CFI = 0.96). Correlations among the 4 factors were all statistically significant (Table 3). Next, a 2-factor model for the SWAP was examined. The Dissatisfaction with Appearance factor was identified by 8 variables (combining the Facial Features and Non-Facial Features subscales) while the Social Discomfort factor was identified by 6 variables (combining the Social Distress and Perceived Social Impact subscales). This 2-factor model did not fit well statistically [chi-square (76) = 274.23, p < 0.01], but it did fit well descriptively (RMSEA = 0.11, SRMR = 0.06, CFI = 0.90). The interfactor correlation was large and statistically significant (r = 0.71, p < 0.01). A chi-square difference test was used to statistically compare the 4-factor model to the 2-factor model. The 2 models were statistically significantly different [Δchi-square (5) = 125.22, p < 0.01], indicating that the 4-factor model fit the observed data better than the 2-factor model.
Brief-SWAP
A 2-factor model for the 6-item Brief-SWAP was tested using CFA (Table 1 gives all standardized factor loadings for this model). An interfactor correlation was specified between the 2 latent variables. This 2-factor model fit well statistically [chi-square (8) = 14.24, p = 0.08], and descriptively (RMSEA = 0.06, SRMR = 0.03; CFI = 0.99). The interfactor correlation was large and statistically significant (r = 0.79, p < 0.01). Given the high interfactor correlation, a 1-factor model was also tested with a single latent variable indicated by 6 observed variables. This 1-factor model did not fit well statistically [chi-square (9) = 52.08, p < 0.01], but it did fit well descriptively (RMSEA = 0.15, SRMR = 0.05; CFI = 0.92). The 2 models were then statistically compared to determine the superior fit to the data. A chi-square difference test demonstrated that the 2 models fit differently [Δchi-square (1) = 37.85, p < 0.01], indicating that the 2-factor model fit the observed data better than the 1-factor model.
The 2 best fitting models, the 4-factor model for the SWAP and the 2-factor model for the Brief-SWAP, were then compared. The AIC and sBIC values were lower for the 2-factor Brief-SWAP than for the 4-factor SWAP (AIC = 4790.32 vs 10,334.99. sBIC = 4798.44 vs 10,342.87), suggesting that the 2-factor Brief-SWAP provided better model fit to the observed data.
Internal consistency reliability
Internal consistency reliability was excellent for the SWAP (α = 0.93) and good for the Brief-SWAP (α = 0.87). All hypothesized subscales of the SWAP and Brief-SWAP also had good reliability (SWAP: Facial Features: α = 0.86, Non-Facial Features: α = 0.86, Social Distress: α = 0.89, Perceived Social Impact: α = 0.85; Brief-SWAP: Dissatisfaction with Appearance: α = 0.79, Social Discomfort: α = 0.83).
Convergent and divergent validity
As anticipated, significant positive moderate correlations with depression, level of physical functioning, and disease severity were found for both the SWAP and Brief-SWAP (Table 4 and Table 5). Also, as expected, better mental and physical health-related quality of life was associated with greater satisfaction with appearance. Providing evidence of divergent validity, after controlling for depression, the relationships of bodily pain to the SWAP and Brief-SWAP scores were nonsignificant. For the subscales of the SWAP and Brief-SWAP, all correlations were significant, of expected magnitudes, and in expected directions.
DISCUSSION
Our study examined the psychometric properties of the SWAP and Brief-SWAP in a sample of patients with SSc in the United States. Total scores on the SWAP and Brief-SWAP were similar to those reported for other SSc samples10,12,13. In addition, replicating previous studies10,12, mean SWAP total scores were higher than those from the original sample of hospitalized patients with burn injuries8.
A primary aim was to identify and compare the best-fitting factor structures for the SWAP and Brief-SWAP. In the present analysis, a 4-factor model best fit the data for the SWAP, supporting the use of the 4 subscales in SSc (Facial Features, Non-Facial Features, Social Discomfort, Perceived Social Impact). For the Brief-SWAP, the 2-factor model best fit the data, supporting the use of the 2 Brief-SWAP sub-scales, Dissatisfaction with Appearance and Social Discomfort. The 2-factor structure of the Brief-SWAP demonstrated better fit to the sample data than did the 4-factor structure of the longer SWAP. Alpha coefficients for all total scores and subscales demonstrated good reliability. Therefore, with 8 fewer items, the Brief-SWAP more parsimoniously measures BID. Jewett, et al10 suggested that the 2-factor Brief-SWAP provided better fit because the Brief-SWAP contains items that focused on body parts relevant in SSc, and items were removed from the 14-item SWAP that were endorsed by a few patients with SSc. In the current sample, it is also not surprising that the Brief-SWAP demonstrated better comparative model fit to the SWAP, given that the AIC and sBIC indicators reward parsimony33. However, both models had good overall fit and convergent validity, suggesting that decision making regarding which measure to use should not be based purely on this comparison. Rather, either measure may be useful, depending on the type of information a researcher or clinician is seeking.
The present sample differs from previous validation samples on several key demographic characteristics. First, both men and women are included, unlike the original Canadian study validating the Brief-SWAP that had an all-female sample10. Additionally, the present sample had a higher percentage of patients with diffuse disease in comparison with the CSRG and JHSC samples. The proportion of patients with diffuse versus limited disease varies greatly depending on geographic region and ethnicity, with some epidemiological studies reporting diffuse disease in more than 70% of the disease population34. In addition, the present study sample had a lower percentage of white patients compared with previous samples. Data from multi-ethnic cohorts suggest that non-white patients are at increased risk for more severe SSc, in particular regarding diffuse skin involvement34. Also, patients with diffuse disease often report higher levels of BID.
There are limitations to the current study. Only the original SWAP was completed; Brief-SWAP scores were derived from the original measure, and item order and context have been shown to influence responses35. Because there are no other measures of BID that have been validated for use in SSc, convergent validity analyses focused on measures of constructs previously found to be associated with BID in patients with SSc.
The present findings support the use of the SWAP and Brief-SWAP in patients with SSc. Previous studies using the SWAP or Brief-SWAP have reported both total and subscale scores. In the present study, correlations among subscales for the SWAP and Brief-SWAP were large and statistically significant, suggesting that use of a total score to provide an overall measure of BID is appropriate for both measures. In addition, the factor analyses suggested that subscale scores can be used to assess particular aspects of BID. The SWAP may be preferred in research because it includes 4 subscales that measure specific aspects of BID. However, the Brief-SWAP’s 2 subscales yield information on both subjective dissatisfaction with appearance and appearance-related social concerns while reducing administration time. The Brief-SWAP may be a useful screening measure, aiding in the identification of individuals in need of additional assessment and support.
Footnotes
Supported by the grant Evaluation of Health-related Quality of Life in Systemic Sclerosis from the Scleroderma Foundation Inc. Dr. Khanna has been funded by the US National Institutes of Health (NIH)/National Institute of Arthritis and Musculoskeletal and Skin Diseases K24 AR063120 and K23 AR053858. Ms. Mills was supported by the UC San Diego Cota-Robles Fellowship. Dr. Furst has served as a consultant or on speaker bureaus for the NIH.
- Accepted for publication April 21, 2015.