Evaluation of computerized text analysis in an Internet breast cancer support group

https://doi.org/10.1016/j.chb.2004.02.008Get rights and content

Abstract

Although support groups are widely available on the Internet, little is known about the conversations in them. We hypothesized that automatic text analysis may be a powerful tool helping to understand what is communicated in these groups. In an exploratory study, the postings of nine women participating in a semi-structured breast cancer support group program were analyzed with a human rater and with Pennebaker and Francis' text analysis software (LIWC). The computer scores on most of the selected word categories and human ratings were moderately correlated. This indicates concurrent validity of the LIWC. An indication for construct validity was found by comparing the LIWC scores of the on-line group with those of other texts. Automated text analysis should be further developed for on-line discussions where they may serve as a useful tool for group moderators and researchers.

Introduction

Twelve percent of women in the United States are diagnosed with breast cancer at some time in their life (American Cancer Society, 1996). Being diagnosed with breast cancer is typically a life-changing event that often leads to a number of significant concerns (Spencer et al., 1999). Medical treatment for breast cancer is usually accompanied by social challenges, physical discomfort, and psychological distress. During their initial treatment, 80% of breast cancer patients experience significant distress (e.g. anxiety, anger, depression, and/or loss of social support) (Hughes, 1982; Irvine, Brown, Crooks, Roberts, & Browne, 1991).

A number of studies have found that support groups can improve patients' psychological well-being (Berglund, Bolund, Gustafsson, & Sjoden, 1994; Cain, Kohorn, Quinlan, Latimer, & Schwartz, 1986; Classen et al., 2001; Fawzy, Cousins, Fawzy, Kemeny, Elashoff, & Morton, 1990; Spiegel, Bloom, & Yalom, 1981; Telch & Telch, 1986). Group interventions may even prolong life expectancy (Fawzy et al., 1993; Spiegel, Bloom, Kraemer, & Gottheil, 1989), although the evidence for this remains mixed (Goodwin et al., 2001). Support groups are generally designed to enable women with breast cancer to share their difficult emotions. In cancer patients, expressing emotions is associated with less distress and a better health outlook than avoiding expression of their emotions (Classen, Koopman, Angell, & Spiegel, 1996; Stanton et al., 2000).

While most support groups meet face-to-face, it is apparent that such groups can also be formed on the Internet where participants can share their experiences in writing. Women with breast cancer frequently seek information and emotional support from on-line breast cancer groups (Lieberman & Russo, 2001). A number of studies have examined computer-mediated support groups for breast cancer (Gustafson et al., 1993; Sharf, 1997; Weinberg, Schmale, Uken, & Wessel, 1996; Weinberg, Uken, Schmale, & Adamek, 1995). However, little is known about the processes involved in these groups where participants write messages to one another.

An impressive body of research demonstrates that writing about stressful life events results in a wide range of benefits ranging from better coping with grief, to improvement of immunological indices (Esterling, L'Abate, Murray, & Pennebaker, 1999; Klein & Boals, 2001; Pennebaker, Mayne, & Francis, 1997; Richards, Beal, Seagal, & Pennebaker, 2000). Writing about anticipated positive events may also be helpful for women being confronted with a life threatening disease (Mann, 2001). Moreover, certain characteristics of the writing may have a particularly beneficial effect. Greater health improvements are associated with using a higher proportion of positive emotion words relative to negative emotion words (Pennebaker, 1993) or with using a high number of positive emotion words and moderate amount of negative emotion words (Pennebaker et al., 1997; Pennebaker & Seagal, 1999). Independently, increases in the use of words referring to cognitive processes (words depicting causality, e.g., “because,” and indicating self-reflection, e.g., “understand”) are also linked to health improvement (Pennebaker, 1993; Pennebaker et al., 1997; Pennebaker & Seagal, 1999). This body of research suggests that women's use of words in an on-line support group may contain important indicators of the benefits of participation.

Because all messages posted to Internet groups can be saved and subjected to analysis, it is possible to examine linguistic patterns of the communication occurring in the group. In the present study we explored the usefulness of computer-assisted text-analysis (see Popping, 2000; West, 2001) to describe the pattern of the communication in such groups. The underlying idea was that if electronic text analysis yielded valid descriptions of the communication in groups, it could be used to identify particular linguistic patterns that may predict positive outcome in future work. Text analysis programs have become widely used to analyze the content of written communication and to make predictions regarding psychological adaptation or other measures of health. Compared with face-to-face interactions, the analysis of communication in computer-based groups has the major advantage that all of the message is reflected in the written text, rather than also necessitating analysis of nonverbal cues such as body language or voice tone. Thus, valid computerized text analysis procedures could increase the efficiency and accuracy of content analysis.

Various computerized methods of text analysis are available (see reviews by Alexa & Zuell, 1999; Bauer, 2000; Popping, 2000). Their main application has been the parsing of Web-site content (e.g., Bauer & Scharl, 2000) and keyword-indexing of texts for retrieval in databases (for a review of human versus machine indexing see Anderson and Pérez-Carballo, 2001a, Anderson and Pérez-Carballo, 2001b). Pennebaker and Francis' software, the Linguistic Inquiry and Word Count (LIWC, Pennebaker & Francis, 1999; Pennebaker & Francis, 1996; Pennebaker, Francis, & Booth, 2001) is the most widely used program for analyzing text in clinical psychology. It was designed to map several psychological and linguistic dimensions of written language. Although there is preliminary data for the validity of most of its categories (Pennebaker & Francis, 1999; Pennebaker & Francis, 1996; Pennebaker et al., 2001; Pennebaker et al., 1997), the program has not been validated for the monitoring of discourse between individuals. Indeed, there are legitimate concerns about the ability of computer programs to analyze linguistic information in complex communications. For instance, the LIWC computer program does not take context into account, which is important for interpreting the meaning of the written communication. Instead it simply counts words belonging to certain categories.

To evaluate whether the LIWC approach is useful for analyzing the content of the communication in an Internet support group, we examined three types of validity: content validity, construct validity and concurrent validity (Cronbach & Meehl, 1955). Content validity is the degree of congruence between a measure and the content that it is intended to cover. This was examined by determining the percentage of words in the on-line messages that were captured by the categories of the LIWC program. Construct validity is a measure of the influence a relevant construct has on the scores. We tried to find support for construct validity by comparing the LIWC scores of the on-line support group with LIWC scores of newspaper articles that focus in a greater or lesser degree on emotional aspects of breast cancer. An expert's scores were also compared with LIWC scores of Pennebaker's emotional writing tasks. We tested concurrent validity, the correlation with an independent measure of the same or closely related construct, by comparing LIWC ratings of our on-line support group for breast cancer patients against ratings of the on-line support group generated by a human rater.

Section snippets

On-line support group intervention

Bosom Buddies is a semi-structured moderated group for women with primary breast cancer. The present study is based on an 11-week pilot group we ran to establish the procedure for an ongoing randomized wait-list control trial. The overriding goal is to enable patients to obtain social support and to have a forum in which to express their emotions and to discuss issues related to coping with breast cancer. On a weekly schedule, members receive new content on a different topic (e.g., sexuality,

Descriptive statistics

The participants and the moderator posted a total of 521 messages during the 11 weeks of the intervention. Individual participants composed a mean of 4.6 messages per week (range 2.4–7.9). The messages were of varying length (range 1–915) with an average of 126 words per message. Shorter messages during some weeks were compensated for by increased numbers of posted messages so that the total number of words written per week was generally between 5000 and 7000.

All patients participated regularly

Discussion

We view the results of this study as preliminary but encouraging support for automated text analysis of Internet delivered support groups. The results suggest that it is possible to use a computer program to help describe the communication that occurs in Internet support groups. Although the correlations between human rater and computer are low to moderate in magnitude, they are in the right direction and thus demonstrate that there is a positive association between computer scores for some

Acknowledgements

This research was supported by the US Department of Defense (DAMD17-99-1-9387).

References (43)

  • C. Bauer et al.

    Quantitative evaluation of web site content and structure

    Internet Research

    (2000)
  • E.N. Cain et al.

    Psychosocial benefits of a cancer support group

    Cancer

    (1986)
  • C. Classen et al.

    Supportive-expressive group therapy and distress in patients with metastatic breast cancer: A randomized clinical intervention trial

    Archives of General Psychiatry

    (2001)
  • C. Classen et al.

    Coping styles associated with psychological adjustment to advanced breast cancer

    Health Psychology

    (1996)
  • L.J. Cronbach et al.

    Construct validity in psychological tests

    Psychological Bulletin

    (1955)
  • F.I. Fawzy et al.

    A structured psychiatric intervention for cancer patients – I. Changes over time in methods of coping and affective disturbance

    Archives of General Psychiatry

    (1990)
  • F.I. Fawzy et al.

    Malignant melanoma. Effects of an early structured psychiatric intervention, coping, and affective state on recurrence and survival 6 years later

    Archives of General Psychiatry

    (1993)
  • Goode, E. (1999, October 19). `Fighting spirit' little help in cancer fight. New York...
  • P.J. Goodwin et al.

    The effect of group psychosocial support on survival in metastatic breast cancer

    New England Journal of Medicine

    (2001)
  • Grady, D. (2000, May 2). Researchers stalk a breast cancer gene to see how it kills. New York...
  • D. Gustafson et al.

    Development and pilot evaluation of a computer-based support system for women with breast cancer

    Journal of Psychosocial Oncology

    (1993)
  • Cited by (145)

    View all citing articles on Scopus
    View full text