Detecting depression and mental illness on social media: an integrative review
Introduction
The widespread use of social media may provide opportunities to help reduce undiagnosed mental illness. A growing number of studies examine mental health within social media contexts, linking social media use and behavioral patterns with stress, anxiety, depression, suicidality, and other mental illnesses. The greatest number of studies of this kind focus on depression. Depression continues to be under-diagnosed, with roughly half the cases detected by primary care physicians [1] and only 13–49% receiving minimally adequate treatment [2].
Automated analysis of social media potentially provides methods for early detection. If an automated process could detect elevated depression scores in a user, that individual could be targeted for a more thorough assessment, and provided with further resources, support, and treatment. Studies to date have either examined how the use of social media sites correlates with mental illness in users [3] or attempted to detect mental illness through analysis of the content created by users. This review focuses on the latter: studies aimed at predicting mental illness using social media. We first consider methods used to predict depression, and then consider four approaches that have been used in the literature. We compare the different approaches, provide direction for future studies, and consider ethical issues.
Section snippets
Prediction methods
Automated analysis of social media is accomplished by building predictive models, which use ‘features,’ or variables that have been extracted from social media data. For example, commonly used features include users’ language encoded as frequencies of each word, time of posts, and other variables (see Figure 2). Features are then treated as independent variables in an algorithm (e.g. Linear Regression [4] with built in variable selection [5], or Support Vector Machines (SVM)) [6] to predict the
Assessment criteria
Several approaches have been studied for collecting social media data with associated information about the users’ mental health. Participants are either recruited to take a depression survey and share their Facebook or Twitter data (section A below), or data is collected from existing public online sources (sections B, C, and D below; see Figure 1). These sources include searching public Tweets for keywords to identify (and obtain all Tweets from) users who have shared their mental health
Comparison of studies across data sources
Our review has described four sources of data used to study and detect depression through social media. Here we compare these sources.
Recommendations for future studies
The greatest potential value of social media analysis may be the detection of otherwise undiagnosed cases. However, studies to date have not explicitly focused on successfully identifying people unaware of their mental health status.
In screening for depression, multi-stage screening strategies have been recommended [32, 35] as a means to alleviate the relatively low sensitivity (around 50%) and high false positive rate associated with assessments by non-psychiatric physicians [1, 32] or short
Ethical questions
The feasibility of social-media-based assessment of mental illness raises numerous ethical questions. Privacy is an ongoing concern. Employers and insurance companies, for example, may use these against the interests of those suffering from mental illness. As mental illnesses carry social stigma and may engender discrimination, data protection and ownership frameworks are needed to ensure users are not harmed [36]. Few users realize the amount of mental-health-related information that can be
Conclusion
The studies reviewed here suggest that depression and other mental illnesses are detectable on several online environments, but the generalizability of these studies to broader samples and gold standard clinical criteria has not been established. Advances in natural language processing and machine learning are making the prospect of large-scale screening of social media for at-risk individuals a near-future possibility. Ethical and legal questions about data ownership and protection, as well as
Conflict of interest statement
Nothing declared.
References and recommended reading
Papers of particular interest, published within the period of review, have been highlighted as:
• of special interest
•• of outstanding interest
Acknowledgements
The authors thank Courtney Hagan for her help with editing the manuscript. This work was supported by a grant from the Templeton Religion Trust (ID #TRT0048).
References (43)
- et al.
Measuring depression outcome with a brief self-report instrument: sensitivity to change of the Patient Health Questionnaire (PHQ-9)
J Affect Disord
(2004) - et al.
The importance of assessing clinical phenomena in Mechanical Turk research
Psychol Assess
(2016) A broad-bandwidth, public domain, personality inventory measuring the lower-level facets of several five-factor models
Pers Psychol Eur
(1999)- et al.
Measuring post traumatic stress disorder in Twitter
- et al.
Recognition of depression by non-psychiatric physicians — a systematic literature review and meta-analysis
J Gener Intern Med
(2008) - et al.
Twelve-month use of mental health services in the United States: results from the National Comorbidity Survey Replication
Arch Gener Psychiatry
(2005) - et al.
Social networking sites, depression, and anxiety: a systematic review
JMIR Mental Health
(2016) - et al.
Applied Linear Statistical Models
(1996) Regression shrinkage and selection via the lasso
J R Stat Soc
(1966)- et al.
Support-vector networks
Mach Learn
(1995)