Research reportDefining successful treatment outcome in depression using the PHQ-9: A comparison of methods
Introduction
The PHQ-9 (Kroenke et al., 2001) is a widely used self-report measure of depression that is brief, easy to administer and has well-established psychometric properties (Lee et al., 2007). For these reasons it has been recommended as an integral part of the management of depression in primary care, including tracking symptom change and defining successful treatment outcome to inform treatment decisions (Clark et al., 2009, Dejesus et al., 2007). However, little is known about the performance of the measure in quantifying clinically significant improvement.
A small number of studies have established that the measure if sensitive to change (Cameron et al., 2008, Lowe et al., 2004a) and one study has used minimal clinically important difference (MCID) criteria to define a change of 5-points or more as indicating MCID (Lowe et al., 2004b). Apart from the study by Lowe et al. (2004b), the only other guidance on defining clinically significant change comes from the original validation study of the PHQ-9, which recommended a score of ≥ 10 to indicate the presence of probable depression (Kroenke et al., 2001) based on an analysis of sensitivity and specificity data. In that dataset, scores of ≥ 10 indicated an increased probability of receiving a diagnosis of major depressive disorder (MDD); whereas few people who scored ≤ 9 met diagnostic criteria for MDD. On the basis of this, Kroenke et al. (2001) made a recommendation that a post-treatment score of ≤ 9 along with the commonly used criterion of a 50% reduction in scores could be used to define clinically significant improvement. Kroenke et al. (2001), however, pointed out that their definition of improvement was provisional and that further work was needed to validate it.
There are a number of alternative methods of conceptualising improvement on measures of psychological functioning in general and depression in particular with clear theoretical underpinnings, and it is not clear how the standard definition recommended in the original validation study relates to these. These include recovery and remission criteria (Frank et al., 1991) and the concepts of reliable and clinically significant change (Jacobson and Truax, 1991).
Frank et al. (1991) provided several conceptual definitions of improvement in depression that have proved influential in current thinking about defining treatment outcome (Keller, 2003) The concept of remission requires a period, typically at least several weeks, in which a person remains in an asymptomatic range, defined as no or very few symptoms. Recovery requires that the person remains in this asymptomatic range, but for a longer duration. The concepts of an asymptomatic range, remission and recovery may be important in measuring treatment outcome for depression. Rates of relapse and recurrence following successful treatment for depression remain high. A consistent finding is that people who are classed as improved but continue to have some residual symptoms have a substantially higher rate of relapse and recurrence than those who meet criteria for remission or recovery (Paykel et al., 1995).
Reliable and clinically significance change criteria (Jacobson and Truax, 1991) are among the most commonly used method of quantifying improvement in studies of psychological treatments (Ogles et al., 2001) and have been recommended as a standard reporting strategy for all published research involving these types of interventions (Evans et al., 1998). At a conceptual level, clinically significant change defines improvement as a move from a clinical to a non-clinical range. Jacobson and Truax (1991) provide several operational definitions of cut-off points to distinguish between these ranges based on the central location and distribution of scores for a clinical and non-clinical group. As an additional criterion, the change in scores must be greater than that which could be due to the inherent unreliability of the measure.
It is not clear how the standard definition of improvement for the PHQ-9 relates to other commonly used methods of defining improvement on psychological measures. The aim of this study is to examine the performance of this standard definition by comparing it to other commonly used definitions of improvement. As an additional index of corroboration, we compared the agreement between these definitions and a gold-standard diagnostic interview.
Section snippets
Sample
The sample was taken from a randomised control trial of collaborative care for depression (Richards et al., 2007). Participants were recruited from primary care services, and were included if they were aged above 18 years, had received a diagnosis of depression by a GP, and scored ≥ 5 on the Structure Clinical Interview for DSM-IV defined major depressive disorder (MDD) (Spitzer et al., 1992). Exclusion criteria included active suicidal plans, primary drug or alcohol dependence and some types of
Results
The standard definition suggested a similar level of improvement (36.5%) to the other definitions, with the exception of the asymptomatic criterion (27.1%), which suggested lower rates of improvement than all other definitions (Table 2). Of those participants who scored in the clinical range pre-treatment for all definitions (n = 84), approximately half (53.8%) did not meet improvement criteria for any definition and 23.8% met criteria for all definitions; there were disagreements between the
Discussion
If the PHQ-9 is to be of use in clinical practice it will be necessary to define clinically significant improvement on the measure. Although Kroenke et al. (2001) offered such a definition, they were careful to point out that it was provisional and further work was needed to validate it. To this end, this study compared the definition of Kroenke et al. (2001) with other theoretically informed methods of quantifying improvement as well as a gold-standard diagnostic interview.
The standard
Role of funding source
The randomised trial was funded by MRC grant no. G03000677; ID: 68073, International Standard RCT no.: ISRCT63222059. The researchers worked independently of the research funder in designing of the study, the collection, analysis and interpretation of data, in the writing of the report, and in the decision to submit the paper for publication.
Conflict of interest
All authors declare that they have no conflict of interest.
Acknowledgements
Thank you to Dr Peter Bower for comments on an earlier version of this manuscript.
References (22)
- et al.
Psychometric properties of the Beck Depression Inventory: twenty-five years of evaluation
Clinical Psychology Review
(1988) - et al.
Improving access to psychological therapy: initial evaluation of two UK demonstration sites
Behaviour Research and Therapy
(2009) - et al.
A system-based approach to depression management in primary care using the Patient Health Questionnaire-9
Mayo Clinic Proceedings
(2007) - et al.
Concordance between the PHQ-9 and the HSCL-20 in depressed primary care patients
Journal of Affective Disorders
(2007) - et al.
Measuring depression outcome with a brief self-report instrument: sensitivity to change of the Patient Health Questionnaire (PHQ-9)
Journal of Affective Disorders
(2004) - et al.
Clinical significance, history, application, and current practice
Clinical Psychology Review
(2001) Practical Statistics for Medical Research
(1991)- et al.
An inventory for measuring depression
Archives of General Psychiatry
(1961) - et al.
Psychometric comparison of PHQ-9 and HADS for measuring depression severity in primary care
British Journal of General Practice
(2008) - et al.
Depression
The contribution of reliable and clinically significant change methods to evidence-based mental health
Evidence-based Mental Health
Cited by (176)
Effects of 3D-Printed Models and 3D Printed Pictures on Maternal– and Paternal–Fetal Attachment, Anxiety, and Depression
2023, JOGNN - Journal of Obstetric, Gynecologic, and Neonatal Nursing