Research report
Defining successful treatment outcome in depression using the PHQ-9: A comparison of methods

https://doi.org/10.1016/j.jad.2010.04.030Get rights and content

Abstract

Background

Although the PHQ-9 is widely used in primary care, little is known about its performance in quantifying improvement. The original validation study of the PHQ-9 defined clinically significant change as a post-treatment score of ≤ 9 combined with improvement of 50%, but it is unclear how this relates to other theoretically informed methods of defining successful outcome. We compared a range of definitions of clinically significant change (original definition, asymptomatic criterion, reliable and clinically significant change criteria a, b and c) in a clinical trial of a community-level depression intervention.

Method

Randomised Control Trial of collaborative care for depression. Levels of agreement were calculated between the standard definition, other definitions, and gold-standard diagnostic interview.

Results

The standard definition showed good agreement (kappa > 0.60) with the other definitions and had moderate, though acceptable, agreement with the diagnostic interview (kappa = 0.58). The standard definition corresponded closely to reliable and clinically significant change criterion c, the recommended method of quantifying improvement when clinical and non-clinical distributions overlap.

Limitations

The absence of follow-up data meant that an asymptomatic criterion rather than remission or recovery criteria were used.

Conclusion

The close agreement between the standard definition and reliable and clinically significant change criterion c provides some support for the standard definition of improvement. However, it may be preferable to use a reliable change index rather than 50% improvement. Remission status, based on the asymptomatic range and a lower PHQ-9 score, may provide a useful additional category of clinical change.

Introduction

The PHQ-9 (Kroenke et al., 2001) is a widely used self-report measure of depression that is brief, easy to administer and has well-established psychometric properties (Lee et al., 2007). For these reasons it has been recommended as an integral part of the management of depression in primary care, including tracking symptom change and defining successful treatment outcome to inform treatment decisions (Clark et al., 2009, Dejesus et al., 2007). However, little is known about the performance of the measure in quantifying clinically significant improvement.

A small number of studies have established that the measure if sensitive to change (Cameron et al., 2008, Lowe et al., 2004a) and one study has used minimal clinically important difference (MCID) criteria to define a change of 5-points or more as indicating MCID (Lowe et al., 2004b). Apart from the study by Lowe et al. (2004b), the only other guidance on defining clinically significant change comes from the original validation study of the PHQ-9, which recommended a score of ≥ 10 to indicate the presence of probable depression (Kroenke et al., 2001) based on an analysis of sensitivity and specificity data. In that dataset, scores of ≥ 10 indicated an increased probability of receiving a diagnosis of major depressive disorder (MDD); whereas few people who scored ≤ 9 met diagnostic criteria for MDD. On the basis of this, Kroenke et al. (2001) made a recommendation that a post-treatment score of ≤ 9 along with the commonly used criterion of a 50% reduction in scores could be used to define clinically significant improvement. Kroenke et al. (2001), however, pointed out that their definition of improvement was provisional and that further work was needed to validate it.

There are a number of alternative methods of conceptualising improvement on measures of psychological functioning in general and depression in particular with clear theoretical underpinnings, and it is not clear how the standard definition recommended in the original validation study relates to these. These include recovery and remission criteria (Frank et al., 1991) and the concepts of reliable and clinically significant change (Jacobson and Truax, 1991).

Frank et al. (1991) provided several conceptual definitions of improvement in depression that have proved influential in current thinking about defining treatment outcome (Keller, 2003) The concept of remission requires a period, typically at least several weeks, in which a person remains in an asymptomatic range, defined as no or very few symptoms. Recovery requires that the person remains in this asymptomatic range, but for a longer duration. The concepts of an asymptomatic range, remission and recovery may be important in measuring treatment outcome for depression. Rates of relapse and recurrence following successful treatment for depression remain high. A consistent finding is that people who are classed as improved but continue to have some residual symptoms have a substantially higher rate of relapse and recurrence than those who meet criteria for remission or recovery (Paykel et al., 1995).

Reliable and clinically significance change criteria (Jacobson and Truax, 1991) are among the most commonly used method of quantifying improvement in studies of psychological treatments (Ogles et al., 2001) and have been recommended as a standard reporting strategy for all published research involving these types of interventions (Evans et al., 1998). At a conceptual level, clinically significant change defines improvement as a move from a clinical to a non-clinical range. Jacobson and Truax (1991) provide several operational definitions of cut-off points to distinguish between these ranges based on the central location and distribution of scores for a clinical and non-clinical group. As an additional criterion, the change in scores must be greater than that which could be due to the inherent unreliability of the measure.

It is not clear how the standard definition of improvement for the PHQ-9 relates to other commonly used methods of defining improvement on psychological measures. The aim of this study is to examine the performance of this standard definition by comparing it to other commonly used definitions of improvement. As an additional index of corroboration, we compared the agreement between these definitions and a gold-standard diagnostic interview.

Section snippets

Sample

The sample was taken from a randomised control trial of collaborative care for depression (Richards et al., 2007). Participants were recruited from primary care services, and were included if they were aged above 18 years, had received a diagnosis of depression by a GP, and scored ≥ 5 on the Structure Clinical Interview for DSM-IV defined major depressive disorder (MDD) (Spitzer et al., 1992). Exclusion criteria included active suicidal plans, primary drug or alcohol dependence and some types of

Results

The standard definition suggested a similar level of improvement (36.5%) to the other definitions, with the exception of the asymptomatic criterion (27.1%), which suggested lower rates of improvement than all other definitions (Table 2). Of those participants who scored in the clinical range pre-treatment for all definitions (n = 84), approximately half (53.8%) did not meet improvement criteria for any definition and 23.8% met criteria for all definitions; there were disagreements between the

Discussion

If the PHQ-9 is to be of use in clinical practice it will be necessary to define clinically significant improvement on the measure. Although Kroenke et al. (2001) offered such a definition, they were careful to point out that it was provisional and further work was needed to validate it. To this end, this study compared the definition of Kroenke et al. (2001) with other theoretically informed methods of quantifying improvement as well as a gold-standard diagnostic interview.

The standard

Role of funding source

The randomised trial was funded by MRC grant no. G03000677; ID: 68073, International Standard RCT no.: ISRCT63222059. The researchers worked independently of the research funder in designing of the study, the collection, analysis and interpretation of data, in the writing of the report, and in the decision to submit the paper for publication.

Conflict of interest

All authors declare that they have no conflict of interest.

Acknowledgements

Thank you to Dr Peter Bower for comments on an earlier version of this manuscript.

References (22)

  • C. Evans et al.

    The contribution of reliable and clinically significant change methods to evidence-based mental health

    Evidence-based Mental Health

    (1998)
  • Cited by (176)

    View all citing articles on Scopus
    View full text