Elsevier

Psychiatry Research

Volume 160, Issue 2, 15 August 2008, Pages 129-136
Psychiatry Research

Review
Statistical procedures for analyzing mental health services data

https://doi.org/10.1016/j.psychres.2007.07.003Get rights and content

Abstract

In mental health services research, analyzing service utilization data often poses serious problems, given the presence of substantially skewed data distributions. This article presents a non-technical introduction to statistical methods specifically designed to handle the complexly distributed datasets that represent mental health service use, including Poisson, negative binomial, zero-inflated, and zero-truncated regression models. A flowchart is provided to assist the investigator in selecting the most appropriate method. Finally, a dataset of mental health service use reported by medical patients is described, and a comparison of results across several different statistical methods is presented. Implications of matching data analytic techniques appropriately with the often complexly distributed datasets of mental health services utilization variables are discussed.

Introduction

A large body of research has examined variables associated with the previous use of mental health services, using various conceptual frameworks (Bruce et al., 2002). Among large-scale community surveys, recent results have demonstrated that mental health service use is significantly associated with a number of variables including demographic characteristics, attitudes toward treatment, mental health diagnoses, and access variables (Bland et al., 1997, Kessler et al., 1998, Lin and Parikh, 1999, Parslow and Jorm, 2000, Lewis et al., 2005, Oliver et al., 2005, Wang et al., 2005, Elhai et al., 2006a, Elhai and Ford, 2007).

Several recent reviews have discussed a number of important methodological issues that have limited the literature examining the use of mental health services, including design-specific problems in querying about service use and in measuring utilization (Walker et al., 2004, Elhai et al., 2005). However, in addition to methodological and design issues, there are also important data analysis issues that warrant consideration. The current article aims to briefly present the problems inherent in analyzing data on mental health service use and costs, and discusses in non-technical terms several alternative statistical methods that represent the state of the art in handling such data, with an empirical comparison of the performance of these methods.

Section snippets

Complexities in mental health service use and cost data

Mental health services researchers often gather data on the intensity of services used by participants (typically in the form of visit counts), and sometimes the resulting costs incurred (in dollars). Such data are most often gathered over a recent time period (e.g., past 12 months), since research demonstrates that subjects' recall accuracy substantially decreases when estimating visit counts over longer time frames (Roberts et al., 1996). Medical chart reviews also tend to focus on short time

Data transformations

Perhaps as a result of these data problems, the actual analyses presented in mental health service use studies most often involve logistic regression, by reducing visit counts and costs to dichotomous categories (e.g., “use”/“non-use”; “0–9 visits”/“10 or more visits”; “above”/“below median costs”). While this approach may seem to solve the problems discussed above, because logistic regression does not have the same restrictive assumptions that linear regression has, new problems are

Count regression models

When analyzing predictors of such skewed service use and costs data, the best solution is to use a non-linear, count regression model. Such models require that the dependent variable is a non-negative integer, and as in ordinary linear regression, the predictor variables must be either continuously-scaled, binary-coded or a mixture. Count models use maximum likelihood procedures, and implement transformations to make the non-linear count dependent variable linear. Count models are specific

Decisions in analyzing count regression models

In Fig. 1, we present a flowchart to assist the reader in selecting the most appropriate regression model, given characteristics of the dependent variable.

At the time of this writing, two statistical packages include standard modules for Poisson, negative binomial, and the zero-inflated and zero-truncated methods: Stata (Statacorp, 2005) and LIMDEP (Econometric Software, 2002). Gauss (Aptech Systems Inc., 2005) offers (but does not include as standard) a Maximum Likelihood application which

Applying the models to a dataset of mental health visit counts

Recently, we examined mental health treatment use intensity among 186 Midwestern U.S. primary care patients (Elhai et al., 2006b). We assessed the relationship of gender, attitudes toward mental health treatment, violent-crime and non-crime trauma frequency (log-transformed due to substantial skewness), and a probable posttraumatic stress disorder (PTSD) diagnosis with self-reported mental health visit counts from the past 6 months. We now present a comparison of the above-mentioned statistical

Conclusions

This paper presented a review of the data analysis problems that are inherent when analyzing mental health service use data. Several solutions were presented, including Poisson and negative binomial, zero-inflated, and zero-truncated regression models. Quite different results were observed when alternative statistical solutions were used to handle a typical dataset with mental health service use as the outcome variable. The results demonstrate the potential danger of using analytic methods

References (39)

  • CameronA.C. et al.

    Regression Analysis of Count Data

    (1998)
  • Econometric Software

    LIMDEP

    (2002)
  • ElhaiJ.D. et al.

    Correlates of mental health service use intensity in the National Comorbidity Survey and National Comorbidity Survey Replication

    Psychiatric Services

    (2007)
  • ElhaiJ.D. et al.

    Health service use predictors among trauma survivors: a critical review

    Psychological Services

    (2005)
  • ElhaiJ.D. et al.

    Gender- and trauma-related predictors of use of mental health treatment services among primary care patients

    Psychiatric Services

    (2006)
  • GardnerW. et al.

    Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models

    Psychological Bulletin

    (1995)
  • HallD.B. et al.

    Marginal models for zero inflated clustered data

    Statistical Modelling

    (2004)
  • HeilbronD.C.

    Zero-altered and other regression models for count data with added zeros

    Biometrical Journal

    (1994)
  • Insightful Corporation

    S-Plus

    (2005)
  • Cited by (0)

    View full text