Article Text
Abstract
Objectives The number of depression symptoms can be considered as count data in order to get complete and accurate analyses findings in studies of depression. This study aims to compare the goodness of fit of four count outcomes models by a large survey sample to identify the optimum model for a risk factor study of the number of depression symptoms.
Methods 15 820 subjects, aged 10 to 80 years old, who were not suffering from serious chronic diseases and had not run a high fever in the past 15 days, agreed to take part in this survey; 15 462 subjects completed all the survey scales. The number of depression symptoms was the sum of the ‘positive’ responses of seven depression questions. Four count outcomes models and a logistic model were constructed to identify the optimum model of the number of depression symptoms.
Results The mean number of depression symptoms was 1.37±1.55. The over-dispersion test statistic O was 308.011. The alpha dispersion parameter was 0.475 (95% CI 0.443 to 0.508), which was significantly larger than 0. The Vuong test statistic Z was 6.782 and the P value was <0.001, which showed that there were too many zero counts to be accounted for with traditional negative binomial distribution. The zero-inflated negative binomial (ZINB) model had the largest log likelihood and smallest AIC and BIC, suggesting best goodness of fit. In addition, predictive probabilities for many counts in the ZINB model fitted the observed counts best.
Conclusions All fitting test statistics and the predictive probability curve produced the same findings that the ZINB model was the best model for fitting the number of depression symptoms, assessing both the presence or absence of depression and its severity.
- depression
- count data
- over dispersion
- zero-inflated
- poisson regression
- negative binomial regression
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Strengths and limitations of this study
This study explored methods of constructing Poisson, NB, ZIP and ZINB models for the number of depression symptoms and compared the fitting goodness of four count outcomes regression models.
The alpha dispersion parameter, O test and Vuong test showed over-dispersion and excessive zeroes for the number of depression symptoms.
The likelihood ratio statistic and predictive probability curve suggested that the ZINB model was the best model for fitting the number of depression symptoms.
The ZINB model could provide more accurate information about the risk factors for depression status than the logistic model and other count outcomes models.
Categorising of the count data would lead to loss of some useful information; therefore, logistic regression was not the appropriate model for count outcomes study.
Introduction
In statistics, count data are a type of data in which the observations can take only the non-negative integer values {0, 1, 2, 3, … }, and where these integers arise from counting rather than ranking.1 Count data are commonly encountered in medical studies, such as the number of depression symptoms, dental caries, adverse events of clinical trials, physical activity days, etc. During statistical treatment, they are usually considered as continuous outcomes or transferred to dichotomous data. However, being treated as continuous data, count data are often extremely concrete and do not follow normal distribution. Therefore, arithmetic mean and standard deviation are not applicable statistics, and linear regression is therefore not an appropriate analytical method due to skewed distribution and over-dispersion. Moreover, count data are different from dichotomous data, in that the observations can take only two values, usually represented by 0 and 1. The categorisation of count data to be used in crude rate and logistic regression will lead to loss of information. Furthermore count data are different from ordinal data, which may also consist of integers, but where the individual values fall on an arbitrary scale and only the relative ranking is important. Hence, treating count data as a continuous variable in linear regression or dichotomous variable in logistic regression models is likely to bias the results.2 3
In view of these limitations, Poisson regression or negative binomial (NB) regression are commonly used to model count outcomes assuming Poisson distribution or negative binomial distribution are applicable distributions. But probability of zeroes based on Poisson distribution or negative binomial distribution cannot account for excess zero counts. Neglect of excess zeroes will bias the estimation of parameters.4 Zero-inflated (ZI) regression models consider the raw dataset as a mixture of an all-zero subset and another subset following Poisson distribution or negative binomial distribution. A ZI model has been the best model so far to solve this issue in relation to excess zeroes.5–15
Depression is usually assessed with some scales, which refer to the number and severity of depression symptoms. In general, participants will be categorised into two or several categories based on their positive depression symptom items. Prevalence rate and logistic regression are used to study the incidence intensity and risk factors of depression.16–20 These traditional analysis methods are vulnerable to loss of information because every depressive subject may have different numbers of depressive symptoms, resulting in an inability to assess the severity of depression. This study aims to compare the goodness of fit of several count outcomes models—Poisson model, NB model, zero-inflated Poisson (ZIP) model and zero-inflated negative binomial (ZINB) model—by a large-scale cross-sectional sample to identify the optimum model of depression symptom study.
Methods
Sample and participants
The sample was part of a large-scale population survey about Chinese subjects’ physiological and psychological constants, supported by the basic performance key project by the Ministry of Science and Technology of the People’s Republic of China. This survey was conducted in Yunnan Province, southwest of China, and the two-stage cluster sampling method was used. First, two cities were sampled, then several communities and villages were randomly selected in the cities. In these selected communities and villages, all eligible people were referred to as our survey subjects who were aged 10 to 80 years old, were not suffering from serious chronic diseases, and had not run a high fever in the past 15 days. All subjects signed informed consent forms. The study was approved by the review board of the Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (ethics approval number 005–2008).
Depression assessment
Trained medical professionals carried out the survey and interviews. Before the survey, they were trained about the depression assessment scale. The depression assessment scale was designed based on the Composite International Diagnostic Interview Short Form for Major Depression (CIDI-SFMD).21 22 Subjects were asked if there was ever a time when they felt sad, blue or depressed for 2 weeks or more in a row during the past 12 months. Seven questions were asked about whether they had lost interest in things, felt tired or low energy, gained or lost weight, had more trouble falling asleep or concentrating than usual, thought a lot about death and had a feeling of worthlessness. The number of depression symptoms was the sum of ‘positive’ responses of these seven depression questions (range 0–7), which was the main outcome measure.
Potential risk factors of depression symptoms in the models included age, sex, hypertension status, occupation, tobacco smoking, alcohol consumption, nationality, marital status, obesity, stress at work or home, and positive or negative life events. Negative events included loss of job, retirement, loss of crop/business failure, household break-in, marital separation/divorce, other major intra-family conflict, major personal injury or illness, violence, death of a spouse, death/major illness of another close family member or other major stress. Positive events were wedding of a family member, new job or birth in the family.
Analysis methods
Poisson regression, NB regression, ZIP model and ZINB model were constructed and their goodness-of-fit were compared. These four models were useful for count outcomes. ZI models were first introduced by Lambert to account for excess zero counts.23 Cheung mentioned that ZI regression models can be interpreted as reckoning a two-step disease regression.4 At the beginning subjects are not at risk, so they have zero counts. The influence of covariates may move them into the at-risk population and the outcomes follow a Poisson or NB regression distribution. A covariate may or may not have the same impact on the outcome distribution in the two steps.4
ZI models are two-part models, consisting of both binary and count model sections in order to account for excess zero counts.24 The ZIP model refers to raw dataset as a mixture including an all-zero subset and a subset following Poisson distribution.23 The ZIP model supposes that:
At the same time, the ZINB model refers to raw dataset as a mixture including an all-zero subset and a subset following NB distribution.24 The probability density function of the ZINB model is:
Ln and logit link functions were used for parameters μ and . . In the logit section, the explanations of regression coefficients are similar to those in logistic regression. In the Poisson or NB sections, the explanations are the same as in the traditional Poisson or NB regression models.
In this study, SAS version 9.2 was used for the regression model construction. The alpha dispersion parameter and O test were used to identify the over-dispersion.25 The Vuong test was conducted to judge whether there were excessive zero counts.24 26 The fitting goodness of regression models were determined by the predictive probability curve of each count, and the likelihood ratio test statistics: log-likelihood, AIC (Akaike information criterion) and BIC (Bayesian information criterion). A logistic regression model was also conducted.
Results
A total of 15 820 subjects agreed to take part in this survey, of which 15 462 subjects completed all the survey scales. The response rate was 97.7%. The mean age was 26.7±16.7 years (range 10– 80 years). Other characteristics of the sample are shown in table 1: 57.31% of respondents were female; about a quarter of respondents were Yi nationalities; 8.8% of respondents felt psychological stress at work or home; and both positive and negative life events were reported by 4.17% and 21.17% of respondents, respectively.
The second column of table 2 presents the observed distribution of the number of depression symptoms. Among the total of 15 462 respondents, 39.28% reported no depression symptoms. The larger number of symptoms means the lower proportion of respondents. The mean number of depression symptoms was 1.37±1.55. The variance was greater than the mean. The over-dispersion test statistic O was 308.011 and the P value was <0.001. Furthermore, the alpha dispersion parameter was 0.475 and 95% CI of α was 0.443 to 0.508, which was significantly larger than 0. So the number of depression symptoms was over-dispersed. The Vuong test statistic Z was 6.782, and the P value was <0.001, which suggested that there were too many zero counts to be accounted for with traditional negative binomial distribution. Table 3 demonstrates the fitting goodness of four regression models. ZINB model had the largest log likelihood and the smallest AIC and BIC, suggesting best goodness of fit. The Poisson regression model fitted the data worst.
The last four columns of table 2 presented the predictive probabilities for each count in four regression models. Figure 1 shows the predictive probabilities distribution curve of four regression models and the observed proportions. From table 2 and figure 1, it was clear that the Poisson regression model fitted worst, in which the predictive probability of each count was significantly different from the observed proportion. The NB model was a little better than the Poisson model. The ZIP and ZINB models fit the data better and the predictive probability for zero count of the two ZI models was very close to the observed proportion, especially in the ZINB model. With the exception that the predictive probability of 2 was a little larger than the observe count, the probabilities for the other counts in the ZINB model fitted the observed counts very well. However, the ZIP model predicted fewer 1s and more 2s and 3s.
Based on the alpha dispersion parameter, over-dispersion O test, Vuong test, fitting goodness statistic and the predictive probabilities for counts, the ZINB model was the optimum model for fitting the number of depression symptoms.
Regression coefficients of the ZINB model are shown in table 4. The logit section on the left side of the table is only for zero count. It was clear that sex, occupation, alcohol drinker, Yi nationality, single status, stress, and positive or negative events were risk factors for whether any depression symptoms were encountered or not. Female respondents, mental labourers, alcohol drinkers, Yi nationality, single status, respondents suffering from stress, and respondents with positive or negative events were more at risk for depression. The negative binomial section on the right side showed that age, sex, occupation, alcohol drinker, stress and negative events had a significant effect on the severity of depression. Female respondents, mental labourers, alcohol drinker, single status, and respondents suffering from stress or negative events reported more symptoms of depression. However, older individuals had a smaller number of depression symptoms.
Table 5 shows the logistic regression model coefficients for risk factors of depression. Female, younger age, mental labourers, alcohol drinkers, Yi nationality, single, widowed or divorced, obesity, stress, and positive or negative events were associated with increased odds of reporting one or more depression symptoms.
Discussion
This study explored methods of constructing Poisson, NB, ZIP and ZINB models for the number of depression symptoms and compared the goodness of fit of four count outcome regression models.
Over-dispersion and terrible skewed distribution reduced the utility of linear regression for count outcomes. Traditional Poisson regression and negative binomial regression were the common models for count outcomes. However, strict limitation of variance equalling the mean resulted in it being very difficult for over-dispersed count data to follow a Poisson distribution. With the error item of gamma function, NB distribution allows for the over dispersion. But excessive zero counts had a bad effect on the Poisson regression and NB regression models. ZI models were introduced for resolving excessive zeroes. ZI models provide assessment of the risk factors of depression severity and not just the presence or absence of depression, because ZI models can model depression in a continuum instead of the dichotomous outcome. In particular, ZINB models can resolve both over dispersion and excessive zeroes in the same time.
In this study, the O test, Vuong test, AIC and BIC statistic and predictive probability curve indicated that the ZINB model was the best model for the number of depression symptoms with about 40% zero counts. The ZIP model fitted the data worse than the ZINB model perhaps because the over dispersion of the number of depression symptoms restricted the utility of the ZIP model. Many studies reported similar results that the ZINB model was the best model for count outcomes.8 12–15 However, in a physical activity study and another depressive symptoms study, ZIP was considered a better model than the ZINB model.27 28
In the ZINB model, the influence of risk factors on depression can be assessed by two aspects: whether or not there is depression, and the severity of depression. The logistic regression model reported different risk factors for depression from the ZINB model, especially for obesity and widowed or divorced status. In the ZINB model, obesity and widowed or divorced status were not found to have a strong effect on depression symptoms, although P values approached 0.05 (P=0.106, P=0.068). In addition, age influenced only the severity of depression and not whether depression was present, and positive life events had the opposite influence. Categorising of the count data would lead to loss of some useful information so that logistic regression was not the appropriate model for count outcomes study.
Several limitations are worth noting. First, the Poisson or NB distributional assumption has no upper limit for the counts. However, in the medical field, the outcome variables might always have a specific upper limit for the counts. In this study, the number of depression symptoms ranged from 0 to 7, which was not spread widely. This might be a reason for the poor goodness of fit for traditional Poisson regression and NB regression models. Second, some potential risk factors were correlated with each other, such as stress and positive or negative life events, but the correlation was not too strong to have a disruptive influence on the results of this study.
Despite these limitations, we can conclude that all fitting test statistics and predictive probability curves produced the same finding that the ZINB model was the best model for fitting the number of depression symptoms, not only assessing the presence or absence of depression but also assessing the severity of depression.
References
Footnotes
Contributors TX is responsible for the design, statistical analysis and writing the manuscript. GZ is responsible for the project design and the field survey. SH is responsible for the design, data management and statistical analysis.
Funding The funding is provided by the basic performance key project by the Ministry of Science and Technology of the People’s Republic of China (No. 2006FY110300).
Competing interests None declared.
Ethics approval The review board of Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement Technical appendix, statistical code, and dataset available from the Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & School of Basic Medicine, Peking Union Medical College.