Likelihood ratio tests of the number of components in a normal mixture with unequal variances

https://doi.org/10.1016/j.spl.2004.11.007Get rights and content

Abstract

Determining the number of components in a mixture distribution is of interest to researchers in many areas. In this paper, we investigate the statistical properties of a likelihood ratio test proposed by Lo et al. (Biometrika 88 (2001) 767) for determining the number of components in a normal mixture with unequal variances. We discuss the dependence of the rate of convergence of the likelihood ratio statistic to its limiting distribution on the choice of restrictions imposed on the component variances to deal with the problem of unboundedness of the likelihood. We compare the test procedure to the parametric bootstrap method and posterior predictive checks, a Bayesian model checking procedure.

Introduction

Mixture distributions are useful tools for statistical modeling of data in which individual observations can arise from any component distributions. The component distributions may belong to the same or different parametric families. The most widely used mixture distributions are those with normal distributions as components.

In many practical situations, the number of components in a mixture distribution is unknown. For instance, in a random sample of Iris, one would like to know whether data on sepal length, sepal width, petal length and petal width suggest the presence of more than one Iris species in the sample. It is well known that the likelihood ratio statistic for testing the number of components in a mixture model fails to have the classic chi-squared reference distribution since, under the null hypothesis, the mixing proportions lie on the boundary of the parameter space and the parameters are not identifiable under the null model. Several studies have been conducted to investigate the limiting distribution of the likelihood ratio statistic; see Ghosh and Sen (1985), Hartigan (1985), Dacunha-Castelle and Gassiat, 1997, Dacunha-Castelle and Gassiat, 1999, Lemdani and Pons (1999) and Chen et al. (2001). Following Vuong (1989), Lo et al. (2001) showed that the likelihood ratio statistic is asymptotically distributed as a weighted sum of independent chi-squared random variables with one degree of freedom. They conducted simulation studies for the case of a single normal versus a two-component location contaminated normal mixture and the case of a two-component location contaminated normal mixture versus a three-component location contaminated normal mixture. Their simulation results showed that the test works well for testing the number of components in a homoscedastic normal mixture with suggested adjustment to the likelihood ratio statistic.

Kiefer and Wolfowitz (1956) noted that the global maximum likelihood estimate does not exist in the case of a mixture of normal distributions with unequal variances. For example, if we let μ^i equal any observation in the sample and let σ^i approach zero in search for the maximum likelihood estimates of the parameters, then the likelihood function is unbounded above. Like singularities that cause problems in search for the maximum of the likelihood, Day (1969) argued that spurious maximizers may occur as a consequence of some component distribution having a very small variance relative to others when the corresponding cluster contains few data points sufficiently close together. Quandt and Ramsey (1978) suggested imposing the constraint σ12=cσ22, where c is a known constant, on the component variances so as to avoid the singularities in the likelihood function and spurious local maximizers in the case of a mixture of two normals. Hathaway (1985) suggested using the set of linear inequality constraints mini,j(σi/σj)c>0, c(0,1] to rule out the spurious local maximizers and claimed that there exists a constrained global maximizer of the likelihood function on the constrained parameter space. Following Kiefer and Wolfowitz (1956), he showed that the constrained maximum likelihood estimator is consistent. To investigate the convergence behavior of the constrained EM algorithm, Hathaway (1986) performed a simulation study in the case of a mixture of two normal distributions with unequal variances. He considered different values of c ranging from 0.000001 to 0.25 and different values of the lower bound for the mixing proportion ranging from 0.0001 to 0.2. He concluded that the convergence of the constrained maximum likelihood estimator to the consistent maximizer depends on the values for c and the lower bound of the mixing proportion, and sample sizes. The consistent maximizer is defined as the limit of the EM sequence using the true parameters as starting values.

Feng and McCulloch (1994) performed a simulation study to demonstrate their claim that the simulated null distribution of the likelihood ratio statistic for testing a single normal against a mixture of two normal distributions depends on the choice of restrictions imposed on the component variances. They imposed slightly different constraints min(σ12,σ22)c>0 on the component variances under the alternative hypothesis and showed that the null distribution of the likelihood ratio statistic lies between χ42 and χ52 for c=10-6, between χ52 and χ62 for c=10-10 and is close to χ62 for c=10-20. McLachlan and Peel (2000) repeated the study by Feng and McCulloch (1994) and showed that the simulated null distribution of the likelihood ratio statistic is close to χ162 for c=10-10. Their simulation results were inconsistent with those of Feng and McCulloch (1994). They argued that the difference may be due in part to the extent of search for the largest local maximum under the alternative hypothesis irrespective of whether it was a spurious maximizer or not.

In this paper, we investigate the asymptotic behavior of a likelihood ratio test developed by Lo et al. (2001) for determining the number of components in a normal mixture with unequal variances. Section 2 reviews the likelihood ratio test. Numerical results are presented in Section 3. Empirical comparisons of the test with the bootstrap test and the posterior predictive check are given in Section 4. Discussions and concluding remarks are given in Section 5.

Section snippets

Likelihood ratio test

Suppose that a random sample of size n, X1,,Xn, has been drawn from a normal mixture distribution with probability density functionh(x;ϑ)=i=1kπifi(x;μi,σi2),where ϑ=[π1πkμ1μkσ12σk2] is the vector of unknown parameters, fi(·) is the normal density function with mean μi and variance σi2, πi>0 is the mixing proportion for the ith component and k is the number of components. We assume, without loss of generality, that μ1μ2μk and mini,j(σi/σj)c>0, c(0,1]. We assume further that the

Simulation results

We conducted simulation studies to examine the rate of convergence of the likelihood ratio statistic to the asymptotic distribution for the case of heteroscedastic normal mixtures. We began by determining the lower bound for the relative sizes of the component variances. There are relatively few studies in the literature discussing the choice of values for c so as to ensure that the constrained maximum likelihood estimator is consistent and possesses the asymptotic properties given in Kiefer

Comparisons with bootstrap tests and posterior predictive checks

We performed simulation studies to compare the performances of the proposed test, the parametric bootstrap test and a Bayesian model checking procedure called posterior predictive checks in terms of the observed significance level and power. Bootstrap sampling can be carried out parametrically or non-parametrically, depending on whether the probability distribution generating the data is known or unknown. For instance, if the distribution is unknown, the empirical distribution is used in place

Discussion

We have shown that the proposed likelihood ratio test provides a valid and practical solution to the problem of determining the number of components in a normal mixture with unequal variances. In an attempt to deal with the problem of unboundedness of the likelihood and exclude spurious local maxima from our search for all possible local maxima, we imposed the constraints modified by Hathaway (1983) with suggested c=0.25 on the relative sizes of the component variances. Note that there are

Acknowledgements

The author is grateful to Professor Donald Rubin for his suggestions on simulation and posterior predictive checks. This work was partially supported by NIH grants RO1 DA13564, RO1 DA14998 and P30 AI051519.

References (25)

  • R.F. Phillips

    A constrained maximum-likelihood approach to estimating switching regressions

    J. Econometrics

    (1991)
  • H. Chen et al.

    A modified likelihood ratio test for homogeneity in the finite mixture models

    J. Roy. Statist. Soc. B

    (2001)
  • D. Dacunha-Castelle et al.

    Testing in locally conic models and application to mixture models

    ESAIM: Probab. Statist.

    (1997)
  • D. Dacunha-Castelle et al.

    Testing the order of a model using locally conic parameterizationpopulation mixtures and stationary ARMA processes

    Ann. Statist.

    (1999)
  • N.E. Day

    Estimating the components of a mixture of normal distributions

    Biometrika

    (1969)
  • J. Diebolt et al.

    Estimation of finite mixture distributions through Bayesian sampling

    J. Roy. Statist. Soc. B

    (1994)
  • Z.D. Feng et al.

    On the likelihood ratio test statistic for the number of components in a normal mixture with unequal variances

    Biometrics

    (1994)
  • A. Gelman et al.

    Posterior predictive assessment of model fitness (with discussion)

    Statistica Sinica

    (1996)
  • J.H. Ghosh et al.

    On the asymptotic performance of the log likelihood ratio statistic for the mixture model and related results

  • J.A. Hartigan

    A failure of likelihood asymptotics for normal mixtures

  • R.J. Hathaway

    Constrained maximum-likelihood estimation for normal mixtures

  • R.J. Hathaway

    A constrained formulation of maximum-likelihood estimation for normal mixture distributions

    Ann. Statist.

    (1985)
  • Cited by (0)

    View full text