Retention of latent segments in regression-based marketing models

https://doi.org/10.1016/j.ijresmar.2003.04.001Get rights and content

Abstract

Product design and marketing mix decisions for segmented markets depend crucially on the correct specification of marketing models used as input to these decisions. With real-world data, the true number of segments in a market is unknown. Current evidence from simulation studies suggests that the accuracy of commonly used criteria for determining the number of segments in a market depends on the usage context, including the type of distribution being used to describe the data, the model specification, and the characteristics of the market. This study investigates via simulation the performance of seven segment retention criteria used with finite mixture regression models for normal data. This is one of the most important analysis contexts in marketing research since regression models are used, for example, in conjoint analysis and market response analysis, yet no previous study in either the marketing or statistics literatures explores the segment retention problem for mixture regression models. The study shows that one criterion, Akaike's Information Criterion (AIC) with a per-parameter penalty factor of 3 (AIC3), is clearly the best criterion to use across a wide variety of model specifications and data configurations, having the highest success rate and producing very low parameter bias. Currently, this criterion is rarely, if ever, used in the marketing literature.

Section snippets

The criteria

The simulation experiment compares seven major segment retention criteria: AIC (Akaike, 1973); AIC with a penalty factor of 3, referred to as AIC3 Bozdogan, 1992, Bozdogan, 1994; BIC (Schwarz, 1978); CAIC (Bozdogan, 1987); ICOMP Bozdogan, 1988, Bozdogan, 1990; the validation sample log likelihood LOGLV (Andrews & Currim, 2003); and the Normed Entropy Criterion (NEC) (Celeux & Soromenho, 1996). The reader is referred to the references cited above for detailed discussions of the theoretical

Summary of results

We measure the performance of the various segment retention criteria by (i) their success rates, or the percentage of datasets in which the criteria identify the true number of segments; and (ii) the Root Mean Square Error between the true and estimated β parameters, RMSE(β), of the models selected by the criteria. Given two criteria with similar success rates, we prefer underfitting to overfitting Andrews & Currim, 2003, Cutler & Windham, 1994. Of course, we also prefer model selection

Conclusion

It is generally clear from comparing the results of this study to those of Andrews and Currim (2003) and Cutler and Windham (1994) that the type of distribution being mixed, the model specification, and the characteristics of the market affect the performance of segment retention criteria. Consequently, the finding that AIC3 is the best criterion to use with regression (e.g., conjoint and market response) models for normally distributed data and with logit models for multinomial data (Andrews &

References (18)

  • H. Akaike

    Information theory and an extension of the maximum likelihood principle

  • R.L. Andrews et al.

    Hierarchical Bayes vs. finite mixture conjoint analysis models: A comparison of fit, prediction, and partworth recovery

    Journal of Marketing Research

    (2002, February)
  • R.L. Andrews et al.

    A comparison of segment retention criteria for finite mixture logit models

    Journal of Marketing Research

    (2003)
  • H. Bozdogan

    Model selection and Akaike's information criterion (AIC): The general theory and its analytical extensions

    Psychometrika

    (1987)
  • H. Bozdogan

    ICOMP: A new model selection criterion

  • H. Bozdogan

    On the information-based measure of covariance complexity and its application to the evaluation of multivariate linear models

    Communications in Statistics, Theory, and Methods

    (1990)
  • H. Bozdogan

    Choosing the number of component clusters in the mixture-model using a new informational complexity criterion of the inverse-Fisher information matrix

  • H. Bozdogan

    Mixture-model cluster analysis using model selection criteria and a new informational measure of complexity

  • G. Celeux et al.

    An entropy criterion for assessing the number of clusters in a mixture model

    Journal of Classification

    (1996)
There are more references available in the full text version of this article.

Cited by (74)

  • Bridging the gap between trade operators and consumers to better understand the U.S. wine market: A simultaneous application of discrete choice experiments

    2022, Industrial Marketing Management
    Citation Excerpt :

    Both RQ2 and RQ3 focus on whether comparing the whole trade operator sample with the whole consumer sample fits better than segmenting each group. We apply latent class analysis (LCA) to the responses to the DCEs, because this specific form of finite mixture model (Andrews & Currim, 2003; Boxall & Adamowicz, 2002) assumes that the overall preference distribution comprises unobservable, latent groups or classes that differ in their utility between the groups but are similar within them (Provencher & Moore, 2006). Using statistical criteria, researchers specify the optimal number of underlying groups.

  • Latent class analysis in PLS-SEM: A review and recommendations for future applications

    2022, Journal of Business Research
    Citation Excerpt :

    The key advantage of FIMIX-PLS over alternative PLS-SEM-based latent class analysis techniques is that the method offers guidance on how many segments to retain from the data. Specifically, the log-likelihood formulation in Eq. (4) facilitates the computation of information criteria, which are well known from the regression literature (Andrews & Currim, 2003; Becker, Ringle, Sarstedt, & Völckner, 2015; Sawa, 1978; Vrieze, 2012). Information criteria simultaneously take a model’s fit (i.e., the likelihood value) and the number of parameters used to achieve this fit into account by complementing the resulting (negative) likelihood value with a penalty term, which increases with the number of segments (Burnham & Anderson, 2002).

  • A generalized ordinal finite mixture regression model for market segmentation

    2021, International Journal of Research in Marketing
View all citing articles on Scopus
View full text