Skip to main content
Log in

Understanding predictive information criteria for Bayesian models

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

We review the Akaike, deviance, and Watanabe-Akaike information criteria from a Bayesian perspective, where the goal is to estimate expected out-of-sample-prediction error using a bias-corrected adjustment of within-sample error. We focus on the choices involved in setting up these measures, and we compare them in three simple examples, one theoretical and two applied. The contribution of this paper is to put all these information criteria into a Bayesian predictive context and to better understand, through small examples, how these methods can apply in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Aitkin, M.: Statistical Inference: an Integrated Bayesian/Likelihood Approach. Chapman & Hall, London (2010)

    Book  Google Scholar 

  • Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (eds.) Proceedings of the Second International Symposium on Information Theory, pp. 267–281. Akademiai Kiado, Budapest (1973). Reprinted in: Kotz, S. (ed.) Breakthroughs in Statistics, pp. 610–624. Springer, New York (1992)

    Google Scholar 

  • Ando, T., Tsay, R.: Predictive likelihood for Bayesian model selection and averaging. Int. J. Forecast. 26, 744–763 (2010)

    Article  Google Scholar 

  • Bernardo, J.M.: Expected information as expected utility. Ann. Stat. 7, 686–690 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  • Bernardo, J.M., Smith, A.F.M.: Bayesian Theory. Wiley, New York (1994)

    Book  MATH  Google Scholar 

  • Burman, P.: A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 76, 503–514 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  • Burman, P., Chow, E., Nolan, D.: A cross-validatory method for dependent data. Biometrika 81, 351–358 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  • Burnham, K.P., Anderson, D.R.: Model Selection and Multimodel Inference: a Practical Information Theoretic Approach. Springer, New York (2002)

    Google Scholar 

  • Celeux, G., Forbes, F., Robert, C., Titterington, D.: Deviance information criteria for missing data models. Bayesian Anal. 1, 651–706 (2006)

    Article  MathSciNet  Google Scholar 

  • DeGroot, M.H.: Optimal Statistical Decisions. McGraw-Hill, New York (1970)

    MATH  Google Scholar 

  • Dempster, A.P.: The direct use of likelihood for significance testing. In: Proceedings of Conference on Foundational Questions in Statistical Inference, Department of Theoretical Statistics: University of Aarhus, pp. 335–352 (1974)

    Google Scholar 

  • Draper, D.: Model uncertainty yes, discrete model averaging maybe. Stat. Sci. 14, 405–409 (1999)

    MathSciNet  Google Scholar 

  • Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman & Hall, New York (1993)

    Book  MATH  Google Scholar 

  • Geisser, S., Eddy, W.: A predictive approach to model selection. J. Am. Stat. Assoc. 74, 153–160 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  • Gelfand, A., Dey, D.: Bayesian model choice: asymptotics and exact calculations. J. R. Stat. Soc. B 56, 501–514 (1994)

    MATH  MathSciNet  Google Scholar 

  • Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis, 2nd edn. CRC Press, London (2003)

    Google Scholar 

  • Gelman, A., Meng, X.L., Stern, H.S.: Posterior predictive assessment of model fitness via realized discrepancies (with discussion). Stat. Sin. 6, 733–807 (1996)

    MATH  MathSciNet  Google Scholar 

  • Gelman, A., Shalizi, C.: Philosophy and the practice of Bayesian statistics (with discussion). Br. J. Math. Stat. Psychol. 66, 8–80 (2013)

    Article  MathSciNet  Google Scholar 

  • Gneiting, T.: Making and evaluating point forecasts. J. Am. Stat. Assoc. 106, 746–762 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  • Gneiting, T., Raftery, A.E.: Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc. 102, 359–378 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  • Hibbs, D.: Implications of the ‘bread and peace’ model for the 2008 U.S. presidential election. Public Choice 137, 1–10 (2008)

    Article  Google Scholar 

  • Hoeting, J., Madigan, D., Raftery, A.E., Volinsky, C.: Bayesian model averaging (with discussion). Stat. Sci. 14, 382–417 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  • Jones, H.E., Spiegelhalter, D.J.: Improved probabilistic prediction of healthcare performance indicators using bidirectional smoothing models. J. R. Stat. Soc. A 175, 729–747 (2012)

    Article  MathSciNet  Google Scholar 

  • McCulloch, R.E.: Local model influence. J. Am. Stat. Assoc. 84, 473–478 (1989)

    Article  Google Scholar 

  • Plummer, M.: Penalized loss functions for Bayesian model comparison. Biostatistics 9, 523–539 (2008)

    Article  MATH  Google Scholar 

  • Ripley, B.D.: Statistical Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996)

    Book  Google Scholar 

  • Robert, C.P.: Intrinsic losses. Theory Decis. 40, 191–214 (1996)

    Article  MATH  Google Scholar 

  • Rubin, D.B.: Estimation in parallel randomized experiments. J. Educ. Stat. 6, 377–401 (1981)

    Google Scholar 

  • Rubin, D.B.: Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Stat. 12, 1151–1172 (1984)

    Article  MATH  Google Scholar 

  • Schwarz, G.E.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)

    Article  MATH  Google Scholar 

  • Shibata, R.: Statistical aspects of model selection. In: Willems, J.C. (ed.) From Data to Model, pp. 215–240. Springer, Berlin (1989)

    Chapter  Google Scholar 

  • Spiegelhalter, D.J., Best, N.G., Carlin, B.P., van der Linde, A.: Bayesian measures of model complexity and fit (with discussion). J. R. Stat. Soc. B (2002)

  • Spiegelhalter, D., Thomas, A., Best, N., Gilks, W., Lunn, D.: BUGS: Bayesian inference using Gibbs sampling. MRC Biostatistics Unit, Cambridge, England (1994, 2003). http://www.mrc-bsu.cam.ac.uk/bugs/

  • Stone, M.: An asymptotic equivalence of choice of model cross-validation and Akaike’s criterion. J. R. Stat. Soc. B 36, 44–47 (1977)

    Google Scholar 

  • van der Linde, A.: DIC in variable selection. Stat. Neerl. 1, 45–56 (2005)

    Article  Google Scholar 

  • Vehtari, A., Lampinen, J.: Bayesian model assessment and comparison using cross-validation predictive densities. Neural Comput. 14, 2439–2468 (2002)

    Article  MATH  Google Scholar 

  • Vehtari, A., Ojanen, J.: A survey of Bayesian predictive methods for model assessment, selection and comparison. Stat. Surv. 6, 142–228 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  • Watanabe, S.: Algebraic Geometry and Statistical Learning Theory. Cambridge University Press, Cambridge (2009)

    Book  MATH  Google Scholar 

  • Watanabe, S.: Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J. Mach. Learn. Res. 11, 3571–3594 (2010)

    MATH  MathSciNet  Google Scholar 

  • Watanabe, S.: A widely applicable Bayesian information criterion. J. Mach. Learn. Res. 14, 867–897 (2013)

    MATH  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank two reviewers for helpful comments and the National Science Foundation, Institute of Education Sciences, and Academy of Finland (grant 218248) for partial support of this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrew Gelman.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gelman, A., Hwang, J. & Vehtari, A. Understanding predictive information criteria for Bayesian models. Stat Comput 24, 997–1016 (2014). https://doi.org/10.1007/s11222-013-9416-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-013-9416-2

Keywords

Navigation