Abstract
Nonlinear probability models, such as logits and probits for binary dependent variables, the ordered logit and ordered probit for ordinal dependent variables and the multinomial logit, together with log-linear models for contingency tables, have become widely used by social scientists in the past 30 years. In this chapter, we show that the identification and estimation of causal effects using these models present severe challenges, over and above those usually encountered in identifying causal effects in a linear setting. These challenges are derived from the lack of separate identification of the mean and variance in these models. We show their impact in experimental and observational studies, and we investigate the problems that arise in the use of standard approaches to the causal analysis of nonexperimental data, such as propensity scores, instrumental variables, and control functions. Naive use of these approaches with nonlinear probability models will yield biased estimates of causal effects, though the estimates will be a lower bound of the true causal effect and will have the correct sign. We show that the technique of Y-standardization brings the parameters of nonlinear probability models on a scale that we can meaningfully interpret but cannot measure. Other techniques, such as average partial effects, can yield causal effects on the probability scale, but, in this case, the linear probability model provides a simple and effective alternative.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In general, we do not use a subscript to indicate individual observations except where its omission might lead to confusion.
- 2.
Indeed, when we apply these models, we also assume that the latent error has a given distribution (e.g., logistic), and we cannot know whether this is an accurate assumption either. But, in general, it seems that these models are more robust (at least when we are concerned about comparisons of parameter values across models or samples) to violations of the assumption about the distributional form of the error than they are to violations of the assumptions about the standard deviation of that distribution (Cramer 2007).
- 3.
We can write the standard deviation of the error in this way even though, given that we assumed e in Eq. (10.1) had a logistic distribution, \( \nu \) will almost certainly not have a logistic distribution.
- 4.
But Robinson and Jewell (1991: 239) point out that “to test the null hypothesis of no treatment effect in a randomized study, it is always as or more efficient to adjust for the covariate [Z in our example] … when logistic models are used” (parentheses added by authors).
- 5.
Or, equally, (YX), if we collapse the three-way table over the Z margin.
- 6.
A third approach we do not discuss here is the use of average effects on the predicted probability. Wooldridge (2002) and Cramer (2007) show that average partial effects (APEs) are unaffected by the attenuation bias created by omitted covariates orthogonal to the treatment variable. See also the concluding section where we discuss the use of the linear probability model.
- 7.
Had we used the probit model for estimating c 1, then \( h=\sqrt{{c_1^2\operatorname{var}(X)+1}} \), reflecting the assumption placed on the latent error term which, for the probit, differs from that of the logit.
- 8.
As noted by Karlson et al. (2012), this can only be hold under the assumption that the latent error distribution of both models, (10.6b) and (10.13), is logistic and we know this cannot be true. However, as noted in footnote 2, violating this assumption appears not to be very consequential for the model’s ability to recover the parameters of interest (see also Cramer 2007).
- 9.
The probit is used here because the error terms, e 3 through e 8, and all the variables are normally distributed.
- 10.
To simplify exposition, in the following, we assume that the causal effect is constant across individuals in the population. Under this assumption, the IV identifies the average causal effect. Whenever that assumption does not hold, an additional assumption—monotonicity—is required in order for the IV to recover the average causal effect for a subset of the population that is affected or moved by the instrument (see Imbens and Angrist 1994; Blundell et al. 2005). However, the problem we sketch in the following also pertains to the situation in which we recover a local average treatment effect.
- 11.
References
Achen, C. H. (1977). Measuring representation: Perils of the correlation coefficient. American Journal of Political Science, 21, 805–821.
Allison, P. D. (1999). Comparing logit and probit coefficients across groups. Sociological Methods & Research, 28, 186–208.
Amemiya, T. (1975). Qualitative response models. Annals of Economic and Social Measurement, 4, 363–388.
Angrist, J. D., & Pischke, J.-S. (2008). Mostly harmless econometrics: An empiricist’s companion. Princeton: Princeton University Press.
Blalock, H. M. (1967a). Path coefficients versus regression coefficients. The American Journal of Sociology, 72, 675–676.
Blalock, H. M. (1967b). Causal inference, closed populations, and measures of association. American Political Science Review, 61, 130–136.
Blundell, R., Dearden, L., & Sianesi, B. (2005). Evaluating the effect of education on earnings: Models, methods and results from the National Child Development Survey. Journal of the Royal Statistical Society, Series A, 168, 473–512.
Breen, R., Karlson, K. B., & Holm, A. (2012). Correlations and non-linear probability models. Unpublished paper.
Cameron, S. V., & Heckman, J. J. (1998). Life cycle schooling and dynamic selection bias: Models and evidence for five cohorts of American males. Journal of Political Economy, 106, 262–333.
Cohen, J. (1969). Statistical power analysis for the behavioral sciences. New York: Academic.
Cox, D. R. (1958). Planning of experiments. New York: Wiley.
Cramer, J. S. (2007). Robustness of logit analysis: Unobserved heterogeneity and mis-specified disturbances. Oxford Bulletin of Economics and Statistics, 69, 545–555.
Fienberg, S. E. (1977). The analysis of cross-classified categorical data. Cambridge, MA: MIT Press.
Fisher, R. A. (1932). Statistical methods for research workers. Edinburgh: Oliver and Boyd.
Gail, M. H. (1986). Adjusting for covariates that have the same distribution in exposed and unexposed cohorts. In S. H. Moolgavkar & R. L. Prentice (Eds.), Modern statistical methods in chronic disease epidemiology (pp. 3–18). New York: Wiley.
Gail, M. H., Wieand, S., & Piantdosi, S. (1984). Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates. Biometrika, 71, 431–444.
Gangl, M. (2010). Causal inference in sociological research. Annual Review of Sociology, 36, 21–48.
Hauck, W. W., Neuhaus, J. M., Kalbfleisch, J. D., & Anderson, S. (1991). A consequence of omitted covariates when estimating odds ratios. Journal of Clinical Epidemiology, 44, 77–81.
Heckman, J. J. (1979). Sample selection bias as specification error. Econometrica, 47, 153–161.
Heckman, J. J., Ichimura, H., Smith, J., & Todd, P. (1998). Characterizing selection bias using experimental data. Econometrica, 66, 1017–1098.
Imbens, G. W., & Angrist, J. D. (1994). Identification and estimation of local average treatment effects. Econometrica, 62, 467–475.
Imbens, G. W., & Wooldridge, J. M. (2009). Recent developments in the econometrics of program evaluation. Journal of Economic Literature, 47, 5–86.
Karlson, K. B., Holm, A., & Breen, R. (2012). Comparing regression coefficients between same sample nested models using logit and probit: A new method. Sociological Methodology, 42(1), 286–313.
Kim, J.-O., & Mueller, C. W. (1976). Standardized and unstandardized coefficients in causal analysis: An expository note. Sociological Methods & Research, 4, 423–438.
Mare, R. D. (2006). Response: Statistical models of educational stratification – Hauser and Andrew’s models for school transitions. Sociological Methodology, 36, 27–37.
McFadden, D. (1974). Conditional logit analysis of qualitative choice behavior. In P. Zarembka (Ed.), Frontiers in econometrics (pp. 105–142). New York: Academic.
McKelvey, R. D., & Zavoina, W. (1975). A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology, 4, 103–120.
Mood, C. (2010). Logistic regression: Why we cannot do what we think we can do, and what we can do about it. European Sociological Review, 26, 67–82.
Morgan, S. L., & Winship, C. (2007). Counterfactuals and causal inference: Methods and principles for social research. New York: Cambridge University Press.
Olsen, R. J. (1982). Independence from irrelevant alternatives and attrition bias: Their relation to one another in the evaluation of experimental programs. Southern Economic Journal, 49, 521–535.
Pearl, J. (1995). Causal diagrams for empirical research. Biometrika, 82, 669–710.
Pearl, J. (2006). Causality: Models, reasoning and inference. Cambridge: Cambridge University Press.
Robins, J. M. (1999). Association, causation, and marginal structural models. Synthese, 121, 151–179.
Robins, J. M., Hernán, M. A., & Brumback, B. (2000). Marginal structural models and causal inference in epidemiology. Epidemiology, 11, 550–560.
Robinson, L. D., & Jewell, N. P. (1991). Some surprising results about covariate adjustment in logistic regression models. International Statistical Review, 58, 227–240.
Swait, J., & Louviere, J. (1993). The role of the scale parameter in the estimation and comparison of multinomial logit models. Journal of Marketing Research, 30, 305–314.
Train, K. (2009). Discrete choice methods with simulation. Cambridge: Cambridge University Press.
Vytlacil, E. (2002). Independence, monotonicity, and latent index models: An equivalence result. Econometrica, 70, 331–441.
Winship, C., & Mare, R. D. (1984). Regression models with ordinal variables. American Sociological Review, 49, 512–525.
Wooldridge, J. M. (2002). Econometric analysis of cross section and panel data. Cambridge, MA: MIT Press.
Xie, Y. (2011). Values and limitations of statistical models. Research in Social Stratification and Mobility, 29, 343–349.
Yatchew, A., & Griliches, Z. (1985). Specification error in probit models. The Review of Economics and Statistics, 67, 134–139.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Breen, R., Karlson, K.B. (2013). Counterfactual Causal Analysis and Nonlinear Probability Models. In: Morgan, S. (eds) Handbook of Causal Analysis for Social Research. Handbooks of Sociology and Social Research. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6094-3_10
Download citation
DOI: https://doi.org/10.1007/978-94-007-6094-3_10
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-6093-6
Online ISBN: 978-94-007-6094-3
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)