Propensity Score Matching in Criminology and Criminal Justice

Apel, Robert J.; Sweeten, Gary

doi:10.1007/978-0-387-77650-7_26

Robert J. Apel³ &
Gary Sweeten⁴

114k Accesses
137 Citations
3 Altmetric

Abstract

The propensity score methodology has become quite common in applied research in the last 10 years, and criminology is no exception to this growing trend. It offers a potentially powerful way to estimate the treatment effect of some intervention on behavior when the receipt of treatment arises in a nonrandom way – this is the selection problem. It does so by creating synthetic “experimental” and “control” groups that are equivalent on a large number of potential confounding variables. In this chapter, we first introduce the counterfactual framework on which the propensity score method is based and define the average treatment effect. We then outline technical issues that must be addressed when the propensity score method is used in practice, including estimation of the propensity score, demonstration of covariate balance, and estimation of the treatment effect of interest. To provide a step-by-step example of the method, we appeal to the relationship between employment and substance use in adolescence. Following a brief review of research in criminology and related disciplines that employ the propensity score methodology, we offer a number of guidelines for use of the technique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Readers desiring a more thorough survey of the counterfactual framework generally and the propensity score method specifically are referred to Cameron and Trivedi (2005: Chap. 25), Imbens (2004), Morgan and Harding (2006), and Wooldridge (2002: Chap. 18).
2.
Notice that the counterfactual definition of causality requires that the individual occupy two states at the same time, not two different states at two different times. If the latter condition held, panel data with a time-varying treatment condition would suffice to estimate a causal effect of treatment. In the marriage example, the period(s) in which the individual is not married would be the counterfactual for the period(s) in which the same individual is married.
3.
In this chapter, we will be mostly concerned with estimation of ATE rather that its constituents, ATT and ATU.
4.
Because it renders treatment ignorable, randomization is sufficient to identify the average treatment effect in the following manner:
$$\begin{array}{l} \mathrm{ATE} = \mathrm{E}\left (\left.{Y }_{i}^{1}\right \vert {T}_{i} = 1\right ) -\mathrm{E}\left (\left.{Y }_{i}^{0}\right \vert {T}_{i} = 0\right ) \\ \qquad \ = \mathrm{E}\left (\left.{Y }_{i}\right \vert {T}_{i} = 1\right ) -\mathrm{E}\left (\left.{Y }_{i}\right \vert {T}_{i} = 0\right ) \end{array}$$

Notice that this is simply the mean difference in the outcome for treated and untreated individuals in the target population, as the potential outcomes notation in the first equality can be removed. The second equality necessarily follows because treatment assignment independent of potential outcomes ensures that:

$$\mathrm{E}\left (\left.{Y }_{i}^{1}\right \vert {T}_{ i} = 1\right ) = E\left (\left.{Y }_{i}^{1}\right \vert {T}_{ i} = 0\right ) = E\left (\left.{Y }_{i}\right \vert {T}_{i} = 1\right )$$
and
$$\mathrm{E}\left (\left.{Y }_{i}^{0}\right \vert {T}_{ i} = 1\right ) = E\left (\left.{Y }_{i}^{0}\right \vert {T}_{ i} = 0\right ) = E\left (\left.{Y }_{i}\right \vert {T}_{i} = 0\right )$$
As an interesting aside, in the case of a randomized experiment, it is also the case that ATT and ATU are equivalent to ATE by virtue of these equalities.
5.
To be perfectly accurate, randomization may in fact produce imbalance, but the imbalance is attributable entirely to chance. However, asymptotically (i.e., as the sample size tends toward infinity) the expected imbalance approaches zero.
6.
Aside from ethical and practical concerns, this experiment would be unable to assess the effect of marriage as we know it, as marriages entered into on the basis of a coin flip would likely have very different qualities than those freely chosen.
7.
Researchers differ in their preferences for how exhaustive the treatment status model should be. In a theoretically informed model, the researcher includes only a vector of variables that are specified a priori in the theory or theories of choice. In a kitchen sink model, the researcher includes as many variables as are available in the dataset. In our view, a theoretically informed model is appealing only to the extent that it achieves balance on confounders that are excluded from the treatment status model but would have been included in a kitchen sink model.
8.
Some researchers also include functions of the confounders in the treatment status model, for example, quadratic and interaction terms.
9.
A useful sensitivity exercise is to estimate treatment effects using a number of different bandwidths to determine stability of the estimates. With smaller bandwidths, common support shrinks and fewer cases are retained. This alters the nature of the estimated treatment effect, particularly if a large number of cases are excluded. This can be dealt with by simply acknowledging that the estimated effect excludes certain kinds of cases, and these can be clearly described since the dropped cases are observed.
10.
Where substantive significance is as important as statistical significance, the standardized bias formula can also be used to estimate an effect size for the treatment effect estimate (see Cohen, 1988).
11.
In practice, Wooldridge (2002) recommends augmenting the regression model in the following way:
$${Y }_{i} = {\alpha }^{{\prime}} + {\beta }^{{\prime}}{T}_{ i} + {\gamma }^{{\prime}}P({x}_{ i}) + {\delta }^{{\prime}}{T}_{ i}\left [P({x}_{i}) -\bar{ P}({x}_{i})\right ] + {e}_{i}^{{\prime}}$$
where $\bar{P}({x}_{i})$ represents the mean propensity score for the target population and ATE is estimated the same way, but by using β^′ in place of β.
12.
Nearest neighbor matching can be done with or without replacement. Matching without replacement means that once an untreated case has been matched to a treated case, it is removed from the candidates for matching. This may lead to poor matches when the distribution of propensity scores is quite different for the treated and untreated groups. Matching without replacement also requires that cases be randomly sorted prior to matching, as sort order can affect matches when there are cases with equal propensity scores. Matching with replacement allows an untreated individual to serve as the counterfactual for multiple treated individuals. This allows for better matches, but reduces the number of untreated cases used to create the treatment effect estimate, which increases the variance of the estimate (Smith and Todd 2005). As with the choice of the number of neighbors, one has to balance concerns of bias and efficiency.
13.
When there are many cases at the boundaries of the propensity score distribution, it may be useful to generalize kernel matching to include a linear term; this is called local linear matching. Its main advantage over kernel matching is that it yields more accurate estimates at boundary points in the distribution of propensity scores and it deals better with different data densities (Smith and Todd 2005).
14.
Apel et al. (2006, 2007, 2008); Bachman et al. (1981, 2003); Bachman and Schulenberg (1993); Gottfredson (1985); Greenberger et al. (1981); Johnson (2004); McMorris and Uggen (2000); Mihalic and Elliott (1997); Mortimer (2003); Mortimer et al. (1996); Paternoster et al. (2003); Ploeger (1997); Resnick et al. (1997); Safron et al. (2001); Staff and Uggen (2003); Steinberg and Dornbusch (1991); Steinberg et al. (1982, 1993); Tanner and Krahn (1991).
15.
If we select the sample treatment probability as the classification threshold, 71.8 percent of the sample is correctly classified from the model shown in Table26.1.
16.
The sign of the standardized bias is informative. If positive, it signifies that treated youth (i.e., youth who work intensively during the school year) exhibit more of the characteristic being measured than untreated youth. Conversely, if negative, it means that treated youth have less of the measured quality than untreated youth.
17.
If a logistic regression model of substance use is estimated instead, the coefficient for intensive work with no control variables is 0.77 (odds ratio = 2.16), and with control variables is 0.28 (odds ratio = 1.33). Both coefficients are statistically significant at a five-percent level.
18.
Notice that the ATE from standard regression in panel A (b=0.051) is very similar to the ATE from propensity score regression with no trimming in panel B (b=0.054). The similarity is not coincidental. The discrepancy is only due to the fact that the propensity score was estimated from a logistic regression model at the first stage. Had a linear regression model been used instead, the two coefficients would be identical, although the standard errors would differ.
19.
We employ the user-written Stata protocol -psmatch2- to estimate average treatment effects from the matching models (see Leuven and Barbara 2003). To obtain the standard error of the ATE, we perform a bootstrap procedure with 100 replications.
20.
As a further test of sensitivity, we estimated the ATE of intensive employment on substance use for subsamples with different substance use histories. For this test, we employed single-nearest-neighbor matching with no caliper, although the findings were not sensitive to this choice. Among the 2,740 youth who, at the initial interview, reported never having used illicit substances, ATE=0.084 (S.E.=0.060). Among the 1,927 youth who reported having used at least one type of illicit substance prior to the initial interview, ATE=−0.019 (S.E.=0.046).

References

Apel R, Bushway SD, Brame R, Haviland AM, Nagin DS, Paternoster R (2007) Unpacking the relationship between adolescent employment and antisocial behavior: a matched samples comparison. Criminology 45:67–97
Article Google Scholar
Apel R, Bushway SD, Paternoster R, Brame R, Sweeten G (2008) Using state child labor laws to identify the causal effect of youth employment on deviant behavior and academic achievement. J Quant Criminol 24:337–362
Article Google Scholar
Apel R, Paternoster R, Bushway SD, Brame R (2006) A job isn’t just a job: the differential impact of formal versus informal work on adolescent problem behavior. Crime Delinq 52:333–369
Article Google Scholar
Bachman JG, Johnston LD, O’Malley PM (1981) Smoking, drinking, and drug use among American high school students: correlates and trends, 1975–1979. Am J Public Health 71:59–69
Google Scholar
Bachman JG, Safron DJ, Sy SR, Schulenberg JE (2003) Wishing to work: new perspectives on how adolescents’ part-time work intensity is linked to educational disengagement, substance use, and other problem behaviors. Int J Behav Dev 27:301–315
Article Google Scholar
Bachman JG Schulenberg JE (1993) How part-time work intensity relates to drug use, problem behavior, time use, and satisfaction among high school seniors: are these consequences or merely correlates? Dev Psychol 29:220–235
Google Scholar
Banks D, Gottfredson DC (2003) The effects of drug treatment and supervision on time to rearrest among drug treatment court participants. J Drug Issues 33:385–412
Google Scholar
Berk RA, Newton PJ (1985) Does arrest really deter wife battery? An effort to replicate the findings of the Minneapolis spouse abuse experiment. Am Sociol Rev 50:253–262
Article Google Scholar
Bingenheimer JB, Brennan RT, Earls FJ (2005) Firearm violence exposure and serious violent behavior. Science 308:1323–1326
Article Google Scholar
Blechman EA, Maurice A, Bueckner B, Helberg C (2000) Can mentoring or skill training reduce recidivism? Observational study with propensity analysis. Prev Sci 1:139–155
Article Google Scholar
Brame R, Bushway SD, Paternoster R, Apel R (2004) Assessing the effect of adolescent employment on involvement in criminal activity. J Contemp Crim Justice 20:236–256
Article Google Scholar
Caldwell M, Skeem J, Salekin R, Rybroek GV (2006) Treatment response of adolescent offenders with psychopathy features: a 2-year follow-up. Crim Justice Behav 33:571–596
Article Google Scholar
Cameron AC, Trivedi PK (2005) Microeconometrics: methods and applications. Cambridge University Press, New York
Google Scholar
Cochran WG (1968) The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics 24:295–313
Article Google Scholar
Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Lawrence Erlbaum, Hillsdale, NJ
Google Scholar
Dehejia RH, Wahba S (1999) Causal effects in nonexperimental settings: reevaluating the evaluation of training programs. J Am Stat Assoc 94:1053–1062
Article Google Scholar
Dehejia RH, Wahba S (2002) Propensity score-matching methods for nonexperimental causal studies. Rev Econ Stat 84:151–161
Article Google Scholar
Glueck S, Glueck E (1950) Unraveling juvenile delinquency. The Commonwealth Fund, Cambridge, MA
Google Scholar
Gottfredson DC (1985) Youth employment, crime, and schooling: a longitudinal study of a national sample. Dev Psychol 21:419–432
Article Google Scholar
Greenberger E, Steinberg LD, Vaux A (1981) Adolescents who work: health and behavioral consequences of job stress. Dev Psychol 17:691–703
Article Google Scholar
Haviland AM, Nagin DS (2005) Causal inferences with group based trajectory models. Psychometrika 70:1–22
Article Google Scholar
Haviland AM, Nagin DS (2007) Using group-based trajectory modeling in conjunction with propensity scores to improve balance. J Exp Criminol 3:65–82
Article Google Scholar
Haviland AM, Nagin DS, Rosenbaum PR (2007) Combining propensity score matching and group-based trajectory analysis in an observational study. Psychol Methods 12:247–267
Article Google Scholar
Heckman JJ, Joseph Hotz V (1989) Choosing among alternative nonexperimental methods for estimating the impact of social programs: the case of manpower training. J Am Stat Assoc 84:862–874
Article Google Scholar
Hirano K, Imbens GW, Ridder G (2003) Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71:1161–1189
Article Google Scholar
Holland PW (1986) Statistics and causal inference. J Am Stat Assoc 81:945–960
Article Google Scholar
Imbens GW (2004) Nonparametric estimation of average treatment effects under exogeneity: a review. Rev Econ Stat 86:4–29
Article Google Scholar
Johnson MK (2004) Further evidence on adolescent employment and substance use: differences by race and ethnicity. J Health Soc Behav 45:187–197
Article Google Scholar
King RD, Massoglia M, MacMillan R (2007) The context of marriage and crime: gender, the propensity to marry, and offending in early adulthood. Criminology 45:33–65
Article Google Scholar
Krebs CP, Strom KJ, Koetse WH, Lattimore PK (2009) The impact of residential and nonresidential drug treatment on recidivism among drug-involved probationers. Crime Delinq 55:442–471
Article Google Scholar
Leeb RT, Barker LE, Strine TW (2007) The effect of childhood physical and sexual abuse on adolescent weapon carrying. J Adolesc Health 40:551–558
Article Google Scholar
Leuven E, Barbara S (2003) PSMATCH2: Stata module to perform full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing. Available online: http://ideas.repec.org/c/boc/bocode/s432001.html
Li YP, Propert KJ, Rosenbaum PR (2001) Balanced risk set matching. J Am Stat Assoc 96:870–882
Article Google Scholar
Lu B (2005) Propensity score matching with time-dependent covariates. Biometrics 61:721–728
Article Google Scholar
McCaffrey DF, Ridgeway G, Morral AR (2004) Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychol Methods 9:403–425
Article Google Scholar
McMorris BJ Uggen C (2000) Alcohol and employment in the transition to adulthood. J Health Soc Behav 41: 276–294
Google Scholar
McNeil DE, Binder RL (2007) Effectiveness of a mental health court in reducing criminal recidivism and violence. Am J Psychiatry 164:1395–1403
Article Google Scholar
Mihalic SW Elliott DS (1997) Short- and long-term consequences of adolescent work. Youth Soc 28:464–498
Google Scholar
Mocan NH, Tekin E (2006) Catholic schools and bad behavior: a propensity score matching analysis. J Econom Anal Policy 5:1–34
Google Scholar
Molnar BE, Browne A, Cerda M, Buka SL (2005) Violent behavior by girls reporting violent victimization: a prospective study. Arch Pediatr Adolesc Med 159:731–739
Article Google Scholar
Morgan SL, Harding DJ (2006) Matching estimators of causal effects: prospects and pitfalls in theory and practice. Sociol Methods Res 35:3–60
Article Google Scholar
Mortimer JT (2003) Working and growing up in America. Harvard University Press, Cambridge, MA
Google Scholar
Mortimer JT, Finch MD, Ryu S, Shanahan MJ, Call KT (1996) The effects of work intensity on adolescent mental health, achievement, and behavioral adjustment: new evidence from a prospective study. Child Dev 67: 1243–1261
Google Scholar
Nagin DS (2005) Group-based modeling of development. Harvard University Press, Cambridge, MA
Google Scholar
Nieuwbeerta P, Nagin DS, Blokland AAJ (2009) Assessing the impact of first-time imprisonment on offenders’ subsequent criminal career development: A matched samples comparison. J Quant Criminol 25:227–257
Article Google Scholar
Paternoster R, Bushway S, Brame R, Apel R (2003) The effect of teenage employment on delinquency and problem behaviors. Soc Forces 82:297–335
Article Google Scholar
Ploeger M (1997) Youth employment and delinquency: reconsidering a problematic relationship. Criminology 35:659–675
Article Google Scholar
Resnick MD, Bearman PS, Blum RW, Bauman KE, Harris KM, Jo J, Tabor J, Beuhring T, Sieving RE, Shew M, Ireland M, Bearinger LH, Richard Udry J (1997) Protecting adolescents from harm: findings from the national longitudinal study of adolescent health. J Am Med Assoc 278:823–832
Article Google Scholar
Ridgeway G (2006) Assessing the effect of race bias in post-traffic stop outcomes using propensity scores. J Quant Criminol 22:1–29
Article Google Scholar
Robins JM (1999) Association, causation, and marginal structural models. Synthese 121:151–179
Article Google Scholar
Robins JM, Rotnitzky A (1995) Semiparametric efficiency in multivariate regression models with missing data. J Am Stat Assoc 90:122–129
Article Google Scholar
Robins JM, Mark SD, Newey WK (1992) Estimating exposure effects by modeling the expectation of exposure conditional on confounders. Biometrics 48:479–495
Article Google Scholar
Robins JM, Hernán MÁ, Brumback B (2000) Marginal structural models and causal inference in epidemiology. Epidemiology 11:550–560
Article Google Scholar
Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70:41–55
Article Google Scholar
Rosenbaum PR, Rubin DB (1984) Reducing bias in observational studies using subclassification on the propensity score. J Am Stat Assoc 79:516–524
Article Google Scholar
Rosenbaum PR, Rubin DB (1985) Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. Am Stat 39:33–38
Article Google Scholar
Rubin DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66:688–701
Article Google Scholar
Rubin DB (1977) Assignment of treatment group on the basis of a covariate. J Educ Stat 2:1–26
Article Google Scholar
Safron DJ, Schulenberg JE Bachman JG (2001) Part-time work and hurried adolescence: the links among work intensity, social activities, health behaviors, and substance use. J Health Soc Behav 42:425–449
Google Scholar
Sampson RJ, Laub JH, Wimer C (2006) Does marriage reduce crime? A counterfactual approach to within-individual causal effects. Criminology 44:465–508
Article Google Scholar
Smith JA, Todd PE (2005) Does matching overcome lalonde’s critique of nonexperimental estimators? J Econom 125:305–353
Article Google Scholar
Staff J Uggen C (2003) The fruits of good work: early work experiences and adolescent deviance. J Res Crime Delinq 40:263–290
Google Scholar
Steinberg L Dornbusch S (1991) Negative correlates of part-time work in adolescence: replication and elaboration. Dev Psychol 17:304–313
Google Scholar
Steinberg L, Fegley S, Dornbusch S (1993) Negative impact of part-time work on adolescent adjustment: evidence from a longitudinal study. Dev Psychol 29:171–180
Article Google Scholar
Steinberg LD, Greenberger E, Garduque L, Ruggiero M, Vaux A (1982) Effects of working on adolescent development. Dev Psychol 18:385–395
Article Google Scholar
Sweeten G, Apel R (2007) Incapacitation: revisiting an old question with a new method and new data. J Quant Criminol 23:303–326
Article Google Scholar
Tanner J, Krahn H (1991) Part-time work and deviance among high-school seniors. Can J Sociol 16:281–302
Google Scholar
Tita G, Ridgeway G (2007) The impact of gang formation on local patterns of crime. J Res Crime Delinq 44:208–237
Article Google Scholar
Widom CS (1989) The cycle of violence. Science 244:160–166
Article Google Scholar
Wooldridge JM (2002) Econometric analysis of cross section and panel data. MIT Press, Cambridge, MA
Google Scholar

Download references

Author information

Authors and Affiliations

School of Criminal Justice, University at Albany, State University of New York, Albany, NY, USA
Robert J. Apel
School of Criminology and Criminal Justice, Arizona State University, Scottsdale, AZ, USA
Gary Sweeten

Authors

Robert J. Apel
View author publications
You can also search for this author in PubMed Google Scholar
Gary Sweeten
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Criminology, Florida State University, West Call Street 643, Tallahassee, 32306, U.S.A.
Alex R. Piquero
Inst. Criminology, Hebrew University of Jerusalem, Jerusalem, 91905, Israel
David Weisburd

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Apel, R.J., Sweeten, G. (2010). Propensity Score Matching in Criminology and Criminal Justice. In: Piquero, A., Weisburd, D. (eds) Handbook of Quantitative Criminology. Springer, New York, NY. https://doi.org/10.1007/978-0-387-77650-7_26

Download citation

DOI: https://doi.org/10.1007/978-0-387-77650-7_26
Published: 03 December 2009
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-77649-1
Online ISBN: 978-0-387-77650-7
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)

Publish with us

Policies and ethics