Abstract
We present a semi-parametric approach to estimating item response functions (IRF) useful when the true IRF does not strictly follow commonly used functions. Our approach replaces the linear predictor of the generalized partial credit model with a monotonic polynomial. The model includes the regular generalized partial credit model at the lowest order polynomial. Our approach extends Liang’s (A semi-parametric approach to estimate IRFs, Unpublished doctoral dissertation, 2007) method for dichotomous item responses to the case of polytomous data. Furthermore, item parameter estimation is implemented with maximum marginal likelihood using the Bock–Aitkin EM algorithm, thereby facilitating multiple group analyses useful in operational settings. Our approach is demonstrated on both educational and psychological data. We present simulation results comparing our approach to more standard IRF estimation approaches and other non-parametric and semi-parametric alternatives.
Similar content being viewed by others
Notes
We assume that monotonicity is desirable in many testing situations where a correct item will always indicate higher (or equal) ability for all regions of the latent trait. However, we do note that releasing constraints on monotonicity may be useful for probing for severe departures from monotonicity or when non-monotonicity is actually predicted.
We implemented this approach using a conventional \(p<.05\) threshold for hypothesis testing though note that Benjamini–Hochberg adjustment is sometimes used in practice to control the false discovery rate in differential item functioning situations (e.g., see Thissen, Steinberg, & Kuang, 2002).
We thank an anonymous reviewer for suggesting this index for trait recovery. Results for latent trait recovery using \(\hbox {RIMSE}_\theta \) as in Liang (2007) are available from the authors upon request.
References
Abrahamowicz, M., & Ramsay, J. O. (1992). Multicategorical spline model for item response theory. Psychometrika, 57(1), 5–27.
Baker, F. B., & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques (2nd ed.). New York: Marcel Dekker.
Bertsekas, D. P. (1996). Constrained optimization and Lagrange multiplier methods. Belmont, MA: Athena Scientific.
Birnbaum, A. (1968). Some latent trait models. In F. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 395–479). Reading, MA: Addison-Wesley.
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443–459.
Bock, R. D., & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35, 179–197.
Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431–444.
Cai, L. (2010). A two-tier full-information item factor analysis model with applications. Psychometrika, 75, 581–612.
Cai, L., Yang, J. S., & Hansen, M. (2011). Generalized item bifactor analysis. Psychological Methods, 16(3), 221–248.
Duncan, K. A., & MacEachern, S. N. (2008). Nonparametric Bayesian modelling for item response. Statistical Modelling, 8(1), 41–66.
Duncan, K. A., & MacEachern, S. N. (2013). Nonparametric Bayesian modeling of item response curves with a three-parameter logistic prior mean. In M. C. Edwards & R. C. MacCallum (Eds.), Current topics in the theory and application of latent variable models (pp. 108–125). New York, NY: Routledge.
Elphinstone, C. D. (1985). A method of distribution and density estimation. Unpublished doctoral dissertation, University of South Africa.
Hansen, M., Cai, L., Stucky, B. D., Tucker, J. S., Shadel, W. G., & Edelen, M. O. (2014). Methodology for developing and evaluating the PROMIS smoking item banks. Nicotine & Tobacco Research, 16, S175–S189.
Heinzmann, D. (2005). A filtered polynomial approach to density estimation. Unpublished master’s thesis, Institute of Mathematics, University of Zurich.
Heinzmann, D. (2008). A filtered polynomial approach to density estimation. Computational Statistics, 23, 343–360.
Liang, L. (2007). A semi-parametric approach to estimating item response functions. Unpublished doctoral dissertation, Department of Psychology, The Ohio State University.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Mazza, A., Punzo, A., & McGuire, B. (2013). KernSmoothIRT: Non-parametric item response theory. R Package Version 5.0. Retrieved from http://CRAN.R-project.org/package=KernSmoothIRT.
Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49(3), 359–381.
Miyazaki, K., & Hoshino, T. (2009). A Bayesian semiparametric item response model with Dirichlet process priors. Psychometrika, 74(3), 375–393.
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159–176.
Orlando, M., & Thissen, D. (2000). Likelihood-based item-fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24, 50–64.
Qin, L. (1998). Nonparametric Bayesian models for item response data. Unpublished doctoral dissertation, The Ohio State University.
Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56(4), 611–630.
Ramsay, J. O. (2000). TestGraf: A program for the graphical analysis of multiple choice test and questionnaire data [Computer software].
Ramsay, J. O., & Abrahamowicz, M. (1989). Binomial regression with monotone splines: A psychometric application. Journal of the American Statistical Association, 84(408), 906–915.
Ramsay, J. O., & Winsberg, S. (1991). Maximum marginal likelihood estimation for semiparametric item analysis. Psychometrika, 56(3), 365–379.
R Core Team. (2012). R: A language and environment for statistical computing. Vienna, Austria. Retrieved from http://www.R-project.org. ISBN 3-900051-07-0.
Rossi, N., Wang, X., & Ramsay, J. O. (2002). Nonparametric item response estimates with the EM algorithm. Journal of Educational and Behavioral Statistics, 27(3), 291–317.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometric Monographs, 17, 1–100.
Samejima, F. (1977). A method of estimating item characteristic functions using the maximum likelihood estimate of ability. Psychometrika, 42, 163–191.
Samejima, F. (1979). A new family of models for the multiple choice item (Technical Report No. 79–4). Knoxville: University of Tennessee, Department of Psychology.
Samejima, F. (1984). A plausibility function of Iowa Vocabulary Test items estimated by the simple sum procedure of the conditional P.D.F. approach (Technical Report No. 84–1). Knoxville: University of Tennessee, Department of Psychology.
Santor, D. A., Ramsay, J. O., & Zuroff, D. C. (1994). Nonparametric item analyses of the Beck Depression Inventory: Evaluating gender item bias and response option weights. Psychological Assessment, 6(3), 255–270.
Santor, D. A., Zuroff, D. C., Ramsay, J. O., Cervantes, P., & Palacios, J. (1995). Examining scale discriminability in the BDI and CES-D as a function of depressive severity. Psychological Assessment, 7(2), 131–139.
Shadel, W. G., Edelen, M., & Tucker, J. S. (2011). A unified framework for smoking assessment: The PROMIS smoking initiative. Nicotine & Tobacco Research, 13(5), 399–400.
Sijtsma, K., Debets, P., & Molenaar, I. (1990). Mokken scale analysis for polychotomous items: Theory, a computer program and an empirical application. Quality and Quantity, 24, 173–188.
Thissen, D., Cai, L., & Bock, R. D. (2010). The nominal categories item response model. In M. Nering & R. Ostini (Eds.), Handbook of polytomous item response theory models: Developments and applications (pp. 43–75). New York, NY: Taylor & Francis.
Thissen, D., & Steinberg, L. (1986). A taxonomy of item response models. Psychometrika, 51, 567–577.
Thissen, D., Steinberg, L., & Kuang, D. (2002). Quick and easy implementation of the Benjamini–Hochberg procedure for controlling the false positive rate in multiple comparisons. Journal of Educational and Behavioral Statistics, 27, 77–83.
van der Ark, L. A. (2007). Mokken scale analysis in R. Journal of Statistical Software, 20, 1–19.
Woods, C. M. (2006). Ramsay-curve item response theory (RC-IRT) to detect and correct for nonnormal latent variables. Psychological Methods, 11, 253–270.
Woods, C. M. (2007a). Empirical histograms in item response theory with ordinal data. Educational and Psychological Measurement, 67, 73–87.
Woods, C. M. (2007b). Ramsay curve IRT for Likert-type data. Applied Psychological Measurement, 31(3), 195–212.
Woods, C. M. (2008). Ramsay curve item response theory for the three-parameter item response theory model. Applied Psychological Measurement, 36(6), 447–465.
Woods, C. M., & Lin, N. (2008). Item response theory with estimation of the latent density using Davidian curves. Applied Psychological Measurement, 33(2), 102–117.
Woods, C. M., & Thissen, D. (2006). Item response theory with estimation of the latent population distribution using spline-based densities. Psychometrika, 71, 281–301.
Acknowledgments
This research is supported by a Social Sciences and Humanities Research Council of Canada Post-Doctoral Fellowship awarded to Carl F. Falk. Li Cai’s research is partially supported by Grants from the Institute of Education Sciences (R305B080016 and R305D140046) and Grants from the National Institute on Drug Abuse (R01DA026943 and R01DA030466).
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Further Derivations of EM MML Estimation
Recall that the complete data likelihood is \(L(\varvec{\eta },\mu ,\sigma ^2|\mathbf{y}_{i},\theta _i) = f(\mathbf{y}_{i}|\theta _i, \varvec{\eta })\phi (\theta _i|\mu , \sigma ^2)\) for individual \(i\). Thus, an individual’s contribution to the marginal likelihood can be approximated to arbitrary precision as
Using quadrature, the height of the complete data likelihood at quadrature point \(X_q\) can be represented as
This suggests that the ordinate of the posterior \(f(\theta |\mathbf{y}_i,\varvec{\eta }, \mu , \sigma ^2)\) at quadrature point \(X_q\) can be approximated by
For the item parameter part, the conditional expectation of \(\log L(\varvec{\eta }|\theta _i,\mathbf{y}_i)\) is
where \(\varvec{\eta }_*\), \(\mu _*\), and \(\sigma _*^2\) are the current/provisional parameter estimates. Using Equation (20), this conditional expectation may be approximated by quadrature as
Summing over all \(N\) individuals and rearranging terms in the summation, the conditional expectation of \(\log L(\varvec{\eta }|\varvec{\theta },\mathbf{Y})\) from Equation (14) is
where \(\bar{r}_{jqc} = \sum _{i=1}^N \chi _c (y_{ij}) \bar{P}_i(X_q)\) is the conditional expected frequencies for item \(j\), category \(c\), at quadrature point \(q\). Item parameter estimates are updated in the M-step by treating the expected frequencies as weights and maximizing \(E(\varvec{\eta }|\varvec{\eta }_*,\mu _*,\sigma _*^2)\).
For distributional parameters for the latent traits, \(\mu \) and \(\sigma ^2\), if not fixed in a multiple group analysis, can be also be estimated by calculating the mean and variance of the expected counts (e.g., see Baker & Kim, 2004). This is possible because the conditional expectations of the linear sufficient statistics may also be approximated via quadrature and the M-step is closed-form. For example, the conditional expectation of \(\sum _{i=1}^N \theta _i\) is
where \(r_q = \sum _{i=1}^N \bar{P}_i(X_q)\) is the conditional expected frequencies at quadrature point \(q\). An updated latent variable mean estimate is therefore \(N^{-1} \sum _{q=1}^Q \bar{r}_q X_q \). For the variance parameter, the conditional expectation of \(\sum _{i=1}^N \theta _i^2\) is
An updated variance estimate is found by \(N^{-1} \sum _{q=1}^Q \bar{r}_q X_q^2 - \left( N^{-1} \sum _{q=1}^Q \bar{r}_q X_q \right) ^2\).
Appendix 2: Complete Data Derivatives for the GPC-MP Model
The complete data log-likelihood for a single GPC-MP item \(j\) is
where \(P(c | \theta _i, \varvec{\xi }_j, \omega _j, \varvec{\alpha }_j, \varvec{\tau }_j)\) is the response function for item \(j\) under the GPC-MP model. Differentiating with respect to a typical parameter, \(\eta _t\), leads to
where \(m_{ij}^* = m_j^*(\theta _i, \omega _j, \varvec{\alpha }_j, \varvec{\tau }_j)\) is short-hand for the monotonic polynomial without the intercept parameter, and \(P_{iju} = P(u|\theta _i, \varvec{\xi }_j, \omega _j, \varvec{\alpha }_j, \varvec{\tau }_j)\) is short-hand for person \(i\)’s probability of responding to category \(u\) on item \(j\) under the GPC-MP model. Of course, \(\frac{\partial m_{ij}^*}{\partial \xi _{jv}} = 0\), and \(\frac{\partial \xi _{jv}}{\partial \eta _t}\) is 1 when differentiating with respect to \(\xi _{jv}\) and 0 otherwise. The derivatives of \(m_{ij}^*\) for the parameters \(\omega _j\), \(\alpha _{js}\), and \(\tau _{js}\) are simply the following and can be substituted into the above equation:
in which \(s=1,2,\ldots , k\), the \(\mathbf{T}\) matrices are specific to item \(j\), and \(\frac{\partial m_{ij}^*}{\partial \mathbf{a}}\) is the vector:
The derivatives of the matrices \(\mathbf{T}\) with respect to \(\alpha _{js}\) and \(\tau _{js}\) have the form
For computing the Hessian, we used the following cross-product of gradients approximation, in which derivative vectors are computed for each individual and their outer-products are accumulated (e.g., Bock & Lieberman, 1970): \(\sum _{i=1}^N \left( \frac{\partial l_j(\varvec{\eta }_j|\theta _i, y_{ij})}{\partial \varvec{\eta }_j} \right) \left( \frac{\partial l_j(\varvec{\eta }_j|\theta _i, y_{ij})}{\partial \varvec{\eta }_j}\right) ^{\prime }\). The above derivatives can be easily adapted for the M-step computations in EM MML estimation by summing over the \(Q\) quadrature points instead of over the \(N\) individuals, and by treating the expected frequencies \(\bar{r}_{jqc}\) at each quadrature point as weights.
Rights and permissions
About this article
Cite this article
Falk, C.F., Cai, L. Maximum Marginal Likelihood Estimation of a Monotonic Polynomial Generalized Partial Credit Model with Applications to Multiple Group Analysis. Psychometrika 81, 434–460 (2016). https://doi.org/10.1007/s11336-014-9428-7
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-014-9428-7