Abstract
The \(\theta \) metric in item response theory is often not the most useful metric for score reporting or interpretation. In this paper, I demonstrate that the filtered monotonic polynomial (FMP) item response model, a recently proposed nonparametric item response model (Liang & Browne in J Educ Behav Stat 40:5–34, 2015), can be used to specify item response models on metrics other than the \(\theta \) metric. Specifically, I demonstrate that any item response function (IRF) defined within the FMP framework can be re-expressed as another FMP IRF by taking monotonic transformations of the latent trait. I derive the item parameter transformations that correspond to both linear and nonlinear transformations of the latent trait metric. These item parameter transformations can be used to define an item response model based on any monotonic transformation of the \(\theta \) metric, so long as the metric transformation is approximated by a monotonic polynomial. I demonstrate this result by defining an item response model directly on the approximate true score metric and discuss the implications of metric transformations for applied testing situations.
Similar content being viewed by others
Notes
Because \(h^{-1}\) is a strictly monotonic function, it is guaranteed to have an inverse, and thus the function h is also strictly monotonic and invertible. The inverse transformation, \(h^{-1}\), is used in the current definition for notational consistency.
When operating within the Rasch tradition of item response theory, the scaling of the latent variable is typically not considered arbitrary due to the existence of sufficient statistics and the model’s close connection to additive conjoint measurement (Perline, Wright, & Wainer, 1979). These features are not shared by other item response models. From these facts, it has been argued that the latent trait under Rasch modeling is on an interval-level metric. A discussion of the merits and limitations of this line of reasoning is beyond the scope of this paper, but the interested reader may consult Kyngdon (2008) and the replies to his article that were published in the same journal issue.
Ramsay and Wiberg (2017) approach the problem of specifying an item response model on a sum score-like metric using nonparametric item calibration. In the example given in this paper, I assume that an item response model has already been fitted. The merits of Ramsay and Wiberg’s method notwithstanding, my example is intended to illustrate a general method that can be used to transform already fitted models to any metric that is monotonically related to \(\theta \). Although beyond the scope of this paper, I believe that the methods described in this paper could be adapted to allow for linking among different calibrations specified under Ramsay and Wiberg’s method.
References
Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrox & F. Caski (Eds.), Second international symposium on information theory (pp. 267–281). Budapest: Akademiai Kiado.
Bock, R. D., Thissen, D., & Zimowski, M. F. (1997). IRT estimation of domain scores. Journal of Educational Measurement, 34, 197–211.
Butcher, J. N., Williams, C. L., Graham, J. R., Archer, R. P., Tellegen, A., Ben-Porath, Y. S., et al. (1992). MMPI-A (Minnesota Multiphasic Personality Inventory-Adolescent): Manual for administration, scoring, and interpretation. Minneapolis: University of Minnesota Press.
Elphinstone, C. D. (1983). A target distribution model for nonparametric density estimation. Communication in Statistics-Theory and Methods, 12, 161–198.
Elphinstone, C. D. (1985). A method of distribution and density estimation (Unpublished dissertation). Pretoria: University of South Africa.
Embretson, S. E. (1996). The new rules of measurement. Psychological Assessment, 8, 341–349.
Falk, C. F., & Cai, L. (2016a). Maximum marginal likelihood estimation of a monotonic polynomial generalized partial credit model with applications to multiple group analysis. Psychometrika, 81, 434–460.
Falk, C. F., & Cai, L. (2016b). Semiparametric item response functions in the context of guessing. Journal of Educational Measurement, 53, 229247.
Feuerstahler, L. M. (2018). flexmet: Flexible latent trait metrics using the filtered monotonic polynomial item response model. R package version 1.0.0.0
Haebara, T. (1980). Equating logistic ability scales by a weighted least squares method. Japanese Psychological Research, 22, 144–149.
Hawkins, D. M. (1994). Fitting monotonic polynomials to data. Computational Statistics, 9, 233–247.
Kolen, M. J. (1988). Defining score scales in relation to measurement error. Journal of Educational Measurement, 25, 97–110.
Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking (3rd ed.). New York: Springer.
Kyngdon, A. (2008). The Rasch model from the perspective of the representational theory of measurement. Theory & Psychology, 18, 89–109.
Liang, L. (2007). A semi-parametric approach to estimating item response functions (Unpublished doctoral dissertation). Columbus, OH: The Ohio State University.
Liang, L., & Browne, M. W. (2015). A quasi-parametric method for fitting flexible item response functions. Journal of Educational and Behavioral Statistics, 40, 5–34.
Lord, F. M. (1974). The relative efficiency of two tests as a function of ability level. Psychometrika, 39, 351–358.
Lord, F. M. (1975). The ‘ability’ scale in item characteristic curve theory. Psychometrika, 40, 205–217.
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
Mokken, R. J. (1971). A theory and procedure of scale analysis with applications in political research. The Hague: Mouton.
Murray, K., Müller, S., & Turlach, B. A. (2013). Revisiting fitting monotone polynomials to data. Computational Statistics, 28, 1989–2005.
Murray, K., Müller, S., & Turlach, B. A. (2016). Fast and flexible methods for monotone polynomial fitting. Journal of Statistical Computation and Simulation, 86, 2946–2966.
Perline, R., Wright, B. D., & Wainer, H. (1979). The Rasch model as additive conjoint measurement. Applied Psychological Measurement, 3, 237–255.
Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611–630.
Ramsay, J. O., & Wiberg, M. (2017). A strategy for replacing sum scoring. Journal of Educational and Behavioral Statistics, 42, 282–307.
Ramsay, J. O., & Winsberg, S. (1991). Maximum marginal likelihood estimation for semiparametric item analysis. Psychometrika, 56, 365–379.
R Core Team. (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from https://www.R-project.org.
Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items? Psychological Methods, 8, 164–184.
Schulz, E. M., & Nicewander, W. A. (1997). Grade equivalent and IRT representations of growth. Journal of Educational Measurement, 34, 315–331.
Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 677–680.
Stocking, M. L. (1996). An alternative method for scoring adaptive tests. Journal of Educational and Behavioral Statistics, 21, 365–389.
Stocking, M. L., & Lord, F. M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7, 201–210.
Tadikamalla, P. R. (1980). On simulating non-normal distributions. Psychometrika, 45, 273–279.
Turlach, B., & Murray, K. (2016). MonoPoly: Functions to fit monotone polynomials. R package version 0.3-8.
van der Linden, W. J., & Barrett, M. D. (2016). Linking item response model parameters. Psychometrika, 81, 650–673.
Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6, 473–492.
Yi, Q., Wang, T., & Ban, J.-C. (2001). Effects of scale transformation and test-termination rule on the precision of ability estimation in computerized adaptive testing. Journal of Educational Measurement, 38, 267–292.
Zwick, R. (1992). Statistical and psychometric issues in the measurement of educational achievement trends: Examples from the National Assessment of Educational Progress. Journal of Educational Statistics, 17, 205–218.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Feuerstahler, L.M. Metric Transformations and the Filtered Monotonic Polynomial Item Response Model. Psychometrika 84, 105–123 (2019). https://doi.org/10.1007/s11336-018-9642-9
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-018-9642-9