Metric Transformations and the Filtered Monotonic Polynomial Item Response Model

Feuerstahler, Leah M.

doi:10.1007/s11336-018-9642-9

Metric Transformations and the Filtered Monotonic Polynomial Item Response Model

Published: 09 November 2018

Volume 84, pages 105–123, (2019)
Cite this article

Psychometrika Aims and scope Submit manuscript

Leah M. Feuerstahler ORCID: orcid.org/0000-0002-7001-8519¹

2093 Accesses
11 Citations
Explore all metrics

Abstract

The \(\theta \) metric in item response theory is often not the most useful metric for score reporting or interpretation. In this paper, I demonstrate that the filtered monotonic polynomial (FMP) item response model, a recently proposed nonparametric item response model (Liang & Browne in J Educ Behav Stat 40:5–34, 2015), can be used to specify item response models on metrics other than the \(\theta \) metric. Specifically, I demonstrate that any item response function (IRF) defined within the FMP framework can be re-expressed as another FMP IRF by taking monotonic transformations of the latent trait. I derive the item parameter transformations that correspond to both linear and nonlinear transformations of the latent trait metric. These item parameter transformations can be used to define an item response model based on any monotonic transformation of the \(\theta \) metric, so long as the metric transformation is approximated by a monotonic polynomial. I demonstrate this result by defining an item response model directly on the approximate true score metric and discuss the implications of metric transformations for applied testing situations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

Yan Xia & Yanyun Yang

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

Jörg Henseler, Christian M. Ringle & Marko Sarstedt

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

Sander Greenland, Stephen J. Senn, … Douglas G. Altman

Notes

Because \(h^{-1}\) is a strictly monotonic function, it is guaranteed to have an inverse, and thus the function h is also strictly monotonic and invertible. The inverse transformation, \(h^{-1}\), is used in the current definition for notational consistency.
When operating within the Rasch tradition of item response theory, the scaling of the latent variable is typically not considered arbitrary due to the existence of sufficient statistics and the model’s close connection to additive conjoint measurement (Perline, Wright, & Wainer, 1979). These features are not shared by other item response models. From these facts, it has been argued that the latent trait under Rasch modeling is on an interval-level metric. A discussion of the merits and limitations of this line of reasoning is beyond the scope of this paper, but the interested reader may consult Kyngdon (2008) and the replies to his article that were published in the same journal issue.
Ramsay and Wiberg (2017) approach the problem of specifying an item response model on a sum score-like metric using nonparametric item calibration. In the example given in this paper, I assume that an item response model has already been fitted. The merits of Ramsay and Wiberg’s method notwithstanding, my example is intended to illustrate a general method that can be used to transform already fitted models to any metric that is monotonically related to \(\theta \). Although beyond the scope of this paper, I believe that the methods described in this paper could be adapted to allow for linking among different calibrations specified under Ramsay and Wiberg’s method.

References

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrox & F. Caski (Eds.), Second international symposium on information theory (pp. 267–281). Budapest: Akademiai Kiado.
Google Scholar
Bock, R. D., Thissen, D., & Zimowski, M. F. (1997). IRT estimation of domain scores. Journal of Educational Measurement, 34, 197–211.
Article Google Scholar
Butcher, J. N., Williams, C. L., Graham, J. R., Archer, R. P., Tellegen, A., Ben-Porath, Y. S., et al. (1992). MMPI-A (Minnesota Multiphasic Personality Inventory-Adolescent): Manual for administration, scoring, and interpretation. Minneapolis: University of Minnesota Press.
Google Scholar
Elphinstone, C. D. (1983). A target distribution model for nonparametric density estimation. Communication in Statistics-Theory and Methods, 12, 161–198.
Article Google Scholar
Elphinstone, C. D. (1985). A method of distribution and density estimation (Unpublished dissertation). Pretoria: University of South Africa.
Google Scholar
Embretson, S. E. (1996). The new rules of measurement. Psychological Assessment, 8, 341–349.
Article Google Scholar
Falk, C. F., & Cai, L. (2016a). Maximum marginal likelihood estimation of a monotonic polynomial generalized partial credit model with applications to multiple group analysis. Psychometrika, 81, 434–460.
Article Google Scholar
Falk, C. F., & Cai, L. (2016b). Semiparametric item response functions in the context of guessing. Journal of Educational Measurement, 53, 229247.
Article Google Scholar
Feuerstahler, L. M. (2018). flexmet: Flexible latent trait metrics using the filtered monotonic polynomial item response model. R package version 1.0.0.0
Haebara, T. (1980). Equating logistic ability scales by a weighted least squares method. Japanese Psychological Research, 22, 144–149.
Article Google Scholar
Hawkins, D. M. (1994). Fitting monotonic polynomials to data. Computational Statistics, 9, 233–247.
Google Scholar
Kolen, M. J. (1988). Defining score scales in relation to measurement error. Journal of Educational Measurement, 25, 97–110.
Article Google Scholar
Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking (3rd ed.). New York: Springer.
Book Google Scholar
Kyngdon, A. (2008). The Rasch model from the perspective of the representational theory of measurement. Theory & Psychology, 18, 89–109.
Article Google Scholar
Liang, L. (2007). A semi-parametric approach to estimating item response functions (Unpublished doctoral dissertation). Columbus, OH: The Ohio State University.
Google Scholar
Liang, L., & Browne, M. W. (2015). A quasi-parametric method for fitting flexible item response functions. Journal of Educational and Behavioral Statistics, 40, 5–34.
Article Google Scholar
Lord, F. M. (1974). The relative efficiency of two tests as a function of ability level. Psychometrika, 39, 351–358.
Article Google Scholar
Lord, F. M. (1975). The ‘ability’ scale in item characteristic curve theory. Psychometrika, 40, 205–217.
Article Google Scholar
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
Google Scholar
Mokken, R. J. (1971). A theory and procedure of scale analysis with applications in political research. The Hague: Mouton.
Book Google Scholar
Murray, K., Müller, S., & Turlach, B. A. (2013). Revisiting fitting monotone polynomials to data. Computational Statistics, 28, 1989–2005.
Article Google Scholar
Murray, K., Müller, S., & Turlach, B. A. (2016). Fast and flexible methods for monotone polynomial fitting. Journal of Statistical Computation and Simulation, 86, 2946–2966.
Article Google Scholar
Perline, R., Wright, B. D., & Wainer, H. (1979). The Rasch model as additive conjoint measurement. Applied Psychological Measurement, 3, 237–255.
Article Google Scholar
Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611–630.
Article Google Scholar
Ramsay, J. O., & Wiberg, M. (2017). A strategy for replacing sum scoring. Journal of Educational and Behavioral Statistics, 42, 282–307.
Article Google Scholar
Ramsay, J. O., & Winsberg, S. (1991). Maximum marginal likelihood estimation for semiparametric item analysis. Psychometrika, 56, 365–379.
Article Google Scholar
R Core Team. (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from https://www.R-project.org.
Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items? Psychological Methods, 8, 164–184.
Article PubMed Google Scholar
Schulz, E. M., & Nicewander, W. A. (1997). Grade equivalent and IRT representations of growth. Journal of Educational Measurement, 34, 315–331.
Article Google Scholar
Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 677–680.
Article PubMed Google Scholar
Stocking, M. L. (1996). An alternative method for scoring adaptive tests. Journal of Educational and Behavioral Statistics, 21, 365–389.
Article Google Scholar
Stocking, M. L., & Lord, F. M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7, 201–210.
Article Google Scholar
Tadikamalla, P. R. (1980). On simulating non-normal distributions. Psychometrika, 45, 273–279.
Article Google Scholar
Turlach, B., & Murray, K. (2016). MonoPoly: Functions to fit monotone polynomials. R package version 0.3-8.
van der Linden, W. J., & Barrett, M. D. (2016). Linking item response model parameters. Psychometrika, 81, 650–673.
Article PubMed Google Scholar
Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6, 473–492.
Article Google Scholar
Yi, Q., Wang, T., & Ban, J.-C. (2001). Effects of scale transformation and test-termination rule on the precision of ability estimation in computerized adaptive testing. Journal of Educational Measurement, 38, 267–292.
Article Google Scholar
Zwick, R. (1992). Statistical and psychometric issues in the measurement of educational achievement trends: Examples from the National Assessment of Educational Progress. Journal of Educational Statistics, 17, 205–218.
Google Scholar

Download references

Author information

Authors and Affiliations

Fordham University, New York City, USA
Leah M. Feuerstahler

Authors

Leah M. Feuerstahler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Leah M. Feuerstahler.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 171 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Feuerstahler, L.M. Metric Transformations and the Filtered Monotonic Polynomial Item Response Model. Psychometrika 84, 105–123 (2019). https://doi.org/10.1007/s11336-018-9642-9

Download citation

Received: 23 June 2017
Published: 09 November 2018
Issue Date: 15 March 2019
DOI: https://doi.org/10.1007/s11336-018-9642-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Metric Transformations and the Filtered Monotonic Polynomial Item Response Model

Abstract

Access this article

Similar content being viewed by others

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Notes

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 171 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Metric Transformations and the Filtered Monotonic Polynomial Item Response Model

Abstract

Access this article

Similar content being viewed by others

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Notes

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 171 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation