Skip to main content
Top
Gepubliceerd in:

07-04-2022

A comparison of methods to address item non-response when testing for differential item functioning in multidimensional patient-reported outcome measures

Auteurs: Olawale F. Ayilara, Tolulope T. Sajobi, Ruth Barclay, Eric Bohm, Mohammad Jafari Jozani, Lisa M. Lix

Gepubliceerd in: Quality of Life Research | Uitgave 9/2022

Log in om toegang te krijgen
share
DELEN

Deel dit onderdeel of sectie (kopieer de link)

  • Optie A:
    Klik op de rechtermuisknop op de link en selecteer de optie “linkadres kopiëren”
  • Optie B:
    Deel de link per e-mail

Abstract

Purpose

Item non-response (i.e., missing data) may mask the detection of differential item functioning (DIF) in patient-reported outcome measures or result in biased DIF estimates. Non-response can be challenging to address in ordinal data. We investigated an unsupervised machine-learning method for ordinal item-level imputation and compared it with commonly-used item non-response methods when testing for DIF.

Methods

Computer simulation and real-world data were used to assess several item non-response methods using the item response theory likelihood ratio test for DIF. The methods included: (a) list-wise deletion (LD), (b) half-mean imputation (HMI), (c) full information maximum likelihood (FIML), and (d) non-negative matrix factorization (NNMF), which adopts a machine-learning approach to impute missing values. Control of Type I error rates were evaluated using a liberal robustness criterion for α = 0.05 (i.e., 0.025–0.075). Statistical power was assessed with and without adoption of an item non-response method; differences > 10% were considered substantial.

Results

Type I error rates for detecting DIF using LD, FIML and NNMF methods were controlled within the bounds of the robustness criterion for > 95% of simulation conditions, although the NNMF occasionally resulted in inflated rates. The HMI method always resulted in inflated error rates with 50% missing data. Differences in power to detect moderate DIF effects for LD, FIML and NNMF methods were substantial with 50% missing data and otherwise insubstantial.

Conclusion

The NNMF method demonstrated comparable performance to commonly-used non-response methods. This computationally-efficient method represents a promising approach to address item-level non-response when testing for DIF.
Bijlagen
Alleen toegankelijk voor geautoriseerde gebruikers
Literatuur
1.
go back to reference Johnston, B. C., Patrick, D. L., Thorlund, K., Busse, J. W., da Costa, B. R., Schünemann, H. J., & Guyatt, G. H. (2013). Patient-reported outcomes in meta-analyses –part 2: Methods for improving interpretability for decision-makers. Health and Quality of Life Outcomes, 11(211), 1–9. https://doi.org/10.1186/1477-7525-11-211CrossRef Johnston, B. C., Patrick, D. L., Thorlund, K., Busse, J. W., da Costa, B. R., Schünemann, H. J., & Guyatt, G. H. (2013). Patient-reported outcomes in meta-analyses –part 2: Methods for improving interpretability for decision-makers. Health and Quality of Life Outcomes, 11(211), 1–9. https://​doi.​org/​10.​1186/​1477-7525-11-211CrossRef
2.
go back to reference Guyatt, G. H., Feeny, D. H., & Patrick, D. L. (1993). Measuring health-related quality of life. Annals of Internal Medicine, 118(8), 622–629.CrossRef Guyatt, G. H., Feeny, D. H., & Patrick, D. L. (1993). Measuring health-related quality of life. Annals of Internal Medicine, 118(8), 622–629.CrossRef
4.
go back to reference Bulut, O., & Kim, D. (2021). The use of data imputation when investigating dimensionality in Sparse data from computerized adaptive tests. Journal of Applied Testing Technology, 22(2), 1. Bulut, O., & Kim, D. (2021). The use of data imputation when investigating dimensionality in Sparse data from computerized adaptive tests. Journal of Applied Testing Technology, 22(2), 1.
6.
go back to reference Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). Wiley.CrossRef Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). Wiley.CrossRef
9.
go back to reference Banks, K. (2015). An introduction to missing data in the context of differential item functioning. Practical Assessment, Research and Evaluation, 20(12), 1–10. Banks, K. (2015). An introduction to missing data in the context of differential item functioning. Practical Assessment, Research and Evaluation, 20(12), 1–10.
14.
go back to reference Raghunathan, T. E., Lepkowski, J. M., & Van Hoewyk, J. (2001). A multivariate technique for multiply imputing missing values using a sequence of regression models. Survey Methodology, 27(1), 85–95. Raghunathan, T. E., Lepkowski, J. M., & Van Hoewyk, J. (2001). A multivariate technique for multiply imputing missing values using a sequence of regression models. Survey Methodology, 27(1), 85–95.
15.
go back to reference Enders, C. K. (2010). Applied missing data analysis. The Guilford Press. Enders, C. K. (2010). Applied missing data analysis. The Guilford Press.
16.
go back to reference Liu, Y., Millsap, R. E., West, S. G., Tein, J. Y., Tanaka, R., & Grimm, K. J. (2017). Testing measurement invariance in longitudinal data with ordered-categorical measures. Psychological Methods, 22(3), 486–506.CrossRef Liu, Y., Millsap, R. E., West, S. G., Tein, J. Y., Tanaka, R., & Grimm, K. J. (2017). Testing measurement invariance in longitudinal data with ordered-categorical measures. Psychological Methods, 22(3), 486–506.CrossRef
17.
go back to reference Chen, P. Y., Wu, W., Garnier-Villarreal, M., Kite, B. A., & Jia, F. (2020). Testing measurement invariance with ordinal missing data: A comparison of estimators and missing data techniques. Multivariate Behavioral Research, 55(1), 87–101.CrossRef Chen, P. Y., Wu, W., Garnier-Villarreal, M., Kite, B. A., & Jia, F. (2020). Testing measurement invariance with ordinal missing data: A comparison of estimators and missing data techniques. Multivariate Behavioral Research, 55(1), 87–101.CrossRef
18.
go back to reference Donneau, A. F., Mauer, M., Lambert, P., Molenberghs, G., & Albert, A. (2015). Simulation-based study comparing multiple imputation methods for non-monotone missing ordinal data in longitudinal settings. Journal of Biopharmaceutical Statistics, 25(3), 570–601.CrossRef Donneau, A. F., Mauer, M., Lambert, P., Molenberghs, G., & Albert, A. (2015). Simulation-based study comparing multiple imputation methods for non-monotone missing ordinal data in longitudinal settings. Journal of Biopharmaceutical Statistics, 25(3), 570–601.CrossRef
19.
go back to reference Baker, F. B., & Kim, S. H. (2004). Item response theory: Parameter estimation techniques (2nd ed.). CRC Press.CrossRef Baker, F. B., & Kim, S. H. (2004). Item response theory: Parameter estimation techniques (2nd ed.). CRC Press.CrossRef
22.
go back to reference Mazumder, R., Hastie, T., & Tibshirani, R. (2010). Spectral regularization algorithms for learning large incomplete matrices. Journal of Machine Learning Research, 11, 2287–2322.PubMed Mazumder, R., Hastie, T., & Tibshirani, R. (2010). Spectral regularization algorithms for learning large incomplete matrices. Journal of Machine Learning Research, 11, 2287–2322.PubMed
23.
go back to reference Wold, H. (1975). Soft modelling by latent variables: The nonlinear iterative partial least squares (NIPALS) approach. Journal of Applied Probability, 12(S1), 117–142.CrossRef Wold, H. (1975). Soft modelling by latent variables: The nonlinear iterative partial least squares (NIPALS) approach. Journal of Applied Probability, 12(S1), 117–142.CrossRef
24.
go back to reference Fairclough, A. D. L., & Cella, D. F. (1996). Functional assessment of cancer therapy (FACT-G): Non-response to individual questions. Quality of Life Research, 5(3), 321–329.CrossRef Fairclough, A. D. L., & Cella, D. F. (1996). Functional assessment of cancer therapy (FACT-G): Non-response to individual questions. Quality of Life Research, 5(3), 321–329.CrossRef
26.
go back to reference Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6(4), 330–351.CrossRef Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6(4), 330–351.CrossRef
31.
go back to reference Lin, X. E., & Boutros, P. (2019). NNLM: a package for fast and versatile nonnegative matrix factorization. Lin, X. E., & Boutros, P. (2019). NNLM: a package for fast and versatile nonnegative matrix factorization.
35.
go back to reference De Ayala, R. J. (1994). The influence of multidimensionality on the graded response model. Applied Psychological Measurement, 18(2), 155–170.CrossRef De Ayala, R. J. (1994). The influence of multidimensionality on the graded response model. Applied Psychological Measurement, 18(2), 155–170.CrossRef
37.
go back to reference Finch, H. W. (2011). The impact of missing data on the detection of nonuniform differential item functioning. Educational and Psychological Measurement, 71(4), 663–683.CrossRef Finch, H. W. (2011). The impact of missing data on the detection of nonuniform differential item functioning. Educational and Psychological Measurement, 71(4), 663–683.CrossRef
45.
go back to reference Lix, L. M., Wu, X., Hopman, W., Mayo, N., Sajobi, T. T., Liu, J., Prior, J. C., Papaioannou, A., Josse, R. G., Towheed, T. E., Davison, K. S., & Sawatzky, R. (2016). Differential item functioning in the SF-36 physical functioning and mental health sub scales: A population-based investigation in the Canadian multicentre osteoporosis study. PLoS ONE, 11(3), 1–13. https://doi.org/10.1371/journal.pone.0151519CrossRef Lix, L. M., Wu, X., Hopman, W., Mayo, N., Sajobi, T. T., Liu, J., Prior, J. C., Papaioannou, A., Josse, R. G., Towheed, T. E., Davison, K. S., & Sawatzky, R. (2016). Differential item functioning in the SF-36 physical functioning and mental health sub scales: A population-based investigation in the Canadian multicentre osteoporosis study. PLoS ONE, 11(3), 1–13. https://​doi.​org/​10.​1371/​journal.​pone.​0151519CrossRef
49.
go back to reference Bradley, J. V. (1978). Robustness. British Journal of Mathematical & Statistical Psychology, 31(2), 144–152.CrossRef Bradley, J. V. (1978). Robustness. British Journal of Mathematical & Statistical Psychology, 31(2), 144–152.CrossRef
50.
go back to reference Kaplan, D. (1989). A study of the sampling variability and z-values of parameter estimates from misspecified structural equation models. Multivariate Behavioral Research, 24(1), 41–57.CrossRef Kaplan, D. (1989). A study of the sampling variability and z-values of parameter estimates from misspecified structural equation models. Multivariate Behavioral Research, 24(1), 41–57.CrossRef
51.
go back to reference Curran, P., & West, S. G. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Methods, 1(1), 16–29.CrossRef Curran, P., & West, S. G. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Methods, 1(1), 16–29.CrossRef
53.
go back to reference Salyers, M., Bosworth, H., Swanson, J., Lamb-Pagone, J., & Osher, F. (2000). Reliability and validity of the SF-12 health survey among people with severe mental illness. Medical Care, 38, 1141–1150.CrossRef Salyers, M., Bosworth, H., Swanson, J., Lamb-Pagone, J., & Osher, F. (2000). Reliability and validity of the SF-12 health survey among people with severe mental illness. Medical Care, 38, 1141–1150.CrossRef
54.
go back to reference Cernin, P., Cresci, K., Jankowski, T., & Lichtenberg, P. (2010). Reliability and validity testing of the short-form health survey in a sample of community-dwelling African American older adults. Journal of Nursing Measurement, 18, 49–59.CrossRef Cernin, P., Cresci, K., Jankowski, T., & Lichtenberg, P. (2010). Reliability and validity testing of the short-form health survey in a sample of community-dwelling African American older adults. Journal of Nursing Measurement, 18, 49–59.CrossRef
55.
go back to reference Cheak-Zamora, N., Wyrwich, K., & McBride, T. (2009). Reliability and validity of the SF-12v2 in the medical expenditure panel survey. Quality of Life Research, 18, 727–735.CrossRef Cheak-Zamora, N., Wyrwich, K., & McBride, T. (2009). Reliability and validity of the SF-12v2 in the medical expenditure panel survey. Quality of Life Research, 18, 727–735.CrossRef
56.
go back to reference Yosef, H. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75(4), 800–802.CrossRef Yosef, H. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75(4), 800–802.CrossRef
58.
go back to reference Sedivy, S. K., Zhang, B., & Traxel, N. M. (2006). Detection of differential item functioning with polytomous items in the presence of missing data. In: Annual meeting of the National Council on Measurement in Education Sedivy, S. K., Zhang, B., & Traxel, N. M. (2006). Detection of differential item functioning with polytomous items in the presence of missing data. In: Annual meeting of the National Council on Measurement in Education
60.
go back to reference Finch, H. (2008). Estimation of item response theory parameters in the presence of missing data. Journal of Educational Measurement, 45(3), 225–245.CrossRef Finch, H. (2008). Estimation of item response theory parameters in the presence of missing data. Journal of Educational Measurement, 45(3), 225–245.CrossRef
Metagegevens
Titel
A comparison of methods to address item non-response when testing for differential item functioning in multidimensional patient-reported outcome measures
Auteurs
Olawale F. Ayilara
Tolulope T. Sajobi
Ruth Barclay
Eric Bohm
Mohammad Jafari Jozani
Lisa M. Lix
Publicatiedatum
07-04-2022
Uitgeverij
Springer International Publishing
Gepubliceerd in
Quality of Life Research / Uitgave 9/2022
Print ISSN: 0962-9343
Elektronisch ISSN: 1573-2649
DOI
https://doi.org/10.1007/s11136-022-03129-8