Skip to main content
Top

11-12-2024

Investigating item response theory model performance in the context of evaluating clinical outcome assessments in clinical trials

Auteurs: Nicolai D. Ayasse, Cheryl D. Coon

Gepubliceerd in: Quality of Life Research

Log in om toegang te krijgen
share
DELEN

Deel dit onderdeel of sectie (kopieer de link)

  • Optie A:
    Klik op de rechtermuisknop op de link en selecteer de optie “linkadres kopiëren”
  • Optie B:
    Deel de link per e-mail

Abstract

Purpose

Item response theory (IRT) models are an increasingly popular method choice for evaluating clinical outcome assessments (COAs) for use in clinical trials. Given common constraints in clinical trial design, such as limits on sample size and assessment lengths, the current study aimed to examine the appropriateness of commonly used polytomous IRT models, specifically the graded response model (GRM) and partial credit model (PCM), in the context of how they are frequently used for psychometric evaluation of COAs in clinical trials.

Methods

Data were simulated under varying sample sizes, measure lengths, response category numbers, and slope strengths, as well as under conditions that violated some model assumptions, namely, unidimensionality and equality of item slopes. Model fit, detection of item local dependence, and detection of item misfit were all examined to identify conditions where one model may be preferable or results may contain a degree of bias.

Results

For unidimensional item sets and equal item slopes, the PCM and GRM performed similarly, and GRM performance remained consistent as slope variability increased. For not-unidimensional item sets, the PCM was somewhat more sensitive to this unidimensionality violation. Looking across conditions, the PCM did not demonstrate a clear advantage over the GRM for small sample sizes or shorter measure lengths.

Conclusion

Overall, the GRM and the PCM each demonstrated advantages and disadvantages depending on underlying data conditions and the model outcome investigated. We recommend careful consideration of the known, or expected, data characteristics when choosing a model and interpreting its results.
Bijlagen
Alleen toegankelijk voor geautoriseerde gebruikers
Voetnoten
1
Trials extracted on 27 December 2023 from clinicaltrials.gov by filtering on “Interventional,” either “Phase 2” or “Phase 3,” and as completed and no longer looking for participants. Trials that were marked as encompassing multiple phases were removed, as were trials with either enrollment of zero or missing enrollment information. Outlier sample sizes were not removed.
 
2
An average slope of 3 was originally also tested and a similar results pattern observed.
 
Literatuur
2.
go back to reference Fleurence, R. L., Forsythe, L. P., Lauer, M., Rotter, J., Ioannidis, J. P. A., Beal, A., Frank, L., & Selby, J. V. (2014). Engaging patients and stakeholders in research proposal review: The Patient-Centered Outcomes Research Institute. In Annals of Internal Medicine (Vol. 161, Issue 2, pp. 122–130). American College of Physicians. https://doi.org/10.7326/M13-2412 Fleurence, R. L., Forsythe, L. P., Lauer, M., Rotter, J., Ioannidis, J. P. A., Beal, A., Frank, L., & Selby, J. V. (2014). Engaging patients and stakeholders in research proposal review: The Patient-Centered Outcomes Research Institute. In Annals of Internal Medicine (Vol. 161, Issue 2, pp. 122–130). American College of Physicians. https://​doi.​org/​10.​7326/​M13-2412
4.
go back to reference Sacristán, J. A., Aguarón, A., Avendaño-Solá, C., Garrido, P., Carrión, J., Gutiérrez, A., Kroes, R., & Flores, A. (2016). Patient involvement in clinical research: Why, when, and how. Patient preference and adherence (Vol. 10, pp. 631–640). Dove Medical Press Ltd. https://doi.org/10.2147/PPA.S104259 Sacristán, J. A., Aguarón, A., Avendaño-Solá, C., Garrido, P., Carrión, J., Gutiérrez, A., Kroes, R., & Flores, A. (2016). Patient involvement in clinical research: Why, when, and how. Patient preference and adherence (Vol. 10, pp. 631–640). Dove Medical Press Ltd. https://​doi.​org/​10.​2147/​PPA.​S104259
6.
go back to reference Brundage, M., Blazeby, J., Revicki, D., Bass, B., De Vet, H., Duffy, H., Efficace, F., King, M., Lam, C. L. K., Moher, D., Scott, J., Sloan, J., Snyder, C., Yount, S., & Calvert, M. (2013). Patient-reported outcomes in randomized clinical trials: Development of ISOQOL reporting standards. Quality of Life Research, 22(6), 1161–1175. https://doi.org/10.1007/s11136-012-0252-1CrossRefPubMed Brundage, M., Blazeby, J., Revicki, D., Bass, B., De Vet, H., Duffy, H., Efficace, F., King, M., Lam, C. L. K., Moher, D., Scott, J., Sloan, J., Snyder, C., Yount, S., & Calvert, M. (2013). Patient-reported outcomes in randomized clinical trials: Development of ISOQOL reporting standards. Quality of Life Research, 22(6), 1161–1175. https://​doi.​org/​10.​1007/​s11136-012-0252-1CrossRefPubMed
16.
go back to reference Stover, A. M., McLeod, L. D., Langer, M. M., Chen, W. H., & Reeve, B. B. (2019). State of the psychometric methods: Patient-reported outcome measure development and refinement using item response theory. Journal of Patient-Reported Outcomes, 3(1). https://doi.org/10.1186/s41687-019-0130-5 Stover, A. M., McLeod, L. D., Langer, M. M., Chen, W. H., & Reeve, B. B. (2019). State of the psychometric methods: Patient-reported outcome measure development and refinement using item response theory. Journal of Patient-Reported Outcomes, 3(1). https://​doi.​org/​10.​1186/​s41687-019-0130-5
17.
19.
go back to reference McDonald, R. P. (1999). Test theory: A unified treatment. Lawrence Erlbaum Associates, Inc. McDonald, R. P. (1999). Test theory: A unified treatment. Lawrence Erlbaum Associates, Inc.
20.
go back to reference Thissen, D., & Wainer, H. (Eds.). (n.d.-u). Test scoring. Lawrence Erlbaum Associates, Inc., Publishers. Thissen, D., & Wainer, H. (Eds.). (n.d.-u). Test scoring. Lawrence Erlbaum Associates, Inc., Publishers.
23.
27.
go back to reference Mercieca-Bebber, R., Palmer, M. J., Brundage, M., Calvert, M., Stockler, M. R., & King, M. T. (2016). Design, implementation and reporting strategies to reduce the instance and impact of missing patient-reported outcome (PRO) data: A systematic review. British Medical Journal Open, 6(6). https://doi.org/10.1136/bmjopen-2015 Mercieca-Bebber, R., Palmer, M. J., Brundage, M., Calvert, M., Stockler, M. R., & King, M. T. (2016). Design, implementation and reporting strategies to reduce the instance and impact of missing patient-reported outcome (PRO) data: A systematic review. British Medical Journal Open, 6(6). https://​doi.​org/​10.​1136/​bmjopen-2015
29.
go back to reference Masters, G. N. (1982). A Rasch model for partial credit scoring. PSYCHOMETRIKA, 47(2). Masters, G. N. (1982). A Rasch model for partial credit scoring. PSYCHOMETRIKA, 47(2).
30.
go back to reference Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests.
31.
go back to reference Andrich, D. (2004). Controversy and the Rasch model: A characteristic of incompatible paradigms? Medical Care, I7–I16. Andrich, D. (2004). Controversy and the Rasch model: A characteristic of incompatible paradigms? Medical Care, I7–I16.
39.
go back to reference Cai, L., & Monroe, S. (2014). A New Statistic for Evaluating Item Response Theory Models for Ordinal Data. CRESST Report 839. Cai, L., & Monroe, S. (2014). A New Statistic for Evaluating Item Response Theory Models for Ordinal Data. CRESST Report 839.
42.
go back to reference Browne, M. W., & Cudeck, R. (1992). Alternative ways of assessing model fit. In K. A. Bollen, & J. S. Long (Eds.), Testing structural equation models (pp. 136–162). Sage. Browne, M. W., & Cudeck, R. (1992). Alternative ways of assessing model fit. In K. A. Bollen, & J. S. Long (Eds.), Testing structural equation models (pp. 136–162). Sage.
43.
go back to reference Maccallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. (1999). Sample Size in Factor Analysis. In Psychological Methods (Vol. 4, Issue 1). Maccallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. (1999). Sample Size in Factor Analysis. In Psychological Methods (Vol. 4, Issue 1).
49.
go back to reference Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561–573.CrossRef Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561–573.CrossRef
50.
go back to reference Darrell Bock, R. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37(1), 29–51.CrossRef Darrell Bock, R. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37(1), 29–51.CrossRef
Metagegevens
Titel
Investigating item response theory model performance in the context of evaluating clinical outcome assessments in clinical trials
Auteurs
Nicolai D. Ayasse
Cheryl D. Coon
Publicatiedatum
11-12-2024
Uitgeverij
Springer International Publishing
Gepubliceerd in
Quality of Life Research
Print ISSN: 0962-9343
Elektronisch ISSN: 1573-2649
DOI
https://doi.org/10.1007/s11136-024-03873-z