Skip to main content
Top
Gepubliceerd in: Quality of Life Research 1/2007

01-08-2007

Applying item response theory and computer adaptive testing: the challenges for health outcomes assessment

Auteur: Peter M. Fayers

Gepubliceerd in: Quality of Life Research | bijlage 1/2007

Log in om toegang te krijgen
share
DELEN

Deel dit onderdeel of sectie (kopieer de link)

  • Optie A:
    Klik op de rechtermuisknop op de link en selecteer de optie “linkadres kopiëren”
  • Optie B:
    Deel de link per e-mail

Abstract

Objectives

We review the papers presented at the NCI/DIA conference, to identify areas of controversy and uncertainty, and to highlight those aspects of item response theory (IRT) and computer adaptive testing (CAT) that require theoretical or empirical research in order to justify their application to patient reported outcomes (PROs).

Background

IRT and CAT offer exciting potential for the development of a new generation of PRO instruments. However, most of the research into these techniques has been in non-healthcare settings, notably in education. Educational tests are very different from PRO instruments, and consequently problematic issues arise when adapting IRT and CAT to healthcare research.

Results

Clinical scales differ appreciably from educational tests, and symptoms have characteristics distinctly different from examination questions. This affects the transferring of IRT technology. Particular areas of concern when applying IRT to PROs include inadequate software, difficulties in selecting models and communicating results, insufficient testing of local independence and other assumptions, and a need of guidelines for estimating sample size requirements. Similar concerns apply to differential item functioning (DIF), which is an important application of IRT. Multidimensional IRT is likely to be advantageous only for closely related PRO dimensions.

Conclusions

Although IRT and CAT provide appreciable potential benefits, there is a need for circumspection. Not all PRO scales are necessarily appropriate targets for this methodology. Traditional psychometric methods, and especially qualitative methods, continue to have an important role alongside IRT. Research should be funded to address the specific concerns that have been identified.
Literatuur
1.
go back to reference Berkson, J. (1944). Application of the logistic function to bio-assay. Journal of the American Statistical Society, 39, 357–365. Berkson, J. (1944). Application of the logistic function to bio-assay. Journal of the American Statistical Society, 39, 357–365.
2.
go back to reference Rasch, G. (1960). Probabilistic models for some intelligence attainment tests. Copenhagen: Danish Institute for Educational Research. Rasch, G. (1960). Probabilistic models for some intelligence attainment tests. Copenhagen: Danish Institute for Educational Research.
3.
go back to reference Cox, D. R. (1972). Regression models and life tables (with discussion). Journal of the Royal Statistical Society series B, 34, 187–220. Cox, D. R. (1972). Regression models and life tables (with discussion). Journal of the Royal Statistical Society series B, 34, 187–220.
4.
go back to reference McCullagh, P. (1980). Regression models for ordinal data (with discussion). Journal of the Royal Statistical Society series B, 42, 109–142. McCullagh, P. (1980). Regression models for ordinal data (with discussion). Journal of the Royal Statistical Society series B, 42, 109–142.
5.
go back to reference Van der Linden, W. J., & Hambleton, R. K. (1996). Item response theory: brief history, common models, and extensions. In W. J. Van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (p. 23). New York: Springer. Van der Linden, W. J., & Hambleton, R. K. (1996). Item response theory: brief history, common models, and extensions. In W. J. Van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (p. 23). New York: Springer.
6.
go back to reference Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. (Psychometric Monograph No. 17). Iowa City: Psychometric Society. Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. (Psychometric Monograph No. 17). Iowa City: Psychometric Society.
7.
go back to reference Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159–176.CrossRef Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159–176.CrossRef
8.
go back to reference Fayers, P. M., & Hand, D. J. (1997). Factor analysis, causal indicators and quality of life. Quality of Life Research, 6, 139–150.PubMed Fayers, P. M., & Hand, D. J. (1997). Factor analysis, causal indicators and quality of life. Quality of Life Research, 6, 139–150.PubMed
9.
go back to reference Fayers, P. M, & Hand, D. J. (2002). Causal variables, indicator variables and measurement scales: An example from quality of life. Journal of the Royal Statistical Society series A, 165, 233–261. Fayers, P. M, & Hand, D. J. (2002). Causal variables, indicator variables and measurement scales: An example from quality of life. Journal of the Royal Statistical Society series A, 165, 233–261.
10.
go back to reference Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists (p. 123). Mahwah: Lawrence Erlbaum Associates. Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists (p. 123). Mahwah: Lawrence Erlbaum Associates.
11.
go back to reference Wright, B. D. (1992). IRT in the 1990s: Which models work best? Rasch Measurement Transactions, 6, 196–200. Wright, B. D. (1992). IRT in the 1990s: Which models work best? Rasch Measurement Transactions, 6, 196–200.
12.
go back to reference Donaldson, M. (2006). Using patient-reported outcomes in clinical oncology practice: benefits, challenges, and next steps. Expert Review Pharmacoeconomics Outcomes Research, 6, 87–95.CrossRef Donaldson, M. (2006). Using patient-reported outcomes in clinical oncology practice: benefits, challenges, and next steps. Expert Review Pharmacoeconomics Outcomes Research, 6, 87–95.CrossRef
13.
go back to reference De Boer, A. G. E. M., Van Lanschot, J. J. B., Stalmeier, P. F. M. et al. (2004). Is a single-item visual analogue scale as valid, reliable and responsive as multi-item scales in measuring quality of life? Quality of Life Research, 13, 311–320.PubMedCrossRef De Boer, A. G. E. M., Van Lanschot, J. J. B., Stalmeier, P. F. M. et al. (2004). Is a single-item visual analogue scale as valid, reliable and responsive as multi-item scales in measuring quality of life? Quality of Life Research, 13, 311–320.PubMedCrossRef
14.
go back to reference Collins, S. L., Edwards, J., Moore, R. A et al. (2001). Seeking a simple measure of analgesia for mega-trials: Is a single global assessment good enough? Pain, 91, 189–194.PubMedCrossRef Collins, S. L., Edwards, J., Moore, R. A et al. (2001). Seeking a simple measure of analgesia for mega-trials: Is a single global assessment good enough? Pain, 91, 189–194.PubMedCrossRef
15.
go back to reference Bernhard, J., Sullivan M., Hurny C. et al. (2001). Clinical relevance of single item quality of life indicators in cancer clinical trials. British Journal of Cancer, 84, 1156–1165.PubMedCrossRef Bernhard, J., Sullivan M., Hurny C. et al. (2001). Clinical relevance of single item quality of life indicators in cancer clinical trials. British Journal of Cancer, 84, 1156–1165.PubMedCrossRef
16.
go back to reference Petersen, M. Aa., Groenvold, M., Aaronson, N., Fayers, P. M., Sprangers, M. A., & Bjorner, J. B. (2006). Multidimensional computerized adaptive testing of the EORTC QLQ-C30: Basic developments and evaluation. Quality of Life Research, 15, 315–329.PubMedCrossRef Petersen, M. Aa., Groenvold, M., Aaronson, N., Fayers, P. M., Sprangers, M. A., & Bjorner, J. B. (2006). Multidimensional computerized adaptive testing of the EORTC QLQ-C30: Basic developments and evaluation. Quality of Life Research, 15, 315–329.PubMedCrossRef
17.
go back to reference Haley, S. M., Ni, P. S., Ludlow, L. H., & Fragala-Pinkham, M. A. (2006). Measurement precision and efficiency of multidimensional computer adaptive testing of physical functioning using the pediatric evaluation of disability inventory. Archives of Physical Medicine and Rehabilitation, 87, 1223–1229.PubMedCrossRef Haley, S. M., Ni, P. S., Ludlow, L. H., & Fragala-Pinkham, M. A. (2006). Measurement precision and efficiency of multidimensional computer adaptive testing of physical functioning using the pediatric evaluation of disability inventory. Archives of Physical Medicine and Rehabilitation, 87, 1223–1229.PubMedCrossRef
18.
go back to reference Wang, W.-C., Chen, P.-H., & Cheng, T.-Y. (2004). Improving measurement precision of test batteries using multidimensional item response models. Psychological Methods, 9, 116–136.PubMedCrossRef Wang, W.-C., Chen, P.-H., & Cheng, T.-Y. (2004). Improving measurement precision of test batteries using multidimensional item response models. Psychological Methods, 9, 116–136.PubMedCrossRef
19.
go back to reference Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists (p. 278). Mahwah: Lawrence Erlbaum Associates. Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists (p. 278). Mahwah: Lawrence Erlbaum Associates.
20.
go back to reference Bair, M. J., Robinson, R. L., Katon, W., & Kroenke, K. (2003). Depression and pain co-morbidity: a literature review. Archives of Internal Medicine, 163, 2433–2445.PubMedCrossRef Bair, M. J., Robinson, R. L., Katon, W., & Kroenke, K. (2003). Depression and pain co-morbidity: a literature review. Archives of Internal Medicine, 163, 2433–2445.PubMedCrossRef
21.
go back to reference Ruoff, G. E. (1996). Depression in the patient with chronic pain. Journal of Family Practice, 43, S25–S33.PubMed Ruoff, G. E. (1996). Depression in the patient with chronic pain. Journal of Family Practice, 43, S25–S33.PubMed
22.
go back to reference Alonso, J., Angermeyer, M. C., Bernert, S. et al. (2004). Prevalence of mental disorders in Europe: Results from the European Study of the Epidemiology of Mental Disorders (ESEMeD) project. Acta Psychiatrica Scandinavica Suppl 420, 21–27. Alonso, J., Angermeyer, M. C., Bernert, S. et al. (2004). Prevalence of mental disorders in Europe: Results from the European Study of the Epidemiology of Mental Disorders (ESEMeD) project. Acta Psychiatrica Scandinavica Suppl 420, 21–27.
Metagegevens
Titel
Applying item response theory and computer adaptive testing: the challenges for health outcomes assessment
Auteur
Peter M. Fayers
Publicatiedatum
01-08-2007
Uitgeverij
Springer Netherlands
Gepubliceerd in
Quality of Life Research / Uitgave bijlage 1/2007
Print ISSN: 0962-9343
Elektronisch ISSN: 1573-2649
DOI
https://doi.org/10.1007/s11136-007-9197-1

Andere artikelen bijlage 1/2007

Quality of Life Research 1/2007 Naar de uitgave