Skip to main content
Log in

Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms

  • Published:
Quality of Life Research Aims and scope Submit manuscript

Abstract

Purpose

Short-form patient-reported outcome measures are popular because they minimize patient burden. We assessed the efficiency of static short forms and computer adaptive testing (CAT) using data from the Patient-Reported Outcomes Measurement Information System (PROMIS) project.

Methods

We evaluated the 28-item PROMIS depressive symptoms bank. We used post hoc simulations based on the PROMIS calibration sample to compare several short-form selection strategies and the PROMIS CAT to the total item bank score.

Results

Compared with full-bank scores, all short forms and CAT produced highly correlated scores, but CAT outperformed each static short form in almost all criteria. However, short-form selection strategies performed only marginally worse than CAT. The performance gap observed in static forms was reduced by using a two-stage branching test format.

Conclusions

Using several polytomous items in a calibrated unidimensional bank to measure depressive symptoms yielded a CAT that provided marginally superior efficiency compared to static short forms. The efficiency of a two-stage semi-adaptive testing strategy was so close to CAT that it warrants further consideration and study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Bjorner, J. B., Chang, C. H., Thissen, D., & Reeve, B. B. (2007). Developing tailored instruments: item banking and computerized adaptive assessment. Quality of Life Research, 16(Suppl 1), 95–108.

    Article  PubMed  Google Scholar 

  2. Thissen, D., Reeve, B. B., Bjorner, J. B., & Chang, C. H. (2007). Methodological issues for building item banks and computerized adaptive scales. Quality of Life Research, 16(Suppl 1), 109–119.

    Article  PubMed  Google Scholar 

  3. Cella, D., Yount, S., Rothrock, N., Gershon, R., Cook, K., Reeve, B., et al. (2007). The patient-reported outcomes measurement information system (PROMIS): progress of an NIH roadmap cooperative group during its first two years. Medical Care, 45(5 Suppl 1), S3–S11.

    Article  PubMed  Google Scholar 

  4. Belov, D. I., & Armstrong, R. D. (2008). A Monte Carlo approach to the design, assembly, and evaluation of multistage adaptive tests. Applied Psychological Measurement, 32(2), 119–137.

    Article  Google Scholar 

  5. Pilkonis, P. A., Choi, S. W., Reise, S. P., Stover, A. M., Riley, W. T., & Cella, D. (in preparation). The development of scales for emotional distress from the patient-reported outcomes measurement information system (PROMIS): Depression, Anxiety, and Anger.

  6. Fliege, H., Becker, J., Walter, O., Bjorner, J., Klapp, B., & Rose, M. (2005). Development of a computer-adaptive test for depression (D-CAT). Quality of Life Research, 14(10), 2277–2291.

    Article  PubMed  Google Scholar 

  7. Gardner, W., Shear, K., Kelleher, K., Pajer, K., Mammen, O., Buysse, D., et al. (2004). Computerized adaptive measurement of depression: A simulation study. BMC Psychiatry, 4(1), 13.

    Article  PubMed  Google Scholar 

  8. Gibbons, R. D., Weiss, D. J., Kupfer, D. J., Frank, E., Fagiolini, A., Grochocinski, V. J., et al. (2008). Using computerized adaptive testing to reduce the burden of mental health assessment. Psychiatric Services, 59(4), 361–368.

    Article  PubMed  Google Scholar 

  9. Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph, 17.

  10. Thissen, D., Chen, W.-H., & Bock, R. D. (2003). Multilog (version 7) [Computer software]. Lincolnwood, IL: Scientific Software International.

    Google Scholar 

  11. Kang, T., & Chen, T. (2008). Performance of the generalized S-X2 item fit index for polytomous IRT models. Journal of Educational Measurement, 45(4), 391–406.

    Article  Google Scholar 

  12. Orlando, M., & Thissen, D. (2003). Further investigation of the performance of S-X2: An item fit index for use with dichotomous item response theory models. Applied Psychological Measurement, 27(4), 289–298.

    Article  Google Scholar 

  13. Bjorner, J. B., Smith, K. J., Orlando, M., Stone, C., Thissen, D., & Sun, X. (2006). IRTFIT: A macro for item fit and local dependence tests under IRT models. Lincoln, RI: Quality Metric, Inc.

    Google Scholar 

  14. Liu, H., Cella, D., Gershon, R., Shen, J., Morales, L. S., Riley, W. T., & Hays, R. D. (in press). Representativeness of the PROMIS Internet Panel. Journal of Clinical Epidemiology.

  15. Muthen, L. K. & Muthen, B. O. (1998). Mplus user’s guide.

  16. Choi, S. W. (2009). Firestar: Computerized adaptive testing simulation program for polytomous IRT models. Applied Psychological Measurement, 33(8), 644–645.

    Article  PubMed  Google Scholar 

  17. Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  18. Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6(4), 473–492.

    Article  Google Scholar 

  19. Chang, H.-H., & Ying, Z. (1996). A global information approach to computerized adaptive testing. Applied Psychological Measurement, 20(3), 213–229.

    Article  Google Scholar 

  20. Lima Passos, V., Berger, M. P. F., & Tan, F. E. (2007). Test design optimization in CAT early stage with the nominal response model. Applied Psychological Measurement, 31(3), 213–232.

    Article  Google Scholar 

  21. van der Linden, W. J., & Pashley, P. J. (2000). Item selection and ability estimator in adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 1–25). Boston, MA: Kluwer Academic.

    Google Scholar 

  22. Veerkamp, W. J. J., & Berger, M. P. F. (1997). Some new item selection criteria for adaptive testing. Journal of Educational and Behavioral Statistics, 22(2), 203–226.

    Google Scholar 

  23. Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6(4), 431–444.

    Article  Google Scholar 

  24. Choi, S. W., & Swartz, R. J. (2009). Comparison of CAT item selection criteria for polytomous items. Applied Psychological Measurement, 33(6), 419–440.

    Article  PubMed  Google Scholar 

  25. van der Linden, W. (1998). Optimal assembly of psychological and education tests. Applied Psychological Measurement, 22(3), 195–211.

    Article  Google Scholar 

  26. Reise, S. P., & Henson, J. M. (2000). Computerization and adaptive administration of the NEO PI-R. Assessment, 7(4), 347–364.

    Article  CAS  PubMed  Google Scholar 

  27. Hol, A. M., Vorst, H. C. M., & Mellenbergh, G. J. (2007). Computerized adaptive testing for polytomous motivation items: administration mode effects and a comparison with short forms. Applied Psychological Measurement, 31(5), 412–429.

    Article  Google Scholar 

  28. Kendall, M. G., & Babington, S. B. (1939). The problem of m rankings. The Annals of Mathematical Statistics, 10(3), 275–287.

    Article  Google Scholar 

  29. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.

    Article  Google Scholar 

  30. Yule, G. U. (1912). On the methods of measuring association between two attributes. Journal of the Royal Statistical Society, 75, 579–652.

    Article  Google Scholar 

  31. Warrens, M. (2008). On association coefficients for 2 × 2 tables and properties that do not depend on the marginal distributions. Psychometrika, 73, 777–789.

    Article  PubMed  Google Scholar 

  32. Altman, D. G., & Bland, J. M. (1994). Diagnostic tests 2: Predictive values. British Journal of Medicine, 309, 102.

    CAS  Google Scholar 

  33. Strauss, M. E. & Smith, G. T. (2009). Construct validity: advances in theory and methodology. Annual Review of Clinical Psychology, 5, 1–25.

    Article  PubMed  Google Scholar 

  34. Reise, S. P., Morizot, J., & Hays, R. D. (2007). The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Quality of Life Research, 16(Suppl 1), 19–31.

    Article  PubMed  Google Scholar 

  35. Dodd, B. G., Koch, W. R., & De Ayala, R. J. (1989). Operational characteristics of adaptive testing procedures using the graded response model. Applied Psychological Measurement, 13(2), 129–143.

    Article  Google Scholar 

  36. Smith, G. T., McCarthy, D. M., & Anderson, K. G. (2000). On the sins of short-form development. Psychological Assessment, 12(1), 102–111.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

This study was supported in part by NIH grant PROMIS Network (U-01 AR 052177-04, PI: David Cella). The Patient-Reported Outcomes Measurement Information System (PROMIS) is a National Institutes of Health (NIH) Roadmap initiative to develop a computerized system measuring patient-reported outcomes in respondents with a wide range of chronic diseases and demographic characteristics. PROMIS was funded by cooperative agreements to a Statistical Coordinating Center (Northwestern University, PI: David Cella, Ph.D., U01AR52177) and six Primary Research Sites (Duke University, PI: Kevin Weinfurt, Ph.D., U01AR52186; University of North Carolina, PI: Darren DeWalt, MD, MPH, U01AR52181; University of Pittsburgh, PI: Paul A. Pilkonis, Ph.D., U01AR52155; Stanford University, PI: James Fries, MD, U01AR52158; Stony Brook University, PI: Arthur Stone, Ph.D., U01AR52170; and University of Washington, PI: Dagmar Amtmann, Ph.D., U01AR52171). NIH Science Officers on this project are Deborah Ader, Ph.D., Susan Czajkowski, Ph.D., Lawrence Fine, MD, DrPH, Louis Quatrano, Ph.D., Bryce Reeve, Ph.D., William Riley, Ph.D., and Susana Serrate-Sztein, Ph.D. This manuscript was reviewed by the PROMIS Publications Subcommittee prior to external peer review. See the web site at www.nihpromis.org for additional information on the PROMIS cooperative group.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Seung W. Choi.

Appendix

Appendix

See Table 3.

Table 3 PROMIS depression bank parameters (graded response model)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Choi, S.W., Reise, S.P., Pilkonis, P.A. et al. Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms. Qual Life Res 19, 125–136 (2010). https://doi.org/10.1007/s11136-009-9560-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11136-009-9560-5

Keywords

Navigation