Skip to main content
Log in

Assessing quality of a diagnostic test evaluation

  • Original Articles
  • Published:
Journal of General Internal Medicine Aims and scope Submit manuscript

Abstract

Objective:To develop a standardized scale for assessing the quality of a diagnostic test evaluation.

Design:Fourteen participants with formal and practical experience in evaluating diagnostic tests formed a consensus panel. Panel members identified and weighted questions that should be addressed when assessing the quality of a diagnostic test evaluation.

Setting:General internal medicine division at an academic medical center.

Results:A 19-item weighted scale was developed. It prioritizes and addresses issues such as description of the proposed purpose of the test; appropriate selection and description of the study population; appropriate performance and description of the diagnostic test; appropriate selection and performance of the reference standard; and adequate presentation of test characteristics.

Conclusions:The scale is proposed as a useful instrument for readers, investigators, reviewers, and editors, because it represents an updated synthesis of important criteria to consider when evaluating diagnostic tests. It can also be used to rate quantitatively the quality of diagnostic test evaluations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Vecchio TJ. Predictive value of a single diagnostic test in unselected populations. N Engl J Med 1966;274:1171–3.

    Article  PubMed  CAS  Google Scholar 

  2. Feinstein AR. Clinical judgment. New York: Robert E. Krieger Publishing Co., 1967;72–127.

    Google Scholar 

  3. Galen RS, Gambino SR. Beyond normality. The predictive value and efficacy of medical diagnoses. New York: John Wiley and Sons, 1975.

    Google Scholar 

  4. Koran LM. The reliability of clinical methods, data and judgments. N Engl J Med 1975;293:642–6, 695–701.

    Article  PubMed  CAS  Google Scholar 

  5. McNeil BJ, Keeler E, Adelstein SJ. Primer on certain elements of medical decision making. N Engl J Med 1975;293:211–5.

    Article  PubMed  CAS  Google Scholar 

  6. Wulff HR. Rational diagnosis and treatment. 2nd ed. Oxford: Blackwell Scientific Publications, 1976;78–117.

    Google Scholar 

  7. Feinstein AR. Clinical biostatistics. St. Louis: C. V. Mosby, 1977.

    Google Scholar 

  8. Ransohoff DF, Feinstein AR. Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. N Engl J Med 1978;299:926–9.

    Article  PubMed  CAS  Google Scholar 

  9. Swets JA, Pickett RM, Whitehead SF, et al. Assessment of diagnostic technologies. Science 1979;205:753–9.

    Article  PubMed  CAS  Google Scholar 

  10. Diamond GA, Forrester JA. Analysis of probability as an aid in the clinical diagnosis of coronary-artery disease. N Engl J Med 1979;300:1350–7.

    Article  PubMed  CAS  Google Scholar 

  11. Weinstein MC, Fineberg HV. Clinical decision analysis. Philadelphia: W. B. Saunders, 1980.

    Google Scholar 

  12. Pauker SG, Kaissirer JP. The threshold approach to clinical decision making. N Engl J Med 1980;302:1109–17.

    Article  PubMed  CAS  Google Scholar 

  13. Philbrick JT, Horwitz RI, Feinstein AR. Methodologic problems of exercise testing for coronary artery disease: groups, analysis and bias. Am J Cardiol 1980;46:807–12.

    Article  PubMed  CAS  Google Scholar 

  14. Eddy DM. Screening for cancer; theory, analysis and design. Englewood Cliffs, NJ: Prentice-Hall, 1980;26–96.

    Google Scholar 

  15. Riegelman RK. Studying a study and testing a test: how to read the medical literature. Boston: Little, Brown, 1981;93–149.

    Google Scholar 

  16. Department of Clinical Epidemiology and Biostatistics, McMaster University Health Science Centre. How to read clinical journals: II: To learn about a diagnostic test. Can Med Assoc J 1981;124:703–10.

    Google Scholar 

  17. Griner PF, Mayewski RJ, Mushlin AI, Greenland P. Selection and interpretation of diagnostic tests and procedures: principles and applications. Ann Intern Med 1981;94:557–600.

    PubMed  CAS  Google Scholar 

  18. Fletcher RH, Fletcher SW, Wagner EH. Clinical epidemiology: the essentials. Baltimore: Williams and Wilkins, 1982.

    Google Scholar 

  19. Philbrick JT, Horwitz RI, Feinstein AR, Langou RA, Chandler JP. The limited spectrum of patients studied in exercise test research. JAMA 1982;248:2467–70.

    Article  PubMed  CAS  Google Scholar 

  20. Robertson EA, Zweig MH, Van Steirteghem AC. Evaluating the clinical efficacy of laboratory tests. Am J Clin Pathol 1982;79:78–86.

    Google Scholar 

  21. Diamond GA, Forrester JS. Metadiagnosis: an epistemologic model of clinical judgment. Am J Med 1983;75:129–37.

    Article  PubMed  CAS  Google Scholar 

  22. Ingelfinger JA, Mosteller F, Thibodeau LA, Ware JH. Biostatistics in clinical medicine. New York: MacMillan, 1983;1–45.

    Google Scholar 

  23. Sheps SB, Schechter MT. The assessment of diagnostic tests: a survey of current medical research. JAMA 1984;252:2418–22.

    Article  PubMed  CAS  Google Scholar 

  24. Sackett DL, Haynes RB, Tugwell P. Clinical epidemiology: a basic science for clinical medicine. Boston: Little, Brown, 1985; 3–155.

    Google Scholar 

  25. Feinstein AR. Clinical epidemiology: the architecture of clinical research. Philadelphia: W. B. Saunders, 1985;597–631.

    Google Scholar 

  26. Doubilet PM, Cain KC. The superiority of sequential over simultaneous testing. Med Decis Making 1985;5:447–51.

    Article  PubMed  CAS  Google Scholar 

  27. Richardson DK, Schwartz JS, Weinbaum PJ, Gabbe SG, Diagnostic tests in obstetrics: a method for improved evaluation. Am J Obstet Gynecol 1985;152:613–8.

    PubMed  CAS  Google Scholar 

  28. Rozanski A, Diamond GA, Forrester JS, et al. Should the intent of testing influence its interpretation. J Am Coll Cardiol 1986; 7:17–24.

    Article  PubMed  CAS  Google Scholar 

  29. Guyatt G, Drummond M, Feeny D, et al. Guidelines for the clinical and economic evaluation of health care technologies. Soc Sci Med 1986;22:393–408.

    Article  PubMed  CAS  Google Scholar 

  30. Griner PF, Panzer RJ, Greenland P. Clinical diagnosis and the laboratory: logical strategies for common medical problems. Chicago: Year Book Medical Publishers, 1986;1–44.

    Google Scholar 

  31. Guyatt GH, Tugwell PX, Feeny DH, Haynes RB, Drummond M. A framework for clinical evaluation of diagnostic technologies. Can Med Assoc J 1987;134:587–94.

    Google Scholar 

  32. Sox HC. Common diagnostic tests. Use and interpretation. Philadelphia: Am Coll of Physicians, 1987;1–15.

    Google Scholar 

  33. Hlatky MA, Mark DB, Harrell FE, Lee KL, Califf RM, Pryor DB. Rethinking sensitivity and specificity. Am J Cardiol 1987; 59:1195–8.

    Article  PubMed  CAS  Google Scholar 

  34. Nierenberg AA, Feinstein AR. How to evaluate a diagnostic marker test: lessons from the rise and fall of dexamethasone suppression test. JAMA 1988;259:1699–1702.

    Article  PubMed  CAS  Google Scholar 

  35. Detrano R, Janosi A, Lyons KP, Marcondes G, Abbassi N, Froelicher VF. Factors affecting sensitivity and specificity of a diagnostic test: the exercise thallium scintigram. Am J Med 1988;84:699–710.

    Article  PubMed  CAS  Google Scholar 

  36. Arroll B, Schechter MI, Sheps SB. The assessment of diagnostic tests: a comparison of medical literature in 1982 and 1985. J Gen Intern Med 1988;3:443–7.

    Article  PubMed  CAS  Google Scholar 

  37. Poynard T, Chaput JC, Etienne JP. Relations between effectiveness of a diagnostic test, prevalence of the disease, and percentages of uninterpretable results. Med Decis Making 1982; 2:285–302.

    Article  PubMed  CAS  Google Scholar 

  38. Begg CB, Greenes RA, Iglewicz B. The influence of uninterpretability on the assessment of diagnostic tests. J Chronic Dis 1986;39:575–84.

    Article  PubMed  CAS  Google Scholar 

  39. Begg CB, Greenes RA. Assessment of radiologic tests: control of bias and other design considerations. Radiology 1988; 167:565–9.

    PubMed  CAS  Google Scholar 

  40. Boyko EJ, Alderman BW, Baron AE. Reference test errors bias the evaluation of diagnostic tests for ischemic heart disease. J Gen Intern Med 1988;3:476–81.

    Article  PubMed  CAS  Google Scholar 

  41. Metz CE. Basic principles of ROC analysis. Semin Nucl Med 1978;8:283–98.

    PubMed  CAS  Google Scholar 

  42. Kerlinger FN. Foundations of behavior research: education and psychological inquiry. New York: Holt, Rinehart and Winston, 1973.

    Google Scholar 

  43. Blalock HM, Blalock AB, Methodology in social research. New York: McGraw Hill, 1968.

    Google Scholar 

  44. Gaul MK, Linn WD, Mulrow CD. Captopril stimulated renin secretion in the diagnosis of renal vascular hypertension. Am J Hypertens 1988;1:73.

    Google Scholar 

  45. Light RJ. Measures of response agreement for qualitative data: some generalizations and alternatives. Psych Bull 1971; 76:365–77.

    Article  Google Scholar 

  46. Begg CB. Methodologic standards for diagnostic test assessment studies. J Gen Intern Med 1988;3:518–20.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Received from the Division of General Internal Medicine, University of Texas Health Science Center and the Ambulatory Care Section, Audie L. Murphy Memorial Veterans’ Administration Hospital, San Antonio, Texas.

Supported by a Milbank Memorial Scholarship and an American College of Physicians’ Teaching and Research Scholar Award.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mulrow, C.D., Linn, W.D., Gaul, M.K. et al. Assessing quality of a diagnostic test evaluation. J Gen Intern Med 4, 288–295 (1989). https://doi.org/10.1007/BF02597398

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02597398

Key words

Navigation