Abstract
Objective:To develop a standardized scale for assessing the quality of a diagnostic test evaluation.
Design:Fourteen participants with formal and practical experience in evaluating diagnostic tests formed a consensus panel. Panel members identified and weighted questions that should be addressed when assessing the quality of a diagnostic test evaluation.
Setting:General internal medicine division at an academic medical center.
Results:A 19-item weighted scale was developed. It prioritizes and addresses issues such as description of the proposed purpose of the test; appropriate selection and description of the study population; appropriate performance and description of the diagnostic test; appropriate selection and performance of the reference standard; and adequate presentation of test characteristics.
Conclusions:The scale is proposed as a useful instrument for readers, investigators, reviewers, and editors, because it represents an updated synthesis of important criteria to consider when evaluating diagnostic tests. It can also be used to rate quantitatively the quality of diagnostic test evaluations.
Similar content being viewed by others
References
Vecchio TJ. Predictive value of a single diagnostic test in unselected populations. N Engl J Med 1966;274:1171–3.
Feinstein AR. Clinical judgment. New York: Robert E. Krieger Publishing Co., 1967;72–127.
Galen RS, Gambino SR. Beyond normality. The predictive value and efficacy of medical diagnoses. New York: John Wiley and Sons, 1975.
Koran LM. The reliability of clinical methods, data and judgments. N Engl J Med 1975;293:642–6, 695–701.
McNeil BJ, Keeler E, Adelstein SJ. Primer on certain elements of medical decision making. N Engl J Med 1975;293:211–5.
Wulff HR. Rational diagnosis and treatment. 2nd ed. Oxford: Blackwell Scientific Publications, 1976;78–117.
Feinstein AR. Clinical biostatistics. St. Louis: C. V. Mosby, 1977.
Ransohoff DF, Feinstein AR. Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. N Engl J Med 1978;299:926–9.
Swets JA, Pickett RM, Whitehead SF, et al. Assessment of diagnostic technologies. Science 1979;205:753–9.
Diamond GA, Forrester JA. Analysis of probability as an aid in the clinical diagnosis of coronary-artery disease. N Engl J Med 1979;300:1350–7.
Weinstein MC, Fineberg HV. Clinical decision analysis. Philadelphia: W. B. Saunders, 1980.
Pauker SG, Kaissirer JP. The threshold approach to clinical decision making. N Engl J Med 1980;302:1109–17.
Philbrick JT, Horwitz RI, Feinstein AR. Methodologic problems of exercise testing for coronary artery disease: groups, analysis and bias. Am J Cardiol 1980;46:807–12.
Eddy DM. Screening for cancer; theory, analysis and design. Englewood Cliffs, NJ: Prentice-Hall, 1980;26–96.
Riegelman RK. Studying a study and testing a test: how to read the medical literature. Boston: Little, Brown, 1981;93–149.
Department of Clinical Epidemiology and Biostatistics, McMaster University Health Science Centre. How to read clinical journals: II: To learn about a diagnostic test. Can Med Assoc J 1981;124:703–10.
Griner PF, Mayewski RJ, Mushlin AI, Greenland P. Selection and interpretation of diagnostic tests and procedures: principles and applications. Ann Intern Med 1981;94:557–600.
Fletcher RH, Fletcher SW, Wagner EH. Clinical epidemiology: the essentials. Baltimore: Williams and Wilkins, 1982.
Philbrick JT, Horwitz RI, Feinstein AR, Langou RA, Chandler JP. The limited spectrum of patients studied in exercise test research. JAMA 1982;248:2467–70.
Robertson EA, Zweig MH, Van Steirteghem AC. Evaluating the clinical efficacy of laboratory tests. Am J Clin Pathol 1982;79:78–86.
Diamond GA, Forrester JS. Metadiagnosis: an epistemologic model of clinical judgment. Am J Med 1983;75:129–37.
Ingelfinger JA, Mosteller F, Thibodeau LA, Ware JH. Biostatistics in clinical medicine. New York: MacMillan, 1983;1–45.
Sheps SB, Schechter MT. The assessment of diagnostic tests: a survey of current medical research. JAMA 1984;252:2418–22.
Sackett DL, Haynes RB, Tugwell P. Clinical epidemiology: a basic science for clinical medicine. Boston: Little, Brown, 1985; 3–155.
Feinstein AR. Clinical epidemiology: the architecture of clinical research. Philadelphia: W. B. Saunders, 1985;597–631.
Doubilet PM, Cain KC. The superiority of sequential over simultaneous testing. Med Decis Making 1985;5:447–51.
Richardson DK, Schwartz JS, Weinbaum PJ, Gabbe SG, Diagnostic tests in obstetrics: a method for improved evaluation. Am J Obstet Gynecol 1985;152:613–8.
Rozanski A, Diamond GA, Forrester JS, et al. Should the intent of testing influence its interpretation. J Am Coll Cardiol 1986; 7:17–24.
Guyatt G, Drummond M, Feeny D, et al. Guidelines for the clinical and economic evaluation of health care technologies. Soc Sci Med 1986;22:393–408.
Griner PF, Panzer RJ, Greenland P. Clinical diagnosis and the laboratory: logical strategies for common medical problems. Chicago: Year Book Medical Publishers, 1986;1–44.
Guyatt GH, Tugwell PX, Feeny DH, Haynes RB, Drummond M. A framework for clinical evaluation of diagnostic technologies. Can Med Assoc J 1987;134:587–94.
Sox HC. Common diagnostic tests. Use and interpretation. Philadelphia: Am Coll of Physicians, 1987;1–15.
Hlatky MA, Mark DB, Harrell FE, Lee KL, Califf RM, Pryor DB. Rethinking sensitivity and specificity. Am J Cardiol 1987; 59:1195–8.
Nierenberg AA, Feinstein AR. How to evaluate a diagnostic marker test: lessons from the rise and fall of dexamethasone suppression test. JAMA 1988;259:1699–1702.
Detrano R, Janosi A, Lyons KP, Marcondes G, Abbassi N, Froelicher VF. Factors affecting sensitivity and specificity of a diagnostic test: the exercise thallium scintigram. Am J Med 1988;84:699–710.
Arroll B, Schechter MI, Sheps SB. The assessment of diagnostic tests: a comparison of medical literature in 1982 and 1985. J Gen Intern Med 1988;3:443–7.
Poynard T, Chaput JC, Etienne JP. Relations between effectiveness of a diagnostic test, prevalence of the disease, and percentages of uninterpretable results. Med Decis Making 1982; 2:285–302.
Begg CB, Greenes RA, Iglewicz B. The influence of uninterpretability on the assessment of diagnostic tests. J Chronic Dis 1986;39:575–84.
Begg CB, Greenes RA. Assessment of radiologic tests: control of bias and other design considerations. Radiology 1988; 167:565–9.
Boyko EJ, Alderman BW, Baron AE. Reference test errors bias the evaluation of diagnostic tests for ischemic heart disease. J Gen Intern Med 1988;3:476–81.
Metz CE. Basic principles of ROC analysis. Semin Nucl Med 1978;8:283–98.
Kerlinger FN. Foundations of behavior research: education and psychological inquiry. New York: Holt, Rinehart and Winston, 1973.
Blalock HM, Blalock AB, Methodology in social research. New York: McGraw Hill, 1968.
Gaul MK, Linn WD, Mulrow CD. Captopril stimulated renin secretion in the diagnosis of renal vascular hypertension. Am J Hypertens 1988;1:73.
Light RJ. Measures of response agreement for qualitative data: some generalizations and alternatives. Psych Bull 1971; 76:365–77.
Begg CB. Methodologic standards for diagnostic test assessment studies. J Gen Intern Med 1988;3:518–20.
Author information
Authors and Affiliations
Additional information
Received from the Division of General Internal Medicine, University of Texas Health Science Center and the Ambulatory Care Section, Audie L. Murphy Memorial Veterans’ Administration Hospital, San Antonio, Texas.
Supported by a Milbank Memorial Scholarship and an American College of Physicians’ Teaching and Research Scholar Award.
Rights and permissions
About this article
Cite this article
Mulrow, C.D., Linn, W.D., Gaul, M.K. et al. Assessing quality of a diagnostic test evaluation. J Gen Intern Med 4, 288–295 (1989). https://doi.org/10.1007/BF02597398
Issue Date:
DOI: https://doi.org/10.1007/BF02597398