Original ArticlePolytomous logistic regression analysis could be applied more often in diagnostic research
Introduction
Diagnostic practice starts with a patient presenting with particular signs and symptoms. The physician then defines the differential diagnoses and implicitly estimates the probability of presence of all possible conditions given the patient's clinical and nonclinical profile [1], [2], [3]. Usually, one of these differential diagnoses is defined as the working diagnosis or target condition, to which the diagnostic workup is primarily directed. Diagnostic studies commonly focus on the ability of tests to include or exclude this target condition by dichotomizing the diagnostic outcome; the alternative diagnoses are included in the category “target condition absent.” Accordingly, diagnostic studies that aim to develop diagnostic prediction rules commonly use dichotomous logistic regression analysis. A well-known example is the Wells rule to diagnose deep venous thrombosis [4]. However, diagnostic prediction rules that estimate the probability of presence vs. absence of one target condition may oversimplify clinical practice. Rules that estimate the probabilities of presence of each of the potential conditions may be preferable.
Already in the early eighties Begg and coworkers discussed the use of polytomous logistic regression to accommodate simultaneous modeling of more than two unordered outcome categories [5], [6]. This method has received little attention since then [7], and we believe it is time to revisit polytomous logistic regression analysis to address diagnostic questions.
We provide an introduction to the principles of polytomous logistic regression and show an application with empirical data from a study on diagnosis of residual retroperitoneal mass histology in patients with nonseminomatous testicular germ cell tumor (NSTGCT) [8]. We explain the interpretation of the derived odds ratios [ORs], study several aspects of the polytomous model performance, and present a user-friendly format for application of the polytomous regression model. Finally, the advantages and disadvantages of polytomous logistic regression are discussed.
Section snippets
Patients
We used data from previous studies on residual retroperitoneal mass histology in patients (n = 1,094) treated with chemotherapy for metastatic NSTGCT [8], [9], [10], [11]. These studies were primarily performed to develop and validate a dichotomous diagnostic prediction model to discriminate benign tissue from other histologies. Patients with elevated levels of the serum tumor markers alpha-fetoprotein (AFP) and human chorionic gonadotropin (HCG) at the time of surgery, extragonadal primaries,
Results
In 425 (39%) patients, the final diagnosis was benign tissue, 535 (49%) had mature teratoma, and 134 (12%) had viable cancer (Table 1). Overall, 46% of the patients had teratoma negative tumor histology. Tumor marker levels of AFP and HCG were normal in approximately one third of all patients (31% and 35%, respectively). Patients with benign masses had a higher frequency of absence of mature teratoma in the primary tumor (Table 1).
Discussion
In this article, we examined polytomous logistic regression in diagnostic studies with multiple diagnoses. We explained the interpretation of the ORs derived from the polytomous regression model and showed several model performance measures and a user-friendly format (score chart) to facilitate the use of a polytomous regression model in practice.
Acknowledgments
For this research project, we received financial support from the Netherlands Organization for Scientific Research grant numbers ZONMW 904-66-112 and 917-46-360.
References (33)
- et al.
Using binary logistic regression models for ordinal data with non-proportional odds
J Clin Epidemiol
(1998) - et al.
Validation of a prediction model and its predictors for the histology of residual masses in nonseminomatous testicular cancer
J Urol
(2001) - et al.
Predictors of residual mass histology following chemotherapy for metastatic non-seminomatous testicular cancer: a quantitative overview of 996 resections
Eur J Cancer
(1994) - et al.
A simulation study of the number of events per variable in logistic regression analysis
J Clin Epidemiol
(1996) - et al.
Stepwise selection in small data sets: a simulation study of bias in logistic regression analysis
J Clin Epidemiol
(1999) Clinical epidemiology: the architecture of clinical research
(1985)- et al.
Diagnostic studies as multivariable, prediction research
J Epidemiol Community Health
(2002) - et al.
Clinical epidemiology; a basic science for clinical medicine
(1985) - et al.
A simple clinical model for the diagnosis of deep-vein thrombosis combined with impedance plethysmography: potential for an improvement in the diagnostic process
J Intern Med
(1998) - et al.
Calculation of polychotomous logistic regression parameters using individualized regressions
Biometrika
(1984)
Methodology for the differential diagnosis of a complex data set. A case study using data from routine CT scan examinations
Med Decis Making
Prediction of residual retroperitoneal mass histology after chemotherapy for metastatic nonseminomatous germ cell tumor: multivariate analysis of individual patient data from six study groups
J Clin Oncol
Validity of predictions of residual retroperitoneal mass histology in nonseminomatous testicular cancer
J Clin Oncol
External validity of a prediction rule for residual mass histology in testicular cancer: an evaluation for good prognosis patients
Br J Cancer
Residual mass histology in testicular cancer: development and validation of a clinical prediction rule
Stat Med
An introduction to categorical data analysis
Cited by (96)
Season, weather and predictors of healthcare-associated Gram-negative bloodstream infections: a case-only study
2019, Journal of Hospital InfectionCitation Excerpt :Cases diagnosed in winter were used as the reference category. Models of polytomous (multi-nomial) logistic regression were used for both uni- and multi-variable analysis [15]. In the multi-variable step, a stepwise forward strategy was used, using criteria of P < 0.05 and P > 0.1 for insertion and removal of variables [16].