Polytomous logistic regression analysis could be applied more often in diagnostic research

doi:10.1016/j.jclinepi.2007.03.002

Journal of Clinical Epidemiology

Volume 61, Issue 2, February 2008, Pages 125-134

https://doi.org/10.1016/j.jclinepi.2007.03.002 Get rights and content

Abstract

Objective

Physicians commonly consider the presence of all differential diagnoses simultaneously. Polytomous logistic regression modeling allows for simultaneous estimation of the probability of multiple diagnoses. We discuss and (empirically) illustrate the value of this method for diagnostic research.

Study Design and Setting

We used data from a study on the diagnosis of residual retroperitoneal mass histology in patients presenting with nonseminomatous testicular germ cell tumor. The differential diagnoses include benign tissue, mature teratoma, and viable cancer. Probabilities of each diagnosis were estimated with a polytomous logistic regression model and compared with the probabilities estimated from two consecutive dichotomous logistic regression models.

Results

We provide interpretations of the odds ratios derived from the polytomous regression model and present a simple score chart to facilitate calculation of predicted probabilities from the polytomous model. For both modeling methods, we show the calibration plots and receiver operating characteristics curve (ROC) areas comparing each diagnostic outcome category with the other two. The ROC areas for benign tissue, mature teratoma, and viable cancer were similar for both modeling methods, 0.83 (95% confidence interval [CI] = 0.80–0.85) vs. 0.83 (95% CI = 0.80–0.85), 0.78 (95% CI = 0.75–0.81) vs. 0.78 (95% CI = 0.75–0.81), and 0.66 (95% CI = 0.61–0.71) vs. 0.64 (95% CI = 0.59–0.69), for polytomous and dichotomous regression models, respectively.

Conclusion

Polytomous logistic regression is a useful technique to simultaneously model predicted probabilities of multiple diagnostic outcome categories. The performance of a polytomous prediction model can be assessed similarly to a dichotomous logistic regression model, and predictions by a polytomous model can be made with a user-friendly method. Because the simultaneous consideration of the presence of multiple (differential) conditions serves clinical practice better than consideration of the presence of only one target condition, polytomous logistic regression could be applied more often in diagnostic research.

Introduction

Diagnostic practice starts with a patient presenting with particular signs and symptoms. The physician then defines the differential diagnoses and implicitly estimates the probability of presence of all possible conditions given the patient's clinical and nonclinical profile [1], [2], [3]. Usually, one of these differential diagnoses is defined as the working diagnosis or target condition, to which the diagnostic workup is primarily directed. Diagnostic studies commonly focus on the ability of tests to include or exclude this target condition by dichotomizing the diagnostic outcome; the alternative diagnoses are included in the category “target condition absent.” Accordingly, diagnostic studies that aim to develop diagnostic prediction rules commonly use dichotomous logistic regression analysis. A well-known example is the Wells rule to diagnose deep venous thrombosis [4]. However, diagnostic prediction rules that estimate the probability of presence vs. absence of one target condition may oversimplify clinical practice. Rules that estimate the probabilities of presence of each of the potential conditions may be preferable.

Already in the early eighties Begg and coworkers discussed the use of polytomous logistic regression to accommodate simultaneous modeling of more than two unordered outcome categories [5], [6]. This method has received little attention since then [7], and we believe it is time to revisit polytomous logistic regression analysis to address diagnostic questions.

We provide an introduction to the principles of polytomous logistic regression and show an application with empirical data from a study on diagnosis of residual retroperitoneal mass histology in patients with nonseminomatous testicular germ cell tumor (NSTGCT) [8]. We explain the interpretation of the derived odds ratios [ORs], study several aspects of the polytomous model performance, and present a user-friendly format for application of the polytomous regression model. Finally, the advantages and disadvantages of polytomous logistic regression are discussed.

Section snippets

Patients

We used data from previous studies on residual retroperitoneal mass histology in patients (n = 1,094) treated with chemotherapy for metastatic NSTGCT [8], [9], [10], [11]. These studies were primarily performed to develop and validate a dichotomous diagnostic prediction model to discriminate benign tissue from other histologies. Patients with elevated levels of the serum tumor markers alpha-fetoprotein (AFP) and human chorionic gonadotropin (HCG) at the time of surgery, extragonadal primaries,

Results

In 425 (39%) patients, the final diagnosis was benign tissue, 535 (49%) had mature teratoma, and 134 (12%) had viable cancer (Table 1). Overall, 46% of the patients had teratoma negative tumor histology. Tumor marker levels of AFP and HCG were normal in approximately one third of all patients (31% and 35%, respectively). Patients with benign masses had a higher frequency of absence of mature teratoma in the primary tumor (Table 1).

Discussion

In this article, we examined polytomous logistic regression in diagnostic studies with multiple diagnoses. We explained the interpretation of the ORs derived from the polytomous regression model and showed several model performance measures and a user-friendly format (score chart) to facilitate the use of a polytomous regression model in practice.

Acknowledgments

For this research project, we received financial support from the Netherlands Organization for Scientific Research grant numbers ZONMW 904-66-112 and 917-46-360.

References (33)

R. Bender et al.
Using binary logistic regression models for ordinal data with non-proportional odds
J Clin Epidemiol
(1998)
Y. Vergouwe et al.
Validation of a prediction model and its predictors for the histology of residual masses in nonseminomatous testicular cancer
J Urol
(2001)
E.W. Steyerberg et al.
Predictors of residual mass histology following chemotherapy for metastatic non-seminomatous testicular cancer: a quantitative overview of 996 resections
Eur J Cancer
(1994)
P. Peduzzi et al.
A simulation study of the number of events per variable in logistic regression analysis
J Clin Epidemiol
(1996)
E.W. Steyerberg et al.
Stepwise selection in small data sets: a simulation study of bias in logistic regression analysis
J Clin Epidemiol
(1999)
A.R. Feinstein
Clinical epidemiology: the architecture of clinical research
(1985)
K.G. Moons et al.
Diagnostic studies as multivariable, prediction research
J Epidemiol Community Health
(2002)
D.L. Sackett et al.
Clinical epidemiology; a basic science for clinical medicine
(1985)
P.S. Wells et al.
A simple clinical model for the diagnosis of deep-vein thrombosis combined with impedance plethysmography: potential for an improvement in the diagnostic process
J Intern Med
(1998)
C.B. Begg et al.
Calculation of polychotomous logistic regression parameters using individualized regressions
Biometrika
(1984)

A. Wijesinha et al.

Methodology for the differential diagnosis of a complex data set. A case study using data from routine CT scan examinations

Med Decis Making

(1983)

E.W. Steyerberg et al.

Prediction of residual retroperitoneal mass histology after chemotherapy for metastatic nonseminomatous germ cell tumor: multivariate analysis of individual patient data from six study groups

J Clin Oncol

(1995)

E.W. Steyerberg et al.

Validity of predictions of residual retroperitoneal mass histology in nonseminomatous testicular cancer

J Clin Oncol

(1998)

Y. Vergouwe et al.

External validity of a prediction rule for residual mass histology in testicular cancer: an evaluation for good prognosis patients

Br J Cancer

(2003)

E.W. Steyerberg et al.

Residual mass histology in testicular cancer: development and validation of a clinical prediction rule

Stat Med

(2001)

A. Agresti

An introduction to categorical data analysis

(1996)

Cited by (96)

Developing and externally validating multinomial prediction models for methotrexate treatment outcomes in patients with rheumatoid arthritis: results from an international collaboration
2024, Journal of Clinical Epidemiology
In rheumatology, there is a clinical need to identify patients at high risk (>50%) of not responding to the first-line therapy methotrexate (MTX) due to lack of disease control or discontinuation due to adverse events (AEs). Despite this need, previous prediction models in this context are at high risk of bias and ignore AEs. Our objectives were to (i) develop a multinomial model for outcomes of low disease activity and discontinuing due to AEs 6 months after starting MTX, (ii) update prognosis 3-month following treatment initiation, and (iii) externally validate these models.
A multinomial model for low disease activity (submodel 1) and discontinuing due to AEs (submodel 2) was developed using data from the UK Rheumatoid Arthritis Medication Study, updated using landmarking analysis, internally validated using bootstrapping, and externally validated in the Norwegian Disease-Modifying Antirheumatic Drug register. Performance was assessed using calibration (calibration-slope and calibration-in-the-large), and discrimination (concordance-statistic and polytomous discriminatory index).
The internally validated model showed good calibration in the development setting with a calibration-slope of 1.01 (0.87, 1.14) (submodel 1) and 0.83 (0.30, 1.34) (submodel 2), and moderate discrimination with a c-statistic of 0.72 (0.69, 0.74) and 0.53 (0.48, 0.59), respectively. Predictive performance decreased after external validation (calibration-slope 0.78 (0.64, 0.93) (submodel 1) and 0.86 (0.34, 1.38) (submodel 2)), which may be due to differences in disease-specific characteristics and outcome prevalence.
We addressed previously identified methodological limitations of prediction models for outcomes of MTX therapy. The multinomial approach predicted outcomes of disease activity more accurately than AEs, which should be addressed in future work to aid implementation into clinical practice.
Season, weather and predictors of healthcare-associated Gram-negative bloodstream infections: a case-only study
2019, Journal of Hospital Infection
Citation Excerpt :
Cases diagnosed in winter were used as the reference category. Models of polytomous (multi-nomial) logistic regression were used for both uni- and multi-variable analysis [15]. In the multi-variable step, a stepwise forward strategy was used, using criteria of P < 0.05 and P > 0.1 for insertion and removal of variables [16].
Recent studies reported seasonality in healthcare-associated infections (HCAI). The association of this phenomenon with other risk factors for HCAI is not clear.
To analyse the interplay of season, weather and usual predictors of healthcare-associated bloodstream infections caused by Gram-negative bacilli (GNB-BSI).
A case-only study was conducted in a teaching hospital in Brazil. The study enrolled 446 subjects with GNB-BSI diagnosed from July 2012 to June 2016. Demographic data, comorbidities, invasive procedures and use of antimicrobials were reviewed in medical charts. The season in which GNB-BSI occurred, and weather parameters on the day of diagnosis were recorded. Factors associated with occurrence of GNB-BSI in different seasons (reference category: winter) and caused by different GNB (reference category: Escherichia coli) were analysed. Uni- and multi-variable models of multi-nomial logistic regression were used for analysis.
GNB-BSI diagnosed in summer was more likely to be caused by Klebsiella spp. [odds ratio (OR) 5.33; 95% confidence interval (CI) 2.04–13.96] or Acinetobacter baumannii (OR 2.69; 95% CI 1.04–6.96), and there was an association between Klebsiella spp. and spring (OR 2.86; 95% CI 1.14–7.18). Average temperature on the day of diagnosis was associated with Klebsiella spp. (OR 1.19; 95% CI 1.07–1.33) and A. baumannii (OR 1.20; 95% CI 1.07–1.34).
Warm seasons and daily temperature impact on the aetiology of GNB-BSI, even in models adjusted for usual risk factors. One possible explanation for these findings is that seasonality of healthcare-associated pathogens is intrinsic to micro-organisms, and not associated with comorbidities, procedures or use of antimicrobials.
Comparing methods for risk prediction of multicategory outcomes: dichotomized logistic regression vs. multinomial logit regression
2024, Research Square
A new robust approach for the polytomous logistic regression model based on Rényi’s pseudodistances
2024, arXiv
Body image dissatisfaction and low adherence to the Western dietary standard among schoolchildren: a cross-sectional study
2024, Ciencia e Saude Coletiva
How to develop, externally validate, and update multinomial prediction models
2023, arXiv

View all citing articles on Scopus

View full text

Original ArticlePolytomous logistic regression analysis could be applied more often in diagnostic research

Abstract

Objective

Study Design and Setting

Results

Conclusion

Introduction

Section snippets

Patients

Results

Discussion

Acknowledgments

J Clin Epidemiol

J Urol

Eur J Cancer

J Clin Epidemiol

J Clin Epidemiol

Clinical epidemiology: the architecture of clinical research

Diagnostic studies as multivariable, prediction research

J Epidemiol Community Health

Clinical epidemiology; a basic science for clinical medicine

A simple clinical model for the diagnosis of deep-vein thrombosis combined with impedance plethysmography: potential for an improvement in the diagnostic process

J Intern Med

Calculation of polychotomous logistic regression parameters using individualized regressions

Biometrika

Methodology for the differential diagnosis of a complex data set. A case study using data from routine CT scan examinations

Med Decis Making

Prediction of residual retroperitoneal mass histology after chemotherapy for metastatic nonseminomatous germ cell tumor: multivariate analysis of individual patient data from six study groups

J Clin Oncol

Validity of predictions of residual retroperitoneal mass histology in nonseminomatous testicular cancer

J Clin Oncol

External validity of a prediction rule for residual mass histology in testicular cancer: an evaluation for good prognosis patients

Br J Cancer

Residual mass histology in testicular cancer: development and validation of a clinical prediction rule

Stat Med

An introduction to categorical data analysis

Original Article
Polytomous logistic regression analysis could be applied more often in diagnostic research