Berkson's bias revisited

doi:10.1016/0021-9681(84)90067-5

Journal of Chronic Diseases

Volume 37, Issue 12, 1984, Pages 909-916

Journal of Chronic D...

https://doi.org/10.1016/0021-9681(84)90067-5 Get rights and content

Abstract

When used in a loose manner to indicate distortive effects on associations among hospitalized patients, the term Berkson's bias denotes a special case of Simpson's paradox. If, however, Berkson's independence assumption is introduced, Berkson's bias affects only the “selected” subjects and not those “left behind” and tends to decrease the odds-ratio. Generally speaking, the model is valid not only for case-control studies but also for prospective and other investigations. By introducing a time dimension the model allows for the study of changes over time in Berkson's bias.

References (9)

AV Boyd
Testing for association of diseases
J Chron Dis
(1979)
J Berkson
Limitations to the application of fourfold table analysis to hospital data
Biometrics Bull
(1946)
AM Lilienfeld et al.
Foundations of Epidemiology
(1980)
DC Kleinbaum et al.
Epidemiologic Research
(1983)

There are more references available in the full text version of this article.

Cited by (15)

Statistical association and Berkson's paradox
2021, Journal of the American Academy of Dermatology
Subtypes of alcohol dependence in a nationally representative sample
2007, Drug and Alcohol Dependence
The authors sought to empirically derive alcohol dependence (AD) subtypes based on clinical characteristics using data from a nationally representative epidemiological survey.
A sample of 1484 respondents to the National Epidemiological Survey on Alcohol and Related Conditions (NESARC) with past year AD was subjected to latent class analysis in order to identify homogeneous subtypes.
The best-fitting model was a five-cluster solution. The largest cluster (Cluster 1: ∼31%) was comprised of young adults, who rarely sought help for drinking, had moderately high levels of periodic heavy drinking, relatively low rates of comorbidity, and the lowest rate of multigenerational AD (∼22%). In contrast, Clusters 4 and 5 (∼21% and 9%, respectively) had substantial rates of multigenerational AD (53% and 77%, respectively), had the most severe AD criteria profile, were associated with both comorbid psychiatric and other drug use disorders, lower levels of psychosocial functioning, and had engaged in significant help-seeking. Clusters 2 and 3 (∼19% each) had the latest onset, the lowest rates of periodic heavy drinking, medium/low levels of comorbidity, moderate levels of help-seeking, and higher psychosocial functioning.
Five distinct subtypes of AD were derived, distinguishable on the basis of family history, age of AD onset, endorsement of DSM-IV AUD criteria, and the presence of comorbid psychiatric and substance use disorders. These clinically relevant subtypes, derived from the general population, may enhance our understanding of the etiology, treatment, natural history, and prevention of AD and inform the DSM-V research agenda.
Effect of different sampling techniques on odds ratio estimates using hospital-based cases and controls
1997, Preventive Veterinary Medicine
Potential biases introduced by the use of hospital admission records have rarely been discussed in the veterinary literature. Veterinary Medical Teaching Hospital (VMTH) patient records kept at the University of California, Davis (UCD) School of Veterinary Medicine provide a unique opportunity to perform in-depth analyses on the effect of different control selection (sampling) techniques on odds ratio (OR) estimates for disease risk factors in a retrospective case-control study. Horses with Corynebacterium pseudotuberculosis abscesses (134) and the (secondary) study base population (source for controls) were identified, and a ‘gold standard’ OR for each category of the factors admission type, age, breed and sex was derived. Example data were used to calculate sampling ratios (SRs), defined as the ratio between any sample proportion (of an arbitrary risk factor) and the study base proportion for this risk factor. Sampling ratios different from 1.0 introduced biases into the observed OR estimates, when compared with the ‘gold standard’ OR. Three randomized samples (simple random, stratified random, systematic sampling), one matched (on date of admission) and three different diagnosis samples (‘colic’, ‘cuts and lacerations’, ‘fractures’) were selected from the study base, and the SRs for all categories of the four factors were derived. The matched and two different disease samples (‘colic’ and ‘fractures’) had especially wide ranges of observed SRs (and large errors in the OR estimates), whereas simple random and systematic sampling had comparably narrow ranges (less biased OR estimates). For the three randomized sampling techniques under study, repeated sampling was used to derive SR distributions. The SRs were approximately normally distributed. Analysis of variance and covariance showed that simple random and systematic sampling provided SR distributions with means closest to 1.0 (expected value) and small standard deviations. The OR estimates obtained from records selected by these two sampling techniques therefore were least biased. The findings demonstrate the importance of selecting appropriate sampling techniques in addition to properly defining the study (base) population. Sampling design introduces uncertainty into the OR estimates. The direction of the bias, however, depends on the OR between factor and disease in the source population (the ‘gold standard’), and on the direction and magnitude of the SR. When combining the results from single and repeated sampling we conclude that sampling design is most influential on the range of the observed SRs (single samples), on the absolute deviation of the SR from 1.0 (expressed as SR ΔMean) and on the SR standard deviation (SD) (repeated sampling).
Bias associated with differential hospitalization rates in incident case-control studies
1989, Journal of Clinical Epidemiology
Berkson's bias reflects a statistical phenomenon in which differential hospitalization rates create an exposure distribution among hospitalized cases that differs from that among other cases Importantly, previous work on Berkson's bias has not explicitly addressed the possibility of excluding prevalent or previously diagnosed cases-exclusions that are key features of many study designs. We indicate that the classically described bias differs from the corresponding bias in studies, such as incidence density studies, in which cases are restricted to those with recent diagnoses. We present methods that may be used to assess the magnitude of Berkson's bias in incidence-density studies. In many, though not all, situations the bias should be small and of little practical concern.
An analysis of Berkson's bias in case-control studies
1986, Journal of Chronic Diseases
The bias described by Berkson arises as a mathematical phenomenon, caused by the probabilistic union of different rates of hospitalization for people with different medical phenomena. When the concept is extended to case-control studies, these rates will occur as h_d for people with the target disease, h_c for people with the control condition, and h_e for the separate effect of exposure to the suspected etiologic agent.
An algebraic analysis of patterns of hospitalization and case-control selection demonstrates that Berkson's bias will be avoided if both cases and controls are chosen from the community or if h_e = 0. When the cases are chosen from hospitalized patients, the odds ratio will be biased if, as in the usual clinical situation, h_e ≠ 0. The odds ratio will be falsely elevated if the control groups are chosen from a community population rather than from hospitalized patients, and falsely lowered if the controls are hospitalized patients who do not have the target disease. If the control groups are chosen from patients hospitalized with specific comparison conditions, the odds ratio will be falsely elevated or lowered, depending on the relative magnitudes of h_d and h_c.
In Berkson's mathematical model, the probabilistic calculations depend on the assumption that each of the exposed or diseased clinical conditions has an independent additive effect on hospitalization rates. In reality, however, the concurrence of two or more conditions of disease and exposure may synergistically affect the examining physician's nosocomial decisions and may thereby substantially change the hospitalization rates from what is expected mathematically. In creating hospitalization bias in case-control studies, these selective clinical decisions about referral to hospital may be more cogent than the probabilistic distinctions described by Berkson.
Is there a rise of prevalence for Molar Incisor Hypomineralization? A meta-analysis of published data
2024, BMC Oral Health

View all citing articles on Scopus

View full text

Original articleBerkson's bias revisited

Abstract

J Chron Dis

Limitations to the application of fourfold table analysis to hospital data

Biometrics Bull

Foundations of Epidemiology

Epidemiologic Research

Original article
Berkson's bias revisited