Skip to main content
Log in

When the entire population is the sample: strengths and limitations in register-based epidemiology

  • METHODS
  • Published:
European Journal of Epidemiology Aims and scope Submit manuscript

Abstract

Studies based on databases, medical records and registers are used extensively today in epidemiological research. Despite the increasing use, no developed methodological literature on use and evaluation of population-based registers is available, even though data collection in register-based studies differs from researcher-collected data, all persons in a population are available and traditional statistical analyses focusing on sampling error as the main source of uncertainty may not be relevant. We present the main strengths and limitations of register-based studies, biases especially important in register-based studies and methods for evaluating completeness and validity of registers. The main strengths are that data already exist and valuable time has passed, complete study populations minimizing selection bias and independently collected data. Main limitations are that necessary information may be unavailable, data collection is not done by the researcher, confounder information is lacking, missing information on data quality, truncation at start of follow-up making it difficult to differentiate between prevalent and incident cases and the risk of data dredging. We conclude that epidemiological studies with inclusion of all persons in a population followed for decades available relatively fast are important data sources for modern epidemiology, but it is important to acknowledge the data limitations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Irgens LM, Bjerkeda T. Epidemiology of leprosy in Norway—history of National Leprosy Registry of Norway from 1856 until today. Int J Epidemiol. 1973;2(1):81–9.

    Article  CAS  PubMed  Google Scholar 

  2. Goldberg J, Gelfand HM, Levy PS. Registry evaluation methods: a review and case study. Epidemiol Rev. 1980;2:210–20.

    CAS  PubMed  Google Scholar 

  3. St Sauver JL, Grossardt BR, Yawn BP, Melton LJ III, Rocca WA. Use of a medical records linkage system to enumerate a dynamic population over time: the Rochester epidemiology project. Am J Epidemiol. 2011;173:1059–68.

    Article  PubMed Central  PubMed  Google Scholar 

  4. Olsen J, Bronnum-Hansen H, Gissler M, Hakama M, Hjern A, Kamper-Jorgensen F, et al. High-throughput epidemiology: combining existing data from the Nordic countries in health-related collaborative research. Scand J Public Health. 2010;38:777–9.

    Article  PubMed  Google Scholar 

  5. Thygesen LC, Daasnes C, Thaulow I, Bronnum-Hansen H. Introduction to Danish (nationwide) registers on health and social issues: structure, access, legislation, and archiving. Scand J Public Health. 2011;39:12–6.

    Article  PubMed  Google Scholar 

  6. Sorensen TI. Great scientific potential in Danish registries [in Danish]. Ugeskr Laeger. 1994;156:5812–3.

  7. Frank L. Epidemiology—when an entire country is a cohort. Science. 2000;287:2398–9.

    Article  CAS  PubMed  Google Scholar 

  8. Sorensen HT, Sabroe S, Olsen J. A framework for evaluation of secondary data sources for epidemiological research. Int J Epidemiol. 1996;25:435–42.

    Article  CAS  PubMed  Google Scholar 

  9. Sorensen H. Regional administrative health registries as a resource in clinical epidemiology. Aarhus: Aarhus University; 1996.

    Google Scholar 

  10. Sorensen H. Regional administrative health registries as a resource in clinical epidemiology. Int J Risk Saf Med. 1997;10:1–22.

    CAS  PubMed  Google Scholar 

  11. Pike MC, Henderson BE, Casagrande JT, Rosario I, Gray GE. Oral-contraceptive use and early abortion as risk-factors for breast-cancer in young-women. Br J Cancer. 1981;43:72–6.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  12. Brind J, Chinchilli VM, Severs WB, Summy-Long J. Induced abortion as an independent risk factor for breast cancer: a comprehensive review and meta-analysis. J Epidemiol Community Health. 1996;50:481–96.

    Google Scholar 

  13. Melbye M, Wohlfahrt J, Olsen JH, Frisch M, Westergaard T, Helweg-Larsen K, et al. Induced abortion and the risk of breast cancer. N Engl J Med. 1997;336:81–5.

    Article  CAS  PubMed  Google Scholar 

  14. Blenstrup LT, Knudsen LB. Danish registers on aspects of reproduction. Scand J Public Health. 2011;39(7 Suppl.):79–82.

    Google Scholar 

  15. Gjerstorff ML. The Danish cancer registry. Scand J Public Health. 2011;39(7 Suppl):42–5.

    Article  PubMed  Google Scholar 

  16. Norgaard M, Wogelius P, Pedersen L, Rothman KJ, Sorensen HT. Maternal use of oral contraceptives during early pregnancy and risk of hypospadias in male offspring. Urology. 2009;74:583–7.

    Article  PubMed  Google Scholar 

  17. Peltola M, Juntunen M, Hakkinen U, Rosenqvist G, Seppala TT, Sund R. A methodological approach for register-based evaluation of cost and outcomes in health care. Ann Med. 2011;43:S4–13.

    Article  PubMed  Google Scholar 

  18. Sund R, Nurmi-Luthje I, Luthje P, Tanninen S, Narinen A, Keskimaki I. Comparing properties of audit data and routinely collected register data in case of performance assessment of hip fracture in Finland. Methods Inf Med. 2007;46:558–66.

    CAS  PubMed  Google Scholar 

  19. Dans PE. Looking for answers in all the wrong places. Ann Intern Med. 1993;119:855–7.

    Article  CAS  PubMed  Google Scholar 

  20. Hsia DC, Krushat WM, Fagan AB, Tebbutt JA, Kusserow RP. Accuracy of diagnostic coding for Medicare patients under the prospective-payment system. N Engl J Med. 1988;318:352–5.

    Article  CAS  PubMed  Google Scholar 

  21. Irgens LM. Challenges to registry-based epidemiology in post-modernistic civilization. Nor Epidemiol. 2001;11:127–31.

    Google Scholar 

  22. United Nations Economic Commission of Europe. Register-based statistics in the Nordic countries. New York: United Nations; 2007.

    Google Scholar 

  23. Wallgren A, Wallgren B. Register-based statistics—administrative data for statistical purposes. Sussex: Wiley; 2007.

    Book  Google Scholar 

  24. Hartley HO, Sielken RL Jr. A “super-population viewpoint” for finite population sampling. Biometrics. 1975;31:411–22.

    Article  CAS  PubMed  Google Scholar 

  25. Edington ES. Randomization tests. New York: Marcel Dekker; 1986.

    Google Scholar 

  26. Sorensen HT, Schulze S. Danish health registries. A valuable tool in medical research. Dan Med Bull. 1996;43:463.

    CAS  PubMed  Google Scholar 

  27. Agerbo E. Epidemiological suicide research based on Danish routine registers. Aarhus: Aarhus University; 2009.

    Google Scholar 

  28. Olsen J. Register-based research: some methodological considerations. Scand J Public Health. 2011;39:225–9.

    Article  PubMed  Google Scholar 

  29. Jensen VM, Rasmussen AW. Danish education registers. Scand J Public Health. 2011;39(7 Suppl):91–4.

    Article  PubMed  Google Scholar 

  30. Olsen J. Using secondary data. In: Rothman KJ, Greenland S, Lash TL, editors. Modern epidemiology. Philadelphia, PA: Lippincott Williams & Wilkins; 2008. p. 481–91.

    Google Scholar 

  31. Thomsen CF, Skovdal J, Helkjaer PE. Intraobserver variation in the classification of diseases [in Danish]. Ugeskr Laeger. 1995;157:3746–9.

  32. Green J, Wintfeld N. How accurate are hospital discharge data for evaluating effectiveness of care? Med Care. 1993;31:719–31.

    Article  CAS  PubMed  Google Scholar 

  33. Jencks SF, Williams DK, Kay TL. Assessing hospital-associated deaths from discharge data. The role of length of stay and comorbidities. JAMA. 1988;260:2240–6.

    Article  CAS  PubMed  Google Scholar 

  34. Ray WA. Improving automated database studies. Epidemiology. 2011;22:302–4.

    Article  PubMed  Google Scholar 

  35. Weiss NS. The new world of data linkages in clinical epidemiology: are we being brave or foolhardy? Epidemiology. 2011;22:292–4.

    Article  PubMed  Google Scholar 

  36. Schneeweiss S. Sensitivity analysis and external adjustment for unmeasured confounders in epidemiologic database studies of therapeutics. Pharmacoepidemiol Drug Saf. 2006;15:291–303.

    Article  PubMed  Google Scholar 

  37. Schneeweiss S, Glynn RJ, Tsai EH, Avorn J, Solomon DH. Adjusting for unmeasured confounders in pharmacoepidemiologic claims data using external information. Epidemiology. 2005;16:17–24.

    Article  PubMed  Google Scholar 

  38. Greenland S. An introduction to instrumental variables for epidemiologists. Int J Epidemiol. 2000;29:722–9.

    Article  CAS  PubMed  Google Scholar 

  39. Hernan MA, Robins JM. Instruments for causal inference. An epidemiologists dream? Epidemiology. 2006;17:360–72.

    Article  PubMed  Google Scholar 

  40. Earle CC, Tsai JS, Gelber RD, Weinstein MC, Neumann PJ, Weeks JC. Effectiveness of chemotherapy for advanced lung cancer in the elderly: instrumental variable and propensity analysis. J Clin Oncol. 2001;19:1064–70.

    CAS  PubMed  Google Scholar 

  41. Cavelaars AEJM, Kunst AE, Geurts JJM, Crialesi R, Grotvedt L, Helmert U, et al. Educational differences in smoking: international comparison. Br Med J. 2000;320:1102–7.

    Article  CAS  Google Scholar 

  42. Groth MV, Fagt S, Brondsted L. Social determinants of dietary habits in Denmark. Eur J Clin Nutr. 2001;55:959–66.

    Article  CAS  PubMed  Google Scholar 

  43. Schneeweiss S, Maclure M. Use of comorbidity scores for control of confounding in studies using administrative databases. Int J Epidemiol. 2000;29:891–8.

    Article  CAS  PubMed  Google Scholar 

  44. Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40:373–83.

    Article  CAS  PubMed  Google Scholar 

  45. Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J Clin Epidemiol. 1992;45:613–9.

    Article  CAS  PubMed  Google Scholar 

  46. Ghali WA, Hall RE, Rosen AK, Ash AS, Moskowitz MA. Searching for an improved clinical comorbidity index for use with ICD-9-CM administrative data. J Clin Epidemiol. 1996;49:273–8.

    Article  CAS  PubMed  Google Scholar 

  47. Clark DO, VonKorff M, Saunders K, Baluch WM, Simon GE. A chronic disease score with empirically derived weights. Med Care. 1995;33:783–95.

    Article  CAS  PubMed  Google Scholar 

  48. Greenland S. Basic methods for sensitivity analysis of biases. Int J Epidemiol. 1996;25:1107–16.

    Article  CAS  PubMed  Google Scholar 

  49. Groenwold RHH, Nelson DB, Nichol KL, Hoes AW, Hak E. Sensitivity analyses to estimate the potential impact of unmeasured confounding in causal research. Int J Epidemiol. 2010;39:107–17.

    Article  PubMed  Google Scholar 

  50. Rothman KJ. Epidemiology—an introduction. Oxford: Oxford University Press; 2002.

    Google Scholar 

  51. Jaro MA. Probabilistic linkage of large public-health data files. Stat Med. 1995;14:491–8.

    Article  CAS  PubMed  Google Scholar 

  52. Dean JM, Vernon DD, Cook L, Nechodom P, Reading J, Suruda A. Probabilistic linkage of computerized ambulance and inpatient hospital discharge records: a potential tool for evaluation of emergency medical services. Ann Emerg Med. 2001;37:616–26.

    Article  CAS  PubMed  Google Scholar 

  53. Victor TW, Mera RM. Record linkage of health care insurance claims. J Am Med Inform Assoc. 2001;8:281–8.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  54. Kripke DF, Langer RD, Kline LE. Hypnotics’ association with mortality or cancer: a matched cohort study. BMJ Open. 2012;2:e000850.

    Google Scholar 

  55. Hommel K, Rasmussen S, Madsen M, Kamper AL. The Danish Registry on regular dialysis and transplantation: completeness and validity of incident patient registration. Nephrol Dial Transplant. 2010;25:947–51.

    Article  PubMed  Google Scholar 

  56. Lynge E, Sandegaard JL, Rebolj M. The Danish National Patient Register. Scand J Public Health. 2011;39(7 Suppl):30–3.

    Article  PubMed  Google Scholar 

  57. Almdal TP, Sorensen TI. Incidence of parenchymal liver diseases in Denmark, 1981 to 1985: analysis of hospitalization registry data. The Danish Association for the Study of the Liver. Hepatology. 1991;13:650–5.

    Article  CAS  PubMed  Google Scholar 

  58. Bernillon P, Lievre L, Pillonel J, Laporte A, Costagliola D. Record-linkage between two anonymous databases for a capture-recapture estimation of underreporting of AIDS cases: France 1990–1993. The Clinical Epidemiology Group from Centres d’Information et de Soins de l’Immunodeficience Humaine. Int J Epidemiol. 2000;29:168–74.

    Article  CAS  PubMed  Google Scholar 

  59. Thomas AM, Thygerson SM, Merrill RM, Cook LJ. Identifying work-related motor vehicle crashes in multiple databases. Traffic Inj Prev. 2012;13:348–54.

    Article  PubMed  Google Scholar 

  60. Patterson CC, Gyurus E, Rosenbauer J, Cinek O, Neu A, Schober E, et al. Trends in childhood type 1 diabetes incidence in Europe during 1989–2008: evidence of non-uniformity over time in rates of increase. Diabetologia. 2012;55:2142–7.

    Article  CAS  PubMed  Google Scholar 

  61. McDonald TL, Amstrup SC. Estimation of population size using open capture-recapture models. J Agric Biol Environ Stat. 2001;6:206–20.

    Article  Google Scholar 

  62. Devantier A, Kjer JJ. The national patient register—a research tool? Ugeskr Laeger. 1991;153:516–7.

    CAS  PubMed  Google Scholar 

  63. Christensen J, Vestergaard M, Olsen J, Sidenius P. Validation of epilepsy diagnoses in the Danish National Hospital Register. Epilepsy Res. 2007;75:162–70.

    Article  PubMed  Google Scholar 

  64. Krarup LH, Boysen G, Janjua H, Prescott E, Truelsen T. Validity of stroke diagnoses in a National Register of Patients. Neuroepidemiology. 2007;28:150–4.

    Article  PubMed  Google Scholar 

  65. Djurhuus BD, Skytthe A, Faber CE. Validation of the cholesteatoma diagnosis in the Danish National Hospital Register. Dan Med Bull. 2010;57:A4159.

    PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lau Caspar Thygesen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thygesen, L.C., Ersbøll, A.K. When the entire population is the sample: strengths and limitations in register-based epidemiology. Eur J Epidemiol 29, 551–558 (2014). https://doi.org/10.1007/s10654-013-9873-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10654-013-9873-0

Keywords

Navigation