Yearb Med Inform 2012; 21(01): 130-134
DOI: 10.1055/s-0038-1639443
Survey
Georg Thieme Verlag KG Stuttgart

Translational Bioinformatics Embraces Big Data

N. H. Shah
1   Stanford Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, California, United States of America
› Author Affiliations
N.H.S. is funded by the US National Institute of Health Roadmap (U54 HG004028 and U54 LM008748). The ideas around mass phenotyping benefited from discussion with Lawrence Hunter and participants at the Discovery Informatics Workshop 2012, supported by the National Science Foundation.
Further Information

Publication History

Publication Date:
10 March 2018 (online)

Summary

We review the latest trends and major developments in translational bioinformatics in the year 2011-2012. Our emphasis is on highlighting the key events in the field and pointing at promising research areas for the future. The key take-home points are:

• Translational informatics is ready to revolutionize human health and healthcare using large-scale measurements on individuals.

• Data–centric approaches that compute on massive amounts of data (often called “Big Data”) to discover patterns and to make clinically relevant predictions will gain adoption.

• Research that bridges the latest multimodal measurement technologies with large amounts of electronic healthcare data is increasing; and is where new breakthroughs will occur.

 
  • References

  • 1 Shah NH, Tenenbaum JD. The coming age of data-driven medicine: Translational Bioinformatics’ next frontier. J Am Med Inform Assoc 2012; Jun 1; 19 e1 e2-e4.
  • 2 Altman RB, Miller KS. 2010 translational bioinformatics year in review. J Am Med Inform Assoc 2011; 18 (04) 358-66.
  • 3 Green ED, Guyer MS. Charting a course for genomic medicine from base pairs to bedside. Nature 2011; 470 7333 204-13.
  • 4 Liu M, Wu Y, Chen Y, Sun J, Zhao Z, Chen XW. et al. Large-scale Prediction of Adverse Drug Reactions by Integrating Chemical, Biological, and Phenotypic Properties of Drugs. J Am Med Inform Assoc 2012; Jun 1; 19 e1 e28-e35.
  • 5 Russu A, Malovini A, Puca AA, Bellazzi R. Stochastic model search with binary outcomes for Genome-Wide Association Studies. J Am Med Inform Assoc 2012; Jun 1; 19 e1 e13-e20.
  • 6 Morgan AA, Chen R, Butte AJ. Clinical utility of sequence-based genotype compared with that derivable from genotyping arrays. J Am Med Inform Assoc 2012; Jun 1; 19 e1 e21-e27.
  • 7 Worthey EA, Mayer AN, Syverson GD, Helbling D, Bonacci BB, Decker B. et al. Making a definitive diagnosis: successful clinical application of whole exome sequencing in a child with intractable inflammatory bowel disease. Genet Med 2011; 13 (03) 255-62.
  • 8 Mayer AN, Dimmock DP, Arca MJ, Bick DP, Verbsky JW, Worthey EA. et al. A timely arrival for genomic medicine. Genet Med 2011; 13 (03) 195-6.
  • 9 Trelles O, Prins P, Snir M, Jansen RC. Big data, but are we ready?. Nat Rev Genet 2011; 12 (03) 224.
  • 10 Schadt EE, Linderman MD, Sorenson J, Lee L, Nolan GP. Cloud and heterogeneous computing solutions exist today for the emerging big data problems in biology. Nat Rev Genet 2011; 12 (03) 224.
  • 11 Weiss R. Obama Administration Unveils “Big Data” Initiative: Announces $200 million in new R&D Investments. Washington D.C: O.o.S.a.T. Policy, Executive Off ice of the President; 2012: 1-4.
  • 12 Kho AN, Pacheco JA, Peissig PL, Rasmussen L, Newton KM, Weston N. et al. Electronic Medical Records for Genetic Research: Results of the eMERGE Consortium. Sci Transl Med 2011; 03 (79) 79re1.
  • 13 Kho AN, Hayes MG, Rasmussen-Torvik L, Pacheco JA, Thompson WK, Armstrong LL. et al. Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study. JAm Med Inform Assoc 2012; 19 (02) 212-8.
  • 14 Wei WQ, Leibson CL, Ransom JE, Kho AN, Caraballo PJ, Chai HS. et al. Impact of data fragmentation across healthcare centers on the accuracy of a high-throughput clinical phenotyping algorithm for specifying subjects with type 2 diabetes mellitus. J Am Med Inform Assoc 2012; 19 (02) 219-24.
  • 15 Halevy A, Norvig P, Pereira F. The unreasonable effectiveness of data. IEEE Intelligent Systems 2009; 24 (02) 8-12.
  • 16 Ashley EA, Butte AJ, Wheeler MT, Chen R, Klein TE, Dewey FE. et al. Clinical assessment incorporating a personal genome. Lancet 2010; 375 9725 1525-35.
  • 17 Samani NJ, Tomaszewski M, Schunkert H. The personal genome—the future of personalised medicine?. Lancet 2010; 375 9725 1497-8.
  • 18 Harpaz R, Dumouchel W, Shah NH, Madigan D, Ryan P, Friedman C. Novel Data Mining Methodologies for Adverse Drug Event Discovery and Analysis. Clin Pharmacol Ther. 2012 in press.
  • 19 Lussier YA, Chen JL. The emergence of genome-based drug repositioning. Sci Transl Med 2011; 03 (96) 96ps35.
  • 20 Bate A, Evans SJW. Quantitative signal detection using spontaneous ADR reporting. Pharmaco-epidemiol Drug Saf 2009; 18 (06) 427-36.
  • 21 Szarfman A, Machado SG, O’Neill RT. Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the US FDA’s spontaneous reports database. Drug Saf 2002; 25 (06) 381-92.
  • 22 Weiss-Smith S, Deshpande G, Chung S, Gogolak V. The FDA drug safety surveillance program: adverse event reporting trends. Arch Intern Med 2011; 171 (06) 591-3.
  • 23 Norén GN, Sundberg R, Bate A, Edwards IR. A statistical methodology for drug-drug interaction surveillance. Stat Med 2008; 27 (16) 3057-70.
  • 24 Tatonetti NP, Denny JC, Murphy SN, Fernald GH, Krishnan G, Castro V. et al. Detecting Drug Interactions From Adverse-Event Reports: Interaction Between Paroxetine and Pravastatin Increases Blood Glucose Levels. Clin Pharmacol Ther 2011; 90 (01) 133-42.
  • 25 Tatonetti NP, Ye PP, Daneshjou R, Altman RB. Data-driven prediction of drug effects and interactions. Sci Transl Med 2012; Mar 14; 04 (125) 125ra31.
  • 26 Cami A, Arnold A, Manzi S, Reis B. Predicting adverse drug events using pharmacological network models. Sci Transl Med 2011; 03 (114) 114ra127.
  • 27 Pouliot Y, Chiang AP, Butte AJ. Predicting adverse drug reactions using publicly available PubChem BioAssay data. Clin Pharmacol Ther 2011; 90 (01) 90-9.
  • 28 Vilar S, Harpaz R, Chase HS, Costanzi S, Rabadan R, Friedman C. Facilitating adverse drug event detection in pharmacovigilance databases using molecular structure similarity: application to rhabdomyolysis. J Am Med Inform Assoc 2011; 18 Suppl 1: i73-80.
  • 29 Liu Y, LePendu P, Iyer S, Shah NH. Using Temporal Patterns in Medical Records to Discern Adverse Drug Events from Indications. In: AMIA Summit on Clinical Research Informatics. 2012. San Francisco: AMIA.;
  • 30 Lependu P, Iyer SV, Fairon C, Shah NH. Annotation Analysis for Testing Drug Safety Signals. J Biomed Semantics 2012; 3 Suppl 1: pS5.
  • 31 Brownstein JS, Sordo M, Kohane IS, Mandl KD. The tell-tale heart: population-based surveillance reveals an association of rofecoxib and celecoxib with myocardial infarction. PLoS ONE 2007; 02 (09) e840.
  • 32 Harpaz R, Chase H, Friedman C. Mining multi-item drug adverse effect associations in spontaneous reporting systems. BMC Bioinformatics 2010; 11 Suppl 9: S7.
  • 33 Dore D, Seeger J, Arnold KChan. Use of a claims-based active drug safety surveillance system to assess the risk of acute pancreatitis with exenatide or sitagliptin compared to metformin or glyburide. Curr Med Res Opin 2009; 25 (04) 1019-27.
  • 34 Nadkarni P. Drug safety surveillance using de-identified EMR and claims data: issues and challenges. J Am Med Inform Assoc 2010; 17 (06) 671-4.
  • 35 Brown JS, Kulldorff M, Chan KA, Davis RL, Graham D, Pettus PT. et al. Early detection of adverse drug events within population-based health networks: application of sequential testing methods. Pharmacoepidemiol Drug Saf 2007; 16 (12) 1275-84.
  • 36 Shetty KD, Dalal S. Using information mining of the medical literature to improve drug safety. J Am Med Inform Assoc 2011; 18 (05) 668-74.
  • 37 Chee BW, Berlin R, Schatz B. Predicting adverse drug events from personal health messages. AMIA Annu Symp Proc 2011; 2011: 217-26.
  • 38 Gottlieb A, Stein GY, Ruppin E, Sharan R. PREDICT: a method for inferring novel drug indications with application to personalized medicine. Mol Syst Biol 2011; 07: 496.
  • 39 Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A. et al. Discovery and preclinical validation of drug indications using compendia of public gene expression data. Sci Transl Med 2011; 03 (96) 96ra77.
  • 40 Dudley JT, Sirota M, Shenoy M, Pai RK, Roedder S, Chiang AP. et al. Computational repositioning of the anticonvulsant topiramate for inflammatory bowel disease. Sci Transl Med 2011; 03 (96) 96ra76.
  • 41 Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ. et al. Detecting Novel Associations in Large Data Sets. Science 2011; 334 6062 1518-24.
  • 42 Sobek M, Cleveland L, Flood S, Hall PK, King ML, Ruggles S. et al. Big Data: Large-Scale Historical Infrastructure from the Minnesota Population Center. Hist Methods 2011; 44 (02) 61-8.
  • 43 Fox B. Using big data for big impact. How predictive modeling can affect patient outcomes. Health Manag Technol 2012; 33 (01) 32.
  • 44 Chen R, Mias GI, Li-Pook-Than J, Jiang L, Lam HY, Chen R. et al. Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes. Cell 2012; 148 (06) 1293-1307.
  • 45 Frankovich J, Longhurst CA, Sutherland SM. Evidence-based medicine in the EMR era. N Engl J Med 2011; 365 (19) 1758-9.
  • 46 Tung JY, Do CB, Hinds DA, Kiefer AK, Macpherson JM, Chowdry AB. et al. Efficient replication of over 180 genetic associations with self-reported medical data. PLoS ONE 2011; 06 (08) e23473.
  • 47 Roque FS, Jensen PB, Schmock H, Dalgaard M, Andreatta M, Hansen T. et al. Using electronic patient records to discover disease correlations and stratify patient cohorts. Plos Comput Biol 2011; 07 (08) e1002141.
  • 48 FFrost J, Okun S, Vaughan T, Heywood J, Wicks P. Patient-reported outcomes as a source of evidence in off-label prescribing: analysis of data from PatientsLikeMe. J Med Internet Res 2011; 13 (01) e6.
  • 49 Wicks P, Vaughan TE, Massagli MP, Heywood J. Accelerated clinical discovery using self-reported patient data collected online and a patient-matching algorithm. Nature Biotechnol 2011; 29 (05) 411-4.
  • 50 Hays J, Efros AA. Scene completion using millions of photographs. Commun ACM 2008; 51 (10) 87-94.
  • 51 Bringardner J. Winning the Lawsuit: Data Miners Dig for Dirt. Wired Magazine. 2008 (16-07)
  • 52 Michel JB, Shen YK, Aiden AP, Veres A, Gray MK. Google Books Team. et al. Quantitative analysis of culture using millions of digitized books. Science 2011; 331 6014 176-82.
  • 53 National Research Council, U.S.C.o.A.F. f.D.a.N.T.o.D. Toward precision medicine building a knowledge network for biomedical research and a new taxonomy of disease. 2011 Available from: http://www.worldcat.org/isbn/0309222222