Skip to main content
Log in

Text Mining for Adverse Drug Events: the Promise, Challenges, and State of the Art

  • Leading Article
  • Published:
Drug Safety Aims and scope Submit manuscript

Abstract

Text mining is the computational process of extracting meaningful information from large amounts of unstructured text. It is emerging as a tool to leverage underutilized data sources that can improve pharmacovigilance, including the objective of adverse drug event (ADE) detection and assessment. This article provides an overview of recent advances in pharmacovigilance driven by the application of text mining, and discusses several data sources—such as biomedical literature, clinical narratives, product labeling, social media, and Web search logs—that are amenable to text mining for pharmacovigilance. Given the state of the art, it appears text mining can be applied to extract useful ADE-related information from multiple textual sources. Nonetheless, further research is required to address remaining technical challenges associated with the text mining methodologies, and to conclusively determine the relative contribution of each textual source to improving pharmacovigilance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Kroeze JH, Matthee MC, Bothma TJD. Differentiating data- and text-mining terminology. In: Proceedings of the 2003 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists on Enablement Through Technology. 954024: South African Institute for Computer Scientists and Information Technologists; 2003: pp. 93–101.

  2. Witten IH. “Text mining”. In: Singh MP, editor. Practical handbook of internet computing. Boca Raton, FL: Chapman and Hall/CRC Press; 2005: pp. 14-1–22.

  3. Szarfman A, Machado SG, O’Neill RT. Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the US FDA’s spontaneous reports database. Drug Saf. 2002;25(6):381–92.

    Article  CAS  PubMed  Google Scholar 

  4. Harpaz R, Dumouchel W, Lependu P, Bauer-Mehren A, Ryan P, Shah NH. Performance of pharmacovigilance signal-detection algorithms for the FDA adverse event reporting system. Clin Pharmacol Ther. 2013;93(6):539–46. doi:10.1038/clpt.2013.24.

    Article  CAS  PubMed  Google Scholar 

  5. DuMouchel W. Multivariate bayesian logistic regression for analysis of clinical study safety issues. Stat Sci. 2012;27(3):319–39. doi:10.1214/11-STS381.

    Article  Google Scholar 

  6. Honig PK. Advancing the science of pharmacovigilance. Clin Pharmacol Ther. 2013;93(6):474–5. doi:10.1038/clpt.2013.60.

    Article  CAS  PubMed  Google Scholar 

  7. Harpaz R, DuMouchel W, Shah NH, Madigan D, Ryan P, Friedman C. Novel data-mining methodologies for adverse drug event discovery and analysis. Clin Pharmacol Ther. 2012;91(6):1010–21. doi:10.1038/clpt.2012.50.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  8. Prescription Drug User Fee Act (PDUFA V). http://www.fda.gov/ForIndustry/UserFees/PrescriptionDrugUserFee/ucm272170.htm. Accessed Apr 2014.

  9. Regulation (EU) No 1235/2010 of the European Parliament and of the Council of 15 December 2010. http://www.ema.europa.eu/ema/index.jsp?curl=pages/regulation/general/general_content_000492.jsp. Accessed Apr 2014.

  10. Food and Drug Administration Amendments Act (FDAAA) of 2007. http://www.fda.gov/regulatoryinformation/legislation/federalfooddrugandcosmeticactfdcact/significantamendmentstothefdcact/foodanddrugadministrationamendmentsactof2007/default.htm. Accessed Apr 2014.

  11. Platt R, Wilson M, Chan KA, Benner JS, Marchibroda J, McClellan M. The new sentinel network: improving the evidence of medical-product safety. N Engl J Med. 2009;361(7):645–7.

    Article  CAS  PubMed  Google Scholar 

  12. Stang PE, Ryan PB, Racoosin JA, Overhage JM, Hartzema AG, Reich C, et al. Advancing the science for active surveillance: rationale and design for the observational medical outcomes partnership. Annal Intern Med. 2010;153(9):600–6.

    Article  Google Scholar 

  13. Coloma PM, Schuemie MJ, Trifiro G, Gini R, Herings R, Hippisley-Cox J, et al. Combining electronic healthcare databases in Europe to allow for large-scale drug safety monitoring: the EU-ADR Project. Pharmacoepidemiol Drug Saf. 2011;20(1):1–11.

    Article  PubMed  Google Scholar 

  14. Shetty KD, Dalal SR. Using information mining of the medical literature to improve drug safety. J Am Med Inform Assoc. 2011;18(5):668–74. doi:10.1136/amiajnl-2011-000096.

    Article  PubMed Central  PubMed  Google Scholar 

  15. Avillach P, Dufour JC, Diallo G, Salvo F, Joubert M, Thiessard F, et al. Design and validation of an automated method to detect known adverse drug reactions in MEDLINE: a contribution from the EU-ADR project. J Am Med Inform Assoc. 2013;20(3):446–52. doi:10.1136/amiajnl-2012-001083.

    Article  PubMed Central  PubMed  Google Scholar 

  16. Boyce RD, Ryan PB, Noren GN, et al. Bridging islands of information to establish an integrated knowledge base of drugs and health outcomes of interest. Drug Saf. 2014;2014(07/02):1–11.

    Google Scholar 

  17. Duke JD, Friedlin J. ADESSA: a real-time decision support service for delivery of semantically coded adverse drug event data. AMIA Annu Symp Proc. 2010;2010:177–81.

    PubMed Central  PubMed  Google Scholar 

  18. Innovative medicines initiative. 9th call for proposals 2013. http://www.imi.europa.eu/sites/default/files/uploads/documents/9th_Call/Calll_9_Text.pdf. Accessed Apr 2014.

  19. FDA Science Board Subcommittee. Review of the FDA/CDER Pharmacovigilance Program (Prepared for the FDA Science Board May 2011). http://www.fda.gov/downloads/AdvisoryCommittees/CommitteesMeetingMaterials/ScienceBoardtotheFoodandDrugAdministration/UCM276888.pdf. Accessed Apr 2014.

  20. Friedman C, Elhadad N. Natural language processing in health care and biomedicine. In: Shortliffe EH, Cimino JJ, editors. Biomedical informatics. London: Springer; 2014. p. 255–84.

    Chapter  Google Scholar 

  21. Nadkarni PM, Ohno-Machado L, Chapman WW. Natural language processing: an introduction. J Am Med Inform Assoc. 2011;18(5):544–51. doi:10.1136/amiajnl-2011-000464.

    Article  PubMed Central  PubMed  Google Scholar 

  22. Lindberg DA, Humphreys BL, McCray AT. The unified medical language system. Methods Inf Med. 1993;32(4):281–91.

    CAS  PubMed  Google Scholar 

  23. Noy NF, Shah NH, Whetzel PL, Dai B, Dorf M, Griffith N, et al. BioPortal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Res. 2009;37(Web Server issue):W170–3. doi:10.1093/nar/gkp440.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  24. Uzuner O, South BR, Shen S, DuVall SL. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J Am Med Inform Assoc. 2011;18(5):552–6. doi:10.1136/amiajnl-2011-000203.

    Article  PubMed Central  PubMed  Google Scholar 

  25. Gurulingappa H, Klinger R, Hofmann-Apitius M, Fluck J, editors. An empirical evaluation of resources for the identification of diseases and adverse effects in biomedical literature. 2nd Workshop on Building and Evaluating Resources for Biomedical Text Mining (7th edition of the Language Resources and Evaluation Conference); 2010.

  26. Nadkarni PM. Drug safety surveillance using de-identified EMR and claims data: issues and challenges. J Am Med Inform Assoc. 2010;17(6):671–4. doi:10.1136/jamia.2010.008607.

    Article  PubMed Central  PubMed  Google Scholar 

  27. Xu R, Musen MA, Shah NH. A comprehensive analysis of five million UMLS Metathesaurus terms using eighteen million MEDLINE citations. AMIA Annu Symp Proc. 2010;2010:907–11.

    PubMed Central  PubMed  Google Scholar 

  28. Wu ST, Liu H, Li D, Tao C, Musen MA, Chute CG, et al. Unified Medical Language System term occurrences in clinical notes: a large-scale corpus analysis. J Am Med Inform Assoc. 2012;19(e1):e149–56. doi:10.1136/amiajnl-2011-000744.

    Article  PubMed Central  PubMed  Google Scholar 

  29. Rodriguez-Esteban R, Mining Text, Applications Its. Biomedical text mining and its applications. PLoS Comput Biol. 2009;5(12):e1000597. doi:10.1371/journal.pcbi.1000597.

    Article  PubMed Central  PubMed  Google Scholar 

  30. Cohen KB, Hunter L. Getting started in text mining. PLoS Comput Biol. 2008;4(1):e20. doi:10.1371/journal.pcbi.0040020.

    Article  PubMed Central  PubMed  Google Scholar 

  31. Coulet A, Garten Y, Dumontier M, Altman RB, Musen MA, Shah NH. Integration and publication of heterogeneous text-mined relationships on the Semantic Web. J Biomed Semant. 2011;2(Suppl 2):S10. doi:10.1186/2041-1480-2-S2-S10.

    Article  Google Scholar 

  32. Percha B, Garten Y, Altman RB. Discovery and explanation of drug–drug interactions via text mining. Pac Symp Biocomput; 2012; 410–21.

  33. Aronson AR, Lang FM. An overview of MetaMap: historical perspective and recent advances. J Am Med Inform Assoc. 2010;17(3):229–36. doi:10.1136/jamia.2009.002733.

    PubMed Central  PubMed  Google Scholar 

  34. Jonquet C, Shah NH, Musen MA. The open biomedical annotator. Summit Transl Bioinform. 2009;2009:56–60.

    Google Scholar 

  35. Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 2001;34(5):301–10. doi:10.1006/jbin.2001.1029.

    Article  CAS  PubMed  Google Scholar 

  36. Harkema H, Dowling JN, Thornblade T, Chapman WW. ConText: an algorithm for determining negation, experiencer, and temporal status from clinical reports. J Biomed Inform. 2009;42(5):839–51. doi:10.1016/j.jbi.2009.05.002.

    Article  PubMed Central  PubMed  Google Scholar 

  37. Online registry of biomedical informatics tools. http://orbit.nlm.nih.gov/. Accessed Apr 2014.

  38. iDASH Center. http://idash.ucsd.edu/nlp/natural-language-processing-nlp-ecosystem. Accessed Apr 2014.

  39. Coloma PM, Avillach P, Salvo F, Schuemie MJ, Ferrajolo C, Pariente A, et al. A reference standard for evaluation of methods for drug safety signal detection using electronic healthcare record databases. Drug Saf. 2013;36(1):13–23. doi:10.1007/s40264-012-0002-x.

    Article  CAS  PubMed  Google Scholar 

  40. Gurulingappa H, Toldo L, Rajput AM, Kors JA, Taweel A, Tayrouz Y. Automatic detection of adverse events to predict drug label changes using text and data mining techniques. Pharmacoepidemiol Drug Saf. 2013;22(11):1189–94. doi:10.1002/pds.3493.

    Article  PubMed  Google Scholar 

  41. Gurulingappa H, Rajput AM, Roberts A, Fluck J, Hofmann-Apitius M, Toldo L. Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. J Biomed Inform. 2012;45(5):885–92. doi:10.1016/j.jbi.2012.04.008.

    Article  PubMed  Google Scholar 

  42. Xu R, Wang Q. Large-scale combining signals from both biomedical literature and the FDA Adverse Event Reporting System (FAERS) to improve post-marketing drug safety signal detection. BMC Bioinform. 2014;15(1):17. doi:10.1186/1471-2105-15-17.

    Article  Google Scholar 

  43. The Stanford Parser. http://nlp.stanford.edu/software/lex-parser.shtml. Accessed Apr 2014.

  44. Kuhn M, Campillos M, Letunic I, Jensen LJ, Bork P. A side effect resource to capture phenotypic effects of drugs. Mol Syst Biol. 2010;6:343. doi:10.1038/msb.2009.98.

    Article  PubMed Central  PubMed  Google Scholar 

  45. Duke JD, Han X, Wang Z, Subhadarshini A, Karnik SD, Li X, et al. Literature based drug interaction prediction with clinical assessment using electronic medical records: novel myopathy associated drug interactions. PLoS Comput Biol. 2012;8(8):e1002614. doi:10.1371/journal.pcbi.1002614.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  46. Wang W, Haerian K, Salmasian H, Harpaz R, Chase HS, Friedman C. A drug-adverse event extraction algorithm to support pharmacovigilance knowledge mining from PubMed citations. AMIA Annu Symp Proc. 2011; 2011:1464–70.

  47. Fung KW, Jao CS, Demner-Fushman D. Extracting drug indication information from structured product labels using natural language processing. J Am Med Inform Assoc. 2013;20(3):482–8. doi:10.1136/amiajnl-2012-001291.

    Article  PubMed Central  PubMed  Google Scholar 

  48. DailyMed. http://dailymed.nlm.nih.gov/. Accessed Apr 2014.

  49. Friedlin J, Duke J. Applying natural language processing to extract codify adverse drug reaction in medication labels. http://omop.fnih.org/OMOPWhitePapers2010. Accessed Apr 2014.

  50. Ryan PB, Schuemie MJ, Welebob E, Duke J, Valentine S, Hartzema AG. Defining a reference set to support methodological research in drug safety. Drug Saf. 2013;36(Suppl 1):S33–47. doi:10.1007/s40264-013-0097-8.

    Article  PubMed  Google Scholar 

  51. Duke J, Friedlin J, Li X. Consistency in the safety labeling of bioequivalent medications. Pharmacoepidemiol Drug Saf. 2013;22(3):294–301. doi:10.1002/pds.3351.

    Article  CAS  PubMed  Google Scholar 

  52. Smith JC, Denny JC, Chen Q, Nian H, Spickard III A, Rosenbloom ST, et al. Lessons learned from developing a drug evidence base to support pharmacovigilance. Appl Clin Inform. 2013;4(4):596–617. doi:10.4338/ACI-2013-08-RA-0062.

    Article  CAS  PubMed  Google Scholar 

  53. Denny JC, Smithers JD, Miller RA, Spickard A. “Understanding” medical school curriculum content using KnowledgeMap. J Am Med Inform Assoc. 2003;10(4):351–62. doi:10.1197/jamia.M1176.

    Article  PubMed Central  PubMed  Google Scholar 

  54. Classen DC, Resar R, Griffin F, Federico F, Frankel T, Kimmel N, et al. ‘Global Trigger Tool’ shows that adverse events in hospitals may be ten times greater than previously measured. Health Aff. 2011;30(4):581–9. doi:10.1377/hlthaff.2011.0190.

    Article  Google Scholar 

  55. Boland MR, Hripcsak G, Shen Y, Chung WK, Weng C. Defining a comprehensive verotype using electronic health records for personalized medicine. J Am Med Inform Assoc. 2013;20(e2):e232–8. doi:10.1136/amiajnl-2013-001932.

    Article  PubMed  Google Scholar 

  56. Friedman C, Shagina L, Lussier Y, Hripcsak G. Automated encoding of clinical documents based on natural language processing. J Am Med Inform Assoc. 2004;11(5):392–402. doi:10.1197/jamia.M1552.

    Article  PubMed Central  PubMed  Google Scholar 

  57. Wang X, Hripcsak G, Markatou M, Friedman C. Active computerized pharmacovigilance using natural language processing, statistics, and electronic health records: a feasibility study. J Am Med Inform Assoc. 2009;16(3):328–37. doi:10.1197/jamia.M3028.

    Article  PubMed Central  PubMed  Google Scholar 

  58. Haerian K, Varn D, Vaidya S, Ena L, Chase HS, Friedman C. Detection of pharmacovigilance-related adverse events using electronic health records and automated methods. Clin Pharmacol Ther. 2012;92(2):228–34. http://www.nature.com/clpt/journal/v92/n2/suppinfo/clpt201254s1.html. Accessed Apr 2014.

  59. Li Y, Salmasian H, Vilar S, Chase H, Friedman C, Wei Y. A method for controlling complex confounding effects in the detection of adverse drug reactions using electronic health records. J Am Med Inform Assoc. 2014;21(2):308–14. doi:10.1136/amiajnl-2013-001718.

    Article  PubMed  Google Scholar 

  60. Harpaz R, Haerian K, Chase HS, Friedman C. Mining electronic health records for adverse drug effects using regression based methods. In: Proceedings of the 1st ACM International Health Informatics Symposium; Arlington, VA. 1883008: ACM; 2010: pp. 100–7.

  61. LePendu P, Iyer SV, Bauer-Mehren A, Harpaz R, Mortensen JM, Podchiyska T, et al. Pharmacovigilance using clinical notes. Clin Pharmacol Ther. 2013;93(6):547–55. doi:10.1038/clpt.2013.47.

    Article  CAS  PubMed  Google Scholar 

  62. Lowe HJ, Ferris TA, Hernandez PM, Weber SC. STRIDE—an integrated standards-based translational research informatics platform. AMIA Annu Symp Proc. 2009;2009:391–5.

    PubMed Central  PubMed  Google Scholar 

  63. Iyer SV, Harpaz R, Lependu P, Bauer-Mehren A, Shah NH. Mining clinical text for signals of adverse drug-drug interactions. J Am Med Inform Assoc. 2013. doi:10.1136/amiajnl-2013-001612.

  64. Jung K, LePendu P, Chen WS, Iyer SV, Readhead B, Dudley JT, et al. Automated detection of off-label drug use. PLoS One. 2014;9(2):e89324. doi:10.1371/journal.pone.0089324.

    Article  PubMed Central  PubMed  Google Scholar 

  65. Harpaz R, DuMouchel W, LePendu P, Shah NH. Empirical Bayes model to combine signals of adverse drug reactions. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '13), pp. 1339–1347.

  66. Harpaz R, Vilar S, Dumouchel W, Salmasian H, Haerian K, Shah NH, et al. Combing signals from spontaneous reports and electronic health records for detection of adverse drug reactions. J Am Med Inform Assoc. 2013;20(3):413–9. doi:10.1136/amiajnl-2012-000930.

    Article  PubMed Central  PubMed  Google Scholar 

  67. Friedman C, Rindflesch TC, Corn M. Natural language processing: state of the art and prospects for significant progress, a workshop sponsored by the National Library of Medicine. J Biomed Inform. 2013;46(5):765–73. doi:10.1016/j.jbi.2013.06.004.

    Article  PubMed  Google Scholar 

  68. The Social Life of Health Information, Pew Research Center. http://www.pewinternet.org/2011/05/12/the-social-life-of-health-information-2011. Accessed Apr 2014.

  69. Edwards IR, Lindquist M. Social media and networks in pharmacovigilance. Drug Saf. 2011;34(4):267–71. doi:10.2165/11590720-000000000-00000.

    Article  PubMed  Google Scholar 

  70. Medawar C, Herxheimer A, Bell A, Jofre S. Paroxetine, panorama and user reporting of ADRs: consumer intelligence matters in clinical practice and post-marketing drug surveillance. Int J Risk Saf Med. 2002;15(3):161–9.

    Google Scholar 

  71. Wysowski DK, Chang JT. Alendronate and risedronate: reports of severe bone, joint, and muscle pain. Arch Intern Med. 2005;165(3):346–7. doi:10.1001/archinte.165.3.346-b.

    Article  PubMed  Google Scholar 

  72. DeMonaco HJ. Patient- and physician-oriented web sites and drug surveillance: bisphosphonates and severe bone, joint, and muscle pain. Arch Inter Med. 2009;169(12):1164–6. doi:10.1001/archinternmed.2009.133.

    Article  Google Scholar 

  73. Moncrieff J, Cohen D, Mason JP. The subjective experience of taking antipsychotic medication: a content analysis of Internet data. Acta Psychiatrica Scandinavica. 2009;120(2):102–11. doi:10.1111/j.1600-0447.2009.01356.x.

    Article  CAS  PubMed  Google Scholar 

  74. Leaman R, Wojtulewicz L, Sullivan R, Skariah A, Yang J, Gonzalez G. Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts in health-related social networks. In: Proceedings of the 2010 Workshop on Biomedical Natural Language Processing. 2010: pp: 117–25.

  75. Yang CC, Yang H, Jiang L, Zhang M. Social media mining for drug safety signal detection. In: Proceedings of the 2012 International Workshop on Smart Health and Wellbeing; Maui, HI. 2389714: ACM; 2012. p. 33–40.

  76. Consumer health vocabulary. http://consumerhealthvocab.org/. Accessed Apr 2014.

  77. Liu X, Chen H. AZDrugMiner: an information extraction system for mining patient-reported adverse drug events in online patient forums. In: Zeng D, Yang C, Tseng V, Xing C, Chen H, Wang F-Y, et al., editors. Smart Health. Lecture notes in computer science. Springer: Berlin Heidelberg; 2013. p. 134–50.

    Google Scholar 

  78. Nikfarjam A, Gonzalez GH. Pattern mining for extraction of mentions of adverse drug reactions from user comments. AMIA Annu Symp Proc. 2011;2011:1019–26.

    PubMed Central  PubMed  Google Scholar 

  79. Chee BW, Berlin R, Schatz B. Predicting adverse drug events from personal health messages. AMIA Annu Symp Proc. 2011;2011:217–26.

    PubMed Central  PubMed  Google Scholar 

  80. Liu J, Li A, Seneff S. Automatic drug side effect discovery from online patient-submitted reviews: focus on statin drugs. The First International Conference on advances in information mining and management. 2011.

  81. Hadzi-Puric J, Grmusa J, editors. Automatic drug adverse reaction discovery from parenting websites using disproportionality methods. Advances in Social Networks Analysis and Mining (ASONAM), 2012 IEEE/ACM International Conference on; 26–29 Aug 2012.

  82. Benton A, Ungar L, Hill S, Hennessy S, Mao J, Chung A, et al. Identifying potential adverse effects using the web: a new approach to medical hypothesis generation. J Biomed Inform. 2011;44(6):989–96. doi:10.1016/j.jbi.2011.07.005.

    PubMed  Google Scholar 

  83. Statistic brain. http://www.statisticbrain.com/twitter-statistics/. Accessed Apr 2014.

  84. Bian J, Topaloglu U, Yu F. Towards large-scale twitter mining for drug-related adverse events. In: Proceedings of the 2012 International Workshop on Smart Health and Wellbeing; Maui, HI. 2389713: ACM; 2012: pp. 25–32.

  85. Jiang K, Zheng Y. Mining twitter data for potential drug effects. In: Motoda H, Wu Z, Cao L, Zaiane O, Yao M, Wang W, editors. Advanced data mining and applications. Lecture notes in computer science. Springer: Berlin; 2013. p. 434–43.

    Chapter  Google Scholar 

  86. Pimpalkhute P, Patki A, Nikfarjam A, Gonzalez G. Phonetic spelling filter for keyword selection in drug mention mining from social media. AMIA TBI Summit. 2014.

  87. Centers for Disease Control and Prevention (CDC). Use of the Internet for health information: United States, 2009. http://www.cdc.gov/nchs/data/databriefs/db66.htm. Accessed Apr 2014.

  88. Pew Research Center. Pew Internet and American Life Project: Health Online 2013. http://www.pewinternet.org/~/media/Files/Reports/2013/Pew%20Internet%20Health%20Online%20report.pdf. Accessed Apr 2014.

  89. Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using search engine query data. Nature. 2009;457(7232):1012–4. doi:10.1038/Nature07634.

    Article  CAS  PubMed  Google Scholar 

  90. White RW, Tatonetti NP, Shah NH, Altman RB, Horvitz E. Web-scale pharmacovigilance: listening to signals from the crowd. J Am Med Informa Assoc. 2013. doi:10.1136/amiajnl-2012-001482.

  91. White RW, Harpaz R, Shah NH, DuMouchel W, Horvitz E. Toward enhanced pharmacovigilance using patient-generated data on the internet. Clin Pharmacol Ther. 2014;96(2):239–46.

  92. Tatonetti NP, Denny JC, Murphy SN, Fernald GH, Krishnan G, Castro V, et al. Detecting drug interactions from adverse-event reports: interaction between paroxetine and pravastatin increases blood glucose levels. Clin Pharmacol Ther. 2011;90(1):133–142.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  93. Botsis T, Nguyen MD, Woo EJ, Markatou M, Ball R. Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection. J Am Med Inform Assoc. 2011;18(5):631–8. doi:10.1136/amiajnl-2010-000022.

    Article  PubMed Central  PubMed  Google Scholar 

  94. New Drug Application (NDA). http://www.fda.gov/drugs/developmentapprovalprocess/howdrugsaredevelopedandapproved/approvalapplications/newdrugapplicationnda/default.htm. Accessed Apr 2014.

  95. European Public Assessment Reports. http://www.ema.europa.eu/ema/index.jsp?curl=pages/medicines/landing/epar_search.jsp&mid=WC0b01ac058001d125. Accessed Apr 2014.

  96. World Health Organization pharmaceuticals newsletter. http://www.who.int/medicines/publications/newsletter/en/. Accessed Apr 2014.

  97. Potential signals of serious risks/new safety information identified from the FDA Adverse Event Reporting System (FAERS). http://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Surveillance/AdverseDrugEffects/UCM082196. Accessed Apr 2014.

  98. Clinical trial reports. http://www.fda.gov/downloads/regulatoryinformation/guidances/ucm129456.pdf. Accessed Apr 2014.

Download references

Acknowledgments

The writing of this manuscript was supported by National Institutes of Health (NIH) Grant U54-HG004028 for the National Center for Biomedical Ontology, and by National Institute of General Medical Sciences (NIGMS) grant GM101430-01A1.

Conflicts of interest

Nigam H. Shah is a Science Advisor to Apixio Inc. (www.apixio.com), and Kyron Inc. (www.kyron.com). Rave Harpaz is an employee of Oracle Health Sciences. Rave Harpaz, Alison Callahan, Suzanne Tamang, Yen Low, David Odgers, Sam Finlayson, Kenneth Jung, Paea LePendu, and Nigam H. Shah have no other conflicts of interest that are directly relevant to the content of this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rave Harpaz.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Harpaz, R., Callahan, A., Tamang, S. et al. Text Mining for Adverse Drug Events: the Promise, Challenges, and State of the Art. Drug Saf 37, 777–790 (2014). https://doi.org/10.1007/s40264-014-0218-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40264-014-0218-z

Keywords

Navigation