Abstract
Drugs are an important part of today’s medicine, designed to treat, control, and prevent diseases; however, besides their therapeutic effects, drugs may also cause adverse effects that range from cosmetic to severe morbidity and mortality. To identify these potential drug safety issues early, surveillance must be conducted for each drug throughout its life cycle, from drug development to different phases of clinical trials, and continued after market approval. A major aim of pharmacovigilance is to identify the potential drug–event associations that may be novel in nature, severity, and/or frequency. Currently, the state-of-the-art approach for signal detection is through automated procedures by analyzing vast quantities of data for clinical knowledge. There exists a variety of resources for the task, and many of them are textual data that require text analytics and natural language processing to derive high-quality information. This chapter focuses on the utilization of text mining techniques in identifying potential safety issues of drugs from textual sources such as biomedical literature, consumer posts in social media, and narrative electronic medical records.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
World Health Organization (1966) International drug monitoring: the role of the hospital. In: Technical report series no. 425. World Health Organization, Geneva
Pirmohamed M, Breckenridge AM, Kitteringham NR et al (1998) Adverse drug reactions. BMJ 316:1295–1298
Patel P, Zed PJ (2002) Drug-related visits to the emergency department: how big is the problem? Pharmacotherapy 22:915–923
Juntti-Patinen L, Neuvonen PJ (2002) Drug-related deaths in a university central hospital. Eur J Clin Pharmacol 58:479–482
Moore TJ, Cohen MR, Furberg CD (2007) Serious adverse drug events reported to the Food and Drug Administration, 1998–2005. Arch Intern Med 167:1752–1759
Lazarou J, Pomeranz BH, Corey PN (1998) Incidence of adverse drug reactions in hospitalized patients: a meta-analysis of prospective studies. JAMA 279:1200–1205
Jha AK, Kuperman GJ, Rittenberg E et al (2001) Identifying hospital admissions due to adverse drug events using a computer-based monitor. Pharmacoepidemiol Drug Saf 10:113–119
Griffin MR, Stein CM, Ray WA (2004) Postmarketing surveillance for drug safety: surely we can do better. Clin Pharmacol Ther 75:491–494
Edwards IR, Aronson JK (2000) Adverse drug reactions: definitions, diagnosis, and management. Lancet 356:1255–1259
Fliri AF, Loging WT, Thadeio PF et al (2005) Analysis of drug-induced effect patterns to link structure and side effects of medicines. Nat Chem Biol 1:389–397
Bender A, Scheiber J, Glick M et al (2007) Analysis of pharmacology data and the prediction of adverse drug reactions and off-target effects from chemical structure. ChemMedChem 2:861–873
Campillos M, Kuhn M, Gavin AC et al (2008) Drug target identification using side-effect similarity. Science 321:263–266
Fuzuzaki M, Seki M, Kashima H et al (2009) Side effect prediction using cooperative pathways. In: IEEE international conference on bioinformatics and biomedicine. Washington, DC, pp 142–147
Scheiber J, Chen B, Milik M et al (2009) Gaining insight into off-target mediated effects of drug candidates with a comprehensive systems chemical biology analysis. J Chem Inf Model 49:308–317
Scheiber J, Jenkins JL, Sukuru SC et al (2009) Mapping adverse drug reactions in chemical space. J Med Chem 52:3103–3107
Xie L, Li J, Bourne PE (2009) Drug discovery using chemical systems biology: identification of the protein-ligand binding network to explain the side effects of CETP inhibitors. PLoS Comput Biol 5:e1000387
Hammann F, Gutmann H, Vogt N et al (2010) Prediction of adverse drug reactions using decision tree modeling. Clin Pharmacol Ther 88:52–59
Yamanishi Y, Kotera M, Kanehisa M et al (2010) Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework. Bioinformatics 26:i246–i254
Brouwers L, Iskar M, Zeller G et al (2011) Network neighbors of drug targets contribute to drug side-effect similarity. PLoS One 6:e22187
Pauwels E, Stoven V, Yamanishi Y (2011) Predicting drug side-effect profiles: a chemical fragment-based approach. BMC Bioinformatics 12:169
Pouliot Y, Chiang AP, Butte AJ (2011) Predicting adverse drug reactions using publicly available PubChem BioAssay data. Clin Pharmacol Ther 90:90–99
Lounkine E, Keiser MJ, Whitebread S et al (2012) Large-scale prediction and testing of drug activity on side-effect targets. Nature 486:361–367
Lindquist M, Edwards IR (2001) The WHO Programme for International Drug Monitoring, its database, and the technical support of the Uppsala Monitoring Center. J Rheumatol 28:1180–1187
Szarfman A, Machado SG, O’Neill RT (2002) Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the US FDA’s spontaneous reports database. Drug Saf 25:381–392
Hauben M, Reich L, Chung S (2004) Postmarketing surveillance of potentially fatal reactions to oncology drugs: potential utility of two signal-detection algorithms. Eur J Clin Pharmacol 60:747–750
Chan KA, Hauben M (2005) Signal detection in pharmacovigilance: empirical evaluation of data mining tools. Pharmacoepidemiol Drug Saf 14:597–599
Berlowitz DR, Miller DR, Oliveria SA et al (2006) Differential associations of beta-blockers with hemorrhagic events for chronic heart failure patients on warfarin. Pharmacoepidemiol Drug Saf 15:799–807
Bjornsson E, Olsson R (2006) Suspected drug-induced liver fatalities reported to the WHO database. Dig Liver Dis 38:33–38
Hauben M, Reich L, Gerrits CM (2006) Reports of hyperkalemia after publication of RALES: a pharmacovigilance study. Pharmacoepidemiol Drug Saf 15:775–783
Brown JS, Kulldorff M, Chan KA et al (2007) Early detection of adverse drug events within population-based health networks: application of sequential testing methods. Pharmacoepidemiol Drug Saf 16:1275–1284
Jin HD, Chen J, He HX et al (2008) Mining unexpected temporal associations: applications in detecting adverse drug reactions. IEEE Trans Inf Technol Biomed 12:488–500
Matthews EJ, Kruhlak NL, Benz RD et al (2009) Identification of structure-activity relationships for adverse effects of pharmaceuticals in humans: Part C: use of QSAR and an expert system for the estimation of the mechanism of action of drug-induced hepatobiliary and urinary tract toxicities. Regul Toxicol Pharmacol 54:43–65
Wang X, Hripcsak G, Friedman C (2009) Characterizing environmental and phenotypic associations using information theory and electronic health records. BMC Bioinformatics 10(Suppl 9):S13
Wang X, Hripcsak G, Markatou M et al (2009) Active computerized pharmacovigilance using natural language processing, statistics, and electronic health records: a feasibility study. J Am Med Inform Assoc 16:328–337
Harpaz R, Chase HS, Friedman C (2010) Mining multi-item drug adverse effect associations in spontaneous reporting systems. BMC Bioinformatics 11(Suppl 9):S7
Leaman R, Wojtulewicz L, Sullivan R et al (2010) Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts to health-related social networks. In: Workshop on biomedical natural language processing, pp 117–125
Wang X, Chase H, Markatou M et al (2010) Selecting information in electronic health records for knowledge acquisition. J Biomed Inform 43:595–601
Chee BW, Berlin R, Schatz B (2011) Predicting adverse drug events from personal health messages. AMIA Annu Symp Proc 2011:217–226, Washington, DC
Harpaz R, Perez H, Chase HS et al (2011) Biclustering of adverse drug events in the FDA’s spontaneous reporting system. Clin Pharmacol Ther 89:243–250
Ji Y, Ying H, Dews P et al (2011) A potential causal association mining algorithm for screening adverse drug reactions in postmarketing surveillance. IEEE Trans Inf Technol Biomed 15:428–437
Shetty KD, Dalal SR (2011) Using information mining of the medical literature to improve drug safety. J Am Med Inform Assoc 18:668–674
Sohn S, Kocher JP, Chute CG et al (2011) Drug side effect extraction from clinical narratives of psychiatry and psychology patients. J Am Med Inform Assoc 18(Suppl 1):i144–i149
Zorych I, Madigan D, Ryan P et al (2011) Disproportionality methods for pharmacovigilance in longitudinal observational databases. Stat Methods Med Res 22:39–56
Harpaz R, Vilar S, Dumouchel W et al (2012) Combing signals from spontaneous reports and electronic health records for detection of adverse drug reactions. J Am Med Inform Assoc 20:413–419
Liu M, McPeek Hinz ER, Matheny ME et al (2012) Comparative analysis of pharmacovigilance methods in the detection of adverse drug reactions using electronic medical records. J Am Med Inform Assoc 20:420–426
Liu M, Wu Y, Chen Y et al (2012) Large-scale prediction of adverse drug reactions using chemical, biological, and phenotypic properties of drugs. J Am Med Inform Assoc 19:e28–e35
Warrer P, Hansen EH, Juhl-Jensen L et al (2012) Using text-mining techniques in electronic patient records to identify ADRs from medicine use. Br J Clin Pharmacol 73:674–684
Yoon D, Park MY, Choi NK et al (2012) Detection of adverse drug reaction signals using an electronic health records database: comparison of the Laboratory Extreme Abnormality Ratio (CLEAR) algorithm. Clin Pharmacol Ther 91:467–474
Scripture CD, Figg WD (2006) Drug interactions in cancer therapy. Nat Rev Cancer 6:546–558
Hale R (2005) Text mining: getting more value from literature resources. Drug Discov Today 10:377–379
Cohen AM, Hersh WR (2005) A survey of current work in biomedical text mining. Brief Bioinform 6:57–71
Van De Belt TH, Engelen LJ, Berben SA et al (2010) Definition of Health 2.0 and Medicine 2.0: a systematic review. J Med Internet Res 12:e18
Wishart DS, Knox C, Guo AC et al (2008) DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res 36:D901–D906
Knox C, Law V, Jewison T et al (2010) DrugBank 3.0: a comprehensive resource for ‘omics’ research on drugs. Nucleic Acids Res 39:D1035–D1041
Segura-Bedmar I, Crespo M, de Pablo-Sanchez C et al (2010) Resolving anaphoras for the extraction of drug–drug interactions in pharmacological documents. BMC Bioinformatics 11(Suppl 2):S1
Segura-Bedmar I, Martinez P, de Pablo-Sanchez C (2011) A linguistic rule-based approach to extract drug–drug interactions from pharmacological documents. BMC Bioinformatics 12(Suppl 2):S1
Jiang J, Zhai C (2007) An empirical study of tokenization strategies for biomedical information retrieval. Inf Retr 10:341–363
Porter M (1997) An algorithm for suffix stripping. In: Sparck Jones K, Willett P (eds) Readings in information retrieval. Morgan Kaufmann Publishers Inc., San Francisco, CA, pp 313–6
Porter M (2001) Snowball: a language for stemming algorithms. Available from: http://snowball.tartarus.org/texts/introduction.html
Burns PR (2013) MorphAdorner v2: a Java Library for the morphological adornment of English language texts. Northwestern University, Evanston, IL
Paulussen H, Martin W (1992) DILEMMA-2: a lemmatizer-tagger for medical abstracts. In: Third conference on applied natural language processing
Smith L, Rindflesch T, Wilbur WJ (2004) MedPost: a part-of-speech tagger for bioMedical text. Bioinformatics 20:2320–2321
Divita G, Browne A, Loane R (2006) dTagger: a POS Tagger. AMIA Annu Symp Proc 2006:200–203
Tsuruoka Y, Tateishi Y, Kim J-D et al (2005) Developing a robust part-of-speech tagger for biomedical text. Lect Notes Comput Sci 3746:382–392
Klein D, Manning C (2003) Accurate unlexicalized parsing. In: Proceedings of the 41st meeting of the Association for Computational Linguistics, vol 2003, pp 423–430
McClosky D (2006) Effective self-training for parsing. In: Proceedings of North American chapter of the Association for Computational Linguistics, vol 2006, pp 152–159
Grinberg D, Lafferty J, Sleator D (1995) A robust parsing algorithm for link grammars. In: Proceedings of the fourth international workshop on parsing technologies, vol 1995
Open Health Natural Language Processing (OHNLP) Consortium
Savova GK, Masanz JJ, Ogren PV et al (2010) Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc 17:507–513
Unstructured Information Management Architecture (UIMA). Available from: http://uima-framework.sourceforge.net
Mitchell KJ, Becich MJ, Berman JJ et al (2004) Implementation and evaluation of a negation tagger in a pipeline-based system for information extract from pathology reports. Stud Health Technol Inform 107:663–667
Denny JC, Smithers JD, Miller RA et al (2003) “Understanding” medical school curriculum content using KnowledgeMap. J Am Med Inform Assoc 10:351–362
Denny J, Peterson J (2007) Identifying QT prolongation from ECG impressions using natural language processing and negation detection. Stud Health Technol Inform 129: 1283–1288
Denny JC, Pr I, Wehbe FH et al (2003) The KnowledgeMap project: development of a concept-based medical school curriculum database. AMIA Annu Symp Proc 2003:195–199
Denny JC, Soriano RP, Stein G et al (2009) POGOe: a national repository of geriatric education materials. Proc AMIA Annu Fall Symp 2009:1–192
Denny JC, Spickard A, Miller RA et al (2005) Identifying UMLS concepts from ECG impressions using KnowledgeMap. AMIA Annu Symp Proc 2005:196–200
Denny JC, Miller RA, Waitman LR et al (2009) Identifying QT prolongation from ECG impressions using a general-purpose Natural Language Processor. Int J Med Inform 78(Suppl 1):S34–S42
Friedman C, Alderson PO, Austin JH et al (1994) A general natural-language text processor for clinical radiology. J Am Med Inform Assoc 1:161–174
Hripcsak G, Friedman C, Alderson PO et al (1995) Unlocking clinical data from narrative reports: a study of natural language processing. Ann Intern Med 122:681–688
Hripcsak G, Austin JH, Alderson PO et al (2002) Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports. Radiology 224:157–163
Friedman C (1997) Towards a comprehensive medical language processing system: methods and issues. Proc AMIA Annu Fall Symp 1997:595–599
Friedman C, Knirsch C, Shagina L et al (1999) Automating a severity score guideline for community-acquired pneumonia employing medical language processing of discharge summaries. Proc AMIA Symp 1999:256–260
Aronson AR, Lang FM (2010) An overview of MetaMap: historical perspective and recent advances. J Am Med Inform Assoc 17:229–236
Ruch P, Gobeill J, Lovis C et al (2008) Automatic medical encoding with SNOMED categories. BMC Med Inform Decis Mak 8(Suppl 1):S6
Xu H, Stenner SP, Doan S et al (2010) MedEx: a medication information extraction system for clinical narratives. J Am Med Inform Assoc 17:19–24
Gold S, Elhadad N, Zhu X et al (2008) Extracting structured medication event information from discharge summaries. AMIA Annu Symp Proc 2008:237–241
Hanisch D, Fundel K, Mevissen HT et al (2005) ProMiner: rule-based protein and gene entity recognition. BMC Bioinformatics 6(Suppl 1):S14
Meystre SM, Thibault J, Shen S et al (2010) Textractor: a hybrid system for medications and reason for their prescription extraction from clinical text documents. J Am Med Inform Assoc 17:559–562
Cohen B, Hunter L (2004) Natural language processing and systems biology. In: Pereira F, Dubitzky W (eds) Artificial intelligence methods and tools for systems biology. Springer, Netherlands
Krauthammer M, Nenadic G (2004) Term identification in the biomedical literature. J Biomed Inform 37:512–526
Leaman R, Gonzalez G (2008) BANNER: an executable survey of advances in biomedical named entity recognition. Pac Symp Biocomput 2008:652–663
Mahbub Chowdhury F, Lavelli A (2010) Disease mention recognition with specific features. In: Proceedings of the 2010 workshop on biomedical natural language processing (BioNLP), vol 2010, p 91–98
Jiang M, Chen Y, Liu M et al (2011) A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. J Am Med Inform Assoc 18:601–606
Savova GK, Coden AR, Sominsky IL et al (2008) Word sense disambiguation across two domains: biomedical literature and clinical notes. J Biomed Inform 41:1088–1100
Jimeno-Yepes AJ, McInnes BT, Aronson AR (2011) Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation. BMC Bioinformatics 12:223
Stevenson M, Agirre E, Soroa A (2012) Exploiting domain information for Word Sense Disambiguation of medical documents. J Am Med Inform Assoc 19:235–240
Uzuner O, South BR, Shen S et al (2011) 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J Am Med Inform Assoc 18:552–556
Haerian K, Varn D, Vaidya S et al (2012) Detection of pharmacovigilance-related adverse events using electronic health records and automated methods. Clin Pharmacol Ther 92:228–234
Leitner F, Mardis SA, Krallinger M et al (2010) An overview of BioCreative II.5. IEEE/ACM Trans Comput Biol Bioinform 7:385–399
Segura-Bedmar I, Martinez P, Sanchez-Cisneros D (2011) The 1st DDIExtraction-2011 challenge task: extraction of drug–drug interactions from biomedical texts. In: Proceedings of workshop on first challenge task: drug–drug interaction extraction, vol 2011, p 1–9
Karimi S, Kim SN, Cavedon L (2011) Drug side-effects: what do patients forums reveal? In: The second international workshop on Web science and information exchange in the medical Web. ACM, Glasgow
Tari L, Anwar S, Liang S et al (2010) Discovering drug–drug interactions: a text-mining and reasoning approach based on properties of drug metabolism. Bioinformatics 26:i547–i553
Lependu P, Iyer SV, Fairon C et al (2012) Annotation analysis for testing drug safety signals using unstructured clinical notes. J Biomed Semantics 3(Suppl 1):S5
Vilar S, Harpaz R, Santana L et al (2012) Enhancing adverse drug event detection in electronic health records using molecular structure similarity: application to pancreatitis. PLoS One 7:e41471
LePendu P, Iyer SV, Bauer-Mehren A et al (2013) Pharmacovigilance using clinical notes. Clin Pharmacol Ther 93:547–555
Leeper NJ, Bauer-Mehren A, Iyer SV et al (2013) Practice-based evidence: profiling the safety of cilostazol by text-mining of clinical notes. PLoS One 8:e63499
Duke JD, Han X, Wang Z et al (2012) Literature based drug interaction prediction with clinical assessment using electronic medical records: novel myopathy associated drug interactions. PLoS Comput Biol 8:e1002614
Gurulingappa H, Toldo L, Rajput AM et al (2013) Automatic detection of adverse events to predict drug label changes using text and data mining techniques. Pharmacoepidemiol Drug Saf 22:1189–1194
Benton A, Ungar L, Hill S et al (2011) Identifying potential adverse effects using the web: a new approach to medical hypothesis generation. J Biomed Inform 44:989–996
Henriksson A, Kvist M, Hassel M et al (2012) Exploration of adverse drug reactions in semantic vector space models of clinical text. In: Proceedings of the 29th international conference on machine learning, vol 2012. Edinburgh, Scotland
Yang CC, Yang H, Jiang L et al (2012) Social media mining for drug safety signal detection. In: SHB '12 Proceedings of the 2012 international workshop on smart health and wellbeing, vol 2012. ACM, pp 33–40
Nikfarjam A, Gonzalez GH (2011) Pattern mining for extraction of mentions of adverse drug reactions from user comments. AMIA Annu Symp Proc 2011:1019–1026
Wang W, Haerian K, Salmasian H et al (2011) A drug-adverse event extraction algorithm to support pharmacovigilance knowledge mining from PubMed citations. AMIA Annu Symp Proc 2011:1464–1470
Liu Y, Lependu P, Iyer S et al (2012) Using temporal patterns in medical records to discern adverse drug events from indications. AMIA Summits Transl Sci Proc 2012:47–56
Bisgin H, Liu Z, Fang H et al (2011) Mining FDA drug labels using an unsupervised learning technique: topic modeling. BMC Bioinformatics 12(Suppl 10):S11
Yang C, Srinivasan P, Polgreen PM (2012) Automatic adverse drug events detection using letters to the editor. AMIA Annu Symp Proc 2012:1030–1039
Gurulingappa H, Mateen-Rajput A, Toldo L (2012) Extraction of potential adverse drug events from medical case reports. J Biomed Semantics 3:15
Gurulingappa H, Rajput AM, Roberts A et al (2012) Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. J Biomed Inform 45:885–892
Bjorne J, Airola A, Pahikkala T et al (2011) Drug–drug interaction extraction from biomedical texts with SVM and RLS classifiers. In: Proceedings of the 1st challenge task on drug–drug interaction extraction (DDI Extraction 2011), September 2011, Huelva, Spain, pp 35–42
Thomas P, Neves M, Solt I et al (2011) Relation extraction for drug–drug interactions using ensemble learning. In: Proceedings of the 1st challenge task on drug–drug interaction extraction (DDI Extraction 2011), September 2011. Huelva, Spain
Segura-Bedmar I, Martinez P, de Pablo-Sanchez C (2011) Using a shallow linguistic kernel for drug–drug interaction extraction. J Biomed Inform 44:789–804
Zhang Y, Lin H, Yang Z et al (2012) A single kernel-based approach to extract drug–drug interactions from biomedical literature. PLoS One 7:e48901
Percha B, Garten Y, Altman RB (2012) Discovery and explanation of drug–drug interactions via text mining. Pac Symp Biocomput 2012:410–421
Kolchinsky A, Lourenco A, Li L et al (2013) Evaluation of linear classifiers on articles containing pharmacokinetic evidence of drug–drug interactions. Pac Symp Biocomput 2013:409–420
Boyce R, Gardner G, Harkema H (2012) Using natural language processing to identify pharmacokinetic drug–drug interactions described in drug package inserts. In: Proceedings of the 2012 workshop on biomedical natural language processing (BioNLP 2012), June 8, 2012, Association for Computational Linguistics, Montreal, Canada, p 206–213
He L, Yang Z, Zhao Z et al (2013) Extracting drug–drug interaction from the biomedical literature using a stacked generalization-based approach. PLoS One 8:e65814
Vilar S, Harpaz R, Uriarte E et al (2012) Drug–drug interaction through molecular structure similarity analysis. J Am Med Inform Assoc 19:1066–1074
Kuhn M, Campillos M, Letunic I et al (2010) A side effect resource to capture phenotypic effects of drugs. Mol Syst Biol 6:343
Kuhn M, von Mering C, Campillos M et al (2008) STITCH: interaction networks of chemicals and proteins. Nucleic Acids Res 36:D684–D688
Acknowledgment
Mei Liu is supported by funds from the New Jersey Institute of Technology. Yong Hu is supported by the National Science Foundation of China (71271061, 70801020); Science and Technology Planning Project of Guangdong Province, China (2010B010600034, 2012B091100192); and Business Intelligence Key Team of Guangdong University of Foreign Studies (TD1202). Buzhou Tang is supported by the China Postdoctoral Science Foundation (2011 M500669).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this protocol
Cite this protocol
Liu, M., Hu, Y., Tang, B. (2014). Role of Text Mining in Early Identification of Potential Drug Safety Issues. In: Kumar, V., Tipney, H. (eds) Biomedical Literature Mining. Methods in Molecular Biology, vol 1159. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0709-0_13
Download citation
DOI: https://doi.org/10.1007/978-1-4939-0709-0_13
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-0708-3
Online ISBN: 978-1-4939-0709-0
eBook Packages: Springer Protocols