Skip to main content

A Hybrid Approach for the Extraction of Semantic Relations from MEDLINE Abstracts

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6609))

Abstract

With the continuous digitisation of medical knowledge, information extraction tools become more and more important for practitioners of the medical domain. In this paper we tackle semantic relationships extraction from medical texts. We focus on the relations that may occur between diseases and treatments. We propose an approach relying on two different techniques to extract the target relations: (i) relation patterns based on human expertise and (ii) machine learning based on SVM classification. The presented approach takes advantage of the two techniques, relying more on manual patterns when few relation examples are available and more on feature values when a sufficient number of examples are available. Our approach obtains an overall 94.07% F-measure for the extraction of cure, prevent and side effect relations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Hearst, M.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th International Conference on Computational Linguistics (COLING 1992), pp. 539–545 (1992)

    Google Scholar 

  2. Agichtein, E., Gravano, L.: Snowball: Extracting relations from large plain-text collections. In: Proceedings of the 5th ACM International Conference on Digital Libraries, pp. 85–94 (2000)

    Google Scholar 

  3. Fleischman, M., Hovy, E., Echihabi, A.: Offline strategies for online question answering: Answering questions before they are asked. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan, pp. 1–7. Association for Computational Linguistics (2003)

    Google Scholar 

  4. Rindflesch, T.C., Bean, C.A., Sneiderman, C.A.: Argument identification for arterial branching predications asserted in cardiac catheterization reports. In: AMIA Annu Symp Proc., pp. 704–708 (2000)

    Google Scholar 

  5. Blaschke, C., Andrade, M.A., Ouzounis, C., Valencia, A.: Automatic extraction of biological information from scientific text: protein-protein interactions. In: ISMB 1999, pp. 60–67 (1999)

    Google Scholar 

  6. Zweigenbaum, P.: Question answering in biomedicine. In: de Rijke, M., Webber, B. (eds.) Proceedings Workshop on Natural Language Processing for Question Answering, EACL 2003, Budapest, pp. 1–4. ACL (2003)

    Google Scholar 

  7. Shadow, G., MacDonald, C.: Extracting structured information from free text pathology reports. In: AMIA Annu Symp Proc., Washington, DC (2003)

    Google Scholar 

  8. Embarek, M., Ferret, O.: Learning patterns for building resources about semantic relations in the medical domain. In: LREC 2008 (May 2008)

    Google Scholar 

  9. Hindle, D.: Noun classification from predicate argument structures. In: Proc. 28th Annual Meeting of the Association for Computational Linguistics (ACL 1990), Berkeley, USA (1990)

    Google Scholar 

  10. Wang, T., Li, Y., Bontcheva, K., Cunningham, H., Wang, J.: Automatic extraction of hierarchical relations from text. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 215–229. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  11. Suchanek, F.M., Ifrim, G., Weikum, G.: Combining linguistic and statistical analysis to extract relations from Web documents. In: KDD 2006: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (April 2006)

    Google Scholar 

  12. Zhou, G., Su, J., Zhang, J., Zhang, M.: Combining various knowledge in relation extraction. In: Proceedings of the 43th Annual Meeting of the Association for Computational Linguistics (2005)

    Google Scholar 

  13. Stapley, B., Benoit, G.: Biobibliometrics: information retrieval and visualization from co-occurrences of gene names in MEDLINE abstracts. In: Proceedings of the Pacific Symposium on Biocomputing, Hawaii, USA, pp. 529–540 (2000)

    Google Scholar 

  14. Cimino, J.J., Barnett, G.O.: Automatic knowledge acquisition from MEDLINE. Methods Inf. Med. 32(2), 120–130 (1993)

    Google Scholar 

  15. Khoo, C.S.G., Chan, S., Niu, Y.: Extracting causal knowledge from a medical database using graphical patterns. In: Proc. 38th Annual Meeting of the Association for Computational Linguistics (ACL 2000), pp. 336–343 (2000)

    Google Scholar 

  16. Schneider, G., Kaljurand, K., Rinaldi, F.: Detecting protein-protein interactions in biomedical texts using a parser and linguistic resources. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 406–417. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  17. Xiao, J., Su, J., Zhou, G., Tan, C.: Protein-protein interaction extraction: a supervised learning approach. In: Proceedings of the 1st International Symposium on Semantic Mining in Biomedicine (SMBM) (2005)

    Google Scholar 

  18. Roberts, A., Gaizauskas, R., Hepple, M.: Extracting clinical relationships from patient narratives. In: BioNLP 2008 (2008)

    Google Scholar 

  19. Grouin, C., BenAbacha, A., Bernhard, D., Cartoni, B., Deléger, L., Grau, B., Ligozat, A.L., Minard, A.L., Rosset, S., Zweigenbaum, P.: CARAMBA: Concept, assertion, and relation annotation using machine-learning based approaches. In: Uzuner, Ö., et al. (eds.) i2b2 Medication Extraction Challenge Workshop (2010)

    Google Scholar 

  20. Lee, C., Khoo, C., Na, J.: Automatic identification of treatment relations for medical ontology learning: An exploratory study. In: McIlwaine, I. (ed.) Knowledge Organization and the Global Information Society: Proceedings of the Eighth International ISKO Conference (2004)

    Google Scholar 

  21. Ben Abacha, A., Zweigenbaum, P.: Automatic extraction of semantic relations between medical entities: Application to the treatment relation. In: Collier, N., Hahn, U. (eds.) Proceedings of the Fourth International Symposium on Semantic Mining in Biomedicine (SMBM), Hinxton, Cambridgeshire, UK, pp. 4–11 (2010)

    Google Scholar 

  22. Rosario, B., Hearst, M.A.: Classifying semantic relations in bioscience text. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004), Barcelona (July 2004)

    Google Scholar 

  23. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: An update. SIGKDD Explorations 11(1) (2009)

    Google Scholar 

  24. Frunza, O., Inkpen, D.: Extraction of disease-treatment semantic relations from biomedical sentences. In: Proceedings of the 2010 Workshop on Biomedical Natural Language Processing, Uppsala, Sweden, pp. 91–98. Association for Computational Linguistics (2010)

    Google Scholar 

  25. Joachims, T.: Text categorization with support vector machines: Learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398. Springer, Heidelberg (1998)

    Google Scholar 

  26. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

  27. Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing, Manchester, UK, pp. 44–49 (1994)

    Google Scholar 

  28. Lindberg, D.A., Humphreys, B., McCray, A.T.: The Unified Medical Language System. Methods of Information in Medicine 32(4), 281–291 (1993)

    Google Scholar 

  29. Kipper, K., Dang, H.T., Palmer, M.: Class-based construction of a verb lexicon. In: AAAI/IAAI, pp. 691–696

    Google Scholar 

  30. Levin, B.: English verb classes and alternation: A preliminary investigation. The University of Chicago Press, Chicago (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ben Abacha, A., Zweigenbaum, P. (2011). A Hybrid Approach for the Extraction of Semantic Relations from MEDLINE Abstracts. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2011. Lecture Notes in Computer Science, vol 6609. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19437-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19437-5_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19436-8

  • Online ISBN: 978-3-642-19437-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics