skip to main content
10.3115/992730.992782dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free Access

Automatic acquisition of domain knowledge for Information Extraction

Published:31 July 2000Publication History

ABSTRACT

In developing an Information Extraction (IE) system for a new class of events or relations, one of the major tasks is identifying the many ways in which these events or relations may be expressed in text. This has generally involved the manual analysis and, in some cases, the annotation of large quantities of text involving these events. This paper presents an alternative approach, based on an automatic discovery procedure, EXDISCO, which identifies a set of relevant documents and a set of event patterns from un-annotaled text, starting from a small set of "seed patterns." We evaluate EXDISCO by comparing the performance of discovered patterns against that of manually constructed systems on actual extraction tasks.

References

  1. David Fisher, Stephen Soderland, Joseph MeCarthy, Fangfang Feng, and Wendy Lehnert. 1995. Description of the UMass system as used for MUC-6. In Proc. Sixth Message. Understanding Conf. (MUC-6), Columbia, MD, November. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ralph Grishman. 1995. The NYU system for MUC-6, or where's the syntax? In Proc. Sixth Message Understanding Conf. (MUC-6), pages 167--176, Columbia, MD, November. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. W. Lehnert, C. Cardie, D. Fisher, J. McCarthy, E. Riloff, and S. Soderland. 1992. University of massachusetts: MUC-4 test results and analysis. In Proc. Fourth Message Understanding Conf., McLean, VA, June. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Scott Miller, Michael Crystal, Heidi Fox, Lance Ramshaw, Richard Schwartz, Rebecca Stone, Ralph Weischedel, and the Annotation Group. 1998. Algorithms that learn to extract information; BBN: Description of the SIFT system as used for MUC-7. In Proc. 7th Message Understanding Conf., Fairfax, VA.Google ScholarGoogle Scholar
  5. 1993. Proceedings of the Fifth Message Understanding Conference (MUC-5), Baltimore, MD, August. Morgan Kaufmann.Google ScholarGoogle Scholar
  6. 1995. Proceedings of the Sixth Message Understanding Conference (MUC-6), Columbia, MD, November. Morgan Kaufmann.Google ScholarGoogle Scholar
  7. Ellen Riloff and Rosie Jones. 1999. Learning dictionaries for information extraction by multi-level bootstrapping. In Proc. 16th Nat'l Conference on Artificial Intelligence (AAAI-99), Orlando, Florida. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ellen Riloff. 1996. Automatically generating extraction patterns from untagged text. In Proc. 13th Nat'l Conf. on Artificial Intelligence (AAAI-96). The AAAI Press/MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Pasi Tapanainen and Timo Järvinen. 1997. A non-projective dependency parser. In Proc. 5th Conf. on Applied Natural Language Processing, pages 64--71, Washington, D. C. ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Roman Yangarber and Ralph Grishman. 1997. Customization of information extraction systems. In Paola Velardi, editor, Int'l Workshop on Lexically Driven Information Extraction, Frascati, Italy. Università di Roma.Google ScholarGoogle Scholar
  11. Roman Yangarber and Ralph Grishman. 1998. NYU: Description of the Proteus/PET system as used for MUC-7 ST. In 7th Message Understanding Conference, Columbia, MD.Google ScholarGoogle Scholar
  12. Roman Yangarber, Ralph Grishman, Pasi Tapanainen, and Silja Huttunen. 2000. Unsupervised discovery of scenario-level patterns for information extraction. In Proc. Conf. on Applied Natural Language Processing (ANLP-NAACL), Seattle, WA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Automatic acquisition of domain knowledge for Information Extraction

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        COLING '00: Proceedings of the 18th conference on Computational linguistics - Volume 2
        July 2000
        549 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 31 July 2000

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate1,537of1,537submissions,100%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader