tutorial

Web-based open-domain information extraction

Author:
Marius Pasca

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementOctober 2011Pages 2605–2606https://doi.org/10.1145/2063576.2064034

Published:24 October 2011Publication History

CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

Pages 2605–2606

ABSTRACT

This tutorial provides an overview of extraction methods developed in the area of Web-based open-domain information extraction, whose purpose is the acquisition of open-domain classes, instances and relations from Web text. The extraction methods operate over unstructured or semi-structured text. They take advantage of weak supervision provided in the form of seed examples or small amounts of annotated data, or draw upon knowledge already encoded within resources created strictly by experts or collaboratively by users. The tutorial teaches the audience about existing resources that include instances and relations; details of methods for extracting such data from structured and semi-structured text available on the Web; and strengths and limitations of resources extracted from text as part of recent literature, with applications in knowledge discovery and information retrieval.

Index Terms

Web-based open-domain information extraction
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Web-based open-domain information extraction
WWW '11: Proceedings of the 20th international conference companion on World wide web

This tutorial provides an overview of extraction methods developed in the area of Web-based open-domain information extraction, whose purpose is the acquisition of open-domain classes, instances and relations from Web text. The extraction methods ...
Read More
Extraction of open-domain class attributes from text: building blocks for faceted search
SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Knowledge automatically extracted from text captures instances, classes of instances and relations among them. In particular, the acquisition of class attributes (e.g., "top speed", "body style" and "number of cylinders" for the class of "sports cars") ...
Read More
Information Extraction from the Web: System and Techniques

Information Extraction (IE) systems that can exploit the vast source of textual information that is the internet would provide a revolutionary step forward in terms of delivering large volumes of content cheaply and precisely, thus enabling a wide range ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
October 2011
2712 pages
ISBN:9781450307178
DOI:10.1145/2063576
Editors:
Bettina Berendt,
Arjen de Vries,
Wenfei Fan,
Craig Macdonald
University of Glasgow, UK
,
Iadh Ounis
University of Glasgow, UK
,
Ian Ruthven
University of Strathclyde, UK
Copyright © 2011 Author
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 October 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
information extraction
knowledge acquisition
web corpora
Qualifiers
- tutorial
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 228
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Web-based open-domain information extraction

CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

ABSTRACT

Cited By

Index Terms

Recommendations

Web-based open-domain information extraction

Extraction of open-domain class attributes from text: building blocks for faceted search

Information Extraction from the Web: System and Techniques