research-article

Mechanical turk as an ontology engineer?: using microtasks as a component of an ontology-engineering workflow

Authors:
Natalya F. Noy

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Jonathan Mortensen

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Mark A. Musen

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Paul R. Alexander

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

WebSci '13: Proceedings of the 5th Annual ACM Web Science ConferenceMay 2013Pages 262–271https://doi.org/10.1145/2464464.2464482

Published:02 May 2013Publication History

WebSci '13: Proceedings of the 5th Annual ACM Web Science Conference

Pages 262–271

ABSTRACT

Ontology evaluation has proven to be one of the more difficult problems in ontology engineering. Researchers proposed numerous methods to evaluate logical correctness of an ontology, its structure, or coverage of a domain represented by a corpus. However, evaluating whether or not ontology assertions correspond to the real world remains a manual and time-consuming task. In this paper, we explore the feasibility of using microtask crowdsourcing through Amazon Mechanical Turk to evaluate ontologies. Specifically, we look at the task of verifying the subclass--superclass hierarchy in ontologies. We demonstrate that the performance of Amazon Mechanical Turk workers (turkers) on this task is comparable to the performance of undergraduate students in a formal study. We explore the effects of the type of the ontology on the performance of turkers and demonstrate that turkers can achieve accuracy as high as 90% on verifying hierarchy statements form common-sense ontologies such as WordNet. Finally, we compare the performance of turkers to the performance of domain experts on verifying statements from an ontology in the biomedical domain. We report on lessons learned about designing ontology-evaluation experiments on Amazon Mechanical Turk. Our results demonstrate that microtask crowdsourcing can become a scalable and efficient component in ontology-engineering workflows.

References

Alexander, P. R., Nyulas, C. I., Tudorache, T., Whetzel, T., Noy, N. F., and Musen, M. A. Semantic infrastructure to enable collaboration in ontology development. In International Workshop on Semantic Technologies for Information-Integrated Collaboration (STIIC 2011) (Philadelphia, PA, USA, 2011).Google ScholarCross Ref
Auer, S., Dietzold, S., and Riechert, T. OntoWiki--a tool for social, semantic collaboration. In Fifth International Semantic Web Conference, ISWC, vol. LNCS 4273, Springer (Athens, GA, 2006). Google ScholarDigital Library
Bernstein, M., Little, G., Miller, R., Hartmann, B., Ackerman, M., Karger, D., Crowell, D., and Panovich, K. Soylent: a word processor with a crowd inside. In The 23d annual ACM symposium on user interface software and technology, ACM (2010), 313--322. Google ScholarDigital Library
Cooper, S., Khatib, F., Treuille, A., Barbero, J., Lee, J., Beenen, M., Leaver-Fay, A., Baker, D., and Popovi?, Z. Predicting protein structures with a multiplayer online game. Nature 466, 7307 (2010), 756--760.Google ScholarCross Ref
Demartini, G., Difallah, D. E., and Cudr-Mauroux, P. Zencrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In 21st World Wide Web Conference WWW2012 (Lyon, France, 2012), 469--478. Google ScholarDigital Library
Evermann, J., and Fang, J. Evaluating ontologies: Towards a cognitive measure of quality. Information Systems 35 (2010), 391403. Google ScholarDigital Library
Ghidini, C., Kump, B., Lindstaedt, S., Mahbub, N., Pammer, V., Rospocher, M., and Serafini, L. Moki: The enterprise modelling wiki. In European Semantic Web Conference (ESWC-2009), Springer Berlin/Heidelberg (Heraklion, Greece, 2009), 831835. Google ScholarDigital Library
GOConsortium. Creating the Gene Ontology resource: design and implementation. Genome Res 11, 8 (2001), 1425--33.Google Scholar
Green, P., and Rosemann, M. Integrated process modeling: An ontological evaluation. Information Systems 25, 2 (2000), 73--87. Google ScholarDigital Library
Haendel, M., Neuhaus, F., Osumi-Sutherland, D., Mabee, P., Mejino, J., Mungall, C., and Smith, B. Carothe common anatomy reference ontology. Anatomy Ontologies for Bioinformatics (2008), 327--349.Google Scholar
Hausenblas, M., Troncy, R., Raimond, Y., and Brger, T. Interlinking multimedia: How to apply linked data principles to multimedia fragments. In WWW 2009 Workshop: Linked Data on the Web (2009).Google Scholar
Kittur, A., Chi, E., and Suh, B. Crowdsourcing user studies with Mechanical Turk. In 26th annual SIGCHI conference on human factors in computing systems (2008), 453--456. Google ScholarDigital Library
Lin, C. H., Mausam, and Weld, D. S. Dynamically switching between synergistic workows for crowdsourcing. In Twenty-Sixth AAAI Conference on Artificial Intelligence (2012).Google Scholar
Markotschi, T., and Völker, J. GuessWhat?! - Human Intelligence for Mining Linked Data. In Proceedings of the Workshop on Knowledge Injection into and Extraction from Linked Data at EKAW (2010).Google Scholar
Mason, W., and Watts, D. Financial incentives and the "Performance of Crowds". In ACM SIGKDD workshop on human computation, ACM (2009), 77--85. Google ScholarDigital Library
McCann, R., Shen, W., and Doan, A. Matching schemas in online communities: A Web 2.0 approach. In The 24th International Conference on Data Engineering (ICDE-08) (Cancun, Mexico, 2008). Google ScholarDigital Library
Minder, P., Seuken, S., Bernstein, A., and Zollinger, M. Crowdmanager-combinatorial allocation and pricing of crowdsourcing tasks with time constraints. In Workshop on Social Computing and User Generated Content in conjunction with ACM Conference on Electronic Commerce (ACM-EC 2012) (Valencia, Spain, 2012), 1--18.Google Scholar
Musen, M. A., Noy, N. F., Shah, N. H., Whetzel, P. L., Chute, C. G., Storey, M.-A., Smith, B., and team, T. N. The national center for biomedical ontology. Journal of American Medical Informatics Association 19 (2012), 190--195.Google ScholarCross Ref
Niles, I., and Pease, A. Towards a standard upper ontology. In The 2nd International Conference on Formal Ontology in Information Systems (FOIS-2001) (Ogunquit, Maine, 2001). Google ScholarDigital Library
Noy, N. F., Griffith, N., and Musen, M. A. Collecting community-based mappings in an ontology repository. In 7th International Semantic Web Conference (ISWC 2008) (Karlsruhe, Germany, 2008). Google ScholarDigital Library
Noy, N. F., Mortensen, J., Alexander, P. R., and Musen, M. A. Ontology engineering through microtask crowdsourcing. Under review (2013).Google Scholar
Quinn, A., and Bederson, B. Human computation: a survey and taxonomy of a growing field. In Annual Conference on Human Factors in Computing Systems (CHI 2011), ACM (Vancouver, BC, 2011), 1403--1412. Google ScholarDigital Library
Raddick, M., Bracey, G., Gay, P., Lintott, C., Murray, P., Schawinski, K., Szalay, A., and Vandenberg, J. Galaxy zoo: exploring the motivations of citizen science volunteers. arXiv preprint arXiv:0909.2925 (2009).Google Scholar
Sarasua, C., Simperl, E., and Noy, N. F. Crowdmap: Crowdsourcing ontology alignment with microtasks. In 11th International Semantic Web Conference (ISWC), Springer (Boston, MA, 2012). Google ScholarDigital Library
Schwarz, N. Self-reports: How the questions shape the answers. American Psychologist 54, 2 (1999), 93--105.Google ScholarCross Ref
Sebastian, A., Noy, N. F., Tudorache, T., and Musen, M. A. A generic ontology for collaborative ontology-development workflows. In 16th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2008), Springer (Catania, Italy, 2008). Google ScholarDigital Library
Simperl, E., Norton, B., and Vrandecic, D. Crowdsourcing tasks in linked data management. In 2nd workshop on consuming Linked Data COLD2011 co-located with the 10th International Semantic Web Conference ISWC 2011 (Bonn, Germany, 2011).Google Scholar
Tanur, J. M. Questions about Questions: Inquiries Into the Coqnitive Bases of Surveys. Russell Sage Foundation Publications, 1992.Google Scholar
Thaler, S., Siorpaes, K., and Simperl, E. SpotTheLink: A Game for Ontology Alignment. In 6th Conference for Professional Knowledge Management (2011).Google ScholarDigital Library
Tudorache, T., Nyulas, C., Noy, N. F., and Musen, M. A. Webprotégé: A distributed ontology editor and knowledge acquisition tool for the web. Semantic Web Journal 11-165 (2011).Google Scholar
von Ahn, L., and Dabbish, L. Labeling images with a computer game. In SIGCHI conference on Human factors in computing systems, ACM Press New York, NY, USA (2004), 319--326. Google ScholarDigital Library
Wang, J., Ghose, A., and Ipeirotis, P. Bonus, disclosure, and choice: What motivates the creation of high-quality paid reviews? In Thirty Third International Conference on Information Systems (ICIS) (Orlando, FL, 2012).Google Scholar
Waterhouse, T. P. Pay by the bit: an information-theoretic metric for collective human judgment. In Conference on Computer supported cooperative work (CSCW), ACM (2013), 623--638. Google ScholarDigital Library
Whetzel, P. L., Noy, N. F., Shah, N. H., Alexander, P. R., Nyulas, C. I., Tudorache, T., and Musen, M. A. Bioportal: Enhanced functionality via new web services from the national center for biomedical ontology to access and use ontologies in software applications. Nucleic Acids Research (NAR) 39, Web Server issue (2011), W541--5.Google Scholar
Zhdanova, A., and Shvaiko, P. Community-driven ontology matching. In 3rd European Semantic Web Conference (Budva, Montenegro, 2006), 3449. Google ScholarDigital Library

Index Terms

Mechanical turk as an ontology engineer?: using microtasks as a component of an ontology-engineering workflow
1. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

How many crowdsourced workers should a requester hire?

Recent years have seen an increased interest in crowdsourcing as a way of obtaining information from a potentially large group of workers at a reduced cost. The crowdsourcing process, as we consider in this paper, is as follows: a requester hires a ...
Read More
Investigating the Amazon Mechanical Turk Market Through Tool Design

We developed TurkBench to better understand the work of crowdworkers on the Amazon Mechanical Turk (AMT) marketplace. While we aimed to reduce the amount of invisible, unpaid work that these crowdworkers performed, we also probed the day-to-day ...
Read More
A Data-Driven Analysis of Workers' Earnings on Amazon Mechanical Turk
CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems

A growing number of people are working as part of on-line crowd work. Crowd work is often thought to be low wage work. However, we know little about the wage distribution in practice and what causes low/high earnings in this setting. We recorded 2,676 ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WebSci '13: Proceedings of the 5th Annual ACM Web Science Conference
May 2013
481 pages
ISBN:9781450318891
DOI:10.1145/2464464
Conference Chairs:
Hugh Davis
University of Southampton
,
Harry Halpin
World Wide Web Consortium
,
Alex Pentland,
Program Chairs:
Mark Bernstein,
Lada Adamic,
Harith Alani,
Alexandre Monnin,
Richard Rogers
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 May 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Amazon mechanical turk
crowdsourcing
human computation
ontology
semantic web
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate218of875submissions,25%
Upcoming Conference
Websci '24

Sponsor:

sigweb

16th ACM Web Science Conference

May 21 - 24, 2024

Stuttgart , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 31
  Total Citations
  View Citations
- 302
  Total Downloads
- Downloads (Last 12 months)12
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Mechanical turk as an ontology engineer?: using microtasks as a component of an ontology-engineering workflow

WebSci '13: Proceedings of the 5th Annual ACM Web Science Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

How many crowdsourced workers should a requester hire?

Investigating the Amazon Mechanical Turk Market Through Tool Design

A Data-Driven Analysis of Workers' Earnings on Amazon Mechanical Turk