ABSTRACT
A central challenge in human computation is in understanding how to design task environments that effectively attract participants and coordinate the problem solving process. In this paper, we consider a common problem that requesters face on Amazon Mechanical Turk: how should a task be designed so as to induce good output from workers? In posting a task, a requester decides how to break down the task into unit tasks, how much to pay for each unit task, and how many workers to assign to a unit task. These design decisions affect the rate at which workers complete unit tasks, as well as the quality of the work that results. Using image labeling as an example task, we consider the problem of designing the task to maximize the number of quality tags received within given time and budget constraints. We consider two different measures of work quality, and construct models for predicting the rate and quality of work based on observations of output to various designs. Preliminary results show that simple models can accurately predict the quality of output per unit task, but are less accurate in predicting the rate at which unit tasks complete. At a fixed rate of pay, our models generate different designs depending on the quality metric, and optimized designs obtain significantly more quality tags than baseline comparisons.
- P. Dai, Mausam, and D. S. Weld. Decision-theoretic control of crowd-sourced workflows. In the 24th AAAI Conference on Artificial Intelligence (AAAI'10), Atlanta, Georgia, 2010.Google Scholar
- J. Horton and L. Chilton. The labor economics of paid crowdsourcing. CoRR, abs/1001.0627, 2010.Google Scholar
- J. Horton, D. G. Rand, and R. J. Zeckhauser. The Online Laboratory: Conducting Experiments in a Real Labor Market. SSRN eLibrary, 2010.Google Scholar
- P. G. Ipeirotis. Demographics of mechanical turk. CeDER Working Papers, 2010.Google Scholar
- G. Little, L. B. Chilton, M. Goldman, and R. C. Miller. Turkit: tools for iterative tasks on mechanical turk. In KDD-HCOMP '09, Paris, France, June 2009. Google ScholarDigital Library
- W. Mason and D. J. Watts. Financial incentives and the "Performance of Crowds". In KDD-HCOMP '09, Paris, France, June 2009. Google ScholarDigital Library
- J. Pontin. Artificial intelligence, with help from the humans. The New York Times, March 2007.Google Scholar
- M. F. Porter. An algorithm for suffix stripping. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997.Google Scholar
- H. A. Simon. The sciences of the artificial. Cambridge, Ma: MIT Press, 1969.Google ScholarDigital Library
- R. Snow, B. O'Connor, D. Jurafsky, and A. Y. Ng. Cheap and fast - but is it good? evaluating non-expert annotations for natural language tasks. In the 2008 Conference on Empirical Methods in Natural Language Processing, pages 254--263, Honolulu, October 2008. Google ScholarDigital Library
- Q. Su, D. Pavlov, J.-H. Chou, and W. C. Baker. Internet-scale collection of human-reviewed data. In the 2007 International World Wide Web Conference, Banff, Alberta, Canada, May 2007. Google ScholarDigital Library
- L. von Ahn and L. Dabbish. Designing games with a purpose. Commun. ACM, 51(8):58--67, 2008. Google ScholarDigital Library
- Wikipedia statistics, January 2010. http://stats.wikimedia.org/EN/TablesWikipediaEN.htm.Google Scholar
- H. Zhang, Y. Chen, and D. C. Parkes. A general approach to environment design with one agent. In the 21st International Joint Conference on Artificial Intelligence (IJCAI-09), Pasadena, CA, 2009. Google ScholarDigital Library
- H. Zhang and D. C. Parkes. Value-based policy teaching with active indirect elicitation. In the 23rd AAAI Conference on Artificial Intelligence (AAAI'08), Chicago, IL, 2008. Google ScholarDigital Library
- H. Zhang, D. C. Parkes, and Y. Chen. Policy teaching through reward function learning. In the 10th ACM Electronic Commerce Conference (EC'09), Stanford, CA, 2009. Google ScholarDigital Library
Index Terms
- Toward automatic task design: a progress report
Recommendations
TurKit: tools for iterative tasks on mechanical Turk
HCOMP '09: Proceedings of the ACM SIGKDD Workshop on Human ComputationMechanical Turk (MTurk) is an increasingly popular web service for paying people small rewards to do human computation tasks. Current uses of MTurk typically post independent parallel tasks. We are exploring an alternative iterative paradigm, in which ...
Who are the crowdworkers?: shifting demographics in mechanical turk
CHI EA '10: CHI '10 Extended Abstracts on Human Factors in Computing SystemsAmazon Mechanical Turk (MTurk) is a crowdsourcing system in which tasks are distributed to a population of thousands of anonymous workers for completion. This system is increasingly popular with researchers and developers. Here we extend previous ...
Pricing mechanisms for crowdsourcing markets
WWW '13: Proceedings of the 22nd international conference on World Wide WebEvery day millions of crowdsourcing tasks are performed in exchange for payments. Despite the important role pricing plays in crowdsourcing campaigns and the complexity of the market, most platforms do not provide requesters appropriate tools for ...
Comments