ABSTRACT
We present hybrid crowd-machine learning classifiers: classification models that start with a written description of a learning goal, use the crowd to suggest predictive features and label data, and then weigh these features using machine learning to produce models that are accurate and use human-understandable features. These hybrid classifiers enable fast prototyping of machine learning models that can improve on both algorithm performance and human judgment, and accomplish tasks where automated feature extraction is not yet feasible. Flock, an interactive machine learning platform, instantiates this approach. To generate informative features, Flock asks the crowd to compare paired examples, an approach inspired by analogical encoding. The crowd's efforts can be focused on specific subsets of the input space where machine-extracted features are not predictive, or instead used to partition the input space and improve algorithm performance in subregions of the space. An evaluation on six prediction tasks, ranging from detecting deception to differentiating impressionist artists, demonstrated that aggregating crowd features improves upon both asking the crowd for a direct prediction and off-the-shelf machine learning features by over 10%. Further, hybrid systems that use both crowd-nominated and machine-extracted features can outperform those that use either in isolation.
- Crowdflower. http://www.crowdflower.com.Google Scholar
- Anderson, A., Huttenlocher, D., Kleinberg, J., and Leskovec, J. Discovering value from community activity on focused question answering sites: a case study of stack overflow. In Proc. KDD (2012). Google ScholarDigital Library
- André, P., Kittur, A., and Dow, S. P. Crowd synthesis: Extracting categories and clusters from complex data. In Proc. CSCW (2014). Google ScholarDigital Library
- Attenberg, J., Ipeirotis, P. G., and Provost, F. J. Beat the machine: Challenging workers to find the unknown unknowns. In Proc. AAAIW (2011).Google Scholar
- Blumenstock, J. E. Automatically assessing the quality of wikipedia articles. UCB School of Information (2008).Google Scholar
- Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., and Belongie, S. Visual recognition with humans in the loop. In Proc. ECCV. 2010. Google ScholarDigital Library
- Chilton, L. B., Little, G., Edge, D., Weld, D. S., and Landay, J. A. Cascade: Crowdsourcing taxonomy creation. In Proc. CHI (2013). Google ScholarDigital Library
- Dai, P., Weld, D. S., et al. Decision-theoretic control of crowd-sourced workflows. In Proc. AAAI (2010).Google Scholar
- Deng, J., Krause, J., and Fei-Fei, L. Fine-grained crowdsourcing for fine-grained recognition. In Proc. CVPR (2013). Google ScholarDigital Library
- Domingos, P. A few useful things to know about machine learning. Communications of the ACM (2012). Google ScholarDigital Library
- Fails, J. A., and Olsen Jr, D. R. Interactive machine learning. In Proc. IUI (2003). Google ScholarDigital Library
- Fogarty, J., Tan, D., Kapoor, A., and Winder, S. Cueflik: interactive concept learning in image search. In Proc. CHI (2008). Google ScholarDigital Library
- Gazzaniga, M. S., LeDoux, J. E., Gazzaniga, M., and LeDoux, J. The integrated mind. 1978.Google ScholarCross Ref
- Gentner, D., Loewenstein, J., and Thompson, L. Learning and transfer: A general role for analogical encoding. Journal of Educational Psychology (2003).Google Scholar
- Gentner, D., and Markman, A. B. Structure mapping in analogy and similarity. American Psychologist (1997).Google Scholar
- Gomes, R. G., Welinder, P., Krause, A., and Perona, P. Crowdclustering. In Advances in Neural Information Processing Systems (2011), 558--566.Google Scholar
- Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. The weka data mining software: an update. ACM SIGKDD Explorations (2009). Google ScholarDigital Library
- Hammond, K. R., Hursch, C. J., and Todd, F. J. Analyzing the components of clinical inference. Psychological review (1964).Google Scholar
- Hartmann, B., Abdulla, L., Mittal, M., and Klemmer, S. R. Authoring sensor-based interactions by demonstration with direct manipulation and pattern recognition. In Proc. CHI (2007). Google ScholarDigital Library
- Kahneman, D. A perspective on judgment and choice: mapping bounded rationality. American psychologist (2003).Google Scholar
- Kamar, E., Kapoor, A., and Horvitz, E. Lifelong learning for acquiring the wisdom of the crowd. In Proc. JCAI (2013). Google ScholarDigital Library
- Kim, J., Cheng, J., and Bernstein, M. S. Ensemble: Exploring complementary strengths of leaders and crowds in creative collaboration. In Proc. CSCW (2014). Google ScholarDigital Library
- Kittur, A., Peters, A. M., Diriye, A., and Bove, M. Standing on the schemas of giants: socially augmented information foraging. In Proc. CSCW (2014). Google ScholarDigital Library
- Lasecki, W. S., Song, Y. C., Kautz, H., and Bigham, J. P. Real-time crowd labeling for deployable activity recognition. In Proc. CSCW (2013). Google ScholarDigital Library
- Law, E. Attribute Learning using Joint Human and Machine Computation. PhD thesis, 2012. Google ScholarDigital Library
- Law, E., Settles, B., Snook, A., Surana, H., Von Ahn, L., and Mitchell, T. Human computation for attribute and attribute value acquisition. In FPGV (2011).Google Scholar
- Law, E., and Von Ahn, L. Input-agreement: a new mechanism for collecting data using human computation games. In Proc. CHI, ACM (2009). Google ScholarDigital Library
- Le, J., Edmonds, A., Hester, V., and Biewald, L. Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution. In Proc. SIGIR (2010).Google Scholar
- Li, H., Zhao, B., and Fuxman, A. The wisdom of minority: Discovering and targeting the right group of workers for crowdsourcing. In Proc. WWW (2014). Google ScholarDigital Library
- Lowe, D. G. Object recognition from local scale-invariant features. In ICCV (1999). Google ScholarDigital Library
- Ott, M., Choi, Y., Cardie, C., and Hancock, J. T. Finding deceptive opinion spam by any stretch of the imagination. In Proc. ACL (2011). Google ScholarDigital Library
- Parameswaran, A., Sarma, A. D., Garcia-Molina, H., Polyzotis, N., and Widom, J. Human-assisted graph search: it's okay to ask questions. Proc. VLDB (2011). Google ScholarDigital Library
- Parikh, D., and Grauman, K. Interactively building a discriminative vocabulary of nameable attributes. In CVPR (2011). Google ScholarDigital Library
- Patel, K., Bancroft, N., Drucker, S. M., Fogarty, J., Ko, A. J., and Landay, J. Gestalt: integrated support for implementation and analysis in machine learning. In Proc. UIST (2010). Google ScholarDigital Library
- Patel, K., Fogarty, J., Landay, J. A., and Harrison, B. Investigating statistical machine learning as a tool for software development. In Proc. CHI (2008). Google ScholarDigital Library
- Raykar, V. C., Yu, S., Zhao, L. H., Valadez, G. H., Florin, C., Bogoni, L., and Moy, L. Learning from crowds. JMLR (2010). Google ScholarDigital Library
- Rzeszotarski, J., and Kittur, A. Crowdscape: interactively visualizing user behavior and output. In Proc. UIST (2012). Google ScholarDigital Library
- Rzeszotarski, J. M., and Kittur, A. Instrumenting the crowd: using implicit behavioral measures to predict task performance. In Proc. UIST (2011). Google ScholarDigital Library
- Settles, B. Closing the loop: Fast, interactive semi-supervised annotation with queries on features and instances. In Proc. EMNLP (2011). Google ScholarDigital Library
- Sheng, V. S., Provost, F., and Ipeirotis, P. G. Get another label- improving data quality and data mining using multiple, noisy labelers. In Proc. KDD (2008). Google ScholarDigital Library
- Skinner, B. F. Reinforcement today. American Psychologist (1958).Google Scholar
- Snow, R., O'Connor, B., Jurafsky, D., and Ng, A. Y. Cheap and fast - but is it good?: evaluating non-expert annotations for natural language tasks. In Proc. EMNLP (2008). Google ScholarDigital Library
- Talbot, J., Lee, B., Kapoor, A., and Tan, D. S. Ensemblematrix: interactive visualization to support machine learning with multiple classifiers. In Proc. CHI (2009). Google ScholarDigital Library
- Vijayanarasimhan, S., and Grauman, K. Large-scale live active learning: Training object detectors with crawled data and crowds. In CVPR (2011). Google ScholarDigital Library
- Wais, P., Lingamneni, S., Cook, D., Fennell, J., Goldenberg, B., Lubarov, D., Marin, D., and Simons, H. Towards building a high-quality workforce with mechanical turk. In Proc. NIPSCSS (2010).Google Scholar
- Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., and Movellan, J. R. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In Proc. NIPS (2009).Google Scholar
- Yu, L., Kittur, A., and Kraut, R. E. Distributed analogical idea generation: Inventing with crowds. In Proc. CHI (2014). Google ScholarDigital Library
- Yu, L., and Nickerson, J. V. Cooks or cobblers': crowd creativity through combination. In Proc. CHI (2011). Google ScholarDigital Library
- Parkash, A., and Parikh, D. Attributes for classifier feedback. In Proc. ECCV. 2012. Google ScholarDigital Library
Index Terms
- Flock: Hybrid Crowd-Machine Learning Classifiers
Recommendations
Guiding Reinforcement Learning Exploration Using Natural Language
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsIn this work we present a technique for using natural language to help reinforcement learning generalize to unseen environments using neural machine translation techniques. These techniques are then integrated into policy shaping to make it more ...
A fast hybrid reinforcement learning framework with human corrective feedback
Reinforcement Learning agents can be supported by feedback from human teachers in the learning loop that guides the learning process. In this work we propose two hybrid strategies of Policy Search Reinforcement Learning and Interactive Machine Learning ...
Modus Operandi of Crowd Workers: The Invisible Role of Microtask Work Environments
The ubiquity of the Internet and the widespread proliferation of electronic devices has resulted in flourishing microtask crowdsourcing marketplaces, such as Amazon MTurk. An aspect that has remained largely invisible in microtask crowdsourcing is that ...
Comments