ABSTRACT
Medical researchers looking for evidence pertinent to a specific clinical question must navigate an increasingly voluminous corpus of published literature. This data deluge has motivated the development of machine learning and data mining technologies to facilitate efficient biomedical research. Despite the obvious labor-saving potential of these technologies and the concomitant academic interest therein, however, adoption of machine learning techniques by medical researchers has been relatively sluggish. One explanation for this is that while many machine learning methods have been proposed and retrospectively evaluated, they are rarely (if ever) actually made accessible to the practitioners whom they would benefit. In this work, we describe the ongoing development of an end-to-end interactive machine learning system at the Tufts Evidence-based Practice Center. More specifically, we have developed abstrackr, an online tool for the task of citation screening for systematic reviews. This tool provides an interface to our machine learning methods. The main aim of this work is to provide a case study in deploying cutting-edge machine learning methods that will actually be used by experts in a clinical research setting.
- J. Attenberg, P. Melville, and F. Provost. A unified approach to active dual supervision for labeling features and examples. Machine Learning and Knowledge Discovery in Databases, pages 40--55, 2010. Google ScholarDigital Library
- J. Attenberg and F. Provost. Inactive learning?: Difficulties employing active learning in practice. ACM SIGKDD Explorations Newsletter, 12(2):36--41, 2011. Google ScholarDigital Library
- J. Baldridge and A. Palmer. How well does active learning actually work?: Time-based evaluation of cost-reduction strategies for language documentation. In Empirical Methods on Natural Language Processing (EMNLP), pages 296--305. Association for Computational Linguistics, 2009. Google ScholarDigital Library
- P. Donmez and J. G. Carbonell. Proactive learning: Cost-sensitive active learning with multiple imperfect oracles. In Conference on Information and Knowledge Management (CIKM), pages 619--628, 2008. Google ScholarDigital Library
- P. Melville, W. Gryc, and R. Lawrence. Sentiment analysis of blogs by combining lexical knowledge with text classification. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1275--1284. ACM, 2009. Google ScholarDigital Library
- B. Settles. Active learning literature survey. Technical Report 1648, University of Wisconsin--Madison, 2010.Google Scholar
- B. Settles. Closing the loop: Fast, interactive semi-supervised annotation with queries on features and instances. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2011. Google ScholarDigital Library
- K. Small, B. Wallace, C. Brodley, and T. Trikalinos. The constrained weight-space svm: Learning with ranked features. In the 28th International Conference on Machine Learning (ICML), 2011.Google Scholar
- S. Tong and D. Koller. Support vector machine active learning with applications to text classification. In Journal of Machine Learning Research, pages 999--1006, 2000. Google ScholarDigital Library
- B. Wallace, K. Small, C. Brodley, J. Lau, and T. Trikalinos. Modeling annotation time to reduce workload in comparative effectiveness reviews. In Proceedings of the 1st ACM International Health Informatics Symposium, pages 28--35. ACM, 2010. Google ScholarDigital Library
- B. Wallace, K. Small, C. Brodley, and T. Trikalinos. Who should label what? Instance allocation in multiple expert active learning. In Proceedings of the SIAM International Conference on Data Mining (SDM), 2011.Google ScholarCross Ref
- B. C. Wallace, T. A. Trikalinos, J. Lau, C. E. Brodley, and C. H. Schmid. Semi-automated screening of biomedical citations for systematic reviews. BMC Bioinformatics, 11:55, 2010.Google ScholarCross Ref
- P. Zweigenbaum, D. Demner-Fushman, H. Yu, and K. Cohen. Frontiers of biomedical text mining: current progress. Briefings in Bioinformatics, 8(5):358, 2007.Google ScholarCross Ref
Index Terms
- Deploying an interactive machine learning system in an evidence-based practice center: abstrackr
Recommendations
Using Machine Learning for Automatic Identification of Evidence-Based Health Information on the Web
DH '17: Proceedings of the 2017 International Conference on Digital HealthAutomatic assessment of the quality of online health information is a need especially with the massive growth of online content. In this paper, we present an approach to assessing the quality of health webpages based on their content rather than on ...
Machine Learning: The State of the Art
The two fundamental problems in machine learning (ML) are statistical analysis and algorithm design. The former tells us the principles of the mathematical models that we establish from the observation data. The latter defines the conditions on which ...
IUI workshop on interactive machine learning
IUI '13 Companion: Proceedings of the companion publication of the 2013 international conference on Intelligent user interfaces companionMany applications of Machine Learning (ML) involve interactions with humans. Humans may provide input to a learning algorithm (in the form of labels, demonstrations, corrections, rankings or evaluations) while observing its outputs (in the form of ...
Comments