Abstract
Machine-learning algorithms hold promise for revolutionizing how educators and clinicians make decisions. However, researchers in behavior analysis have been slow to adopt this methodology to further develop their understanding of human behavior and improve the application of the science to problems of applied significance. One potential explanation for the scarcity of research is that machine learning is not typically taught as part of training programs in behavior analysis. This tutorial aims to address this barrier by promoting increased research using machine learning in behavior analysis. We present how to apply the random forest, support vector machine, stochastic gradient descent, and k-nearest neighbors algorithms on a small dataset to better identify parents of children with autism who would benefit from a behavior analytic interactive web training. These step-by-step applications should allow researchers to implement machine-learning algorithms with novel research questions and datasets.
Similar content being viewed by others
Notes
These data are available at https://osf.io/yhk2p/.
There was no significant linear association between the features.
The last line of your Anaconda Prompt or Terminal screen should begin with <myenv>. If it begins with <base>, you have not activated your environment correctly.
Do not copy the line numbers (on the left). These numbers are meant to guide the reader through each code block. A line with no number indicates that the line is a continuation of the line above. It should also be noted that Python code is case sensitive.
For example: C:/Users/Bob/Documents/. If you copy the file location from the property menu of Windows Explorer, you need to replace the backslashes with forward slashes.
For those unfamiliar with matrices, we can call and manipulate specific locations in the matrix using a bracket [i, j], where i is the row number and j the column number. Python begins indexing (numbering of rows and columns) at 0 and the last value is excluded from ranges. Therefore, data_matrix[0, 1] refers to the first row (index = 0) and second column (i.e., index = 1). In the current example, data_matrix[:, 2:4] refers to all rows for the third and fourth columns of the .csv file (indices = 2 and 3).
Lines that are part of a loop (i.e., indented lines of code) must be preceded by a tab. In our code block, the spaces at the beginning of the lines (i.e., following the numbers) represent this tab. If you struggle with indentation or running the code, we recommend that you consult and use our ML_step-by-step.py file available freely in the online repository.
We did not include artificial neural networks because they require larger datasets than our current sample size.
References
Bishop, C. M. (2006). Pattern recognition and machine learning. New York, NY:Springer.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. https://doi.org/10.1023/A:1010933404324.
Burgos, J. E. (2003). Theoretical note: Simulating latent inhibition with selection ANNs. Behavioural Processes, 62(1–3), 183–192. https://doi.org/10.1016/s0376-6357(03)00025-1.
Burgos, J. E. (2007). Autoshaping and automaintenance: A neural-network approach. Journal of the Experimental Analysis of Behavior, 88(1), 115–130. https://doi.org/10.1901/jeab.2007.75-04.
Cai, J., Luo, J., Wang, S., & Yang, S. (2018). Feature selection in machine learning: A new perspective. Neurocomputing, 300, 70–79. https://doi.org/10.1016/j.neucom.2017.11.077.
Christodoulou, E., Ma, J., Collins, G. S., Steyerberg, E. W., Verbakel, J. Y., & Van Calster, B. (2019). A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Journal of Clinical Epidemiology, 110, 12–22. https://doi.org/10.1016/j.jclinepi.2019.02.004.
Chung, J. Y., & Lee, S. (2019). Dropout early warning systems for high school students using machine learning. Children & Youth Services Review, 96, 346–353. https://doi.org/10.1016/j.childyouth.2018.11.030.
Coelho, O. B., & Silveira, I. (2017, October). Deep learning applied to learning analytics and educational data mining: A systematic literature review. Brazilian Symposium on Computers in Education, 28(1), 143–152. https://doi.org/10.5753/cbie.sbie.2017.143.
Dawson, N. V., & Weiss, R. (2012). Dichotomizing continuous variables in statistical analysis: A practice to avoid. Medical Decision Making, 32(2), 225–226. https://doi.org/10.1177/0272989X12437605.
Dietterich, T. (1995). Overfitting and undercomputing in machine learning. ACM Computing Surveys, 27(3), 326–327. https://doi.org/10.1145/212094.212114.
Ding, L., Fang, W., Luo, H., Love, P. E. D., Zhong, B., & Ouyang, X. (2018). A deep hybrid learning model to detect unsafe behavior: Integrating convolution neural networks and long short-term memory. Automation in Construction, 86, 118–124. https://doi.org/10.1016/j.autcon.2017.11.002.
Hagopian, L. P. (2020). The consecutive controlled case series: Design, data-analytics, and reporting methods supporting the study of generality. Journal of Applied Behavior Analysis, 53(2), 596–619. https://doi.org/10.1002/jaba.691.
Harrison, P. L., & Oakland, T. (2011). Adaptive Behavior Assessment System-II: Clinical use and interpretation. San Diego, CA:Academic Press.
Irwin, J. R., & McClelland, G. H. (2003). Negative consequences of dichotomizing continuous predictor variables. Journal of Marketing Research, 40(3), 366–371. https://doi.org/10.1509/jmkr.40.3.366.19237.
Jessel, J., Metras, R., Hanley, G. P., Jessel, C., & Ingvarsson, E. T. (2020). Evaluating the boundaries of analytic efficiency and control: A consecutive controlled case series of 26 functional analyses. Journal of Applied Behavior Analysis, 53(1), 25–43. https://doi.org/10.1002/jaba.544.
Lanovaz, M. J., Giannakakos, A. R., & Destras, O. (2020). Machine learning to analyze single-case data: A proof of concept. Perspectives on Behavior Science, 43(1), 21–38. https://doi.org/10.1007/s40614-020-00244-0.
Lee, W.-M. (2019). Python machine learning. Indianapolis, IN:Wiley.
Leijten, P., Raaijmakers, M. A., de Castro, B. O., & Matthys, W. (2013). Does socioeconomic status matter? A meta-analysis on parent training effectiveness for disruptive child behavior. Journal of Clinical Child & Adolescent Psychology, 42(3), 384–392. https://doi.org/10.1080/15374416.2013.769169.
Linstead, E., Dixon, D. R., French, R., Granpeesheh, D., Adams, H., German, R., . . . Kornack, J. (2017). Intensity and learning outcomes in the treatment of children with autism spectrum disorder. Behavior Modification, 41(2), 229–252. https://doi.org/10.1177/0145445516667059
Linstead, E., German, R., Dixon, D., Granpeesheh, D., Novack, M., & Powell, A. (2015). An application of neural networks to predicting mastery of learning outcomes in the treatment of autism spectrum disorder. In 2015 IEEE 14th international conference on machine learning & applications, December 2018, Miami, FL (pp. 414–418). IEEE. https://doi.org/10.1109/ICMLA.2015.214
Lomas Mevers, J., Muething, C., Call, N. A., Scheithauer, M., & Hewett, S. (2018). A consecutive case series analysis of a behavioral intervention for enuresis in children with developmental disabilities. Developmental Neurorehabilitation, 21(5), 336–344. https://doi.org/10.1080/17518423.2018.1462269.
MacCallum, R. C., Zhang, S., Preacher, K. J., & Rucker, D. D. (2002). On the practice of dichotomization of quantitative variables. Psychological Methods, 7(1), 19–40. https://doi.org/10.1037/1082-989x.7.1.19.
McHugh, M. L. (2012). Interrater reliability: The kappa statistic. Biochemia Medica, 22(3), 276–282.
Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep learning for healthcare: Review, opportunities and challenges. Briefings in Bioinformatics, 19(6), 1236–1246. https://doi.org/10.1093/bib/bbx044.
Ninci, J., Vannest, K. J., Willson, V., & Zhang, N. (2015). Interrater agreement between visual analysts of single-case data: A meta-analysis. Behavior Modification, 39(4), 510–541. https://doi.org/10.1177/0145445515581327.
Peng, C. Y. J., Lee, K. L., & Ingersoll, G. M. (2002). An introduction to logistic regression analysis and reporting. Journal of Educational Research, 96(1), 3–14. https://doi.org/10.1080/00220670209598786.
Qian, Y., Zhou, W., Yan, J., Li, W., & Han, L. (2015). Comparing machine learning classifiers for object-based land cover classification using very high resolution imagery. Remote Sensing, 7(1), 153–168. https://doi.org/10.3390/rs70100153.
Rajaguru, H., & Chakravarthy, S. R. S. (2019). Analysis of decision tree and k-nearest neighbor algorithm in the classification of breast cancer. Asian Pacific Journal of Cancer Prevention, 20(12), 3777–3781. https://doi.org/10.31557/APJCP.2019.20.12.3777.
Raschka, S., & Mirjalili, V. (2019). Python machine learning: Machine learning and deep learning with Python, scikit-learn, and TensorFlow 2 (3rd ed.). Birmingham, UK: Packt Publishing.
Rojahn, J., Matson, J. L., Lott, D., Esbensen, A. J., & Smalls, Y. (2001). The Behavior Problems Inventory: An instrument for the assessment of self-injury, stereotyped behavior, and aggression/destruction in individuals with developmental disabilities. Journal of Autism & Developmental Disorders, 31(6), 577–588. https://doi.org/10.1023/a:1013299028321.
Rooker, G. W., Jessel, J., Kurtz, P. F., & Hagopian, L. P. (2013). Functional communication training with and without alternative reinforcement and punishment: An analysis of 58 applications. Journal of Applied Behavior Analysis, 46(4), 708–722. https://doi.org/10.1002/jaba.76.
Sadiq, S., Castellanos, M., Moffitt, J., Shyu, M., Perry, L., & Messinger, D. (2019). Deep learning based multimedia data mining for autism spectrum disorder (ASD) diagnosis. 2019 international conference on data mining workshops (ICDMW), November 2019, Beijing, China (pp. 847–854). https://doi.org/10.1109/ICDMW.2019.00124.
Sankey, S. S., & Weissfeld, L. A. (1998). A study of the effect of dichotomizing ordinal data upon modeling. Communications in Statistics: Simulation & Computation, 27(4), 871–887. https://doi.org/10.1080/03610919808813515.
Shelleby, E. C., & Shaw, D. S. (2014). Outcomes of parenting interventions for child conduct problems: A review of differential effectiveness. Child Psychiatry & Human Development, 45(5), 628–645. https://doi.org/10.1007/s10578-013-0431-5.
Slocum, T. A., Detrich, R., Wilczynski, S. M., Spencer, T. D., Lewis, T., & Wolfe, K. (2014). The evidence-based practice of applied behavior analysis. The Behavior Analyst, 37(1), 41–56. https://doi.org/10.1007/s40614-014-0005-2.
Stefanski, L. A., Carroll, R. J., & Ruppert, D. (1986). Optimally hounded score functions for generalized linear models with applications to logistic regression. Biometrika, 73(2), 413–424. https://doi.org/10.1093/biomet/73.2.413.
Turgeon, S., Lanovaz, M. J., & Dufour, M.-M. (2020). Effects of an interactive web training to support parents in reducing challenging behaviors in children with autism. Behavior Modification. Advance online publication. https://doi.org/10.1177/0145445520915671.
Vabalas, A., Gowen, E., Poliakoff, E., & Casson, A. J. (2019). Machine learning algorithm validation with a limited sample size. PloS One, 14(11), e0224365–e0224365. https://doi.org/10.1371/journal.pone.0224365.
Visalakshi, S., & Radha, V. (2014). A literature review of feature selection techniques and applications: Review of feature selection in data mining. 2014 IEEE international conference on computational intelligence & computing research, December 2014, Coimbatore, India (pp. 1–6). https://doi.org/10.1109/ICCIC.2014.7238499
Wong, T.-T. (2015). Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognition, 48(9), 2839–2846. https://doi.org/10.1016/j.patcog.2015.03.009.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Funding
This study was funded in part by a Graduate Scholarship from the Social Sciences and Humanities Research Council of Canada (SSHRC) to the first author and a salary award from the Fonds de recherche du Québec - Santé (#269462) to the second author.
Ethical Approval
All procedures performed in this study were in accordance with the ethical standards of the Canadian Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans and with the 1964 Helsinki declaration and its later amendments.
Informed Consent
Parents provided informed consent for them and their child.
Conflict of Interest
The authors declare that they have no conflict of interest.
Availability of Code and Data
The code and data are freely available at https://osf.io/yhk2p/.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article was written in partial fulfillment of the requirements for the PhD degree in Psychoeducation at the Université de Montréal by Stéphanie Turgeon.
Appendix
Appendix
Free Online Resources
Learn More About Python
Learn Python—https://www.learnpython.org/
Google's Python Class—https://developers.google.com/edu/python
Python for Beginners—https://www.python.org/about/gettingstarted/
Learn More About Machine Learning
An Introduction to Machine Learning—https://www.digitalocean.com/community/tutorials/an-introduction-to-machine-learning
Google’s Introduction to Machine Learning—https://developers.google.com/machine-learning/crash-course/ml-intro
Introduction to Machine Learning for Beginners—https://towardsdatascience.com/introduction-to-machine-learning-for-beginners-eed6024fdb08
Learn More About Machine Learning in Python
Cross Validation in Python: Everything You Need to Know About—https://www.upgrad.com/blog/cross-validation-in-python/
An Implementation and Explanation of the Random Forest in Python—https://towardsdatascience.com/an-implementation-and-explanation-of-the-random-forest-in-python-77bf308a9b76
Implementing SVM and Kernel SVM with Python's Scikit-Learn—https://stackabuse.com/implementing-svm-and-kernel-svm-with-pythons-scikit-learn/
How To Implement Logistic Regression From Scratch in Python—https://machinelearningmastery.com/implement-logistic-regression-stochastic-gradient-descent-scratch-python/
Develop k-Nearest Neighbors in Python From Scratch—https://machinelearningmastery.com/tutorial-to-implement-k-nearest-neighbors-in-python-from-scratch/
Hyperparameter Tuning—https://towardsdatascience.com/hyperparameter-tuning-c5619e7e6624
Sci-Kit Learn: 3.2. Tuning the Hyper-Parameters of an Estimator—https://scikit-learn.org/stable/modules/grid_search.html
Rights and permissions
About this article
Cite this article
Turgeon, S., Lanovaz, M.J. Tutorial: Applying Machine Learning in Behavioral Research. Perspect Behav Sci 43, 697–723 (2020). https://doi.org/10.1007/s40614-020-00270-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40614-020-00270-y