Clustering Methods with Qualitative Data: a Mixed-Methods Approach for Prevention Research with Small Samples

Henry, David; Dymnicki, Allison B.; Mohatt, Nathaniel; Allen, James; Kelly, James G.

doi:10.1007/s11121-015-0561-z

Clustering Methods with Qualitative Data: a Mixed-Methods Approach for Prevention Research with Small Samples

Published: 07 May 2015

Volume 16, pages 1007–1016, (2015)
Cite this article

Prevention Science Aims and scope Submit manuscript

David Henry¹,
Allison B. Dymnicki²,
Nathaniel Mohatt³,
James Allen⁴ &
…
James G. Kelly¹

4218 Accesses
64 Citations
2 Altmetric
Explore all metrics

Abstract

Qualitative methods potentially add depth to prevention research but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed-methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed-methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-means clustering, and latent class analysis produced similar levels of accuracy with binary data and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a “real-world” example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

What is Qualitative in Qualitative Research

Article Open access 27 February 2019

Patrik Aspers & Ugo Corte

Qualitative Research: Ethical Considerations

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

Gordon W. Cheung, Helena D. Cooper-Thomas, … Linda C. Wang

References

Aldenderfer, M. S., & Blashfield, R. K. (1984). Cluster analysis: A Sage University paper. Beverly Hills: Sage.
Google Scholar
Anderberg, M. R. (1973). Cluster analysis for applications: DTIC document.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105.
Article CAS PubMed Google Scholar
R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/. Accessed 22 Feb 2014.
Dimitriadou, E., Dolnicar, S., & Weingessel, A. (2002). An examination of indexes for determining the number of clusters in binary data sets. Psychometrika, 67, 137–160.
Article Google Scholar
Eshghi, A., Haughton, D., Legrand, P., Skaletsky, M., & Woolford, S. (2011). Identifying groups: A comparison of methodologies. Journal of Data Science, 9, 271–291.
Google Scholar
Farrell, A. D., Erwin, E. H., Allison, K., Meyer, A. L., Sullivan, T. N., Camou, S., Esposito, L. E. (2007). Problematic situations in the lives of urban African American middle school students: A qualitative study. Journal of Research on Adolescence, 17, 413-454.
Finch, H. (2005). Comparison of distance measures in cluster analysis with dichotomous data. Journal of Data Science, 3, 85–100.
Google Scholar
Hands, S., & Everitt, B. (1987). A Monte Carlo study of the recovery of cluster structure in binary data by hierarchical clustering techniques. Multivariate Behavioral Research, 22, 235–243.
Article Google Scholar
Haughton, D., & Haughton, J. (2011). Chapter 6: Grouping methods. Springer Science+Business Media, LLC, Berlin.
Henry, D., Tolan, P. H., & Gorman-Smith, D. (2005). Cluster analysis in family psychology research. Journal of Family Psychology, 19, 121–132.
Article PubMed Google Scholar
IBM Support Portal. (2012). Clustering binary data with K-means (should be avoided). Technote Retrieved March 4, 2013, from http://www-1.ibm.com/support/docview.wss?uid=swg21477401
Jick, T. D. (1979). Mixing qualitative and quantitative methods: Triangulation in action. Administrative Science Quarterly, 24, 602–611.
Article Google Scholar
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.
Article Google Scholar
MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In L. M. Le Cam & J. Neyman (Eds.), Proceedings of 5th Berkeley symposium on mathematical statistics and probability (Vol. 1, pp. 281–297). Berkeley: University of California Press.
Google Scholar
Mandara, J., & Murray, C. B. (2002). Development of an empirical typology of African American family functioning. Journal of Family Psychology, 16, 318.
Article PubMed Google Scholar
McCutcheon, A. L. (1987). Latent class analysis. Newbury Park: Sage.
Google Scholar
Nguyen, Q. H., & Rayward-Smith, V. J. (2008). Internal quality measures for clustering in metric spaces. International Journal of Business Intelligence and Data Mining, 3, 4–29. doi:10.1504/IJBIDM.2008.017973.
Article Google Scholar
Ordonez, C. (2003). Clustering binary data streams with kmeans. Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, 12-19.
Ostlund, U., Kidd, L., Wengstrom, Y., & Rowa-Dewar, N. (2011). Combining qualitative and quantitative research within mixed method research designs: A methodological review. International Journal of Nursing Studies, 48, 369–383. doi:10.1016/j.ijnurstu.2010.10.005.
Article PubMed Google Scholar
Strauss, A., & Corbin, J. (1990). Basics of qualitative research: Grounded theory procedures and techniques. Newbury Park: Sage.
Google Scholar
Strauss, A., & Corbin, J. (1998). Basics of qualitative research: Techniques and procedures for developing grounded theory (2nd ed.). Newbury Park: Sage.
Google Scholar
Tandon, S. D., Azelton, L. S., Kelly, J. G., & Strickland, D. (1998). Constructing a tree for community leaders: Contexts and processes in collaborative inquiry. American Journal of Community Psychology, 26, 669–696.
Article Google Scholar
Vermunt, J. K., & Magidson, J. (1999). Exploratory latent class cluster, factor, and regression analysis: the Latent GOLD approach. Paper presented at the Proceedings EMPS_99 conference, Lunenburg, Germany.
Yukl, G. (1998). Leadership in organizations (4th ed.). Englewood Cliffs: Prentice-Hall.
Google Scholar

Download references

Acknowledgments

The authors gratefully acknowledge the contributions of Mary Murray, M.A., who performed the cluster analyses for the original DCP study, Debra Strickland, Former Executive Director of the Developing Communities Project, and the community leaders who participated in the interviews for Study 2. Partial support for this study was provided by grant number R13 DA030834 (PI, Ching Fok, Ph.D.) for the conference, “Advancing Science with Culturally Distinct Communities.”

Conflict of Interest

The authors declare that they have no conflicts of interest.

Author information

Authors and Affiliations

University of Illinois at Chicago, Chicago, IL, USA
David Henry & James G. Kelly
American Institutes for Research, Washington, D.C., USA
Allison B. Dymnicki
University of Colorado, Nederland, CO, USA
Nathaniel Mohatt
University of Minnesota, Minneapolis, MN, USA
James Allen

Authors

David Henry
View author publications
You can also search for this author in PubMed Google Scholar
Allison B. Dymnicki
View author publications
You can also search for this author in PubMed Google Scholar
Nathaniel Mohatt
View author publications
You can also search for this author in PubMed Google Scholar
James Allen
View author publications
You can also search for this author in PubMed Google Scholar
James G. Kelly
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Henry.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Henry, D., Dymnicki, A.B., Mohatt, N. et al. Clustering Methods with Qualitative Data: a Mixed-Methods Approach for Prevention Research with Small Samples. Prev Sci 16, 1007–1016 (2015). https://doi.org/10.1007/s11121-015-0561-z

Download citation

Published: 07 May 2015
Issue Date: October 2015
DOI: https://doi.org/10.1007/s11121-015-0561-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering Methods with Qualitative Data: a Mixed-Methods Approach for Prevention Research with Small Samples

Abstract

Access this article

Similar content being viewed by others

What is Qualitative in Qualitative Research

Qualitative Research: Ethical Considerations

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

References

Acknowledgments

Conflict of Interest

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Clustering Methods with Qualitative Data: a Mixed-Methods Approach for Prevention Research with Small Samples

Abstract

Access this article

Similar content being viewed by others

What is Qualitative in Qualitative Research

Qualitative Research: Ethical Considerations

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

References

Acknowledgments

Conflict of Interest

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation