Skip to main content

Advertisement

Log in

Clustering Methods with Qualitative Data: a Mixed-Methods Approach for Prevention Research with Small Samples

  • Published:
Prevention Science Aims and scope Submit manuscript

Abstract

Qualitative methods potentially add depth to prevention research but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed-methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed-methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-means clustering, and latent class analysis produced similar levels of accuracy with binary data and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a “real-world” example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Aldenderfer, M. S., & Blashfield, R. K. (1984). Cluster analysis: A Sage University paper. Beverly Hills: Sage.

    Google Scholar 

  • Anderberg, M. R. (1973). Cluster analysis for applications: DTIC document.

  • Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105.

    Article  CAS  PubMed  Google Scholar 

  • R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/. Accessed 22 Feb 2014.

  • Dimitriadou, E., Dolnicar, S., & Weingessel, A. (2002). An examination of indexes for determining the number of clusters in binary data sets. Psychometrika, 67, 137–160.

    Article  Google Scholar 

  • Eshghi, A., Haughton, D., Legrand, P., Skaletsky, M., & Woolford, S. (2011). Identifying groups: A comparison of methodologies. Journal of Data Science, 9, 271–291.

    Google Scholar 

  • Farrell, A. D., Erwin, E. H., Allison, K., Meyer, A. L., Sullivan, T. N., Camou, S., Esposito, L. E. (2007). Problematic situations in the lives of urban African American middle school students: A qualitative study. Journal of Research on Adolescence, 17, 413-454.

  • Finch, H. (2005). Comparison of distance measures in cluster analysis with dichotomous data. Journal of Data Science, 3, 85–100.

    Google Scholar 

  • Hands, S., & Everitt, B. (1987). A Monte Carlo study of the recovery of cluster structure in binary data by hierarchical clustering techniques. Multivariate Behavioral Research, 22, 235–243.

    Article  Google Scholar 

  • Haughton, D., & Haughton, J. (2011). Chapter 6: Grouping methods. Springer Science+Business Media, LLC, Berlin.

  • Henry, D., Tolan, P. H., & Gorman-Smith, D. (2005). Cluster analysis in family psychology research. Journal of Family Psychology, 19, 121–132.

    Article  PubMed  Google Scholar 

  • IBM Support Portal. (2012). Clustering binary data with K-means (should be avoided). Technote Retrieved March 4, 2013, from http://www-1.ibm.com/support/docview.wss?uid=swg21477401

  • Jick, T. D. (1979). Mixing qualitative and quantitative methods: Triangulation in action. Administrative Science Quarterly, 24, 602–611.

    Article  Google Scholar 

  • Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.

    Article  Google Scholar 

  • MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In L. M. Le Cam & J. Neyman (Eds.), Proceedings of 5th Berkeley symposium on mathematical statistics and probability (Vol. 1, pp. 281–297). Berkeley: University of California Press.

    Google Scholar 

  • Mandara, J., & Murray, C. B. (2002). Development of an empirical typology of African American family functioning. Journal of Family Psychology, 16, 318.

    Article  PubMed  Google Scholar 

  • McCutcheon, A. L. (1987). Latent class analysis. Newbury Park: Sage.

    Google Scholar 

  • Nguyen, Q. H., & Rayward-Smith, V. J. (2008). Internal quality measures for clustering in metric spaces. International Journal of Business Intelligence and Data Mining, 3, 4–29. doi:10.1504/IJBIDM.2008.017973.

    Article  Google Scholar 

  • Ordonez, C. (2003). Clustering binary data streams with kmeans. Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, 12-19.

  • Ostlund, U., Kidd, L., Wengstrom, Y., & Rowa-Dewar, N. (2011). Combining qualitative and quantitative research within mixed method research designs: A methodological review. International Journal of Nursing Studies, 48, 369–383. doi:10.1016/j.ijnurstu.2010.10.005.

    Article  PubMed  Google Scholar 

  • Strauss, A., & Corbin, J. (1990). Basics of qualitative research: Grounded theory procedures and techniques. Newbury Park: Sage.

    Google Scholar 

  • Strauss, A., & Corbin, J. (1998). Basics of qualitative research: Techniques and procedures for developing grounded theory (2nd ed.). Newbury Park: Sage.

    Google Scholar 

  • Tandon, S. D., Azelton, L. S., Kelly, J. G., & Strickland, D. (1998). Constructing a tree for community leaders: Contexts and processes in collaborative inquiry. American Journal of Community Psychology, 26, 669–696.

    Article  Google Scholar 

  • Vermunt, J. K., & Magidson, J. (1999). Exploratory latent class cluster, factor, and regression analysis: the Latent GOLD approach. Paper presented at the Proceedings EMPS_99 conference, Lunenburg, Germany.

  • Yukl, G. (1998). Leadership in organizations (4th ed.). Englewood Cliffs: Prentice-Hall.

    Google Scholar 

Download references

Acknowledgments

The authors gratefully acknowledge the contributions of Mary Murray, M.A., who performed the cluster analyses for the original DCP study, Debra Strickland, Former Executive Director of the Developing Communities Project, and the community leaders who participated in the interviews for Study 2. Partial support for this study was provided by grant number R13 DA030834 (PI, Ching Fok, Ph.D.) for the conference, “Advancing Science with Culturally Distinct Communities.”

Conflict of Interest

The authors declare that they have no conflicts of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Henry.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Henry, D., Dymnicki, A.B., Mohatt, N. et al. Clustering Methods with Qualitative Data: a Mixed-Methods Approach for Prevention Research with Small Samples. Prev Sci 16, 1007–1016 (2015). https://doi.org/10.1007/s11121-015-0561-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11121-015-0561-z

Keywords

Navigation