Skip to main content
Top
Gepubliceerd in:

21-05-2024

Extracting big data from the internet to support the development of a new patient-reported outcome measure for breast implant illness: a proof of concept study

Auteurs: Sophia Hu, Jinjie Liu, Sylvie D. Cornacchi, Anne F. Klassen, Andrea L. Pusic, Manraj N. Kaur

Gepubliceerd in: Quality of Life Research | Uitgave 7/2024

Log in om toegang te krijgen
share
DELEN

Deel dit onderdeel of sectie (kopieer de link)

  • Optie A:
    Klik op de rechtermuisknop op de link en selecteer de optie “linkadres kopiëren”
  • Optie B:
    Deel de link per e-mail

Abstract

Purpose

Individuals with health conditions often use online patient forums to share their experiences. These patient data are freely available and have rarely been used in patient-reported outcomes (PRO) research. Web scraping, the automated identification and coding of webpage data, can be employed to collect patient experiences for PRO research. The objective of this study was to assess the feasibility of using web scraping to support the development of a new PRO measure for breast implant illness (BII).

Methods

Nine publicly available BII-specific web forums were chosen post-consultation with two prominent BII advocacy leaders. The Python Selenium and Pandas packages were used to automate extraction of de-identified text from the individual posts/comments into a spreadsheet. Data were coded using a line-by-line approach and constant comparison was used to create top-level domains and sub-domains.

Results

6362 unique codes were identified and organized into four top-level domains of information needs, symptom experiences, life impact of BII, and care experiences. Information needs of women included seeking/sharing information pre-breast implant surgery, post-breast implant surgery, while contemplating explant surgery, and post-explant surgery. Symptoms commonly described by women included fatigue, brain fog, and musculoskeletal symptoms. Many comments described BII’s impact on daily activities and psychosocial wellbeing. Lastly, some comments described negative care experiences and experiences related to advocating for themselves to providers.

Conclusion

This proof-of-concept study demonstrated the feasibility of employing web scraping as a cost-effective, efficient method to understand the experiences of women with BII. These data will be used to inform the development of a BII-specific PROM.
Literatuur
1.
go back to reference Khder, M. A. (2021). Web scraping or web crawling: state of art, techniques, approaches and application. International Journal of Advances in Soft Computing & Its Applications, 13(3), 145–168.CrossRef Khder, M. A. (2021). Web scraping or web crawling: state of art, techniques, approaches and application. International Journal of Advances in Soft Computing & Its Applications, 13(3), 145–168.CrossRef
18.
go back to reference Ricci, L., Epstein, J., Buisson, A., Devos, C., Toussaint, Y., PeyrinBiroulet, L., & Guillemin, F. (2020). Flare-IBD: Development and validation of a questionnaire based on patients’ messages on an internet forum for early detection of flare in inflammatory bowel disease: Study protocol. British Medical Journal Open, 10(7), e037211. https://doi.org/10.1136/bmjopen-2020-037211CrossRef Ricci, L., Epstein, J., Buisson, A., Devos, C., Toussaint, Y., PeyrinBiroulet, L., & Guillemin, F. (2020). Flare-IBD: Development and validation of a questionnaire based on patients’ messages on an internet forum for early detection of flare in inflammatory bowel disease: Study protocol. British Medical Journal Open, 10(7), e037211. https://​doi.​org/​10.​1136/​bmjopen-2020-037211CrossRef
19.
go back to reference Liu, X., & Chen, H. (2013). AZDrugMiner: An information extraction system for mining patient-reported adverse drug events in online patient forums. In D. Zeng, C. C. Yang, V. S. Tseng, C. Xing, H. Chen, F.-Y. Wang, & X. Zheng (Eds.), Smart health (pp. 134–150). Berlin: Springer.CrossRef Liu, X., & Chen, H. (2013). AZDrugMiner: An information extraction system for mining patient-reported adverse drug events in online patient forums. In D. Zeng, C. C. Yang, V. S. Tseng, C. Xing, H. Chen, F.-Y. Wang, & X. Zheng (Eds.), Smart health (pp. 134–150). Berlin: Springer.CrossRef
20.
go back to reference Milev, P. (2017). Conceptual approach for development of web scraping application for tracking information. Economic Alternatives, 3, 475–485. Milev, P. (2017). Conceptual approach for development of web scraping application for tracking information. Economic Alternatives, 3, 475–485.
22.
go back to reference Landers, R. N., Brusso, R. C., Cavanaugh, K. J., & Collmus, A. B. (2016). A primer on theory-driven web scraping: Automatic extraction of big data from the Internet for use in psychological research. Psychological methods, 21(4), 475.CrossRefPubMed Landers, R. N., Brusso, R. C., Cavanaugh, K. J., & Collmus, A. B. (2016). A primer on theory-driven web scraping: Automatic extraction of big data from the Internet for use in psychological research. Psychological methods, 21(4), 475.CrossRefPubMed
24.
go back to reference Suganya, R., Krupasree, R. S., Gokulraj, S., & Abinesh, B. (2022). Product review analysis by web scraping using NLP. In R. Asokan, D. P. Ruiz, Z. A. Baig, & S. Piramuthu (Eds.), Smart data intelligence (pp. 427–436). Singapore: Springer.CrossRef Suganya, R., Krupasree, R. S., Gokulraj, S., & Abinesh, B. (2022). Product review analysis by web scraping using NLP. In R. Asokan, D. P. Ruiz, Z. A. Baig, & S. Piramuthu (Eds.), Smart data intelligence (pp. 427–436). Singapore: Springer.CrossRef
25.
go back to reference Feuston, J. L., & Brubaker, J. R. (2021). Putting tools in their place: The role of time and perspective in human-AI collaboration for qualitative analysis. In Proceedings of the ACM on Human-Computer Interaction, 5(CSCW2), 469:1–469:25. Doi: https://doi.org/10.1145/3479856 Feuston, J. L., & Brubaker, J. R. (2021). Putting tools in their place: The role of time and perspective in human-AI collaboration for qualitative analysis. In Proceedings of the ACM on Human-Computer Interaction, 5(CSCW2), 469:1–469:25. Doi: https://​doi.​org/​10.​1145/​3479856
26.
go back to reference Jiang, J. A., Wade, K., Fiesler, C., Brubaker, J. R. (2021). Supporting serendipity: opportunities and challenges for human-AI collaboration in qualitative analysis. In: Proceedings of the ACM on Human-Computer Interaction, 5(CSCW1), 94:1–94:23. Doi: https://doi.org/10.1145/3449168 Jiang, J. A., Wade, K., Fiesler, C., Brubaker, J. R. (2021). Supporting serendipity: opportunities and challenges for human-AI collaboration in qualitative analysis. In: Proceedings of the ACM on Human-Computer Interaction, 5(CSCW1), 94:1–94:23. Doi: https://​doi.​org/​10.​1145/​3449168
27.
go back to reference Christou, P. A. (2023). How to use artificial intelligence (AI) as a resource, methodological and analysis tool in qualitative research? The Qualitative Report, 28(7), 1968–1980. Christou, P. A. (2023). How to use artificial intelligence (AI) as a resource, methodological and analysis tool in qualitative research? The Qualitative Report, 28(7), 1968–1980.
32.
go back to reference Brewer, R., Westlake, B., Hart, T., & Arauza, O. (2021). The ethics of web crawling and web scraping in cybercrime research: Navigating issues of consent, privacy, and other potential harms associated with automated data collection. In A. Lavorgna & T. J. Holt (Eds.), Researching cybercrimes: methodologies, ethics, and critical approaches (pp. 435–456). Cham: Springer.CrossRef Brewer, R., Westlake, B., Hart, T., & Arauza, O. (2021). The ethics of web crawling and web scraping in cybercrime research: Navigating issues of consent, privacy, and other potential harms associated with automated data collection. In A. Lavorgna & T. J. Holt (Eds.), Researching cybercrimes: methodologies, ethics, and critical approaches (pp. 435–456). Cham: Springer.CrossRef
34.
go back to reference Patrick, D. L., Burke, L. B., Gwaltney, C. J., Leidy, N. K., Martin, M. L., Molsen, E., & Ring, L. (2011). Content validity–establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value in Health: The Journal of the International Society for Pharmacoeconomics and Outcomes Research, 14(8), 967–977. https://doi.org/10.1016/j.jval.2011.06.014CrossRefPubMed Patrick, D. L., Burke, L. B., Gwaltney, C. J., Leidy, N. K., Martin, M. L., Molsen, E., & Ring, L. (2011). Content validity–establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1—eliciting concepts for a new PRO instrument. Value in Health: The Journal of the International Society for Pharmacoeconomics and Outcomes Research, 14(8), 967–977. https://​doi.​org/​10.​1016/​j.​jval.​2011.​06.​014CrossRefPubMed
35.
go back to reference Patrick, D. L., Burke, L. B., Gwaltney, C. J., Leidy, N. K., Martin, M. L., Molsen, E., & Ring, L. (2011). Content validity–establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO Good Research Practices Task Force report: Part 2–assessing respondent understanding. Value in Health: The Journal of the International Society for Pharmacoeconomics and Outcomes Research, 14(8), 978–988. https://doi.org/10.1016/j.jval.2011.06.013CrossRefPubMed Patrick, D. L., Burke, L. B., Gwaltney, C. J., Leidy, N. K., Martin, M. L., Molsen, E., & Ring, L. (2011). Content validity–establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO Good Research Practices Task Force report: Part 2–assessing respondent understanding. Value in Health: The Journal of the International Society for Pharmacoeconomics and Outcomes Research, 14(8), 978–988. https://​doi.​org/​10.​1016/​j.​jval.​2011.​06.​013CrossRefPubMed
36.
Metagegevens
Titel
Extracting big data from the internet to support the development of a new patient-reported outcome measure for breast implant illness: a proof of concept study
Auteurs
Sophia Hu
Jinjie Liu
Sylvie D. Cornacchi
Anne F. Klassen
Andrea L. Pusic
Manraj N. Kaur
Publicatiedatum
21-05-2024
Uitgeverij
Springer International Publishing
Gepubliceerd in
Quality of Life Research / Uitgave 7/2024
Print ISSN: 0962-9343
Elektronisch ISSN: 1573-2649
DOI
https://doi.org/10.1007/s11136-024-03672-6