Skip to main content

The Meaning of forma in Thomas Aquinas: Hierarchical Clustering from the Index Thomisticus Treebank

  • Conference paper
  • First Online:
Analysis and Modeling of Complex Data in Behavioral and Social Sciences

Abstract

We apply word hierarchical clustering techniques to collect the occurrences of the lemma forma that show a similar contextual behaviour in the works of Thomas Aquinas into the same or closely related groups. Our results will support the lexicographers of a data-driven new lexicon of Thomas Aquinas in their task of writing the lexical entry of forma. We use two datasets: the Index Thomisticus (IT), a corpus containing the opera omnia of Thomas Aquinas, and the Index Thomisticus Treebank, a syntactically annotated subset of the IT.

Results are evaluated against a manually labeled subset of the occurrences of forma.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The IT was lemmatised manually. Participles were always reduced to verbs unless they feature a separate lexical entry in the Latin dictionary provided by Forcellini (1771; extended in 1896 by R. Klotz, G. Freund & L. Doderlein); for instance, the word falsus is always lemmatised as a form of the adjective falsus and not of the verb fallo. Disambiguation of the homographs is partly available in the IT and it is completed in the Index Thomisticus Treebank.

  2. 2.

    Sentences in the IT-TB were splitted automatically by strong punctuation marks (period, colon, semicolon, question mark, exclamation mark). At times, manual modifications of automatic sentence splitting were made by annotators.

References

  • Busa, R. (1974–1980). Index Thomisticus. Stuttgart-Bad Cannstatt: Frommann-Holzboog

    Google Scholar 

  • Deferrari, R. J., & Barry, M. I. (1948–1949). A Lexicon of St. Thomas Aquinas: based on the Summa Theologica and selected passages of his other works. Washington, DC: Catholic University of America Press

    Google Scholar 

  • Firth, J. R. (1957). Papers in linguistics 1934–1951. London: London University Press.

    Google Scholar 

  • Forcellini, A. (1771). Totius Latinitatis lexicon, consilio et cura Jacobi Facciolati opera et studio Aegidii Forcellini, lucubratum, typis Seminarii, Patavii.

    Google Scholar 

  • Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: an introduction to cluster analysis. New York: Wiley.

    Book  Google Scholar 

  • Maechler, M., Rousseeuw, P. J., Struyf, A., Hubert, M., & Hornik, K. (2012). Cluster: Cluster analysis basics and extensions. R package version 1.14.3. http://CRAN.R-project.org/package=cluster.

  • McGillivray, B., Passarotti, M., & Ruffolo, P. (2009). The Index Thomisticus treebank project: Annotation, parsing and valency lexicon. Traitement Automatique des Langues, 50(2), 103–127.

    Google Scholar 

  • Minozzi, S. (2008). La costruzione di una base di conoscenza lessicale per la lingua latina: Latinwordnet. In G. Sandrini (Ed.), Studi in onore di Gilberto Lonardi (pp. 243–258). Verona: Fiorini.

    Google Scholar 

  • Pedersen, T. (2006). Unsupervised corpus-based methods for WSD. In E. Agirre & P. Edmonds (Eds.), Word sense disambiguation: algorithms and applications (pp. 133–166). New York: Springer.

    Chapter  Google Scholar 

  • R Core Team (2012). R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. ISBN: 3-900051-07-0. http://www.R-project.org/.

  • Sokal, R. R., & Michener, C. D. (1958). A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin, 38, 1409–1438.

    Google Scholar 

  • Van Rijsbergen, ‘Keith’ C. J. (1979) Information retrieval. London: Butterworths

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gabriele Cantaluppi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Cantaluppi, G., Passarotti, M. (2014). The Meaning of forma in Thomas Aquinas: Hierarchical Clustering from the Index Thomisticus Treebank. In: Vicari, D., Okada, A., Ragozini, G., Weihs, C. (eds) Analysis and Modeling of Complex Data in Behavioral and Social Sciences. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-06692-9_10

Download citation

Publish with us

Policies and ethics