Extract lexical frequencies

jtrace_get_frequency(word, language = "English", scale = "frequency_abs")

Arguments

word

Character vector with the orthographic word form

language

Character vector containing the language(s) to lookup the frequency of the words for. Must be one or more of "English", "Spanish, and/or "Catalan".

scale

Character vector indicating the scale(s) of the frequency scores. Must be one or more of "frequency_abs" (absolute frequency), "frequency_rel", (relative frequency, counts/1e6, default), or "frequency_zipf" (log10(counts*1e6)+3)"

Value

A data frame containing a column for the words and one column for the SUBTLEX frequencies in each language for the same word

References

English

Van Heuven, W. J., Mandera, P., Keuleers, E., & Brysbaert, M. (2014). SUBTLEX-UK: A new and improved word frequency database for British English. Quarterly journal of experimental psychology, 67(6), 1176-1190.

Spanish

Cuetos, F., Glez-Nosti, M., Barbon, A., & Brysbaert, M. (2011). SUBTLEX-ESP: frecuencias de las palabras espanolas basadas en los subtitulos de las peliculas. Psicológica, 32(2), 133-144.

Catalan

Boada, R., Guasch, M., Haro, J., Demestre, J., & Ferré, P. (2020). SUBTLEX-CAT: Subtitle word frequencies and contextual diversity for Catalan. Behavior research methods, 52(1), 360-375.

Author

Gonzalo Garcia-Castro gonzalo.garciadecastro@upf.edu

Examples

if (FALSE) { my_words <- c("plane", "cake", "tiger", "ham", "seat") jtrace_get_frequency(word = my_words, language = "English", scale = "frequency_abs") jtrace_get_frequency(word = my_words, language = c("Spanish", "Catalan"), scale = "frequency_zipf") jtrace_get_frequency(word = my_words, language = c("Spanish"), scale = "frequency_rel") }