SUBTLEX-ESP: Spanish word frequencies based on film subtitles

Fernando Cuetos; Maria Glez-Nosti; Analía Barbón; Marc Brysbaert

Authors

Fernando Cuetos University of Oviedo, Spain
Maria Glez-Nosti University of Oviedo, Spain
Analía Barbón University of Oviedo, Spain
Marc Brysbaert Ghent University, Belgium

Abstract

Recent studies have shown that word frequency estimates obtained from films and television subtitles are better to predict performance in word recognition experiments than the traditional word frequency estimates based on books and newspapers. In this study, we present a subtitle-based word frequency list for Spanish, one of the most widely spoken languages. The subtitle frequencies are based on a corpus of 41M words taken from contemporary movies and TV series (screened between 1990 and 2009). In addition, the frequencies have been validated by correlating them with the RTs from two megastudies involving 2,764 words each (lexical decision and word naming tasks). The subtitle frequencies explained 6% more of the variance than the existing written frequencies in lexical decision, and 2% extra in word naming.

SUBTLEX-ESP: Spanish word frequencies based on film subtitles

Authors

Abstract

Downloads

Published

Issue

Section

License

Developed By

Language