Towards a distant and deep reading: a pilot corpus of Golden-Age Spanish poetry
DOI:
https://doi.org/10.37536/RPM.2019.33.0.69109Keywords:
Distant reading, Poetry, Golden-Age, Meter., Natural Language Processing, Corpus annotationAbstract
This paper shows the necessity of combine the distant reading of literary texts (panoramic analysis of a great amount of texts) with «deep» reading (close analysis in detail of implicit linguistic or literary aspects of texts). With this objective, the development of large annotated corpora of literary texts is proposed. Taking advantage of recent developments of Natural Language Processing, the linguistic and literary implicit information could be annotated semi-automatically. In order to show the viability of this proposal, a pilot corpus of Golden-Age Spanish poetry is presented. The corpus is made-up of different types of poems (sonnets, romances, eclogues, etc.) and several poets. Nowadays it has more than 52,000 lines annotated at metrical and morphological level: metrical patterns of each line, and the lemma, part of speech and morphological information of each word. The annotation was developed automatically. 5,069 lines has been revised manually and emended (if necessary). This Gold Standard is the first step both for a distant and deep literary analysis of Golden-Age Spanish poetry and for the development of poetry-specific models of Natural Language Processing.
Downloads
Métricas alternativas
Downloads
Additional Files
Published
How to Cite
Issue
Section
License
The opinions and facts stated in each article are the exclusive responsability of the authors. The University of Alcalá is not responsible in any case for the credibility and aunthenticity of the studies.
Authors will retain the rights on their work, even if they will be granting the journal a non-exclusive right of use to reproduce, edit, distribute, publicly communicate and show their work. Therefore, authors are free to enter into additional, independent contracts for non-exclusive distribution of the works published in this journal (such as uploading them to an institutional repository or publishing them in a book), as long as the fact that the manuscripts were first published in this journal is acknowledged.
Works are published under the terms stipulated in the Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0) that allows third parties to share the work under the following conditions:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.