Abstract
We propose the algorithm of automated definition of the genre type and semantic characteristics of poetic texts in Russian. We formulated the approaches to the construction of a joint (“two-dimensional”) classifier of genre types and stylistic colouring of poetic texts, based on the definition of interdependence of the type of genre and stylistic colouring of the text. On the basis of these approaches the principles of formation of the training samples for the algorithms for the definition of styles and genre types were analysed. The computational experiments were conducted using the corpus of texts of A. S. Pushkin's Lyceum lyrics to select the most accurate algorithm for classifying poetic texts, including using the most well-known techniques for ensembling basic algorithms in composition, such as weighted voting, boosting and stacking, and single words, bigrams and trigrams were used as characteristic features of poems.
Recommended Citation
Kozhemyakina, Olga and Barakhnin, Vladimir B., "Models and algorithms for complex analysis of large corpuses of Russian poetic texts" (2020). International Conference on Information Systems 2020 Special Interest Group on Big Data Proceedings. 2.
https://aisel.aisnet.org/sigbd2020/2