Proceedings of the International Scientific Conference “Digitalization of Education: History, Trends and Prospects” (DETP 2020)

Text Age Rating Methods for Digital Libraries

Authors
A.V. Glazkova
Corresponding Author
A.V. Glazkova
Available Online 13 May 2020.
DOI
https://doi.org/10.2991/assehr.k.200509.066How to use a DOI?
Keywords
content rating, age restrictions, Russian Age Rating System, text classification, text addressee, textual target audience, machine learning
Abstract
The addressee plays a major role in communication. Text creating involves taking into account the features of the target audience, to which he refers in writing. In this article, the text addressee detection is considered from the point of view of natural language processing. The task of age classification deserves special attention. Its relevance is associated with the development of e-learning systems and digital libraries. Moreover, nowadays all information products in Russia must be marked by age rating. This article describes the first attempt to solve the automatic age rating prediction task by the example of Russian texts. In this work, we analyze the main factors affecting the text age rating and propose the first approximation classifier for determining the age of the textual target audience. Our approach is based on a range of features designed to capture readability, lexical and topic modeling characteristics. We use these features to train a Linear Support Vector Classifier. We trained and tested our classifier on a dataset of 1200 previews of fiction books in Russian annotated for age rating by books’ publishers. Our performance evaluation suggests that proposed features are a good indicator for text age rating. However, in future work, we plan to add and evaluate other types of models and linguistic features.
Open Access
This is an open access article distributed under the CC BY-NC license.

Download article (PDF)

Cite this article

TY  - CONF
AU  - A.V. Glazkova
PY  - 2020
DA  - 2020/05/13
TI  - Text Age Rating Methods for Digital Libraries
BT  - International Scientific Conference “Digitalization of Education: History, Trends and Prospects” (DETP 2020)
PB  - Atlantis Press
SP  - 364
EP  - 368
SN  - 2352-5398
UR  - https://doi.org/10.2991/assehr.k.200509.066
DO  - https://doi.org/10.2991/assehr.k.200509.066
ID  - Glazkova2020
ER  -