Proceedings of the First International Volga Region Conference on Economics, Humanities and Sports (FICEHS 2019)

Statistics of Numerals in the Text: Development of a New Method of Stylometry

Authors
Andrei V. Zenkov
Corresponding Author
Andrei V. Zenkov
Available Online 18 January 2020.
DOI
10.2991/aebmr.k.200114.106How to use a DOI?
Keywords
stylometry, attribution of texts, text processing, numerals, first significant digit
Abstract

Two approaches to the statistical analysis of texts are suggested, both based on the study of numerals occurrence in a coherent literary texts. The first approach is related to the analysis of the frequency distribution of various first significant digits of numerals occurring in the text. The frequencies of occurrence of the digit 1, as well as, to a lesser extent, the digits 2 and 3, are usually a characteristic author’s style feature, consistently manifested in all (sufficiently long) literary texts of any author. This approach is convenient for quick testing whether a group of texts has common authorship: the latter is dubious if the frequency distributions are sufficiently different. The second approach is the extension of the first one and requires the study of the frequency distribution of the numerals themselves (not their first significant digits). The approach yields non-trivial information about the author’s style peculiarities and is suited for the advanced study of authorial texts. The proposed approaches are illustrated by examples of computer analysis of the literary works by L. Dobychin and A. Platonov.

Copyright
© 2020, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the First International Volga Region Conference on Economics, Humanities and Sports (FICEHS 2019)
Series
Advances in Economics, Business and Management Research
Publication Date
18 January 2020
ISBN
10.2991/aebmr.k.200114.106
ISSN
2352-5428
DOI
10.2991/aebmr.k.200114.106How to use a DOI?
Copyright
© 2020, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Andrei V. Zenkov
PY  - 2020
DA  - 2020/01/18
TI  - Statistics of Numerals in the Text: Development of a New Method of Stylometry
BT  - Proceedings of the First International Volga Region Conference on Economics, Humanities and Sports (FICEHS 2019)
PB  - Atlantis Press
SP  - 448
EP  - 451
SN  - 2352-5428
UR  - https://doi.org/10.2991/aebmr.k.200114.106
DO  - 10.2991/aebmr.k.200114.106
ID  - Zenkov2020
ER  -