Proceedings of the 45th International Philological Conference (IPC 2016)

On the Differences between Traditional and Web-Corpora based on the Analysis of High-Frequency Nouns

Authors
Maria Khokhlova
Corresponding Author
Maria Khokhlova
Available Online June 2017.
DOI
10.2991/ipc-16.2017.76How to use a DOI?
Keywords
text corpus, web corpus, frequency dictionary, nouns.
Abstract

The paper gives a survey of corpora and analyzes a number of Russian nouns across the following corpora: ruTenTen (18.3 bln tokens) and Araneum Russicum Maximum (13.7 bln tokens). The research focuses on the discussion on these corpora, their comparison and the study of frequency properties for the high-frequency Russian nouns comparing them with data published in the Frequency Dictionary.

Copyright
© 2017, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 45th International Philological Conference (IPC 2016)
Series
Advances in Social Science, Education and Humanities Research
Publication Date
June 2017
ISBN
10.2991/ipc-16.2017.76
ISSN
2352-5398
DOI
10.2991/ipc-16.2017.76How to use a DOI?
Copyright
© 2017, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Maria Khokhlova
PY  - 2017/06
DA  - 2017/06
TI  - On the Differences between Traditional and Web-Corpora based on the Analysis of High-Frequency Nouns
BT  - Proceedings of the 45th International Philological Conference (IPC 2016)
PB  - Atlantis Press
SP  - 301
EP  - 304
SN  - 2352-5398
UR  - https://doi.org/10.2991/ipc-16.2017.76
DO  - 10.2991/ipc-16.2017.76
ID  - Khokhlova2017/06
ER  -