Proceedings of the 2016 International Conference on Communications, Information Management and Network Security

Research on Text Classification Based on TextRank

Authors
Guangming Lu, Yule Xia, Jiamei Wang, Zhenling Yang
Corresponding Author
Guangming Lu
Available Online September 2016.
DOI
10.2991/cimns-16.2016.79How to use a DOI?
Keywords
component; hadoop; TexTrank; naive bayes; text classification
Abstract

Extracting keywords from the result of word segmentation with the improved TextRank algorithm. Use the relative position of the words in the article to calculate the influence of position; the position of the coverage of the words and expressions is extended to the statement of the words and the key words as the feature of the text. Hadoop programming using naive Bayesian algorithm for text classification. The experiments show that the improved Textrank has a great improvement in classification performance, and the classification accuracy of naive Bayesian algorithm is 93% when the number of keywords is 40. Compared with the traditional, the accuracy rate increased by about 10%.

Copyright
© 2016, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2016 International Conference on Communications, Information Management and Network Security
Series
Advances in Computer Science Research
Publication Date
September 2016
ISBN
10.2991/cimns-16.2016.79
ISSN
2352-538X
DOI
10.2991/cimns-16.2016.79How to use a DOI?
Copyright
© 2016, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Guangming Lu
AU  - Yule Xia
AU  - Jiamei Wang
AU  - Zhenling Yang
PY  - 2016/09
DA  - 2016/09
TI  - Research on Text Classification Based on TextRank
BT  - Proceedings of the 2016 International Conference on Communications, Information Management and Network Security
PB  - Atlantis Press
SP  - 319
EP  - 322
SN  - 2352-538X
UR  - https://doi.org/10.2991/cimns-16.2016.79
DO  - 10.2991/cimns-16.2016.79
ID  - Lu2016/09
ER  -