Proceedings of the 2016 2nd International Conference on Materials Engineering and Information Technology Applications (MEITA 2016)

A New Text Classifier Based on Random Forests

Authors
Xin Luo
Corresponding Author
Xin Luo
Available Online February 2017.
DOI
10.2991/meita-16.2017.60How to use a DOI?
Keywords
Classifiers; Text processing; Machine learning; Learning algorithms
Abstract

Various ensemble classification methods have been proposed in recent years. These methods have been proven to improve classification accuracy considerably. One of the most widely used ensemble methods is Random Forests, an ensemble of CART, it uses bagging or bootstrap aggregating. In the paper, the use of the Random Forests classifier for text classification is explored. We compare the accuracy of the Random Forest classifier to other pre-existing and freely available methods on Reuters-21578, the standard text test collection. The results showed that the model can be applied to text classification; The text classification model based on random forest had the best effect, compared with the results of a text classification model based on CART, REPTree and J48 and F1-Measure reached 0.777; The text classification model based on random forest is convenient, intuitive and effective, and the evaluation results are reliable. It can provide a new idea for the research of text classification.

Copyright
© 2017, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2016 2nd International Conference on Materials Engineering and Information Technology Applications (MEITA 2016)
Series
Advances in Engineering Research
Publication Date
February 2017
ISBN
10.2991/meita-16.2017.60
ISSN
2352-5401
DOI
10.2991/meita-16.2017.60How to use a DOI?
Copyright
© 2017, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Xin Luo
PY  - 2017/02
DA  - 2017/02
TI  - A New Text Classifier Based on Random Forests
BT  - Proceedings of the 2016 2nd International Conference on Materials Engineering and Information Technology Applications (MEITA 2016)
PB  - Atlantis Press
SP  - 290
EP  - 293
SN  - 2352-5401
UR  - https://doi.org/10.2991/meita-16.2017.60
DO  - 10.2991/meita-16.2017.60
ID  - Luo2017/02
ER  -