Proceedings of the 2012 2nd International Conference on Computer and Information Application (ICCIA 2012)

A Feature Weight Algorithm for Text Classification Based on Class Information

Authors
Yongfei Li
Corresponding Author
Yongfei Li
Available Online May 2014.
DOI
10.2991/iccia.2012.226How to use a DOI?
Keywords
text classification, feature weight, inverse class frequency, term frequency in class, document frequency in class
Abstract

TFIDF algorithm was used for feature weighting in text classification. But the result of classification was not very well because of lack of class information in feature weighting. The known class information in the training set was used to improve the traditional TFIDF feature weight algorithm. Class distinction ability and class description ability were introduced, respectively expressed by inverse class frequency and term frequency in class, document frequency in class. A new feature weight algorithm based on class information, TF_IDT, was proposed. Naïve Bayes classifier was used to test the algorithm. The precision, recall and F1 measure were significantly increased. Macro F1 measure raise by 6.46%. It was proved to be useful for improving text classification to use class information in feature weighting. In addition, the computational complexity of the proposed algorithm was lower and more suitable for use in fields of limited computing capability.

Copyright
© 2013, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2012 2nd International Conference on Computer and Information Application (ICCIA 2012)
Series
Advances in Intelligent Systems Research
Publication Date
May 2014
ISBN
10.2991/iccia.2012.226
ISSN
1951-6851
DOI
10.2991/iccia.2012.226How to use a DOI?
Copyright
© 2013, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Yongfei Li
PY  - 2014/05
DA  - 2014/05
TI  - A Feature Weight Algorithm for Text Classification Based on Class Information
BT  - Proceedings of the 2012 2nd International Conference on Computer and Information Application (ICCIA 2012)
PB  - Atlantis Press
SP  - 930
EP  - 932
SN  - 1951-6851
UR  - https://doi.org/10.2991/iccia.2012.226
DO  - 10.2991/iccia.2012.226
ID  - Li2014/05
ER  -