Proceedings of the 2015 International Industrial Informatics and Computer Engineering Conference

Exploring Representations for Semantic-Rich Part of Speech Tagging

Authors
Weidong Qu, Sicong Yue
Corresponding Author
Weidong Qu
Available Online March 2015.
DOI
10.2991/iiicec-15.2015.223How to use a DOI?
Keywords
Part-of-Speech (POS); Treebank; Maximum Entropy; N-gram Model
Abstract

Part-of-speech (POS) tagging is the basic and primary analysis step in many natural language processing (NLP) applications. For English, it is often considered a solved problem. There are well established approaches, and the accuracy is around 97% with sufficient domain-specific training data. However, many NLP applications have very different special requirements, and the POS tageset has its own characteristics. These challenges can greatly affect the quality of the part-of-speech tagging process. To address these issues and achieve high POS tagging accuracy, we investigate the representations that can be applied to improve the performance of POS task. Our experiments show that the accuracy of POS tagging degrades significantly when tested with a large semantic and syntactic tagset. In addition, our analysis of experiments suggests that tokens rather than POS tags have more effect on tagging accuracy. Our best results were reached by using the most appropriate representations for POS tagging task.

Copyright
© 2015, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2015 International Industrial Informatics and Computer Engineering Conference
Series
Advances in Computer Science Research
Publication Date
March 2015
ISBN
10.2991/iiicec-15.2015.223
ISSN
2352-538X
DOI
10.2991/iiicec-15.2015.223How to use a DOI?
Copyright
© 2015, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Weidong Qu
AU  - Sicong Yue
PY  - 2015/03
DA  - 2015/03
TI  - Exploring Representations for Semantic-Rich Part of Speech Tagging
BT  - Proceedings of the 2015 International Industrial Informatics and Computer Engineering Conference
PB  - Atlantis Press
SP  - 999
EP  - 1002
SN  - 2352-538X
UR  - https://doi.org/10.2991/iiicec-15.2015.223
DO  - 10.2991/iiicec-15.2015.223
ID  - Qu2015/03
ER  -