Proceedings of the 2016 International Conference on Advanced Electronic Science and Technology (AEST 2016)

Character information extraction based on CRFsuite

Authors
Jingzhong Wang, Zhongren Li, Wei Huang, Ke Xiao
Corresponding Author
Jingzhong Wang
Available Online November 2016.
DOI
https://doi.org/10.2991/aest-16.2016.19How to use a DOI?
Keywords
CRFsuite; information extraction; machine learning.
Abstract
By applying the Conditional Random Fields based on discriminant undirected graph to character information extraction, this paper proposes an automation character information extraction method based on CRFsuite. Through learning the known domain, this method extracts the feature leading words, position and means from the character information in the Internet to build up a character parameter. By using CRFsuite as a model, the method adopts it to data from the Internet, matches character information and builds up the structured character information database. The method proposed by this paper demonstrates the feasibility of the implement of automation extraction of character information in the mass Internet data, and provides an effective way to facilitate character information tracking and looking-up.
Open Access
This is an open access article distributed under the CC BY-NC license.

Download article (PDF)

Proceedings
2016 International Conference on Advanced Electronic Science and Technology (AEST 2016)
Part of series
Advances in Intelligent Systems Research
Publication Date
November 2016
ISBN
978-94-6252-257-2
ISSN
1951-6851
DOI
https://doi.org/10.2991/aest-16.2016.19How to use a DOI?
Open Access
This is an open access article distributed under the CC BY-NC license.

Cite this article

TY  - CONF
AU  - Jingzhong Wang
AU  - Zhongren Li
AU  - Wei Huang
AU  - Ke Xiao
PY  - 2016/11
DA  - 2016/11
TI  - Character information extraction based on CRFsuite
BT  - 2016 International Conference on Advanced Electronic Science and Technology (AEST 2016)
PB  - Atlantis Press
SP  - 147
EP  - 154
SN  - 1951-6851
UR  - https://doi.org/10.2991/aest-16.2016.19
DO  - https://doi.org/10.2991/aest-16.2016.19
ID  - Wang2016/11
ER  -