Character information extraction based on CRFsuite
Jingzhong Wang, Zhongren Li, Wei Huang, Ke Xiao
Available Online November 2016.
- https://doi.org/10.2991/aest-16.2016.19How to use a DOI?
- CRFsuite; information extraction; machine learning.
- By applying the Conditional Random Fields based on discriminant undirected graph to character information extraction, this paper proposes an automation character information extraction method based on CRFsuite. Through learning the known domain, this method extracts the feature leading words, position and means from the character information in the Internet to build up a character parameter. By using CRFsuite as a model, the method adopts it to data from the Internet, matches character information and builds up the structured character information database. The method proposed by this paper demonstrates the feasibility of the implement of automation extraction of character information in the mass Internet data, and provides an effective way to facilitate character information tracking and looking-up.
- Open Access
- This is an open access article distributed under the CC BY-NC license.
Cite this article
TY - CONF AU - Jingzhong Wang AU - Zhongren Li AU - Wei Huang AU - Ke Xiao PY - 2016/11 DA - 2016/11 TI - Character information extraction based on CRFsuite BT - 2016 International Conference on Advanced Electronic Science and Technology (AEST 2016) PB - Atlantis Press SP - 147 EP - 154 SN - 1951-6851 UR - https://doi.org/10.2991/aest-16.2016.19 DO - https://doi.org/10.2991/aest-16.2016.19 ID - Wang2016/11 ER -