Proceedings of the 2016 4th International Conference on Advanced Materials and Information Technology Processing (AMITP 2016)

Chinese Named Entity Extraction System Based On Word2vec Under Spark Platform

Authors
Jialu Yuan, Yongping Xiong
Corresponding Author
Jialu Yuan
Available Online September 2016.
DOI
https://doi.org/10.2991/amitp-16.2016.74How to use a DOI?
Keywords
Keywords Spark, word2vec, NER, neural network, machine learning
Abstract
This paper proposes a real-time system that support the Chinese named entity extractions, which through word2vec algorithm training language mode to obtain word vector, and by calculating the Euclidean distance between word vectors to extract Chinese named entity, and transplant algorithm to Spark platform, using the Spark distributed computing ability improve training efficiency. First the system cut corpus into words with the help of existing word segmentation and get the rough corpus, then trains the rough corpus by word2vec algorithm to obtain word vectors and extracts the first layer of named entity according clustering algorithm, finally, the system uses the Named Entity Extraction(NEE) algorithm to extract the named entities and realize it on the spark platform.
Open Access
This is an open access article distributed under the CC BY-NC license.

Download article (PDF)

Proceedings
2016 4th International Conference on Advanced Materials and Information Technology Processing (AMITP 2016)
Part of series
Advances in Computer Science Research
Publication Date
September 2016
ISBN
978-94-6252-245-9
ISSN
2352-538X
DOI
https://doi.org/10.2991/amitp-16.2016.74How to use a DOI?
Open Access
This is an open access article distributed under the CC BY-NC license.

Cite this article

TY  - CONF
AU  - Jialu Yuan
AU  - Yongping Xiong
PY  - 2016/09
DA  - 2016/09
TI  - Chinese Named Entity Extraction System Based On Word2vec Under Spark Platform
BT  - 2016 4th International Conference on Advanced Materials and Information Technology Processing (AMITP 2016)
PB  - Atlantis Press
SN  - 2352-538X
UR  - https://doi.org/10.2991/amitp-16.2016.74
DO  - https://doi.org/10.2991/amitp-16.2016.74
ID  - Yuan2016/09
ER  -