Proceedings of the 2016 4th International Conference on Electrical & Electronics Engineering and Computer Science (ICEEECS 2016)

Normalizing Chinese Address for Internet Applications

Authors
Xiaolin Li, Shuang Huang, Tao Lu, Deng Chen
Corresponding Author
Xiaolin Li
Available Online December 2016.
DOI
https://doi.org/10.2991/iceeecs-16.2016.2How to use a DOI?
Keywords
Set operation; Administrative division; Chinese address; Moving window; Matching degree; Analytical rules
Abstract
Many Internet applications take addresses as input. However, addresses on the Internet are always non-normalized, which cannot be used directly. In this paper, we propose an Administrative Divisions Extracting Algorithm to normalize Chinese addresses on the Internet. Our approach proceeds as follows: 1) It began with the "Road" feature words processing and extracted all possible administrative divisions data set from Chinese addresses by using administrative divisions dictionary and Moving Window Algorithm. 2) According to the Chinese administrative divisions has the characteristics of hierarchical relationships between elements, the algorithm established the conditions set operations rules of administrative divisions, it carried on the set operations to administrative divisions data set. 3) The algorithm obtained Chinese address administrative divisions of the most integrity. In order to investigate the feasibility and effectiveness of our approach, we performed experiments that the paper verified the availability of whether to adopt the "road" feature words processing for about 250 thousands Chinese address data extracted from the internet. At the same time, the algorithm compared with the current address matching technology. Experimental results show that the accuracy reached 93.51%.
Open Access
This is an open access article distributed under the CC BY-NC license.

Download article (PDF)

Proceedings
2016 4th International Conference on Electrical & Electronics Engineering and Computer Science (ICEEECS 2016)
Part of series
Advances in Computer Science Research
Publication Date
December 2016
ISBN
978-94-6252-265-7
ISSN
2352-538X
DOI
https://doi.org/10.2991/iceeecs-16.2016.2How to use a DOI?
Open Access
This is an open access article distributed under the CC BY-NC license.

Cite this article

TY  - CONF
AU  - Xiaolin Li
AU  - Shuang Huang
AU  - Tao Lu
AU  - Deng Chen
PY  - 2016/12
DA  - 2016/12
TI  - Normalizing Chinese Address for Internet Applications
BT  - 2016 4th International Conference on Electrical & Electronics Engineering and Computer Science (ICEEECS 2016)
PB  - Atlantis Press
SN  - 2352-538X
UR  - https://doi.org/10.2991/iceeecs-16.2016.2
DO  - https://doi.org/10.2991/iceeecs-16.2016.2
ID  - Li2016/12
ER  -