Proceedings of the International Conference on Computer, Networks and Communication Engineering (ICCNCE 2013)

Domain Thesaurus Construction from Wikipedia

Authors
WenKe Yin, Ming Zhu, TianHao Chen
Corresponding Author
WenKe Yin
Available Online July 2013.
DOI
10.2991/iccnce.2013.22How to use a DOI?
Keywords
Domain Thesaurus, Wiki, CPMw, LSI,
Abstract

The domain thesaurus plays an important role in information retrieval, natural language processing, question answering system etc. Due to the complexity of the natural language, the NLP based thesaurus constructing methods are difficult to achieve a desired result. In recent years, Wiki has been widely used as a knowledge base. Based on the characteristics anchor description and topic locality of hyperlinks, this paper proposes a hyperlink structure graph clustering based domain thesaurus construction method. The method first constructs a domain-specific hyperlink structure graph using Wiki, and then uses LSI algorithm to calculate the weight of each hyperlink. Then our method uses CPMw algorithm to cluster the weighted undirected hyperlink structure graph. After this step, domain thesaurus can be achieved. Experiments show that our method can get better results.

Copyright
© 2013, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the International Conference on Computer, Networks and Communication Engineering (ICCNCE 2013)
Series
Advances in Intelligent Systems Research
Publication Date
July 2013
ISBN
10.2991/iccnce.2013.22
ISSN
1951-6851
DOI
10.2991/iccnce.2013.22How to use a DOI?
Copyright
© 2013, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - WenKe Yin
AU  - Ming Zhu
AU  - TianHao Chen
PY  - 2013/07
DA  - 2013/07
TI  - Domain Thesaurus Construction from Wikipedia
BT  - Proceedings of the International Conference on Computer, Networks and Communication Engineering (ICCNCE 2013)
PB  - Atlantis Press
SP  - 87
EP  - 92
SN  - 1951-6851
UR  - https://doi.org/10.2991/iccnce.2013.22
DO  - 10.2991/iccnce.2013.22
ID  - Yin2013/07
ER  -