Domain Thesaurus Construction from Wikipedia
- DOI
- 10.2991/iccnce.2013.22How to use a DOI?
- Keywords
- Domain Thesaurus, Wiki, CPMw, LSI,
- Abstract
The domain thesaurus plays an important role in information retrieval, natural language processing, question answering system etc. Due to the complexity of the natural language, the NLP based thesaurus constructing methods are difficult to achieve a desired result. In recent years, Wiki has been widely used as a knowledge base. Based on the characteristics anchor description and topic locality of hyperlinks, this paper proposes a hyperlink structure graph clustering based domain thesaurus construction method. The method first constructs a domain-specific hyperlink structure graph using Wiki, and then uses LSI algorithm to calculate the weight of each hyperlink. Then our method uses CPMw algorithm to cluster the weighted undirected hyperlink structure graph. After this step, domain thesaurus can be achieved. Experiments show that our method can get better results.
- Copyright
- © 2013, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - WenKe Yin AU - Ming Zhu AU - TianHao Chen PY - 2013/07 DA - 2013/07 TI - Domain Thesaurus Construction from Wikipedia BT - Proceedings of the International Conference on Computer, Networks and Communication Engineering (ICCNCE 2013) PB - Atlantis Press SP - 87 EP - 92 SN - 1951-6851 UR - https://doi.org/10.2991/iccnce.2013.22 DO - 10.2991/iccnce.2013.22 ID - Yin2013/07 ER -