Proceedings of the 2nd International Conference on Advances in Computer Science and Engineering (CSE 2013)

A Chinese Word Clustering Method Using Latent Dirichlet Allocation and K-means

Authors
Lin Qiu, Jungang Xu
Corresponding Author
Lin Qiu
Available Online July 2013.
DOI
10.2991/cse.2013.60How to use a DOI?
Keywords
word clustering; latent dirichlet allocation; k-means; word similarity
Abstract

Word clustering is a popular research issue in the field of natural language processing. In this paper, Latent Dirichlet Allocation algorithm is used to extract the topics from nouns in the text, and the highest probability noun of each topic is selected as the centroids of the k-means algorithm. Experimental results show that this method can get better effects than the graph-based word clustering algorithms using a web search engine.

Copyright
© 2013, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2nd International Conference on Advances in Computer Science and Engineering (CSE 2013)
Series
Advances in Intelligent Systems Research
Publication Date
July 2013
ISBN
10.2991/cse.2013.60
ISSN
1951-6851
DOI
10.2991/cse.2013.60How to use a DOI?
Copyright
© 2013, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Lin Qiu
AU  - Jungang Xu
PY  - 2013/07
DA  - 2013/07
TI  - A Chinese Word Clustering Method Using Latent Dirichlet Allocation and K-means
BT  - Proceedings of the 2nd International Conference on Advances in Computer Science and Engineering (CSE 2013)
PB  - Atlantis Press
SP  - 269
EP  - 272
SN  - 1951-6851
UR  - https://doi.org/10.2991/cse.2013.60
DO  - 10.2991/cse.2013.60
ID  - Qiu2013/07
ER  -