Proceedings of the 2018 International Conference on Information Technology and Management Engineering (ICITME 2018)

Extracting Latent Topics from User Reviews Using Online LDA

Authors
Zilin Wang
Corresponding Author
Zilin Wang
Available Online August 2018.
DOI
10.2991/icitme-18.2018.41How to use a DOI?
Keywords
natural language processing; topic model; latent dirichlet allocation; yelp reviews
Abstract

As local business directory service sites like Dianping.com and Yelp.com are increasingly popular, user reviews are becoming more and more important in informing customers of product and service quality. The reviews can also provide meaningful insights to business owners. However, huge amounts of online user reviews are displayed in texts and are of high dimensionality. They also imply different latent topics. Therefore, it is intractable to pinpoint the demand of customers from a large amount of incremental user reviews manually. The goal of this paper is to help businesses discover user demands from enormous reviews of high dimensionality, which in turn will help improve their business. To this end, we propose using online Latent Dirichlet Allocation (LDA) as topic model to discover latent topics from user reviews. We used the open dataset from Yelp Dataset Challenge, and further cleaned and filtered the dataset to focus on the user reviews of restaurants in Phoenix, Arizona, US. By running Online LDA over the cleaned dataset, we discovered 50 latent topics. In this paper, we present the breakdown of latent topics over all reviews and the word distribution of topics. Furthermore, the method adopted by this paper could prove useful to specific business owners in discovering user demands and points of interest.

Copyright
© 2018, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2018 International Conference on Information Technology and Management Engineering (ICITME 2018)
Series
Advances in Intelligent Systems Research
Publication Date
August 2018
ISBN
10.2991/icitme-18.2018.41
ISSN
1951-6851
DOI
10.2991/icitme-18.2018.41How to use a DOI?
Copyright
© 2018, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Zilin Wang
PY  - 2018/08
DA  - 2018/08
TI  - Extracting Latent Topics from User Reviews Using Online LDA
BT  - Proceedings of the 2018 International Conference on Information Technology and Management Engineering (ICITME 2018)
PB  - Atlantis Press
SP  - 204
EP  - 208
SN  - 1951-6851
UR  - https://doi.org/10.2991/icitme-18.2018.41
DO  - 10.2991/icitme-18.2018.41
ID  - Wang2018/08
ER  -