Proceedings of the 21st International Workshop on Computer Science and Information Technologies (CSIT 2019)

Use of Topic Modelling for Improvement of Quality in the Task of Semantic Search of Educational Courses

Authors
Ivan Nikolaev, Dmitry Botov, Yuri Dmitrin, Julius Klenin, Andrei Melnikov
Corresponding Author
Ivan Nikolaev
Available Online December 2019.
DOI
10.2991/csit-19.2019.18How to use a DOI?
Keywords
topic modeling, topic filtering
Abstract

This paper proposes an approach, improving the quality of the original educational course programmes semantic search algorithm, based on vector representations, produced by distributional semantic. The proposed approach works by providing an expert with interpretable topic filtering of courses in search results. Application of probabilistic topic modeling based on additive regularization ensures the interpretability of vector components in representations of texts, allowing the expert, in the process of exploratory search, to narrow down the set of relevant documents found previously by using the vector model. In our experiments, we study the applied task of educational course search, using current requirements of the labor market (requirements described in professional standards serve as search queries). The implementation of topic filtering is based on the open-source library BigARTM. We investigate the influence of hyperparameters and the choice of regularizers in the construction of a topic model on the improvement of quality of educational course semantic search using various vector models: word2vec, fasttext, TF-IDF are investigated.

Copyright
© 2019, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 21st International Workshop on Computer Science and Information Technologies (CSIT 2019)
Series
Atlantis Highlights in Computer Sciences
Publication Date
December 2019
ISBN
10.2991/csit-19.2019.18
ISSN
2589-4900
DOI
10.2991/csit-19.2019.18How to use a DOI?
Copyright
© 2019, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Ivan Nikolaev
AU  - Dmitry Botov
AU  - Yuri Dmitrin
AU  - Julius Klenin
AU  - Andrei Melnikov
PY  - 2019/12
DA  - 2019/12
TI  - Use of Topic Modelling for Improvement of Quality in the Task of Semantic Search of Educational Courses
BT  - Proceedings of the 21st International Workshop on Computer Science and Information Technologies (CSIT 2019)
PB  - Atlantis Press
SP  - 104
EP  - 111
SN  - 2589-4900
UR  - https://doi.org/10.2991/csit-19.2019.18
DO  - 10.2991/csit-19.2019.18
ID  - Nikolaev2019/12
ER  -