Proceedings of the International Conference on Mathematics, Geometry, Statistics, and Computation (IC-MaGeStiC 2021)

Information Retrieval Using Matrix Methods

Case Study: Three Popular Online News Sites in Indonesia

Authors
Ferry Wiranto*, I Made Tirta
Department of Mathematics, FMIPA, University of Jember
*Corresponding author. Email: ferrywr25@gmail.com
Corresponding Author
Ferry Wiranto
Available Online 8 February 2022.
DOI
10.2991/acsr.k.220202.032How to use a DOI?
Keywords
Data mining; Text mining; Matrix methods; Cosine size; Sparse matrix
Abstract

This research is part of data mining, a sub-section of information retrieval and text mining. This research focuses on finding an approach to getting relevant documents online news documents with a specific threshold value and improving computing performance to get relevant documents with large documents. In this case, the author use news from 3 news sites that are pretty popular in Indonesia, which are included in the top 10 Alexa Traffic Rank (ATR) 2021, namely tribunnews.com, detik.com, and liputan6.com. In searching for relevant news documents, the author determines the threshold value first by calculating the average similarity value of the documents used as the experimental sample. The resulting threshold value is a determinant of the similarity value of each document to be used. The author uses several techniques to assist the research process, such as text mining with the tala method and news document representation techniques using matrix methods, and finally utilizing the cosine size method to determine the similarity of documents with matrix-based search data. The results obtained indicate that the approach using the matrix method and the matrix compression process shows good computational results, so it will be useful for implementation on a large number of documents.

Copyright
© 2022 The Authors. Published by Atlantis Press International B.V.
Open Access
This is an open access article under the CC BY-NC license.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Mathematics, Geometry, Statistics, and Computation (IC-MaGeStiC 2021)
Series
Advances in Computer Science Research
Publication Date
8 February 2022
ISBN
10.2991/acsr.k.220202.032
ISSN
2352-538X
DOI
10.2991/acsr.k.220202.032How to use a DOI?
Copyright
© 2022 The Authors. Published by Atlantis Press International B.V.
Open Access
This is an open access article under the CC BY-NC license.

Cite this article

TY  - CONF
AU  - Ferry Wiranto
AU  - I Made Tirta
PY  - 2022
DA  - 2022/02/08
TI  - Information Retrieval Using Matrix Methods
BT  - Proceedings of the  International Conference on Mathematics, Geometry, Statistics, and Computation (IC-MaGeStiC 2021)
PB  - Atlantis Press
SP  - 167
EP  - 172
SN  - 2352-538X
UR  - https://doi.org/10.2991/acsr.k.220202.032
DO  - 10.2991/acsr.k.220202.032
ID  - Wiranto2022
ER  -