Proceedings of the 2019 International Conference on Big Data, Electronics and Communication Engineering (BDECE 2019)

Research of Improved Microblogs Crawling Strategy Based on Time Feature

Authors
Hongyan Zhao, Jifeng Tian, Xin Ye, Peiyu Liu
Corresponding Author
Hongyan Zhao
Available Online 24 December 2019.
DOI
10.2991/acsr.k.191223.005How to use a DOI?
Keywords
time feature, microblogs, crawling strategy, real-time
Abstract

For the problem of real-time about the existing Microblogs page crawling strategy on getting the latest news, this paper proposes an improved Microblogs crawler crawling strategy based on time behavior. For Microblogs page having fast update speed, this strategy adds the time feature tag to the fetching URL, and when you grab this URL again, Comparing the URL time feature and the content of the Microblogs page, and comparing the correlation analysis of the same URL content of different time tag, Thus, it could improve the real-time performance of Microblogs information. The experiment results show that the improved MicroBlog crawler crawling strategy has better real-time than the existing Microblogs page crawling strategy in fetching information, which could reflect the latest change of Public opinion trend.

Copyright
© 2019, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2019 International Conference on Big Data, Electronics and Communication Engineering (BDECE 2019)
Series
Advances in Computer Science Research
Publication Date
24 December 2019
ISBN
10.2991/acsr.k.191223.005
ISSN
2352-538X
DOI
10.2991/acsr.k.191223.005How to use a DOI?
Copyright
© 2019, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Hongyan Zhao
AU  - Jifeng Tian
AU  - Xin Ye
AU  - Peiyu Liu
PY  - 2019
DA  - 2019/12/24
TI  - Research of Improved Microblogs Crawling Strategy Based on Time Feature
BT  - Proceedings of the 2019 International Conference on Big Data, Electronics and Communication Engineering (BDECE 2019)
PB  - Atlantis Press
SP  - 21
EP  - 24
SN  - 2352-538X
UR  - https://doi.org/10.2991/acsr.k.191223.005
DO  - 10.2991/acsr.k.191223.005
ID  - Zhao2019
ER  -