A Survey of Web Page Preprocessing Research
Qi Qi, Gui-Xian Xu
Available Online December 2016.
- https://doi.org/10.2991/icwcsn-16.2017.118How to use a DOI?
- Web page cleaning; data mining; Web mining; information retrieval.
- After obtaining the required information through the crawler technology on Web, it also includes a lot of advertisement and navigation bar. So we should take the basic method to remove the noise content on Web page, which is independent of topic, it is necessary to sum up the Web denoising and do a further study. Firstly, we should explain why the page denosing is necessary, define the page denoising, and summarize the method of Web page denosing, Secondly, we should the improve the algorithm on the Web page denoising, Finally we should discuss the webpage denoising problems and the future research direction.
- Open Access
- This is an open access article distributed under the CC BY-NC license.
Cite this article
TY - CONF AU - Qi Qi AU - Gui-Xian Xu PY - 2016/12 DA - 2016/12 TI - A Survey of Web Page Preprocessing Research BT - 3rd International Conference on Wireless Communication and Sensor Networks (WCSN 2016) PB - Atlantis Press SN - 2352-538X UR - https://doi.org/10.2991/icwcsn-16.2017.118 DO - https://doi.org/10.2991/icwcsn-16.2017.118 ID - Qi2016/12 ER -