A Schema Feature Based Frequent Pattern Mining Algorithm for Semi-structured Data Stream
Weiqi Fu, Husheng Liao, Xueyun Jin
Available Online April 2017.
- https://doi.org/10.2991/fmsmt-17.2017.260How to use a DOI?
- frequent pattern mining, semi-structured data stream, schema feature.
- Data mining is used to find useful information from massive data. Frequent pattern mining is one important task of data mining. Recently, the researches on frequent pattern mining for semi-structured data have made some progresses, and it also have a lot of focuses for data stream. However, only a few studies focus on both semi-structured data and data stream. This paper proposes an algorithm named SPrefixTreeISpan. We segment the semi-structured data stream first, and then uses the pattern-growth method to mine each segment. In the end, we maintain all the results on a structure called patternTree. At the same time, the mining algorithm is optimized by the inevitable parent-child relationship and the inevitable child-parent relationship extracted from XML schema. Experiment shows that SPrefixTreeISpan has better performance.
- Open Access
- This is an open access article distributed under the CC BY-NC license.
Cite this article
TY - CONF AU - Weiqi Fu AU - Husheng Liao AU - Xueyun Jin PY - 2017/04 DA - 2017/04 TI - A Schema Feature Based Frequent Pattern Mining Algorithm for Semi-structured Data Stream BT - Proceedings of the 2017 5th International Conference on Frontiers of Manufacturing Science and Measuring Technology (FMSMT 2017) PB - Atlantis Press SP - 1329 EP - 1336 SN - 2352-5401 UR - https://doi.org/10.2991/fmsmt-17.2017.260 DO - https://doi.org/10.2991/fmsmt-17.2017.260 ID - Fu2017/04 ER -