International Journal of Computational Intelligence Systems

Volume 3, Issue 6, December 2010, Pages 733 - 744

A New Fast Vertical Method for Mining Frequent Patterns

Authors
Zhihong Deng, Zhonghui Wang
Corresponding Author
Zhihong Deng
Received 21 December 2009, Accepted 8 November 2010, Available Online 1 December 2010.
DOI
10.2991/ijcis.2010.3.6.4How to use a DOI?
Keywords
data mining; frequent pattern mining; data structure; algorithm
Abstract

Vertical mining methods are very effective for mining frequent patterns and usually outperform horizontal mining methods. However, the vertical methods become ineffective since the intersection time starts to be costly when the cardinality of tidset (tid-list or diffset) is very large or there are a very large number of transactions. In this paper, we propose a novel vertical algorithm called PPV for fast frequent pattern discovery. PPV works based on a data structure called Node-lists, which is obtained from a coding prefix-tree called PPC-tree. The efficiency of PPV is achieved with three techniques. First, the Node-list is much more compact compared with previous proposed vertical structure (such as tid-lists or diffsets) since transactions with common prefixes share the same nodes of the PPC-tree. Second, the counting of support is transformed into the intersection of Node-lists and the complexity of intersecting two Node-lists can be reduced to O(m+n) by an efficient strategy, where m and n are the cardinalities of the two Node-lists respectively. Third, the ancestor-descendant relationship of two nodes, which is the basic step of intersecting Node-lists, can be very efficiently verified by Pre-Post codes of nodes. We experimentally compare our algorithm with FP-growth, and two prominent vertical algorithms (Eclat and dEclat) on a number of databases. The experimental results show that PPV is an efficient algorithm that outperforms FP-growth, Eclat, and dEclat.

Copyright
© 2010, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Journal
International Journal of Computational Intelligence Systems
Volume-Issue
3 - 6
Pages
733 - 744
Publication Date
2010/12/01
ISSN (Online)
1875-6883
ISSN (Print)
1875-6891
DOI
10.2991/ijcis.2010.3.6.4How to use a DOI?
Copyright
© 2010, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Zhihong Deng
AU  - Zhonghui Wang
PY  - 2010
DA  - 2010/12/01
TI  - A New Fast Vertical Method for Mining Frequent Patterns
JO  - International Journal of Computational Intelligence Systems
SP  - 733
EP  - 744
VL  - 3
IS  - 6
SN  - 1875-6883
UR  - https://doi.org/10.2991/ijcis.2010.3.6.4
DO  - 10.2991/ijcis.2010.3.6.4
ID  - Deng2010
ER  -