Vertical mining for high utility itemsets

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 01 Aug 2012Publisher:IEEEJournal:2012 IEEE International Conference on Granular Computing

Authors: Wei Song 0004; Yu Liu; Jinhong Li;

doi: 10.1109/grc.2012.6468563

Vertical mining for high utility itemsets

- Summary
- Metrics

Abstract

Recently, high utility itemsets mining becomes one of the most important research issues in data mining due to its ability to consider different profit values for every item. In the past studies, most algorithms generate high utility itemsets from a set of transactions in horizontal data format. Inspired by the problem of frequent itemset mining, vertical mining may be a promising approach superior to horizontal mining. In this paper, a high utility itemsets mining algorithm based on vertical database layout is proposed. Candidate high utility itemsets are discovered by intersection of covers at first. Then, high utility itemsets are checked within candidates by scanning database once. Thus, the advantages of vertical database layout, such as low storage, and high efficiency, are utilized. Experimental results show that the proposed algorithm is both efficient and scalable.

Related Organizations

North China University of Technology
China (People's Republic of)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	7
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average