An efficient algorithm for mining closed itemsets

descriptionPublicationkeyboard_double_arrow_right Article 01 Jan 2004 English Publisher:Zhejiang University PressJournal:Journal of Zhejiang University-SCIENCE A, volume 5, pages 8-15 (issn: 1673-565X, eissn: 1862-1775,

Copyright policy )

Authors: Jun-qiang, Liu; Yun-he, Pan;

doi: 10.1631/bf02839306 , 10.1631/jzus.2004.0008

pmid: 14663846

An efficient algorithm for mining closed itemsets

- Summary
- Subjects
- Metrics

Abstract

This paper presents a new efficient algorithm for mining frequent closed itemsets. It enumerates the closed set of frequent itemsets by using a novel compound frequent itemset tree that facilitates fast growth and efficient pruning of search space. It also employs a hybrid approach that adapts search strategies, representations of projected transaction subsets, and projecting methods to the characteristics of the dataset. Efficient local pruning, global subsumption checking, and fast hashing methods are detailed in this paper. The principle that balances the overhead of search space growth and pruning is also discussed. Extensive experimental evaluations on real world and artificial datasets showed that our algorithm outperforms CHARM by a factor of five and is one to three orders of magnitude more efficient than CLOSET and MAFIA.

Related Organizations

Zhejiang Ocean University
China (People's Republic of)
Zhejiang Gongshang University Hangzhou College of Commerce
China (People's Republic of)

Keywords

Databases, Factual, Artificial Intelligence, Computational Biology, Database Management Systems, Information Storage and Retrieval, Algorithms

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average