
handle: 1959.8/156378
Data inconsistency is a key source of data quality problems. Rule based methods are a major means for inconsistency checking. Association rules have been used for this purpose. Time efficiency is very important for online checking. In this paper we utilize a tree structure for efficient storage and retrieval of rules; to reduce complexity and improve efficiency. In the present work we use a storage method called prefix tree (Trie) to store and retrieve rules for making predictions on a dirty dataset. Inconsistent values are identified from large, high dimensional data sets using a large ruleset with reduced complexity in comparison to the existing methods. The number of experiments is conducted using various real world data sets to show the efficiency of our model. Refereed/Peer-reviewed
association rules, FP-tree, data mining, data cleaning, prefix tree
association rules, FP-tree, data mining, data cleaning, prefix tree
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
