
handle: 11583/2573939
In the last ten years, with the explosion of the usage of Internet, network traffic analytics and data mining issues have taken primary importance. Generalized itemset mining is an established data mining technique which allows us to discover multiple-level correlations among data equipped with analyst-provided taxonomies. In this work, we address the discovery of a specific type of generalized itemsets, named misleading generalized itemsets (MGIs), which can be used to highlight anomalous situations in potentially large datasets. More specifically, MGIs are high-level patterns with a contrasting correlation type with respect to those of many of their descendant patterns according to the input taxonomy. This work proposes a new framework, named MGI-Cloud, which is able to efficiently extract misleading generalized itemsets. The framework is characterized by a distributed architecture and it is composed by a set of MapReduce jobs. As reference case study, MGI-Cloud has been applied to real network datasets, captured in different stages from a backbone link of an Italian ISP. The experiments demonstrate the effectiveness of our approach in a real-life scenario.
Generalized itemset mining; cloud-based service; network traffic analysis
Generalized itemset mining; cloud-based service; network traffic analysis
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
