Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ UNSWorksarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
UNSWorks
Doctoral thesis . 2008
License: CC BY NC ND
https://dx.doi.org/10.26190/un...
Doctoral thesis . 2008
License: CC BY NC ND
Data sources: Datacite
DBLP
Doctoral thesis
Data sources: DBLP
versions View all 2 versions
addClaim

Mining unusual patterns

Authors: Gebski, Matthew;

Mining unusual patterns

Abstract

With the continual increases in storage and bandwidth capacity, there has been a corresponding increase in the need for effective data analysis. Applications range from marketing and customer relations to fraud and risk management to epidemiology. Many techniques are focused on detecting useful patterns in data -common trends that can be exploited and applied to most cases. However, often it is the unusual cases that are interesting. Cases of fraud or network intrusion are not the norm and as such, specific tools are needed for the identification of these abnormal scenarios. This thesis analyzes several problems related to the identification of un­ usual patterns in large data sets. We focus on the development of efficient and accurate techniques for detection of such patterns. These patterns are identified for a number of domains including network analysis (determinia­ tion the protocol for encrypted data)and census records (looking for patterns of unusual deaths from mortality data). In being useful for a number of do­ mains, we can analyse a number of different data types; detection of outliers and estimating densities for spatial data, identification of unusual sequences for network data and groups of unusual points for categorical data. Our approaches have many real world applications, and many of the data sets we use for the evaluation of our methods are real world extracts. This demonstrates that our techniques can be used on data from different domains, still maintaining high levels of performance and accuracy. Furthermore, our techniques are novel and provide new tools for mining unusual patterns. This facilitates improved analysis compared to existing methods. We provide for increased speed for identification of local outliers in spatial data; this is complemented with a novel technique for density estimation for high dimensional spatial data. Additionally, we present im­ proved techniques for identification of protocols and users for network data. Finally, we develop an approach for grouping anomalies and demonstrate this approach on behavioural risk factor and mortality data. Unlike existing techniques such as clustering, our approach is able to group instances based on why they are considered anomalous.

Country
Australia
Related Organizations
Keywords

data, methodologies,, analysis, data,, patterns, methodologies, analysis,, 004, patterns,

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green