Profiling instances in noise reduction

descriptionPublicationkeyboard_double_arrow_right Article 01 Jul 2012 English Publisher:Elsevier BVJournal:Knowledge-Based Systems, volume 31, pages 28-40 (issn: 0950-7051,

Copyright policy )Publicly fundedFunded by:SFI | An Investigation Into Cas...

Authors: S. Jane Delany; Segata, Nicola; B. Mac Namee;

doi: 10.1016/j.knosys.2012.01.015

handle: 11572/34844

Profiling instances in noise reduction

- Summary
- Subjects
- Metrics

Abstract

The dependency on the quality of the training data has led to significant work in noise reduction for instance-based learning algorithms. This paper presents an empirical evaluation of current noise reduction techniques, not just from the perspective of their comparative performance, but from the perspective of investigating the types of instances that they focus on for removal. A novel instance profiling technique known as RDCL profiling allows the structure of a training set to be analysed at the instance level categorising each instance based on modelling their local competence properties. This profiling approach offers the opportunity of investigating the types of instances removed by the noise reduction techniques that are currently in use in instance-based learning. The paper also considers the effect of removing instances with specific profiles from a dataset and shows that a very simple approach of removing instances that are misclassified by the training set and cause other instances in the dataset to be misclassified is an effective noise reduction technique.

Related Organizations

Keywords

noise reduction, Artificial Intelligence and Robotics, 330, instance based learning, Computer Sciences, case-based editing, profiling, 004

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	29
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average