An analysis of semi-supervised learning with the Guelph Cluster Class algorithm

Name: An analysis of semi-supervised learning with the Guelph Cluster Class algorithm
Creator: Harvey, Neil
Keywords: classification, classifier training, Guelph Cluster Class algorithm, clustering

descriptionPublicationkeyboard_double_arrow_right Thesis 01 Jan 2002 English Publisher:University of Guelph

Authors: Harvey, Neil;

handle: 10214/21884

An analysis of semi-supervised learning with the Guelph Cluster Class algorithm

- Summary
- Subjects
- Metrics

Abstract

Training a classifier requires a supply of example problems and the correct classification (label) for each. In some practical situations examples are plentiful, but obtaining labels for them is costly. Several algorithms exist for learning classification when only a small number of examples are "labelled" at the outset and the remainder are "unlabelled." This thesis presents continued work on the Guelph Cluster Class algorithm developed by Dara, Stacey and Kremer. Specifically, it investigates how the algorithm performs on ten real-world data sets over a range of parameter settings, and whether cluster validity indices can guide the setting of the parameters. An examination of a simple clustering problem points to explanations for the algorithm's behaviour, and tests of a variant algorithm that capitalizes on these observations are presented. Finally, this thesis explores whether clustering information can guide the selection of examples which, if labelled, would be especially informative for classifier training.

Related Organizations

University of Guelph
Canada

Keywords

classification, classifier training, Guelph Cluster Class algorithm, clustering

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

Green