Decision trees for mining data streams

descriptionPublicationkeyboard_double_arrow_right Article 06 Mar 2006Publisher:SAGE PublicationsJournal:Intelligent Data Analysis, volume 10, pages 23-45 (issn: 1088-467X, eissn: 1571-4128,

Copyright policy )

Authors: João Gama 0001; Ricardo Fernandes; Ricardo Rocha 0003;

doi: 10.3233/ida-2006-10103

Decision trees for mining data streams

- Summary
- Related research
  (3)
- Metrics

Abstract

In this paper we study the problem of constructing accurate decision tree models from data streams. Data streams are incremental tasks that require incremental, online, and any-time learning algorithms. One of the most successful algorithms for mining data streams is VFDT. We have extended VFDT in three directions: the ability to deal with continuous data; the use of more powerful classification techniques at tree leaves, and the ability to detect and react to concept drift. VFDTc system can incorporate and classify new information online, with a single scan of the data, in time constant per example. The most relevant property of our system is the ability to obtain a performance similar to a standard decision tree algorithm even for medium size datasets. This is relevant due to the any-time property. We also extend VFDTc with the ability to deal with concept drift, by continuously monitoring differences between two class-distribution of the examples: the distribution when a node was built and the distribution in a time window of the most recent examples. We study the sensitivity of VFDTc with respect to drift, noise, the order of examples, and the initial parameters in different problems and demonstrate its utility in large and medium data sets.

Related Organizations

3 Research products, page 1 of 1

A New Fuzzy Decision Tree Classification Method for Mining High-Speed Data Streams Based on Binary Search Trees
2007IsAmongTopNSimilarDocuments
A New Decision Tree Classification Method for Mining High-Speed Data Streams Based on Threaded Binary Search Trees
2007IsAmongTopNSimilarDocuments
An Incremental Fuzzy Decision Tree Classification Method for Mining Data Streams
2007IsAmongTopNSimilarDocuments

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	87
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 1%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

87

Top 10%

Top 1%

Top 10%

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now

Decision trees for mining data streams

Decision trees for mining data streams

3 Research products, page 1 of 1

A New Fuzzy Decision Tree Classification Method for Mining High-Speed Data Streams Based on Binary Search Trees

A New Decision Tree Classification Method for Mining High-Speed Data Streams Based on Threaded Binary Search Trees

An Incremental Fuzzy Decision Tree Classification Method for Mining Data Streams