Powered by OpenAIRE graph
Found an issue? Give us feedback
https://doi.org/10.1...arrow_drop_down
https://doi.org/10.1007/978-3-...
Part of book or chapter of book . 2007 . Peer-reviewed
Data sources: Crossref
DBLP
Conference object
Data sources: DBLP
versions View all 2 versions
addClaim

A New Fuzzy Decision Tree Classification Method for Mining High-Speed Data Streams Based on Binary Search Trees

Authors: Zhoujun Li 0001; Tao Wang; Ruoxue Wang; Yuejin Yan; Huowang Chen;

A New Fuzzy Decision Tree Classification Method for Mining High-Speed Data Streams Based on Binary Search Trees

Abstract

Decision tree construction is a well-studied problem in data mining. Recently, there has been much interest in mining data streams. Domingos and Hulten have presented a one-pass algorithm for decision tree constructions. Their system using Hoeffding inequality to achieve a probabilistic bound on the accuracy of the tree constructed. Gama et al. have extended VFDT in two directions. Their system VFDTc can deal with continuous data and use more powerful classification techniques at tree leaves. Peng et al. present soft discretization method to solve continuous attributes in data mining. In this paper, we revisit these problems and implemented a system sVFDT for data stream mining. We make the following contributions: 1) we present a binary search trees (BST) approach for efficiently handling continuous attributes. Its processing time for values inserting is O(nlogn), while VFDT's processing time is O(n2). 2) We improve the method of getting the best split-test point of a given continuous attribute. Comparing to the method used in VFDTc, it decreases from O(nlogn) to O (n) in processing time. 3) Comparing to VFDTc, sVFDT's candidate split-test number decrease from O(n) to O(logn).4)Improve the soft discretization method to increase classification accuracy in data stream mining.

Related Organizations
  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    3
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
3
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!