Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Article . 2025
License: CC BY
Data sources: ZENODO
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Enhancing Classification Efficiency Using the J48 Decision Tree Algorithm

Authors: Prasad, Shaifali;

Enhancing Classification Efficiency Using the J48 Decision Tree Algorithm

Abstract

The J48 decision tree algorithm, derived from the C4.5 methodology, is a powerful and widely used tool for classification tasks due to its efficiency and interpretability. This algorithm employs a systematic approach to analyze datasets, beginning with preprocessing steps to address missing values and discretize continuous attributes when necessary. By leveraging Entropy to measure data uncertainty and Information Gain to evaluate attribute significance, J48 recursively splits datasets into subsets, creating decision nodes and leaf nodes for effective classification. The algorithm continues this process until all data is classified or specified stopping criteria are met, such as a minimum number of instances per leaf. To enhance model simplicity and prevent overfitting, J48 incorporates pruning techniques that replace less informative branches with leaf nodes, improving generalization. Its ability to handle mixed data types, work efficiently with large datasets, and generate interpretable decision trees makesJ48 a versatile and robust tool for diverse classification applications. This paper discusses the methodology, advantages, and practical applications of the J48 algorithm in enhancing classification efficiency across various domains.IntroductionClassification is a critical task in data analysis, enabling the categorization of data into predefined classes based on patterns and relationships within a dataset. Decision tree algorithms are widely utilized for their simplicity, interpretability, and efficiency in handling complex classification problems. Among these, the J48 algorithm, an open-source implementation of the C4.5 algorithm, has emerged as a robust tool for constructing decision trees that offer high accuracy and comprehensibility.The J48 algorithm operates by recursively partitioning the dataset based on attributes that maximize Information Gain, a measure derived from Information Theory. This process begins with preprocessing the dataset to handle missing values and discretize continuous attributes 

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average