publication . Conference object . 2015

Entropy evaluation based on confidence intervals of frequency estimates : Application to the learning of decision trees

Serrurier, Mathieu; Prade, Henri;
Open Access English
  • Published: 06 Jul 2015
  • Publisher: HAL CCSD
  • Country: France
Abstract
International audience; Entropy gain is widely used for learning decision trees. However, as we go deeper downward the tree, the examples become rarer and the faithfulness of entropy decreases. Thus, misleading choices and over-fitting may occur and the tree has to be adjusted by using an early-stop criterion or post pruning algorithms. However, these methods still depends on the choices previously made, which may be unsatisfactory. We propose a new cumulative entropy function based on confidence intervals on frequency estimates that together considers the entropy of the probability distribution and the uncertainty around the estimation of its parameters. This f...
Subjects
free text keywords: Intelligence artificielle, Apprentissage, Logique en informatique, Informatique et langage, Machine learning, Decision trees, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], [INFO.INFO-LO]Computer Science [cs]/Logic in Computer Science [cs.LO], [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], [ INFO.INFO-AI ] Computer Science [cs]/Artificial Intelligence [cs.AI], [ INFO.INFO-LG ] Computer Science [cs]/Machine Learning [cs.LG], [ INFO.INFO-LO ] Computer Science [cs]/Logic in Computer Science [cs.LO], [ INFO.INFO-CL ] Computer Science [cs]/Computation and Language [cs.CL]
Download fromView all 7 versions
Hyper Article en Ligne
Conference object . 2015
HAL-Pasteur
Conference object . 2015
Provider: HAL-Pasteur
HAL-Inserm
Conference object . 2015
Provider: HAL-Inserm
25 references, page 1 of 2

Abella`n, J. and Moral, S. Upper entropy of credal sets. applications to credal classification. International Journal of Approximate Reasoning, 39:235-255, 2005.

Agresti, A. and Coull, B.A. Approximate Is Better than ”Exact” for Interval Estimation of Binomial Proportions. The American Statistician, 52(2):119-126, May 1998. [OpenAIRE]

Bernard, J.M. An introduction to the imprecise dirichlet model for multinomial data. International Journal of Approximate Reasoning, 39(23):123 - 150, 2005. Imprecise Probabilities and Their Applications.

Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J. Classification and Regression Trees. Chapman & Hall, New York, NY, 1984.

Buntine, W. and Niblett, T. A further comparison of splitting rules for decision-tree induction. Machine Learning, 8(1):75-85, 1992.

Demsˇar, J. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res., 7:1-30, December 2006.

Domingos, P. and Geoff, H. Mining high-speed data streams. In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '00, pp. 71-80, New York, NY, USA, 2000. ACM.

Dubois, D. Possibility theory and statistical reasoning. Computational Statistics and Data Analysis, 51:47-69, 2006.

Dubois, D. and Hu¨llermeier, E. Comparing probability measures using possibility theory: A notion of relative peakedness. International Journal of Approximate Reasoning, 45(2):364-385, 2007.

Dubois, D. and Prade, H. When upper probabilities are possibility measures. Fuzzy Sets and Systems, 49:65-74, 1992.

Dubois, D., Prade, H., and Sandri, S. On possibility / probability transformations. In Lowen, R. and Roubens, M. (eds.), Fuzzy Logic - State of the Art, pp. 103-112. Kluwer Acad. Publ., 1993.

Dubois, D., Foulloy, L., Mauris, G., and Prade, H. Probability-possibility transformations, triangular fuzzy sets, and probabilistic inequalities. Reliable Computing, 10:273-297, 2004. [OpenAIRE]

Esposito, F., Malerba, D., and Semeraro, G. A comparative analysis of methods for pruning decision trees. IEEE Trans. Pattern Anal. Mach. Intell., 19(5):476-491, May 1997. [OpenAIRE]

Gama, J., Fernandes, R., and Rocha, R. Decision trees for mining data streams. Intell. Data Anal., 10(1):23-45, January 2006.

Grassberger, P. Finite sample corrections to entropy and dimension estimates. Physics Letters A, 128(67):369 - 373, 1988. [OpenAIRE]

25 references, page 1 of 2
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue