• shareshare
  • link
  • cite
  • add
auto_awesome_motion View all 7 versions
Publication . Conference object . 2009

The Automatic Categorization of Arabic Documents by Boosting Decision Trees

Saeed Raheel; Joseph Dichy; Mohamed Hassoun;
Published: 29 Nov 2009
Publisher: HAL CCSD
Country: France

Automatic document classification has been subject to research since the early 1960s. However, additional research is still required and possible because the results obtained until now remain subject to further enhancement and refinement. Although a lot of literature has been written on the subject, very little research was reported on the automatic classification of Arabic documents none of which applied the technique of Boosting. In addition, Arabic is a highly inflective language and is morphologically much more complex than languages written with Latin characters. One cannot, therefore, easily take for granted that using Boosting to automatically classify Arabic documents is as effective as it is with documents written in Latin characters. This paper aims at exploring the technique of Boosting and its effectiveness with the automatic classification of Arabic documents and compares its performance with results obtained respectively with Support Vector Machines and Naive Bayesian Networks.

Subjects by Vocabulary

Microsoft Academic Graph classification: Categorization Decision tree Statistical classification Document classification computer.software_genre computer Computer science Boosting (machine learning) Artificial intelligence business.industry business Support vector machine Natural language processing The Internet Naive Bayes classifier

ACM Computing Classification System: ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ComputingMethodologies_PATTERNRECOGNITION


automatic categorization, Arabic documents, decision trees, signal-image technology, [SHS.LANGUE]Humanities and Social Sciences/Linguistics