Arabic Text Classification Using Support Vector Machines.

Gharib, Tarek Fouad; Habib, Mena Badieh; Fayed, Zaki Taha

Found an issue? Give us feedback

https://research.utw...arrow_drop_down

https://research.utwente.nl/en...

Article . 2009

Data sources: University of Twente Research Information

Arabic Text Classification Using Support Vector Machines.

descriptionPublicationkeyboard_double_arrow_right Article 01 Dec 2009 Netherlands Publisher:International Society for Computers and Their Applications (ISCA)Journal:Int. J. Comput. Their Appl., volume 16, pages 192-199

Authors: Gharib, Tarek Fouad; Habib, Mena Badieh; Fayed, Zaki Taha;

Arabic Text Classification Using Support Vector Machines.

- Summary
- Subjects
- Metrics

Abstract

Text classification (TC) is the process of classifying documents into a predefined set of categories based on their content. Arabic language is highly inflectional and derivational language which makes text mining a complex task. In this paper we applied the Support Vector Machines (SVM) model in classifying Arabic text documents. The results compared with the other traditional classifiers Bayes classifier, K-Nearest Neighbor classifier and Rocchio classifier. Two experiments used to test the different classifiers. The first uses the training set as the test set, and the second uses Leave one testing method. Experimental results performed on a set of 1132 document show that Rocchio classifier gives better results when the size of feature set is small while SVM outperform the other classifiers when the size of the feature set is large enough. Classification rate exceeds 90% when using more than 4000 feature. Leave one method gives more realistic results over the use of training set as a test set.

Country

Netherlands

Related Organizations

University of Twente
Netherlands

Keywords

IR-75679, EWI-19331, Text Mining, Support Vector Machines, Arabic language, text categorization

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Related to Research communities

Netherlands Research Portal

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now