
Until recently, only two Arabic corpora were commonly available for researchers: the Agence France-Presse (AFP) Arabic newswire from Linguistic Data Consortium (LDC) and the Al-Harm' newspaper collection from the European Language Resources Distribution Agency (ELDA). The availability of a suitable corpus is a key ,for much objective research in language engineering or any other Natural Language-related This paper presents experimental results of comparing corpora. for Modern Standard Arabic IMSA) collected from samples of online published newspapers from different Arabic countries. The results of the experiments show significant differences in vocabulary and styles within different regions. Comprehensives studies of these differences will allow more understanding fOr the language and has implications on different computational and linguistic related research. Developing adequate resources is more crucial than ever to carry this task further
Fine Arts, Modern Standard Arabic (MSA), Language variation., N
Fine Arts, Modern Standard Arabic (MSA), Language variation., N
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
