
arXiv: 2402.10311
handle: 2117/428665
The word order of a sentence is shaped by multiple principles. The principle of syntactic dependency distance minimization is in conflict with the principle of surprisal minimization (or predictability maximization) in single head syntactic dependency structures: while the former predicts that the head should be placed at the center of the linear arrangement, the latter predicts that the head should be placed at one of the ends (either first or last). A critical question is when surprisal minimization (or predictability maximization) should surpass syntactic dependency distance minimization. In the context of single head structures, it has been predicted that this is more likely to happen when two conditions are met, i.e. (a) fewer words are involved and (b) words are shorter. Here we test the prediction on the noun phrase when it is composed of a demonstrative, a numeral, an adjective and a noun. We find that, across preferred orders in languages, the noun tends to be placed at one of the ends, confirming the theoretical prediction. We also show evidence of anti locality effects: syntactic dependency distances in preferred orders are longer than expected by chance.
typos corrected; in press in the Journal of Quantitative Linguistics
Surprisal minimization, FOS: Computer and information sciences, Compression, FOS: Physical sciences, Physics and Society (physics.soc-ph), Computation and Language, Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural, Physics and Society, Computation and Language (cs.CL), Word order, Zipf’s law of abbreviation
Surprisal minimization, FOS: Computer and information sciences, Compression, FOS: Physical sciences, Physics and Society (physics.soc-ph), Computation and Language, Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural, Physics and Society, Computation and Language (cs.CL), Word order, Zipf’s law of abbreviation
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
