Global linguistic diversity

Linguistic diversity is under threat, with projections suggesting that over 20\% of languages risk disappearing by the turn of the century. While ecology has developed statistical techniques to quantify diversity, linguists often resort to simple language counts. To further the methodology available to researchers, this thesis adapts the Leinster-Cobbold framework from ecology to create a unified notion of diversity on a global scale, taking richness, relative abundance, and similarity of languages into account. Large-scale cross-linguistic databases (PHOIBLE, Grambank, ASJP) were assessed for their suitability in modeling global linguistic diversity, using speaker data from Ethnologue and the Joshua Project as a baseline. Measures of linguistic similarity were derived and analyzed through cluster and correlation analysis. The results indicate that only the ASJP database offers sufficient coverage to derive similarity measures on a global scale. While lexical and morphosyntactic similarity measures both display strong phylogenetic classificatory power, correlations between different similarity measures are weak, meaning no single metric can serve as a reliable proxy for the others. Applying the Leinster-Cobbold framework in a global context using lexical similarity demonstrates that accounting for similarity is impactful in regions with dialect continua but has a modest effect globally. The study concludes that incorporating similarity is essential to accurately model linguistic diversity.

Sprachdiversität ist bedroht; bis zum Ende dieses Jahrhunderts könnten 20\% aller Sprachen ausgestorben sein. Während in der Ökologie statistische Methoden zur Messung von Biodiversität etabliert sind, verlassen sich Sprachwissenschaftler häufig auf einfache Sprachzählungen. Diese Arbeit adaptiert daher das Leinster-Cobbold-Rahmenwerk aus der Ökologie, um einen einheitlichen Diversitätsbegriff zu schaffen, der die Gesamtzahl, relative Häufigkeit und Ähnlichkeit von Sprachen berücksichtigt. Große sprachübergreifende Datenbanken (PHOIBLE, Grambank, ASJP) wurden unter Verwendung von Sprecherzahlen aus Ethnologue und Joshua Project als Referenz untersucht und hinsichtlich ihrer Eignung zur Modellierung globaler Sprachdiversität bewertet. Maße sprachlicher Ähnlichkeit wurden abgeleitet und mittels Cluster- und Korrelationsanalysen ausgewertet. Die Ergebnisse zeigen, dass lediglich die ASJP-Datenbank eine ausreichende globale Abdeckung bietet, um Ähnlichkeitsmaße abzuleiten. Während die Maße zuverlässig sprachliche Ähnlichkeiten modellieren, korrelieren die Maße nur schwach miteinander, weshalb kein Maß als verlässlicher Proxy für die anderen dienen kann. Durch die Anwendung des Leinster-Cobbold-Rahmenwerks wird aufgezeigt, dass das Berücksichtigen von Ähnlichkeiten in Ländern mit Dialektkontinua erhebliche Auswirkungen hat, während der globale Effekt gering bleibt. Daraus wird die Notwendigkeit geschlossen, sprachliche Ähnlichkeiten zu berücksichtigen, um Sprachdiversität zu modellieren.

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now