A consolidated lexical dataset for Dogon languages

This dataset is a staged release of a consolidated lexical dataset for Dogon languages. It brings together heterogeneous source layers, including RefLex-derived data, Dogon and Bangime Linguistics materials, CLDF/LexiBank-derived working files, and subsequent BANG project curation. The workflow includes transcription standardization, source and language-name normalization, staged merging, Concepticon and part-of-speech enrichment, manual revision, source-village and GPS verification, doculect construction, and Glottolog alignment. This release should be treated as provisional. Remaining issues include duplicate resolution, language-level attribution auditing, and verification of the incorporation of earlier manual curation of verbal paradigms. Full source-level attribution and contributor roles are documented in ATTRIBUTION.md.

Found an issue? Give us feedback

Funded by

EC| BANG