
The ALMAZ Resource Roster is a curated, versioned catalog of Azerbaijani NLP artifacts including corpora, datasets, pretrained models, evaluation benchmarks, and tools. It serves as the shared reference dataset for the ALMAZ paper series (Advanced Language Model for AZerbaijan) and is intended to support reproducible research in low-resource Azerbaijani language technology.
If you use this roster in your research, please cite it as below.
resource catalog, Azerbaijani NLP, NLP resources, text corpus, language models, Turkic languages, ALMAZ, azerbaijani, LLM pretraining, natural language processing, low-resource languages, data pipeline
resource catalog, Azerbaijani NLP, NLP resources, text corpus, language models, Turkic languages, ALMAZ, azerbaijani, LLM pretraining, natural language processing, low-resource languages, data pipeline
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
