Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ NUML journal of crit...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
NUML journal of critical inquiry
Article . 2025 . Peer-reviewed
Data sources: Crossref
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
NUML journal of critical inquiry
Article . 2025
Data sources: DOAJ
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

The Development of Nominal Synsets for the Saraiki Language: A Corpus-based Analysis

Authors: Madya Asgher; Musarrat Azher;

The Development of Nominal Synsets for the Saraiki Language: A Corpus-based Analysis

Abstract

This paper focuses on developing nominal synsets for the Saraiki language (SL), a lesser-studied language spoken in Pakistan. Nominal synsets are groups of nouns that share semantic characteristics and are crucial for natural language processing tasks such as information retrieval, machine translation, and text classification. The research aims to create Saraiki Nominal Synsets (SNS) using the Gurumukhi Punjabi WordNet. The study employs a hybrid approach, combining merge and expansion techniques for analysis and gathers data from PDF textbooks, online sources, and the Saraiki Wikimedia incubator. The collected data is limited to texts published between 2000 and 2019, and manually tagged using Antconc 3.4.4.0 wordlist due to the unavailability of a tagger for the Saraiki Language. The study builds a 2.2 million Saraiki word corpus and a list of 750 nouns, then categorizes and semantically organizes the Saraiki Nominal Synsets based on the list of Saraiki nouns. To identify and classify nouns in SL based on their semantic properties, a corpus-based approach is utilized, and nominal synsets are constructed using a combination of manual and automatic methods. Evaluating the quality of the synsets involves comparing them to existing lexical resources and conducting a semantic similarity analysis. The results demonstrate the effectiveness of the approach in capturing semantic relations among nouns in SL and producing synsets useful for various NLP applications. Overall, this study contributes to the development of linguistic resources for lesser-studied languages and provides valuable support for researchers and developers working on natural language processing tasks involving SL.

Keywords

Language. Linguistic theory. Comparative grammar, P101-410, Nouns Categorization, Computational linguistics. Natural language processing, P98-98.5, Saraiki language, Saraiki Nominal Synsets, Antconc, NLP, Corpus, WordNet, Lexical Relations

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
gold