Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2023
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2023
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2023
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2023
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2023
License: CC BY
Data sources: Datacite
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

AGREE: a New Benchmark for the Evaluation of Semantic Models of Ancient Greek

Authors: Stopponi, Silvia; Peels-Matthey, Saskia; Nissim, Malvina;

AGREE: a New Benchmark for the Evaluation of Semantic Models of Ancient Greek

Abstract

AGREE (Ancient Greek Relatedness Embeddings Evaluation) is a benchmark for the evaluation of semantic models of Ancient Greek created at the University of Groningen (The Netherlands). More information about it can be found in the following publication: Silvia Stopponi, Saskia Peels-Matthey, Malvina Nissim, AGREE: a new benchmark for the evaluation of distributional semantic models of ancient Greek, Digital Scholarship in the Humanities, Volume 39, Issue 1, April 2024, Pages 373–392, https://doi.org/10.1093/llc/fqad087 1. Overview of the repository This benchmark was created from a mix of expert judgements about relatedness between Ancient Greek words and model outputs validated by human experts. The evaluation items are pairs of Ancient Greek lemmas with a high semantic relatedness. The human judgements were collected via two questionnaires, proposing two different tasks to the experts. The evaluation items included in the AGREE benchmark are a selection of the most strictly related pairs of lemmas obtained from the two tasks. Here an overview of the contents of the repository: 1_agree_task1.json includes all the data collected with the first task. The following labels are used: 'pair': two Ancient Greek lemmas; 'frequency': the number of times that the pair was suggested as related by an expert; 'POS1': part-of-speech of the first lemma; 'POS2': part-of-speech of the second lemma; 'benchmark': inclusion of the pair in the AGREE benchmark ('yes'/'no'). 2_agree_task2.json includes all the data collected with the second task. The following labels are used: 'pair': two Ancient Greek lemmas; 'origin': 'common_pair' = one of the two pairs proposed to all participants in the second task; 'task1' = pairs proposed by experts in the first task; 'models_easy_rel' = output of word2vec models, pair considered as strictly related; 'models_task1' = pairs proposed by experts in the first task and also output by word2vec models; 'models' = output of word2vec language models; 'unrelated' = made up pairs of unrelated lemmas (control pairs); 'respondents': number of experts evaluating a pair; 'score': average relatedness score given by the experts on a 0-100 scale; 'agreement': inter-annotated agreement between all experts who evaluated the block of pairs to which the current pair belongs (when available, i.e. when the block of pairs was presented to more than one participant); 'benchmark': inclusion of the pair in the AGREE benchmark ('yes'/'no'). 3_agree_final_benchmark.json includes the final selection of items that constitutes AGREE. The following labels are used: 'pair': two Ancient Greek lemmas; 'origin': 'task1': pair either proposed more than once in the first task or proposed only once, but scored >= 70 in the second task; 'task2': pair scored by more than one respondent in the second task and with average score >= 70. This updated version of the repository includes the individual answers to the two questionnaires (see files 'answers_Task1_postprocessed.xlsx' and 'raw_answers_Task2.xlsx'). 2. Acknowledgements This work was partially supported by the Young Academy Groningen through the PhD scholarship of Silvia Stopponi. We acknowledge the financial support of Anchoring Innovation. Anchoring Innovation is the Gravitation Grant research agenda of the Dutch National Research School in Classical Studies, OIKOS. It is financially supported by the Dutch ministry of Education, Culture and Science (NWO project number 024.003.012). For more information about the research programme and its results, see the website www.anchoringinnovation.nl. We want to thank the experts of Ancient Greek around the world who shared their knowledge of Ancient Greek semantics and donated some of their precious time. Without them the creation of this benchmark would not have been possible. We also want to thank the many colleagues from the University of Groningen, the National Research School OIKOS, and other Universities abroad who contributed to this work with discussion and advice. 3. CitationSilvia Stopponi, Saskia Peels-Matthey, Malvina Nissim, AGREE: a new benchmark for the evaluation of distributional semantic models of ancient Greek, Digital Scholarship in the Humanities, Volume 39, Issue 1, April 2024, Pages 373–392, https://doi.org/10.1093/llc/fqad087

Related Organizations
Keywords

word embeddings, evaluation, benchmark, human judgements, Ancient Greek, semantics

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 19
    download downloads 12
  • 19
    views
    12
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
19
12
Related to Research communities