Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2026
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2024
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2024
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
versions View all 3 versions
addClaim

Leaderboard Spanish Language Benchmark for Artificial Intelligence Models (TELEIA)

Authors: Mayor-Rocher, Marina; Melero, Nina; Merino-Gómez, Elena; Grandury, María; Conde, Javier; Reviriego, Pedro;

Leaderboard Spanish Language Benchmark for Artificial Intelligence Models (TELEIA)

Abstract

TELEIA Datasets Leaderboard These dataset contains the answers of different LLMs to the TELEIA (Spanish Language Benchmark for Artificial Intelligence Models) dataset.LLMs evaluated: Yi-6B-Chat Meta-Llama-3-8B-Instruct Llama-2-7b-chat-hf gemma-7b-it Mistral-7B-Instruct-v0.1 occiglot-7b-es-en-instruct GPT3.5 GPT4 Files: TELEIA_Cervantes_AVE_results.xlsx: vocabulary and grammatical structures, following the format of the Cervantes AVE exam TELEIA_PCE_results.xlsx: test on morphology and semantics resembling the style of the PCE exam, consisting of short questions or sentences to be completed TELEIA_SIELE_results.xlsx: different texts with questions related to them, based on the reading comprehension task of the SIELE exam Each .xlsx contains a sheet with the results of each model and the following columns: question: question from TELEIA option_a: possible answer from TELEIA option_b: possible answer from TELEIA option_c: possible answer from TELEIA option_d: possible answer from TELEIA correct_answer: correct answer form TELEIA llm_question: complete question made to the LLM tokens_in: list of tokens that compound the question tokens_in_count: number of tokens that compound the question llm_answer: raw answer from the LLM llm_answer_filtered: answer in format {A,B,C,D} from the LLM tokens_out : list of tokens that compound the raw answer tokens_out_count: number of tokens that compound the raw answer word_count : number of words that compound the raw answer

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Funded by