Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software
Data sources: ZENODO
addClaim

llm-ontology-evaluation

Authors: Konys, Agnieszka;

llm-ontology-evaluation

Abstract

LLM Ontology Evaluation Benchmark is a software package for assessing large language models on ontology‑grounded reasoning tasks. It provides structured test cases, ontology‑based prompts, and evaluation scripts that enable reproducible measurement of model performance across semantic, hierarchical, and relational reasoning categories. The benchmark supports consistency checking, concept placement, relation inference, and other ontology‑driven evaluation scenarios. This release contains the full codebase, test data, and instructions required to run the benchmark and compare model outputs across tasks.

Powered by OpenAIRE graph
Found an issue? Give us feedback