llm-ontology-evaluation

LLM Ontology Evaluation Benchmark is a software package for assessing large language models on ontology‑grounded reasoning tasks. It provides structured test cases, ontology‑based prompts, and evaluation scripts that enable reproducible measurement of model performance across semantic, hierarchical, and relational reasoning categories. The benchmark supports consistency checking, concept placement, relation inference, and other ontology‑driven evaluation scenarios. This release contains the full codebase, test data, and instructions required to run the benchmark and compare model outputs across tasks.

Found an issue? Give us feedback