
ArsipKG is a comprehensive package supporting research on automated knowledge graph (KG) population from Indonesian government regulatory archives. It includes: (1) ArsipDataset, a curated corpus of 614 archival regulations from 133 institutions spanning 1961-2025; (2) ArsipOnto, a formal OWL 2 DL ontology aligned with LKIF, Dublin Core Terms, SKOS, and FOAF; (3) ArsipKG-Auto, a populated knowledge graph with 1,211 individuals and 1,911 axioms; (4) ArsipQA-v1, a benchmark of 90 question-answer pairs across 7 question types; (5) a validation dataset of 26 stratified documents independently annotated by two domain experts (Cohen's kappa = 0.9504); and (6) the complete experimental pipeline implementation, including supervised baselines (Rule-based, TF-IDF+SVM, IndoBERT fine-tuning) and the four-stage LLM-driven pipeline.
