Name: Learning Logic Specifications for Soft Policy Guidance in POMCP
Keywords: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Logic in Computer Science, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Partially Observable Markov Decision Processes, Planning Under Uncertainty, Inductive Logic Programming, Answer Set Programming, Explainable AI, Machine Learning (cs.LG), Logic in Computer Science (cs.LO)

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Jan 2023Embargo end date: 01 Jan 2023Publisher:arXiv

Authors: Giulio Mazzi; Daniele Meli; Alberto Castellini; Alessandro Farinelli;

doi: 10.48550/arxiv.2303.09172

arXiv: 2303.09172

handle: 11562/1095998

Learning Logic Specifications for Soft Policy Guidance in POMCP

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Partially Observable Monte Carlo Planning (POMCP) is an efficient solver for Partially Observable Markov Decision Processes (POMDPs). It allows scaling to large state spaces by computing an approximation of the optimal policy locally and online, using a Monte Carlo Tree Search based strategy. However, POMCP suffers from sparse reward function, namely, rewards achieved only when the final goal is reached, particularly in environments with large state spaces and long horizons. Recently, logic specifications have been integrated into POMCP to guide exploration and to satisfy safety requirements. However, such policy-related rules require manual definition by domain experts, especially in real-world scenarios. In this paper, we use inductive logic programming to learn logic specifications from traces of POMCP executions, i.e., sets of belief-action pairs generated by the planner. Specifically, we learn rules expressed in the paradigm of answer set programming. We then integrate them inside POMCP to provide soft policy bias toward promising actions. In the context of two benchmark scenarios, rocksample and battery, we show that the integration of learned rules from small task instances can improve performance with fewer Monte Carlo simulations and in larger task instances. We make our modified version of POMCP publicly available at https://github.com/GiuMaz/pomcp_clingo.git.

To appear in the Proceedings of 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2023

Related Organizations

University of Verona
Italy

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Logic in Computer Science, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Partially Observable Markov Decision Processes, Planning Under Uncertainty, Inductive Logic Programming, Answer Set Programming, Explainable AI, Machine Learning (cs.LG), Logic in Computer Science (cs.LO)

1 Research products, page 1 of 1

pomcp_clingo software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

Green

Learning Logic Specifications for Soft Policy Guidance in POMCP

Learning Logic Specifications for Soft Policy Guidance in POMCP

1 Research products, page 1 of 1

pomcp_clingo software on GitHub