research product . Other ORP type . 2012

Approximate Dynamic Programming with Parallel Stochastic Planning Operators

Child, C. H. T.;
Open Access English
  • Published: 12 Sep 2012
  • Publisher: City University London
  • Country: United Kingdom
Abstract
This report presents an approximate dynamic programming (ADP) technique for environment modelling agents. The agent learns a set of parallel stochastic planning operators (P-SPOs) by evaluating changes in its environment in response to actions, using an association rule mining approach. An approximate policy is then derived by iteratively improving state value aggregation estimates attached to the operators using the P-SPOs as a model in a Dyna-Q-like architecture. Reinforcement learning and dynamic programming are powerful techniques for automated agent decision making in stochastic environments. Dynamic programming is effective when there is a known environmen...
Subjects
free text keywords: T1
Related Organizations
Download from

[38] Gregory, J., Lander. J. and Whiting, M. (2009) “Game Engine Architecture”, A.K. Peters/ CRC Press.

[39] Grzes, M. and Kudenko, D. (2008) “An Empirical Analysis of the Impact of Prioritised Sweeping on the DynaQ's Performance”, Proc. 9th Int. Conf. on Artificial Intelligence and Soft Computing (ICAISC-08), pp 1041-1051. [OpenAIRE]

[40] Guerin, F. (2011) "Learning like a Baby: A Survey of AI approaches", The Knowledge Engineering Review (accepted to appear).

[41] Hidber, C. (1999) “Online Association Rule Mining”. Proc. of the 1999 ACM SIGMOD Int. Conf. on Management of Data, Philadelphia, Pennsylvania, USA, pp 145-156. [OpenAIRE]

[42] Hipp, J., Gunter, U. and Nakhaeizadeh, G., (2000) “Algorithms for Association Rule Mining: a General Survey and Comparison”, ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD 2000), Boston, MA, USA, pp 58-64. [OpenAIRE]

[43] Kirsting, K. and De Raedt, L. (2001) “Bayesian Logic Programs”, Technical Report No. 151, Institute of Computer Science, University of Freiburg, Germany. [OpenAIRE]

[44] Kirsting, K. and De Raedt, L. (2008) “Basic Principles of Learning Bayesian Logic Programs”, Probabilistic Inductive Logic Programming, Lecture Notes in Computer Science, 4911/2008. Springer.

[45] Kisynski, J. and Poole, D. (2009) “Lifted Aggregation in Directed First-Order Probabilistic Models”, Proc. 21st Int. Joint Conf. on Artificial Intelligence (IJCAI 2009), pp 1922-1929.

[52] Jordan, M. I., (1998) “Learning in Graphical Models”. MIT Press.

[54] Lang, T. and Toussaint, M. (2010) “Planning with Noisy Probabilistic Relational Rules”, Journal of Artificial Intelligence Research, 39, pp 1-49. [OpenAIRE]

[66] Pasula, H. M, Zettlemoyer, L. S. and Kaelbling, L. P. (2004) “Learning Probabilistic Relational Planning Rules”, Proc. 14th Int. Conf. on Automated Planning and Scheduling (ICAPS-04), pp 73-82.

Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue