Using Decision Trees as the Answer Networks in Temporal Difference-Networks

descriptionPublicationkeyboard_double_arrow_right Part of book or chapter of book , Conference object , Article 01 Jan 2008 Belgium Publisher:IOS Press

Authors: Antanas, Laura; Driessens, Kurt; Croonenborghs, Tom; Ramon, Jan;

doi: 10.3233/978-1-58603-891-5-847

Using Decision Trees as the Answer Networks in Temporal Difference-Networks

- Summary
- Subjects
- Related research
  (8)
- Metrics

Abstract

State representation for intelligent agents is a continuous challenge as the need for abstraction is unavoidable in large state spaces. Predictive representations offer one way to obtain state abstraction by replacing a state with a set of predictions about future interactions with the world. One such formalism is the Temporal-Difference Networks framework [2]. It splits the representation of knowledge in the question network and the answer network. The question network defines which questions (interactions) about future experience are of interest. It contains nodes, each corresponding to a single scalar prediction about a future observation given a certain sequence of interactions with the environment. The nodes are connected by links, annotated with action-labels, which represent temporal relationships between the predictions made by the nodes, conditioned on the action-labels on the links (more details in [2]). The answer network provides the predictive models to update the answers to the defined questions, which are expected values of the scalar quantities in the nodes. These values can be seen as estimates of probabilities. With each executed action of the agent, the predictions are updated using the answer network models to obtain a description of the new state. In classical TD-networks, logistic regression models are used, whose weight vector is obtained using a gradient learning approach. We propose the use of probability-valued decision trees [1] in the answer network of TD-Nets. We believe that decision trees are a particular good choice to investigate, as they offer a different yet powerful form of generalization. Moreover, this aids in a better understanding of the strengths and weaknesses of TD-Nets and represents an important first step towards using them in worlds with more extensive observations. Furthermore, decision tree induction can be regarded as a prototypical example of a non-gradient learning approach.

Country

Belgium

Related Organizations

Katholieke Universiteit Leuven
Belgium
KU Leuven
Belgium

Keywords

Technology, Science & Technology, temporal-difference networks, Computer Science, Computer Science, Artificial Intelligence, probability trees

8 Research products, page 1 of 1

Soft systems methodology
2020IsAmongTopNSimilarDocuments
Creating spaces and cultivating mindsets for learning and experimentation: International Transdisciplinarity Conference 2021
2021IsAmongTopNSimilarDocuments
Forschung für gesellschaftliche Innovationen an Fachhochschulen (FHs) : Potenziale, Rahmenbedingungen, Handlungsfelder
2021IsAmongTopNSimilarDocuments
Forschung für gesellschaftliche Innovationen an Fachhochschulen (FHs) – Potenziale, Rahmenbedingungen, Handlungsfelder
2020IsAmongTopNSimilarDocuments
TD-Net: A Hybrid End-to-End Network for Automatic Liver Tumor Segmentation From CT Images
2023IsAmongTopNSimilarDocuments
‘Intercultural Endeavors’ Explored at ‘TD-Net’ Conference
2018IsAmongTopNSimilarDocuments
Research marketplace
2020IsAmongTopNSimilarDocuments
Ein Raum für einen kultursensiblen Blick auf Transdisziplinarität. Bericht der ITD-Konferenz 2017
2017IsAmongTopNSimilarDocuments

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average