Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Liriasarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Lirias
Conference object . 2008
Data sources: Lirias
https://doi.org/10.3233/978-1-...
Part of book or chapter of book . 2008 . Peer-reviewed
Data sources: Crossref
mEDRA
Part of book or chapter of book . 2008
Data sources: mEDRA
DBLP
Conference object
Data sources: DBLP
versions View all 4 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Using Decision Trees as the Answer Networks in Temporal Difference-Networks

Authors: Antanas, Laura; Driessens, Kurt; Croonenborghs, Tom; Ramon, Jan;

Using Decision Trees as the Answer Networks in Temporal Difference-Networks

Abstract

State representation for intelligent agents is a continuous challenge as the need for abstraction is unavoidable in large state spaces. Predictive representations offer one way to obtain state abstraction by replacing a state with a set of predictions about future interactions with the world. One such formalism is the Temporal-Difference Networks framework [2]. It splits the representation of knowledge in the question network and the answer network. The question network defines which questions (interactions) about future experience are of interest. It contains nodes, each corresponding to a single scalar prediction about a future observation given a certain sequence of interactions with the environment. The nodes are connected by links, annotated with action-labels, which represent temporal relationships between the predictions made by the nodes, conditioned on the action-labels on the links (more details in [2]). The answer network provides the predictive models to update the answers to the defined questions, which are expected values of the scalar quantities in the nodes. These values can be seen as estimates of probabilities. With each executed action of the agent, the predictions are updated using the answer network models to obtain a description of the new state. In classical TD-networks, logistic regression models are used, whose weight vector is obtained using a gradient learning approach. We propose the use of probability-valued decision trees [1] in the answer network of TD-Nets. We believe that decision trees are a particular good choice to investigate, as they offer a different yet powerful form of generalization. Moreover, this aids in a better understanding of the strengths and weaknesses of TD-Nets and represents an important first step towards using them in worlds with more extensive observations. Furthermore, decision tree induction can be regarded as a prototypical example of a non-gradient learning approach.

Country
Belgium
Related Organizations
Keywords

Technology, Science & Technology, temporal-difference networks, Computer Science, Computer Science, Artificial Intelligence, probability trees

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green