Feature-Discovering Approximate Value Iteration Methods

descriptionPublicationkeyboard_double_arrow_right Part of book or chapter of book , Conference object , Article , Other literature type 01 Jan 2005 United States Publisher:Springer Berlin Heidelberg

Authors: Wu, Jia-Hong; Givan, Robert;

doi: 10.1007/11527862_25

Feature-Discovering Approximate Value Iteration Methods

- Summary
- Subjects
- Metrics

Abstract

Sets of features in Markov decision processes can play a critical role in approximately representing value and in abstracting the state space. Selection of features is crucial to the success of a system and is most often conducted by a human. We study the problem of automatically selecting problem features, and propose and evaluate a simple approach reducing the problem of selecting a new feature to standard classification learning. We learn a classifier that predicts the sign of the Bellman error over a training set of states. By iteratively adding new classifiers as features with this method, training between iterations with approximate value iteration, we find a Tetris feature set that outperforms randomly constructed features significantly, and obtains a score of about three-tenths of the highest score obtained by using a carefully hand-constructed feature set. We also show that features learned with this method outperform those learned with the previous method of Patrascu et al. [4] on the same SysAdmin domain used for evaluation there.

Country

United States

Related Organizations

Purdue University West Lafayette
United States
Purdue University System
United States

Keywords

004

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

1

Average

Green