Uncertainty and exploration.

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type , Preprint 14 Feb 2018 English Publisher:American Psychological Association (APA)Journal:Decision, volume 6, pages 277-286 (issn: 2325-9965, eissn: 2325-9973,

Copyright policy )

Authors: Gershman, Samuel J.;

doi: 10.1037/dec0000101 , 10.1101/265504

pmid: 33768122

pmc: PMC7989061

Uncertainty and exploration.

- Summary
- Metrics

Abstract

Abstract In order to discover the most rewarding actions, agents must collect information about their environment, potentially foregoing reward. The optimal solution to this “explore-exploit” dilemma is often computationally challenging, but principled algorithmic approximations exist. These approximations utilize uncertainty about action values in different ways. Some random exploration algorithms scale the level of choice stochasticity with the level of uncertainty. Other directed exploration algorithms add a “bonus” to action values with high uncertainty. Random exploration algorithms are sensitive to total uncertainty across actions, whereas directed exploration algorithms are sensitive to relative uncertainty. This paper reports a multi-armed bandit experiment in which total and relative uncertainty were orthogonally manipulated. We found that humans employ both exploration strategies, and that these strategies are independently controlled by different uncertainty computations.

Related Organizations

Harvard University
United States

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	98
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

98

Top 1%

Top 10%

Green

hybrid