Stochastic Stability of Perturbed Learning Automata in Positive-Utility Games

Article, Preprint English OPEN
Chasparis, Georgios C. (2017)

This paper considers a class of reinforcement-based learning (namely, perturbed learning automata) and provides a stochastic-stability analysis in repeatedly-played, positive-utility, finite strategic-form games. Prior work in this class of learning dynamics primarily analyzes asymptotic convergence through stochastic approximations, where convergence can be associated with the limit points of an ordinary-differential equation (ODE). However, analyzing global convergence through an ODE-approximation requires the existence of a Lyapunov or a potential function, which naturally restricts the analysis to a fine class of games. To overcome these limitations, this paper introduces an alternative framework for analyzing asymptotic convergence that is based upon an explicit characterization of the invariant probability measure of the induced Markov chain. We further provide a methodology for computing the invariant probability measure in positive-utility games, together with an illustration in the context of coordination games.
  • References (35)
    35 references, page 1 of 4

    [1] G. C. Chasparis, “Stochastic stability analysis of perturbed learning automata with constant step-size in strategic-form games,” in American Control Conference, Seattle, USA, 2017, pp. 4607-4612.

    [2] B. G. Chun, R. Fonseca, I. Stoica, and J. Kubiatowicz, “Characterizing selfishly constructed overlay routing networks,” in Proc. of IEEE INFOCOM 04, Hong-Kong, 2004.

    [3] R. Komali, A. B. MacKenzie, and R. P. Gilles, “Effect of selfish node behavior on efficient topology design,” IEEE Transactions on Mobile Computing, vol. 7, no. 9, pp. 1057-1070, 2008.

    [4] G. Wei, A. V. Vasilakos, Y. Zheng, and N. Xiong, “A game-theoretic method of fair resource allocation for cloud computing services,” The Journal of Supercomputing, vol. 54, no. 2, pp. 252-269, Nov. 2010.

    [5] W. B. Arthur, “On designing economic agents that behave like human agents,” J. Evolutionary Econ., vol. 3, pp. 1-22, 1993.

    [6] T. Bo¨rgers and R. Sarin, “Learning through reinforcement and replicator dynamics,” J. Econ. Theory, vol. 77, no. 1, pp. 1-14, 1997.

    [7] I. Erev and A. Roth, “Predicting how people play games: reinforcement learning in experimental games with unique, mixed strategy equilibria,” Amer. Econ. Rev., vol. 88, pp. 848-881, 1998.

    [8] E. Hopkins and M. Posch, “Attainability of boundary points under reinforcement learning,” Games Econ. Behav., vol. 53, pp. 110-125, 2005.

    [9] A. Beggs, “On the convergence of reinforcement learning,” J. Econ. Theory, vol. 122, pp. 1-36, 2005.

    [10] M. Thathachar and P. Sastry, Networks of Learning Automata: Techniques for Online Stochastic Optimization. Kluwer Academic Publishers, 2004.

  • Similar Research Results (1)
  • Metrics
    No metrics available
Share - Bookmark