Concurrent bandits and cognitive radio networks

Preprint English OPEN
Avner, Orly; Mannor, Shie;
(2014)
  • Subject: Computer Science - Multiagent Systems | Computer Science - Learning

We consider the problem of multiple users targeting the arms of a single multi-armed stochastic bandit. The motivation for this problem comes from cognitive radio networks, where selfish users need to coexist without any side communication between them, implicit coopera... View more
  • References (22)
    22 references, page 1 of 3

    [1] A. Anandkumar, N. Michael, A.K. Tang, and A. Swami. Distributed algorithms for learning and cognitive medium access with logarithmic regret. Selected Areas in Communications, IEEE Journal on, 29(4):731{745, 2011.

    [2] P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine learning, 47(2):235{256, 2002.

    [3] P. Auer, N. Cesa-Bianchi, Y. Freund, and R.E. Schapire. The nonstochastic multiarmed bandit problem. SIAM Journal on Computing, 32(1):48{77, 2002.

    [4] P. Auer and R. Ortner. UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem. Periodica Mathematica Hungarica, 61(1):55{65, 2010.

    [5] O. Avner and S. Mannor. Stochastic bandits with pathwise constraints. In 50th IEEE Conference on Decision and Control, December 2011.

    [6] O. Avner, S. Mannor, and O. Shamir. Decoupling exploration and exploitation in multi-armed bandits. In 29th International Conference on Machine Learning, December 2012.

    [7] D.A. Berry and B. Fristedt. Bandit problems: sequential allocation of experiments. Chapman and Hall London, 1985.

    [8] S. Choe. Performance analysis of slotted aloha based multi-channel cognitive packet radio network. In Proceedings of the 6th IEEE Conference on Consumer Communications and Networking Conference, CCNC'09, pages 672{676, 2009.

    [9] T.M. Cover and J.A. Thomas. Elements of information theory. John Wiley & Sons, 1991.

    [10] E. Even-Dar, S. Mannor, and Y. Mansour. PAC bounds for multi-armed bandit and markov decision processes. In Computational Learning Theory, pages 193{209. Springer, 2002.

  • Metrics
Share - Bookmark