Maximum Entropy based Independent Learning in Anonymous Multi-Agent Settings

Preprint English OPEN
Verma, Tanvi; Varakantham, Pradeep; Lau, Hoong Chuin;
(2018)
  • Subject: Statistics - Machine Learning | Computer Science - Machine Learning | Computer Science - Artificial Intelligence

With the advent of sequential matching (of supply and demand) systems (uber, Lyft, Grab for taxis; ubereats, deliveroo, etc for food; amazon prime, lazada etc. for groceries) across many online and offline services, individuals (taxi drivers, delivery boys, delivery van... View more
  • References (20)
    20 references, page 1 of 2

    Busoniu, L., Babuska, R., and De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Systems, Man, and Cybernetics, Part C, 38(2):156-172.

    Claus, C. and Boutilier, C. (1998). The dynamics of reinforcement learning in cooperative multiagent systems. AAAI/IAAI, 1998:746-752.

    Foerster, J., Assael, Y., de Freitas, N., and Whiteson, S. (2016). Learning to communicate with deep multiagent reinforcement learning. In Advances in Neural Information Processing Systems, pages 2137-2145.

    Haarnoja, T., Tang, H., Abbeel, P., and Levine, S. (2017a). Reinforcement learning with deep energy-based policies. arXiv preprint arXiv:1702.08165.

    Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2017b). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor.

    Hausknecht, M. and Stone, P. (2015). Deep reinforcement learning in parameterized action space. arXiv preprint arXiv:1511.04143.

    Kumar, R. R. and Varakantham, P. (2017). Exploiting anonymity and homogeneity in factored dec-mdps through precomputed binomial distributions. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pages 732-740. International Foundation for Autonomous Agents and Multiagent Systems.

    Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. In Machine Learning Proceedings 1994, pages 157-163. Elsevier.

    Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., and Kavukcuoglu, K. (2016). Asynchronous methods for deep reinforcement learning. In International Conference on Machine Learning, pages 1928-1937.

    Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., et al. (2015). Humanlevel control through deep reinforcement learning. Nature, 518(7540):529.

  • Metrics
Share - Bookmark