Downloads provided by UsageCounts
handle: 2117/364236
The field of Reinforcement Learning (RL) has been receiving much attention during the last few years as a new paradigm to solve complex problems. However, one of the main issues with the current state of the art is their computational cost. Compared with other paradigms such as Supervised learning, RL requires constant interaction with the environment, which is both expensive and hard to parallelize. In this work we explore a more scalable alternative to conventional RL through the use of Evolution Strategies (ES). This consists in iteratively modifying the current solution by adding Gaussian noise to it, evaluating these modifications, and use their score to guide the improvement of the solution. The advantage of ES lies on that creating and evaluating these modifications can be parallelized. After introducing the network routing scenario, we used it to compare how ES performed against PPO, a RL policy gradient method. Ultimately ES took advantage of increasing its number of workers to eventually overtake PPO, training faster while also generating better results overall. However, it was also clear that for this to occur ES must have access to a considerable amount of hardware resources, hence being viable only within high perfomance computing environments.
Network Routing, Deep Reinforcement Learning, Aprenentatge per Reforç Profund, :Informàtica::Enginyeria del software [Àrees temàtiques de la UPC], Computació evolutiva, Evolutionary computation, Neural networks (Computer science), Reinforcement learning, High Perfomance Computing, Xarxes neuronals (Informàtica), Graph Neural Networks, Computació Evolutiva, Computació d'Alt Rendiment, Àrees temàtiques de la UPC::Informàtica::Enginyeria del software, Message Passing Neural Networks, Aprenentatge per Reforç, Enrutament de xarxes, Xarxes neuronals basades en intercanvi de missatges, Reinforcement Learning, Xarxes Neuronals Gràfiques, Aprenentatge per reforç, Evolutionary Strategies, High performance computing, Evolutionary Computation, Càlcul intensiu (Informàtica), Estratègies Evolutives
Network Routing, Deep Reinforcement Learning, Aprenentatge per Reforç Profund, :Informàtica::Enginyeria del software [Àrees temàtiques de la UPC], Computació evolutiva, Evolutionary computation, Neural networks (Computer science), Reinforcement learning, High Perfomance Computing, Xarxes neuronals (Informàtica), Graph Neural Networks, Computació Evolutiva, Computació d'Alt Rendiment, Àrees temàtiques de la UPC::Informàtica::Enginyeria del software, Message Passing Neural Networks, Aprenentatge per Reforç, Enrutament de xarxes, Xarxes neuronals basades en intercanvi de missatges, Reinforcement Learning, Xarxes Neuronals Gràfiques, Aprenentatge per reforç, Evolutionary Strategies, High performance computing, Evolutionary Computation, Càlcul intensiu (Informàtica), Estratègies Evolutives
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 32 | |
| downloads | 64 |

Views provided by UsageCounts
Downloads provided by UsageCounts