Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ INRIA2arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
INRIA2
Doctoral thesis . 2024
Data sources: INRIA2
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Multi-agent reinforcement learning for dynamic wind farm control

Authors: Monroc, Claire;

Multi-agent reinforcement learning for dynamic wind farm control

Abstract

Nous étudions le problème de l’utilisation de l'apprentissage par renforcement multi-agents (MARL) pour le contrôle de parcs éoliens. Les stratégies de contrôle conventionnelles requièrent des modèles d'interactions aérodynamiques complexes entre les éoliennes et souffrent du fléau de la dimension lorsque le nombre d'éoliennes augmente. Nous formulons le problème du contrôle des parcs éoliens comme un problème MARL coopératif permettant l’optimisation sans modèle. Ceci nous permet de concevoir deux algorithmes MARL, qui prennent en compte la propagation dynamique des sillages dans les parcs éoliens avec des mises à jour différées. Nous proposons une approche d'apprentissage par imitation et transfert qui exploite des politiques apprises avec des modèles statiques pour s'adapter en ligne à des conditions de vent dynamiques. Nous montrons que le contrôle des parcs éoliens peut être écrit comme un processus de décision markovien décentralisé avec transitions indépendantes, où l'interdépendance des dynamiques des agents peut être représentée par un graphe acyclique orienté. Cela nous permet d'exploiter des résultats issus du domaine de l'approximation stochastique pour concevoir un algorithme multi-échelle : notre algorithme peut exploiter la localisation des éoliennes dans le champ pour attribuer différents taux d'apprentissage, ce qui garantit la convergence de politiques apprises localement par chaque agent.

This thesis studies the wind farm control problem using multi-agent reinforcement learning (MARL). Conventional model-based control strategies require tractable models of complex aerodynamical interactions between wind turbines and suffer from the curse of dimensionality when the number of turbines increases. To bypass this issue, we frame the wind farm control problem as a model-free, cooperative MARL problem. This allows us to design two delay-aware MARL algorithms based on an independent learning approach, which tackle the dynamic wake propagation in wind farms with delayed updates. We propose an imitation and transfer learning approach that can exploit optimal policies learned with static models to allow online adaptation to dynamic wind conditions. We show that wind farm control can be framed as a transition-independent decentralized Markov Decision Process, in which the interdependence of agents dynamics can be represented by a directed acyclic graph. This allows us to exploit results from stochastic approximation to design a multiscale algorithm: our algorithm can exploit the location of wind turbines in the field to assign different learning rates that guarantee convergence of policies learned locally by each agent.

Country
France
Keywords

Systèmes multi-agent, Apprentisage par renforcement, Wind Farm, Multi-agent systems, Decentralized Markov Decision Process, Processus de Décision Markovien Décentralisé, [INFO] Computer Science [cs], Reinforcement Learning, Parc éolien

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green
Related to Research communities