Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao IEEE Transactions on...arrow_drop_down
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
IEEE Transactions on Neural Networks and Learning Systems
Article . 2024 . Peer-reviewed
License: IEEE Copyright
Data sources: Crossref
DBLP
Article
Data sources: DBLP
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Scalable-MADDPG-Based Cooperative Target Invasion for a Multi-USV System

Authors: Cheng-Cheng Wang; Yu-Long Wang; Peng Shi 0001; Fei Wang 0083;

Scalable-MADDPG-Based Cooperative Target Invasion for a Multi-USV System

Abstract

This article concentrates on proposing a scalable deep reinforcement learning (DRL) method for a multiple unmanned surface vehicle (multi-USV) system to operate cooperative target invasion. The multi-USV system, which is made up of multiple invaders, needs to invade target areas in a specified time. A novel scalable reinforcement learning (RL) method called Scalable-MADDPG is proposed for the first time. In this method, the scale of the multi-USV system can be changed at any time without interrupting the training process. Then, to mitigate the policy oscillation after applying Scalable-MADDPG, a bi-directional long-short-term memory (Bi-LSTM) network is constructed. Moreover, an improved -greedy strategy is proposed to help balance the exploration and exploitation in RL. Furthermore, to enhance the robustness of the optimal policy, Ornstein-Uhlenbeck (OU) noise is added in this improved -greedy strategy during the training process. Finally, the scalable RL method is used to help the multi-USV system perform cooperative target invasion under complex marine environments. The effectiveness of Scalable-MADDPG is demonstrated through three experiments.

Country
Australia
Related Organizations
Keywords

reinforcement learning, 4007 Control engineering, mechatronics and robotics, multiple unmanned surface vehicle, Institute for Sustainable Industries and Liveable Cities, long–short-term memory, ϵ-greedy strategy, robustness, scalability

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    8
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Top 10%
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
8
Top 10%
Average
Top 10%
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!