
Abstract Automated portfolio management via deep reinforcement learning (DRL) has demonstrated competitive risk-adjusted returns in controlled backtesting environments, yet its deployment in regulated financial institutions is impeded by two intertwined deficiencies: the opacity of monolithic neural policies and the inadequacy of existing multi-agent coordination mechanisms for capturing the heterogeneous reasoning processes that underpin professional trading decisions. This paper introduces the Multi-Agent Explainable Trading System (MAETS), a cooperative multi-agent reinforcement learning (MARL) framework comprising four domain-specialized agents—a Fundamental Analysis Agent (FAA), a Technical Analysis Agent (TAA), a Sentiment Analysis Agent (SAA), and a Risk Management Agent (RMA)—coordinated through a Graph Attention Network (GAT)-based centralized critic operating under the Centralized Training with Decentralized Execution (CTDE) paradigm. Each agent's policy is parameterized by a Proximal Policy Optimization (PPO) backbone with multi-head cross-attention over learned inter-agent message embeddings. Post-hoc explainability is provided through a three-stage pipeline: KernelSHAP attribution, counterfactual perturbation, and a FinBERT-conditioned natural language generation module. We also introduce the Fidelity-Completeness-Understandability (FCU) composite metric as a principled measure for evaluating the quality of AI-generated financial explanations. Backtested on a decade of data (2014–2023) from S&P 500, NIFTY 50, and CSI 300 constituents under realistic transaction cost and slippage assumptions, MAETS achieves an annualized return of 31.6% (95% CI: ±0.9%), a Sharpe Ratio of 1.74, a Calmar Ratio of 2.82, and a maximum drawdown of 11.2%—outperforming the strongest DRL baseline by 7.5 Sharpe points and 6.1 percentage points in annualized return. The FCU score of 0.91 represents a 68.5% improvement over post-hoc-attributed single-agent alternatives, with analyst understandability ratings averaging 4.5/5.0. These results establish that trading transparency and financial performance are not competing objectives but can be jointly optimized through principled multi-agent decomposition. Index Terms— Multi-agent reinforcement learning, explainable artificial intelligence, algorithmic trading, proximal policy optimization, SHAP attribution, cooperative agents, portfolio optimization, centralized training with decentralized execution.
(4-(m-Chlorophenylcarbamoyloxy)-2-butynyl)trimethylammonium Chloride/administration & dosage, cooperative agents, explainable artificial intelligence, proximal policy optimization, algorithmic trading, SHAP attribution, (4-(m-Chlorophenylcarbamoyloxy)-2-butynyl)trimethylammonium Chloride/administration & dosage, Multi-agent reinforcement learning, portfolio optimization, centralized training with decentralized execution
(4-(m-Chlorophenylcarbamoyloxy)-2-butynyl)trimethylammonium Chloride/administration & dosage, cooperative agents, explainable artificial intelligence, proximal policy optimization, algorithmic trading, SHAP attribution, (4-(m-Chlorophenylcarbamoyloxy)-2-butynyl)trimethylammonium Chloride/administration & dosage, Multi-agent reinforcement learning, portfolio optimization, centralized training with decentralized execution
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
