
In the dynamic landscape of artificial intelligence and machine learning, Reinforcement Learning (RL) has emerged as a powerful paradigm for training intelligent agents in sequential decision-making. As RL architectures progress in complexity, the need for informed decision-making regarding training strategies and related consequences on the software architecture becomes increasingly intricate. This work addresses this challenge by presenting the outcomes of a qualitative, in-depth study focused on best practices and patterns within training strategies for RL architectures, as articulated by practitioners. Leveraging a model-based qualitative research method, we introduce a formal architecture decision model to bridge the gap between scientific insights and practical implementation. We aim to enhance the understanding of practitioners' approaches in RL architecture. The paper analyzes 33 knowledge sources to discern established industrial practices, patterns, relationships, and decision drivers. Based on this knowledge, we introduce a formal Architectural Design Decision (ADD) model, encapsulating 6 decisions, 29 decision options, and 19 decision drivers, providing robust decision-making support for this critical facet of RL-based software architectures.
Machine Learning, Software Architecture, 102019 Machine Learning, 102001 Artificial intelligence, Grounded Theory, 102001 Artificial Intelligence, 102019 Machine learning, Reinforcement Learning, Design Decisions
Machine Learning, Software Architecture, 102019 Machine Learning, 102001 Artificial intelligence, Grounded Theory, 102001 Artificial Intelligence, 102019 Machine learning, Reinforcement Learning, Design Decisions
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
