A Microservice-based Architecture for Reproducible AI Pipelines

As Artificial Intelligence (AI) systems move from prototypes to deployment, reliability and trustworthiness are increasingly limited by the data and AI pipeline lifecycle rather than by model training alone. Existing platforms offer strong support for individual lifecycle functions such as orchestration, experiment tracking, validation, or documentation, yet these capabilities are often adopted as separate tools, making provenance, observability, governance, and safe adaptation difficult to manage end-to-end. This paper presents the microservice-based reference architecture of the AI-DAPT project for reproducible AI pipelines whose core contribution is the unification of data- and model-centric operations within a single lifecycle-oriented platform. The architecture is organized around five iterative phases, namely data design, data sculpting, data generation, model delivery, and data/model optimization, while treating metadata, lineage, explainability, security, and human-oversight as native platform services rather than peripheral add-ons. The paper further contributes a trustworthiness-by-design architectural view that connects data preparation, valuation, synthetic data generation, orchestration, monitoring, adaptive retraining, and AI security through interoperable services. Finally, it presents the current realization of this architecture using widely adopted orchestration, storage, monitoring, and experimentation technologies, thereby offering a practical blueprint for building adaptable, observable, and governable AI pipelines in real-world settings.

Found an issue? Give us feedback

Funded by

EC| AI-DAPT