
Large-scale simulations executed on ARCHER2 produce substantial data volumes that must be transferred, archived and managed reliably across national research facilities. Manual data movement introduces operational risk, delays downstream analysis, and increases the likelihood of storage bottlenecks. This work presents a production workflow that automates the end-to-end data movement pipeline from ARCHER2 to the JASMIN data facility and onward to long-term Elastic Tape storage. Implemented using Cylc8, the workflow detects newly generated datasets via external triggers, coordinates secure transfers using Globus, verifies successful archival through the Near-Line Data Store (NLDS), and performs controlled clean-up of intermediate storage. Operational resilience is a primary design objective. The workflow incorporates configurable polling intervals, retry mechanisms to tolerate transient network failures, and structured logging to support traceability and debugging. A modular configuration allows the workflow to be reused across modelling activities and adapted to differing data management requirements. Automating cross-facility data handling reduces time-to-archive, alleviates storage pressure on ARCHER2, and improves the reliability of scientific data preservation. This work highlights the role of workflow automation in supporting sustainable data management practices for Tier-1 supercomputing environments and provides a practical approach for handling the increasing data demands of contemporary simulation workloads.
ARCHER2, Cylc8, Elastic Tape, JASMIN, Celebration of Science, Globus, HPC Infrastructure, Workflow Automation, Research Computing, Data Management
ARCHER2, Cylc8, Elastic Tape, JASMIN, Celebration of Science, Globus, HPC Infrastructure, Workflow Automation, Research Computing, Data Management
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
