Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Conference object . 2025
License: CC BY
Data sources: ZENODO
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

EXPLORE: A Scalable Infrastructure for LHC Open Data Analysis and FAIR Data provisioning

Authors: Achkar, Baida; Giffels, Manuel; Quadt, Arnulf; Wozniewski, Sebastian;

EXPLORE: A Scalable Infrastructure for LHC Open Data Analysis and FAIR Data provisioning

Abstract

EXPLORE is a research data infrastructure deployed at the GoeGrid Compute Resource Center, University of Göttingen, as part of the PUNCH4NFDI project [1], funded by the Ger-man Research Foundation (DFG). The project aims to establish FAIR (Findable, Accessible, Interoperable, and Reusable) data management solutions and provide dynamically allocated compute resources for physics communities. EXPLORE integrates up to 200 CPU cores on an HTCondor Overlay Batch System (OBS) [2]. A dedicated login node hosts both the Central Manager and the Submitter, coordinating job submissions. Compute resources are dynamically provided through virtual worker nodes with 8 CPU cores each, enabling flexible and efficient execution. To optimize resource allocation in real-time, EXPLORE uses COBalD [3]/TARDIS [4], a modular provisioning framework. COBalD acts as a flexible decision-making layer for managing heterogeneous resources, while TARDIS enables real-time scaling of resources by launching or terminating worker nodes based on current demands. This ensures efficient utilization under varying loads. The system employs containerized environments via CVMFS [5] and Apptainer, providing users with pre-configured operating systems and software tailored for LHC Open Data analysis. Real-time monitoring and performance tracking are handled using Prometheus, Node Exporter, and Grafana [6]. To support LHC Open Data analysis by a broad user base, including public and educational audiences, an independent login node has been deployed. While the PUNCH4NFDI ecosystem relies on a federated Authentication and Authorization Infrastructure (AAI) based on OpenID Connect (OIDC) [7] and Helmholtz AAI [8], EXPLORE currently uses an interim email-based registration mechanism for user onboarding and SSH-key provisioning. This lightweight access method has proven effective for early-stage user testing, particularly with high school students and non-institutional users. However, it does not yet align fully with the standard PUNCH AAI integration model. Efforts are underway to align EXPLORE with the broader PUNCH AAI framework by enabling simplified identity onboarding and token-based authorization. EXPLORE may serve as a pilot for extending AAI usability to users beyond traditional institutional affiliations. Public users can register at https://punchlogin.goegrid.gwdg.de/ using a valid email. Following an initial alpha phase, beta testing was conducted with high school students in High-Energy Physics (HEP) Masterclasses in Lower Saxony. These tests led to improvements in performance, accessibility, and usability. The infrastructure is now fully operational, supporting researchers, educators, students, and HEP enthusiasts in performing scalable and reproducible analysis of CERN Open Data [9]. EXPLORE promotes FAIR Science by ensuring the Findability, Accessibility, Interoperability, and Reusability of CERN Open Data across a wide range of users. Through dynamic resource allocation, containerized environments, and open-access registration, the system fosters open, reproducible, and collaborative research. This approach ensures resources are accessible not only for the core PUNCH community but also for non-HEP researchers, educators, and new-comers to high-energy physics. This paper presents the technical architecture of the deployed infrastructure, including integration with COBalD/TARDIS, access control mechanisms, and its operational impact since transitioning to production. Preliminary usage statistics, operational insights, and future directions for expanding interoperability and accessibility in research data infrastructure are also discussed.

Keywords

Dynamic Resource Allocation, containerization, COBalD/TARDIS, LHC Open Data Analysis, EXPLORE, HTCondor, Alpha and Beta Testing

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green