Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Presentation . 2023
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Presentation . 2023
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other literature type . 2023
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
https://doi.org/10.5281/zenodo...
Article . 2023
License: CC BY
Data sources: Sygma
versions View all 4 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Open big-data geoscience with PANGEO@EOSC

Authors: Iaquinta, Jean; Fouilloux, Anne; Odaka, Tina; Coca-Castro, Alejandro; Eynard-Bontemps, Guillaume; Luna-Valero, Sebastian; Caballer, Miguel; +3 Authors

Open big-data geoscience with PANGEO@EOSC

Abstract

The benefits of Open Science (OS) and FAIR foundational principles - Findable, Accessible, Interoperable and Reusable - are increasingly valued by academia, although what OS and FAIR entail in practice is largely misunderstood. Once researchers manage to grasp OS and FAIR principles, they are often hit by practical difficulties. The European Open Science Cloud (EOSC) is the main initiative in Europe for providing a federated and open multi-disciplinary environment where European researchers, innovators, companies and citizens can share, publish, find and re-use data, tools and services for research, innovation and educational purposes. One of the goals of the EOSC is to co-design with communities the tools and services that are useful for their day to day research work, to facilitate collaboration and to foster wider adoption of Open Science practices. The Pangeo (https://pangeo.io/) community is a world-wide community of scientists and developers, who thrives to facilitate the deployment of ready-to-use and community-driven platforms for big data geoscience. While a number of services based on Jupyter Notebooks were already available, no public Pangeo deployments providing fast access to large amounts of data and compute resources were accessible on EOSC. Most existing cloud-based Pangeo deployments are USA-based, and members of the Pangeo community in Europe did not have a shared platform where scientists or technologists could exchange know-how and experiences. Pangeo teamed up with two EOSC projects, namely EGI-ACE (https://www.egi.eu/project/egi-ace/) and C-SCALE (https://c-scale.eu/) to demonstrate how to deploy and use Pangeo on EOSC and emphasise the benefits for the European community. The Pangeo Europe Community together with EGI deployed a DaskHub, composed of a Dask Gateway (https://gateway.dask.org/) and JupyterHub (https://jupyter.org/hub), with a Kubernetes cluster backend on EOSC using the infrastructure of the EGI Federation (https://www.egi.eu/egi-federation/). The Pangeo EOSC JupyterHub deployment makes use of 1) the EGI Check-In to enable user registration (and thereby authenticated and authorised access to the Pangeo JupyterHub portal and to the underlying distributed compute infrastructure); and 2) the EGI Cloud Compute and the cloud-based EGI Online Storage (to distribute the computational tasks to a scalable compute platform and to store intermediate results produced by the user jobs). To facilitate future Pangeo deployments on top of a wide range of cloud providers (AWS, Google Cloud, Microsoft Azure, EGI Cloud Computing, OpenNebula, OpenStack, and many more), the Pangeo EOSC JupyterHub deployment is now possible through a so-called Infrastructure Manager (IM) Dashboard (https://im.egi.eu/). All the computing and storage resources are currently supplied by CESNET (https://www.cesnet.cz/?lang=en) in the frame of EGI-ACE project (https://www.egi.eu/project/egi-ace/). Several deployments have been made to serve the geoscience community, both for teaching and for research work. One major advantage of these deployments for teaching and on-boarding researchers is the possibility to train them with realistic, large and complex data analysis problems similar to or directly part of their research work. Participants are taught the usage of Xarray, Dask and more generally how to efficiently access and analyse large online datasets. With this approach, attendees have the opportunity to ask questions, collaborate with other researchers as well as Research Software Engineers, and apply Open Science practices without the burden of trying and (sometimes) failing alone and without having to build their own infrastructure. To date, more than 100 researchers have been trained on Pangeo@EOSC deployments. With a growing community, discussions arose about the need to define clear practices for writing and publishing FAIR Jupyter Notebooks that can be reused and built upon for new research. This is where the community of practice comes into play, putting the focus back on actual Open Science practices. Pangeo teamed up with the Environmental Data Science Book (or EDS Book), a pan-european community-driven resource hosted on GitHub and powered by Jupyter Book that provides practical guidelines and templates to help researchers to translate research outputs into curated, interactive, shareable and reproducible executable notebooks. The quality of the FAIR notebooks is ensured by a collaborative and transparent reviewing process supported by GitHub related technologies. Thanks to EDS, Jupyter Notebooks created and published by researchers are much more modular and reusable. In this presentation, we will provide details on the different deployments, how to get access to JupyterHub deployments and contribute to the EDS, and more generally how to contribute to Pangeo@EOSC.

Keywords

EOSC, Open Science, Pangeo, EGI, FAIR, Jupyter Notebooks

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 8
    download downloads 11
  • 8
    views
    11
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
8
11
Green
Funded by