
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>In the petabyte-era of climate research, harmonising diverse environmental and geoscientific datasets is critical to improve data interoperability and support effectiveness of interdisciplinary studies. This paper presents an idea of designing an LLM-based tool to extract and standardize metadata from climate research repositories. The solution leverages the adaptability of LLMs that are able to understand contextual nuances. By addressing common inconsistencies such as varying parameters (observation types), units, and definitions, the proposed tool will significantly improve effective data integration. It will be the first step to facilitate the creation of a unified metadata schema adhering to the FAIR principles.
In the petabyte-era of climate research, harmonising diverse environmental and geoscientific datasets is critical to improve data interoperability and support effectiveness of interdisciplinary studies. This paper presents an idea of designing an LLM-based tool to extract and standardize metadata from climate research repositories. The solution leverages the adaptability of LLMs that are able to understand contextual nuances. By addressing common inconsistencies such as varying parameters (observation types), units, and definitions, the proposed tool will significantly improve effective data integration. It will be the first step to facilitate the creation of a unified metadata schema adhering to the FAIR principles.
This position paper was accepted for publication in the First AAAI Bridge on Artificial Intelligence for Scholarly Communication AI4SC, 25-26 February 2025 - Philadelphia, Pennsylvania, USA; co-located with the 39th AAAI Conference on Artificial Intelligence (AAAI-25).
This short publication consists of two pages of main body together with two pages of references and an appendix.
climate research, metadata standardisation, large language models, digitalisation, information extraction
climate research, metadata standardisation, large language models, digitalisation, information extraction
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
