
The Helmholtz Metadata Collaboration (HMC) is a platform that supports researchers and data professionals in making their research data findable, accessible, interoperable, and reusable (FAIR [1]) through collaborative development of tools, recommendations and other services. HMC also monitors FAIR (meta)data management practices in the Helmholtz Association through community surveys. In 2021, the first survey targeted at researchers in the association revealed the status of data management and data publication practices among researchers. The results can be found in [2]. This submission focuses on the second survey conducted by HMC, the Data Professionals Survey 2024. The survey was targeted at all employees of Helmholtz who work on research data management (RDM), henceforth mentioned as "data professionals". This target group includes not only data stewards, librarians and data managers, but also researchers and technicians who are actively involved in the management of the research data generated in their department or Helmholtz center. The goals of the survey were to identify the data professionals within Helmholtz and understand their RDM practices, the application of FAIR principles, alignment of their work to data policies of their center as well as the challenges they face and the needs they have in the area of RDM. The questionnaire consisted of 17 questions that explored various parameters addressing the above-mentioned goals. The survey was implemented using an open-source web-based tool for online surveys called LimeSurvey [3]. The survey was anonymous and participation was voluntary. It was disseminated at different Helmholtz centers over a six-month period between April and October 2024. Data analysis involved the usage of an open-source framework called HIFIS-surveyval [4] as well as Llama 3.3 [5] for identifying common categories among free-text answers. Python scripts were created for data cleaning and analysis as well as for generating an anonymized dataset for publication. The survey was answered by 156 data professionals working at different Helmholtz centers. The survey results reveal that the largest group involved in RDM are researchers with close to 39% of the respondents. They spend about 33% of their working time on RDM tasks. On the other hand, 31% of the respondents are explicitly employed as data professionals and they spend close to 80% of their working time on RDM tasks. The respondents were found to be operating at various organizational levels ranging from division level to overarching level. However, only one-fourth of the respondents reported having had a formal training for their RDM tasks. Their reported RDM practices were found to adhere more to findability (61%) and accessibility (49%) aspects of FAIR than to interoperability (39%) and reusability (41%). Furthermore, just over 75% of the respondents mentioned that they are aware of the guidelines and policies of their center and 70% of them also reported at least partially following those guidelines. Over 90% of the respondents mentioned they face at least one challenge in their daily RDM work and over 85% of the respondents expressed interest in at least one of the service formats. These results help HMC in aligning the services to address the needs reported by the data professionals, thus contributing to the improvement of data management practices in Helmholtz. A complete report on the survey responses as well as a cleaned dataset will be published soon.
ddc:004, Research Data Practices, DATA processing & computer science, Research Data Management, Data Professionals, Helmholtz Association, info:eu-repo/classification/ddc/004, FAIR
ddc:004, Research Data Practices, DATA processing & computer science, Research Data Management, Data Professionals, Helmholtz Association, info:eu-repo/classification/ddc/004, FAIR
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
