International audience; DARIAH, the Digital Research Infrastructure for the Arts and Humanities, is committed to advancing the digital revolution that has captured the arts and humanities. As more legacy primary and secondary sources become digital, more digital content is being produced and more digital tools are being deployed, we see a next generation of digitally aware scholars in the humanities emerge. DARIAH aims to connect these resources, tools and scholars, ensuring that the state-of-the-art in research is sustained and integrated across European countries. To do so, it is important to understand the actual role that proper data modelling and standards could play to make digital content sustainable. Even if it does not seem obvious at first sight that the arts and humanities would be fit for taking up the technological prerequisites of standardisation, we want to show in this paper that we can and should integrate standardisation issues at the core of our DARIAH infrastructural work. This analysis may lead us to a wider understanding of the role of scholars within a digital infrastructure and consequently on how DARIAH could better integrate a variety of research communities in the arts and humanities.
Publication . Part of book or chapter of book . 2019
International audience; The reflections in this chapter stem from the perspective of the DARIAH-ERIC,a distributed infrastructure for the arts and humanities. They explore how impactcan take a variety of forms not always considered when the term is applied in astrictly technocratic sense, and the idea that focussing on the user of a research infrastructuremay not describe an optimal relationship from an impact perspective.The chapter concludes by presenting three frames of reference in which an infrastructurelike DARIAH can have impact: to foster excellence through impact on researchers,promote fluidity through impact on policymakers, and support efficiencythrough impact on our partner organisations.
Publication . Part of book or chapter of book . Conference object . 2011
International audience; This paper concerns epistemology and the understanding of research processes in Humanities, such as Archaeology. We believe that to properly understand research processes, it is essential to trace them. The collected traces depend on the process model established, which has to be as accurate as possible to exhaustively record the traces. In this paper, we briefly explain why the existing process models for Humanities are not sufficient to represent traces. We then present different process models from Information Systems Engineering that allow tracing processes according to different perspectives such as activities, decisions or strategies. We assume these process models can be useful to represent research processes in Humanities coherently and thoroughly.
Publication . Part of book or chapter of book . 2017
International audience; Humanities have convincingly argued that they need transnational research opportunities and through the digital transformation of their disciplines also have the means to proceed with it on an up to now unknown scale. The digital transformation of research and its resources means that many of the artifacts, documents, materials, etc. that interest humanities research can now be combined in new and innovative ways. Due to the digital transformations, (big) data and information have become central to the study of culture and society. Humanities research infrastructures manage, organise and distribute this kind of information and many more data objects as they becomes relevant for social and cultural research.
International audience; This paper describes the workflow of the Grammateus project, from gathering data on Greek documentary papyri to the creation of a web application. The first stage is the selection of a corpus and the choice of metadata to record: papyrology specialists gather data from printed editions, existing online resources and digital facsimiles. In the next step, this data is transformed into the EpiDoc standard of XML TEI encoding, to facilitate its reuse by others, and processed for HTML display. We also reuse existing text transcriptions available on . Since these transcriptions may be regularly updated by the scholarly community, we aim to access them dynamically. Although the transcriptions follow the EpiDoc guidelines, the wide diversity of the papyri as well as small inconsistencies in encoding make data reuse challenging. Currently, our data is available on an institutional GitLab repository, and we will archive our final dataset according to the FAIR principles.
Publication . Part of book or chapter of book . 2019
International audience; How can political roadmaps, action plans and principles on open science be translated into pragmatic and realistic research data policy on a French university campus? How can an open science ecosystem be implemented in the specific environment field of social sciences and humanities? After a couple of scientific projects on research data conducted since 2013 at the University of Lille, we carried out interviews with about 50 researchers, PhD students, data engineers, laboratory and project managers, with three objectives:1.To place the researchers at the heart of the implementation of the open science ecosystem on the campus, with their needs, priorities and doubts.2.To identify opportunities and locks for a data policy.3.To recommend ten actions to develop the data culture on the campus.Conducted as an audit on the human and social sciences campus of the University of Lille, our study has a pragmatic scope: to identify the essential elements for a coherent policy of the production, management and reuse of research data on a campus in the humanities and social sciences, and thus contribute to the appropriation of the concept of open science by the development of a “culture of the data”. The national action plan states that there is still a lot of work to be done to make open science a part of scientific practice. To succeed, such an approach requires knowledge of the reality of the field; it needs the support of research communities, the coordination of all actors on the campus, and institutional and scientific steering. It will take time. But it is a necessary investment to maintain excellence in research. This paper makes ten proposals how to go there.
Publication . Part of book or chapter of book . 2012
International audience; The goal of the present chapter is to explore the possibility of providing the research (but also the industrial) community that commonly uses spoken corpora with a stable portfolio of well-documented standardised formats that allow a high re-use rate of annotated spoken resources and, as a consequence, better interoperability across tools used to produce or exploit such resources.
Publication . Part of book or chapter of book . 2016
International audience; This chapter gives an overview of one possible staged methodology for structuring LCI data by presenting a new scientific object, LEarning and TEaching Corpora (LETEC). Firstly, the chapter clarifies the notion of corpora, used in so many different ways in language studies, and underlines how corpora differ from raw language data. Secondly, using examples taken from actual online learning situations, the chapter illustrates the methodology that is used to collect, transform and organize data from online learning situations in order to make them sharable through open-access repositories. The ethics and rights for releasing a corpus as OpenData are discussed. Thirdly, the authors suggest how the transcription of interactions may become more systematic, and what benefits may be expected from analysis tools, before opening the CALL research perspective applied to LCI towards its applications to teacher-training in Computer-Mediated Communication (CMC), and the common interests the CALL field shares with researchers in the field of Corpus Linguistics working on CMC.
International audience; One of the funded project proposals under DARIAH’s Open Humanities call 2015 was “Open History: Sustainable digital publishing of archival catalogues of twentieth-century history archives”. Based on the experiences of the Collaborative EuropeaN Digital Archival Research Infrastructure (CENDARI) and the European Holocaust Research Infrastructure (EHRI), the main goal of the “Open History” project was to enhance the dialogue between (meta-)data providers and research infrastructures. Integrating archival descriptions – when they were already available – held at a wide variety of twentieth-century history archives (from classic archives to memorial sites, libraries and private archives) into research infrastructures has proven to be a major challenge, which could not be done without some degree of limited to extensive pre-processing or other preparatory work. The “Open History” project organized two workshops and developed two tools: an easily accessible and general article on why the practice of standardization and sharing is important and how this can be achieved; and a model which provides checklists for self-analyses of archival institutions. The text that follows is the article we have developed. It intentionally remains at a general level, without much jargon, so that it can be easily read by those who are non-archivists or non-IT. Hence, we hope it will be easy to understand for both those who are describing the sources at various archives (with or without IT or archival sciences degrees), as well as decision-makers (directors and advisory boards) who wish to understand the benefits of investing in standardization and sharing of data. It is important to note is that this text is a first step, not a static, final result. Not all aspects about standardization and publication of (meta-)data are discussed, nor are updates or feedback mechanisms for annotations and comments discussed. The idea is that this text can be used in full or in part and that it will include further chapters and section updates as time goes by and as other communities begin using it. Some archives will read through much of these and see confirmation of what they have already been implementing; others – especially the smaller institutions, such as private memory institutions – will find this a low-key and hands-on introduction to help them in their efforts.