Advanced search in
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
64 Research products, page 1 of 7

  • Publications
  • Research data
  • Other research products
  • 2017-2021
  • English
  • Hal-Diderot
  • DARIAH EU
  • Digital Humanities and Cultural Heritage

10
arrow_drop_down
Date (most recent)
arrow_drop_down
  • Open Access English
    Authors: 
    Frank Uiterwaal; Franco Niccolucci; Sheena Bassett; Steven Krauwer; Hella Hollander; Femmy Admiraal; Laurent Romary; George Bruseker; Carlo Meghini; Jennifer Edmond; +1 more
    Publisher: HAL CCSD
    Countries: France, France, Italy, France, Netherlands
    Project: EC | PARTHENOS (654119)

    This article has been accepted for publication by EUP in the IJHAC: International Journal of Humanities and Arts Computing (https://www.euppublishing.com/loi/ijhac); International audience; Since the first ESFRI roadmap in 2006, multiple humanities Research Infrastructures (RIs) have been set up all over the European continent, supporting archaeologists (ARIADNE), linguists (CLARIN-ERIC), Holocaust researchers (EHRI), cultural heritage specialists (IPERION-CH) and others. These examples only scratch the surface of the breadth of research communities that have benefited from close cooperation in the European Research Area.While each field developed discipline-specific services over the years, common themes can also be distinguished. All humanities RIs address, in varying degrees, questions around research data management, the use of standards and the desired interoperability of data across disciplinary boundaries.This article sheds light on how cluster project PARTHENOS developed pooled services and shared solutions for its audience of humanities researchers, RI managers and policymakers. In a time where the convergence of existing infrastructure is becoming ever more important – with the construction of a European Open Science Cloud as an audacious, ultimate goal – we hope that our experiences inform future work and provide inspiration on how to exploit synergies in interdisciplinary, transnational, scientific cooperation.

  • Open Access English
    Authors: 
    Maryl, Maciej; Błaszczyńska, Marta; Zalotyńska, Agnieszka; Taylor, Laurence; Avanço, Karla; Balula, Ana; Buchner, Anna; Caliman, Lorena; Clivaz, Claire; Costa, Carlos; +21 more
    Publisher: HAL CCSD
    Countries: France, Croatia, Croatia
    Project: EC | OPERAS-P (871069), EC | OPERAS-P (871069)

    This report discusses the scholarly communication issues in Social Sciences and Humanities that are relevant to the future development and functioning of OPERAS. The outcomes collected here can be divided into two groups of innovations regarding 1) the operation of OPERAS, and 2) its activities. The “operational” issues include the ways in which an innovative research infrastructure should be governed (Chapter 1) as well as the business models for open access publications in Social Sciences and Humanities (Chapter 2). The other group of issues is dedicated to strategic areas where OPERAS and its services may play an instrumental role in providing, enabling, or unlocking innovation: FAIR data (Chapter 3), bibliodiversity and multilingualism in scholarly communication (Chapter 4), the future of scholarly writing (Chapter 5), and quality assessment (Chapter 6). Each chapter provides an overview of the main findings and challenges with emphasis on recommendations for OPERAS and other stakeholders like e-infrastructures, publishers, SSH researchers, research performing organisations, policy makers, and funders. Links to data and further publications stemming from work concerning particular tasks are located at the end of each chapter.

  • Open Access English
    Authors: 
    Clivaz, Claire; Allen, Garrick,;
    Publisher: HAL CCSD
    Country: France

    Ancient Manuscripts and Virtual Research Environments Lausanne, 10–11 September 2020 - Conference report

  • Open Access English
    Authors: 
    Stefan Buddenbohm; Maaike A. de Jong; Jean-Luc Minel; Yoann Moranville;
    Publisher: HAL CCSD
    Country: France
    Project: EC | HaS-DARIAH (675570)

    AbstractHow can researchers identify suitable research data repositories for the deposit of their research data? Which repository matches best the technical and legal requirements of a specific research project? For this end and with a humanities perspective the Data Deposit Recommendation Service (DDRS) has been developed as a prototype. It not only serves as a functional service for selecting humanities research data repositories but it is particularly a technical demonstrator illustrating the potential of re-using an already existing infrastructure - in this case re3data - and the feasibility to set up this kind of service for other research disciplines. The documentation and the code of this project can be found in the DARIAH GitHub repository: https://dariah-eric.github.io/ddrs/.

  • English
    Authors: 
    Edmond, Jennifer; Basaraba, Nicole; Doran, Michelle; Garnett, Vicky; Grile, Courtney Helen; Papaki, Eliza; Tóth-Czifra, Erzsébet;
    Publisher: HAL CCSD
    Country: France
  • English
    Authors: 
    Khemakhem, Mohamed;
    Publisher: HAL CCSD
    Country: France
    Project: EC | PARTHENOS (654119), ANR | BASNUM (ANR-18-CE38-0003)

    Dictionaries could be considered as the most comprehensive reservoir of human knowledge, which carry not only the lexical description of words in one or more languages, but also the common awareness of a certain communityabout every known piece of knowledge in a time frame. Print dictionaries are the principle resources which enable the documentation and transfer of such knowledge. They already exist in abundant numbers, while new onesare continuously compiled, even with the recent strong move to digital resources.However, a majority of these dictionaries, even when available digitally, is still not fully structured due to the absence of scalable methods and techniques that can cover the variety of corresponding material. Moreover, the relatively few existing structured resources present limited exchange and query alternatives, given the discrepancy of their data models and formats.In this thesis we address the task of parsing lexical information in print dictionaries through the design of computer models that enable their automatic structuring. Solving this task goes hand in hand with finding a standardised output for these models to guarantee a maximum interoperability among resources and usability for downstream tasks.First, we present different classifications of the dictionaric resources to delimit the category of print dictionaries we aim to process. Second, we introduce the parsing task by providing an overview of the processing challengesand a study of the state of the art. Then, we present a novel approach based on a top-down parsing of the lexical information. We also outline the archiecture of the resulting system, called GROBID-Dictionaries, and the methodology we followed to close the gap between the conception of the system and its applicability to real-world scenarios.After that, we draw the landscape of the leading standards for structured lexical resources. In addition, we provide an analysis of two ongoing initiatives, TEI-Lex-0 and LMF, that aim at the unification of modelling the lexical information in print and electronic dictionaries. Based on that, we present a serialisation format that is inline with the schemes of the two standardisation initiatives and fits the approach implemented in our parsing system.After presenting the parsing and standardised serialisation facets of our lexical models, we provide an empirical study of their performance and behaviour. The investigation is based on a specific machine learning setup andseries of experiments carried out with a selected pool of varied dictionaries.We try in this study to present different ways for feature engineering and exhibit the strength and the limits of the best resulting models. We also dedicate two series of experiments for exploring the scalability of our models with regard to the processed documents and the employed machine learning technique.Finally, we sum up this thesis by presenting the major conclusions and opening new perspectives for extending our investigations in a number of research directions for parsing entry-based documents.; Les dictionnaires peuvent être considérés comme le réservoir le plus compréhensible de connaissances humaines, qui contiennent non seulement la description lexicale des mots dans une ou plusieurs langues, mais aussi la conscience commune d’une certaine communauté sur chaque élément de connaissance connu dans une période de temps donnée. Les dictionnaires imprimés sont les principales ressources qui permettent la documentation et le transfert de ces connaissances. Ils existent déjà en grand nombre, et de nouveaux dictionnaires sont continuellement compilés.Cependant, la majorité de ces dictionnaires dans leur version numérique n’est toujours pas structurée en raison de l’absence de méthodes et de techniques évolutives pouvant couvrir le nombre du matériel croissant et sa variété. En outre, les ressources structurées existantes, relativement peu nombreuses, présentent des alternatives d’échange et de recherche limitées, en raison d’un sérieux manque de synchronisation entre leurs schémas de structure.Dans cette thèse, nous abordons la tâche d’analyse des informations lexicales dans les dictionnaires imprimés en construisant des modèles qui permettent leur structuration automatique. La résolution de cette tâche va depair avec la recherche d’une sortie standardisée de ces modèles afin de garantir une interopérabilité maximale entre les ressources et une facilité d’utilisation pour les tâches en aval.Nous commençons par présenter différentes classifications des ressources dictionnaires pour délimiter les catégories des dictionnaires imprimés sur lesquelles ce travail se focalise. Ensuite, nous définissions la tâche d’analyse en fournissant un aperçu des défis de traitement et une étude de l’état de l’art.Nous présentons par la suite une nouvelle approche basée sur une analyse en cascade de l’information lexicale. Nous décrivons également l’architecture du système résultant, appelé GROBID-Dictionaries, et la méthodologie quenous avons suivie pour rapprocher la conception du système de son applicabilité aux scénarios du monde réel.Ensuite, nous prestons des normes clés pour les ressources lexicales structurées. En outre, nous fournissons une analyse de deux initiatives en cours, TEI-Lex-0 et LMF, qui visent à unifier la modélisation de l’information lexicale dans les dictionnaires imprimés et électroniques. Sur cette base, nous présentons un format de sérialisation conforme aux schémas des deux initiatives de normalisation et qui est assorti à l’approche développée dans notresystème d’analyse lexicale.Après avoir présenté les facettes d’analyse et de sérialisation normalisées de nos modèles lexicaux, nous fournissons une étude empirique de leurs performances et de leurs comportements. L’étude est basée sur une configuration spécifique d’apprentissage automatique et sur une série d’expériences menées avec un ensemble sélectionné de dictionnaires variés. Dans cette étude, nous essayons de présenter différentes manières d’ingénierie des caractéristiques et de montrer les points forts et les limites des meilleurs modèles résultants. Nous consacrons également deux séries d’expériences pour explorer l’extensibilité de nos modèles en ce qui concerne les documents traités et la technique d’apprentissage automatique employée.Enfin, nous clôturons cette thèse en présentant les principales conclusions et en ouvrant de nouvelles perspectives pour l’extension de nos investigations dans un certain nombre de directions de recherche pour l’analyse des documents structurés en un ensemble d’entrées.

  • Publication . Report . 2020
    English
    Authors: 
    Bertrand, Loïc; Anglos, Demetrios; Castillejo, Marta; Charbonnel, Bénédicte; David, Sophie; de Clercq, Hilde; Dubray, Fanny; Spring, Marika;
    Publisher: HAL CCSD
    Country: France
    Project: EC | E-RIHS PP (739503)

    The study and preservation of tangible cultural and natural heritage is a global challenge for science and society at large. The European Research Infrastructure for Heritage Science (E-RIHS) will play a leading role in research on the interpretation, preservation, documentation and management of heritage. As an interdisciplinary infrastructure, E-RIHS will interconnect knowledge and methodologies to address key scientific questions in the field of heritage as a whole. The infrastructure is built on ten core pillars. It will provide a structured and unified input of large-scale instruments, portable devices, physical and digital archives. Its implementation will focus on scientific excellence, interdisciplinarity and cooperation. In doing so, it will offer unprecedented research opportunities to a wide range of interdisciplinary scientific communities.

  • Publication . Article . Conference object . 2020
    Open Access English
    Authors: 
    Stefan Bornhofen; Marten Düring;
    Publisher: HAL CCSD
    Country: France
    Project: ANR | BLIZAAR (ANR-15-CE23-0002)

    AbstractThe paper presents Intergraph, a graph-based visual analytics technical demonstrator for the exploration and study of content in historical document collections. The designed prototype is motivated by a practical use case on a corpus of circa 15.000 digitized resources about European integration since 1945. The corpus allowed generating a dynamic multilayer network which represents different kinds of named entities appearing and co-appearing in the collections. To our knowledge, Intergraph is one of the first interactive tools to visualize dynamic multilayer graphs for collections of digitized historical sources. Graph visualization and interaction methods have been designed based on user requirements for content exploration by non-technical users without a strong background in network science, and to compensate for common flaws with the annotation of named entities. Users work with self-selected subsets of the overall data by interacting with a scene of small graphs which can be added, altered and compared. This allows an interest-driven navigation in the corpus and the discovery of the interconnections of its entities across time.

  • English
    Authors: 
    Blandine Nouvel; Evelyne Sinigaglia; Véronique Humbert;
    Publisher: HAL CCSD
    Country: France

    International audience; The aim of the talk is to present the methodology used to reorganise the PACTOLS thesaurus of Frantiq, launched within the framework of the MASA consortium. PACTOLS is a multilingual and open repository about archaeology from Prehistory to the present and for Classics. It is organized into six micro-thesaurus at the root of its name (Peuples, Anthroponymes,Chronologie, Toponymes, Oeuvres, Lieux, Sujets). The goal is to turn it into a tool interoperable with information systems beyond its original documentary purpose, and usable by archaeologists as a repository for managing scientific data. During the talk, we will describe the choice of tools, the organisation of work within the steering group and the collaborations with specialists for the upgrading and development of the vocabulary while showing the strengths and limitations of some experiments. Above allit will show how the introduction of the conceptual categories of the BackBone Thesaurus of DARIAH, modelled on the CIDOC-CRM ontology, through a progressive deconstruction/reconstruction process, eventually had an impact on all micro thesauri and questioned the organisation of knowledge so far proposed.

  • Publication . Preprint . Conference object . Contribution for newspaper or weekly magazine . Article . 2020
    Open Access English
    Authors: 
    Rehm, Georg; Marheinecke, Katrin; Hegele, Stefanie; Piperidis, Stelios; Bontcheva, Kalina; Hajic, Jan; Choukri, Khalid; Vasiljevs, Andrejs; Backfried, Gerhard; Prinz, Christoph; +37 more
    Countries: France, Denmark, France
    Project: SFI | ADAPT: Centre for Digital... (13/RC/2106), EC | BDVe (732630), EC | ELG (825627), EC | AI4EU (825619), FCT | PINFRA/22117/2016 (PINFRA/22117/2016), EC | X5gon (761758), SFI | ADAPT: Centre for Digital... (13/RC/2106), EC | BDVe (732630), EC | ELG (825627), EC | AI4EU (825619),...

    Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality. However, language barriers impacting business, cross-lingual and cross-cultural communication are still omnipresent. Language Technologies (LTs) are a powerful means to break down these barriers. While the last decade has seen various initiatives that created a multitude of approaches and technologies tailored to Europe's specific needs, there is still an immense level of fragmentation. At the same time, AI has become an increasingly important concept in the European Information and Communication Technology area. For a few years now, AI, including many opportunities, synergies but also misconceptions, has been overshadowing every other topic. We present an overview of the European LT landscape, describing funding programmes, activities, actions and challenges in the different countries with regard to LT, including the current state of play in industry and the LT market. We present a brief overview of the main LT-related activities on the EU level in the last ten years and develop strategic guidance with regard to four key dimensions. Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020). To appear

Send a message
How can we help?
We usually respond in a few hours.