Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Padua research Archi...arrow_drop_down
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
addClaim

Challenges of a multilingual corpus (Old French/Old Venetian): The example of the MICLE project

Authors: Mathieu Goux; Francesco Pinzin;

Challenges of a multilingual corpus (Old French/Old Venetian): The example of the MICLE project

Abstract

Digital formats and data visualization are key aspects in the creation of a multilingual corpus. Nonetheless, they have received relevantly less attention than other important factors, as the problems related to the organization of the workflow and the selection of the tagset. In this contribution we show how these two apparently separate aspects are inextricably intertwined and how we approached these issues in the MICLE project (Micro Cues for Language Evolution, ANR/DFG) in terms of inclusiveness. More specifically, we show how including multiple PoS tagsets (UD, UPENN, PRESTO) in the same corpus by means of conversion scripts allows for a better fruition of the data and a better organization of the workflow. Furthermore, we show how adopting the XML-TEI format for the final version of the data allows for enough flexibility to accommodate all the different POS tags and the various syntactic information (in turn encoded in the UD – dependency-based – and UPENN – constituency-based – format). This has a clear payoff in terms of comparability of the data from the two languages of the corpus, Old French and Old Venetian, as we show in the last section, where we compare the results of an ongoing investigation on the phenomenon of Infinitival Inversion and on its relationship with the Verb Second word-order constraint.

Country
Italy
Related Organizations
Keywords

Natural language Processing, Old Venetian, Old French, Verb Second, Stylistic Fronting.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!