Actions
  • shareshare
  • link
  • cite
  • add
add
auto_awesome_motion View all 4 versions
Research data . Dataset . 2018

Corpus of Occitan Written Traditional Folktales Annotated with Part-Of-Speech (OWT-Tag)

Vergez-Couret, Marianne;
Open Access
Occitan (post 1500); Provençal
Published: 11 Oct 2018
Publisher: Zenodo
Abstract
This resource contains 5 extracts of texts in Occitan which were manually annotated with lemmas and parts-of-speech, following the Grace standard. It was produced during the ExpressioNarration project, funded by a Marie Curie Individual Fellowship, in order to evaluate the performance of an Occitan Part-Of-Speech tagger, Talismane, to the specifities of the corpus of the project called Oral Occitan (OcOr), also available on https://zenodo.org/record/1451753#.W78FJWOYSpo. Each extract contains around 1500 words. They are extracted from 'Contes et proverbes populaires recueillis en armagnac et Contes populaires recueillis en agenais' de J.-F. Bladé, 'Coundes biarnés, couéilhuts aüs parsàas miéytadès dou péys dé Biarn' de J.-V. Lalanne, 'Contes populaires du Languedoc' de L. Lambert and 'Contes populaires recueillis dans la Grande-Lande' de F. Arnaudin. The annotation process is described in the following article available on https://www.openscience.fr/IMG/pdf/iste_modocv1n1_2.pdf.
{"references": ["Vergez-Couret M. (2017). \u00ab Constitution et annotation d'un corpus \u00e9crit de contes et r\u00e9cits en occitan \u00bb, Analyses et m\u00e9thodes formelles pour les humanit\u00e9s num\u00e9riques, ISTE OpenScience, 1-1, publication en ligne : https://www.openscience.fr/Constitution-et-annotation-d-un-corpus-ecrit-de-contes-et-recits-en-occitan."]}
Subjects

part-of-speech, annotation, occitan, folktales, written

Related Organizations
Funded by
EC| EXPRESSIONARRATION
Project
EXPRESSIONARRATION
Narration, linguistic expression and discourse structure: explorations of orality in Occitan and French
  • Funder: European Commission (EC)
  • Project Code: 655034
  • Funding stream: H2020 | MSCA-IF-EF-ST
,
EC| EXPRESSIONARRATION
Project
EXPRESSIONARRATION
Narration, linguistic expression and discourse structure: explorations of orality in Occitan and French
  • Funder: European Commission (EC)
  • Project Code: 655034
  • Funding stream: H2020 | MSCA-IF-EF-ST
moresidebar