Advanced search in
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
7 Research products, page 1 of 1

  • Publications
  • 2017-2021
  • BASNUM
  • Mémoires en Sciences de l'Information et de la Communication
  • Hal-Diderot
  • INRIA a CCSD electronic archive server

Date (most recent)
arrow_drop_down
  • English
    Authors: 
    Williams, Geoffrey; ioana, galleron; Stincone, Clarissa;
    Publisher: HAL CCSD
    Country: France
    Project: ANR | BASNUM (ANR-18-CE38-0003)

    International audience

  • English
    Authors: 
    Khemakhem, Mohamed;
    Publisher: HAL CCSD
    Country: France
    Project: ANR | BASNUM (ANR-18-CE38-0003), EC | PARTHENOS (654119)

    Dictionaries could be considered as the most comprehensive reservoir of human knowledge, which carry not only the lexical description of words in one or more languages, but also the common awareness of a certain communityabout every known piece of knowledge in a time fr...

  • Publication . Article . Conference object . Preprint . 2020 . Embargo End Date: 01 Jan 2019
    Open Access
    Authors: 
    Louis Martin; Benjamin Muller; Pedro Javier Ortiz Suárez; Yoann Dupont; Laurent Romary; Éric Villemonte de la Clergerie; Djamé Seddah; Benoît Sagot;
    Publisher: arXiv
    Country: France
    Project: ANR | BASNUM (ANR-18-CE38-0003), ANR | PARSITI (ANR-16-CE33-0021), ANR | SoSweet (ANR-15-CE38-0011), ANR | PRAIRIE (ANR-19-P3IA-0001)

    Pretrained language models are now ubiquitous in Natural Language Processing. Despite their success, most available models have either been trained on English data or on the concatenation of data in multiple languages. This makes practical use of such models --in all la...

  • Publication . Conference object . Article . Preprint . 2020
    Open Access English
    Authors: 
    Pedro Javier Ortiz Suárez; Laurent Romary; Benoît Sagot;
    Country: France
    Project: ANR | BASNUM (ANR-18-CE38-0003), ANR | PRAIRIE (ANR-19-P3IA-0001)

    International audience; We use the multilingual OSCAR corpus, extracted from Common Crawl via language classification, filtering and cleaning, to train monolingual contextualized word embeddings (ELMo) for several mid-resource languages. We then compare the performance ...

  • French
    Authors: 
    Martin, Louis; Muller, Benjamin; Ortiz Suárez, Pedro Javier; Dupont, Yoan; Romary, Laurent; Villemonte de la Clergerie, Eric; Sagot, Benoît; Seddah, Djamé;
    Publisher: HAL CCSD
    Project: ANR | PARSITI (ANR-16-CE33-0021), ANR | BASNUM (ANR-18-CE38-0003), ANR | SoSweet (ANR-15-CE38-0011), ANR | PRAIRIE (ANR-19-P3IA-0001)

    Les modèles de langue neuronaux contextuels sont désormais omniprésents en traitement automatique des langues. Jusqu’à récemment, la plupart des modèles disponibles ont été entraînés soit sur des données en anglais, soit sur la concaténation de données dans plusieurs la...

  • French
    Authors: 
    Martin, Louis; Muller, Benjamin; Ortiz Suárez, Pedro Javier; Dupont, Yoan; Romary, Laurent; Villemonte de la Clergerie, Eric; Sagot, Benoît; Seddah, Djamé;
    Publisher: HAL CCSD
    Country: France
    Project: ANR | PRAIRIE (ANR-19-P3IA-0001), ANR | BASNUM (ANR-18-CE38-0003), ANR | SoSweet (ANR-15-CE38-0011), ANR | PARSITI (ANR-16-CE33-0021)

    National audience; Contextual word embeddings have become ubiquitous in Natural Language Processing. Until recently,most available models were trained on English data or on the concatenation of corpora in multiplelanguages. This made the practical use of models in all l...

  • Publication . Preprint . Article . Conference object . 2020
    Open Access English
    Authors: 
    Su��rez, Pedro Javier Ortiz; Dupont, Yoann; Muller, Benjamin; Romary, Laurent; Sagot, Beno��t;
    Publisher: HAL CCSD
    Country: France
    Project: ANR | BASNUM (ANR-18-CE38-0003), ANR | PRAIRIE (ANR-19-P3IA-0001)

    Due to COVID19 pandemic, the 12th edition is cancelled. The LREC 2020 Proceedings are available at http://www.lrec-conf.org/proceedings/lrec2020/index.html; International audience; The French TreeBank developed at the University Paris 7 is the main source of morphosynta...