Advanced search in
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
4 Research products, page 1 of 1

  • Publications
  • 2018-2022
  • Open Access
  • FR
  • IE
  • English
  • CLARIN

Date (most recent)
arrow_drop_down
  • Publication . Preprint . Conference object . Contribution for newspaper or weekly magazine . Article . 2020
    Open Access English
    Authors: 
    Rehm, Georg; Marheinecke, Katrin; Hegele, Stefanie; Piperidis, Stelios; Bontcheva, Kalina; Hajic, Jan; Choukri, Khalid; Vasiljevs, Andrejs; Backfried, Gerhard; Prinz, Christoph; +37 more
    Countries: France, Denmark, France
    Project: SFI | ADAPT: Centre for Digital... (13/RC/2106), EC | BDVe (732630), EC | ELG (825627), EC | AI4EU (825619), FCT | PINFRA/22117/2016 (PINFRA/22117/2016), EC | X5gon (761758), SFI | ADAPT: Centre for Digital... (13/RC/2106), EC | BDVe (732630), EC | ELG (825627), EC | AI4EU (825619),...

    Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality. However, language barriers impacting business, cross-lingual and cross-cultural communication are still omnipresent. Language Technologies (LTs) are a powerful means to break down these barriers. While the last decade has seen various initiatives that created a multitude of approaches and technologies tailored to Europe's specific needs, there is still an immense level of fragmentation. At the same time, AI has become an increasingly important concept in the European Information and Communication Technology area. For a few years now, AI, including many opportunities, synergies but also misconceptions, has been overshadowing every other topic. We present an overview of the European LT landscape, describing funding programmes, activities, actions and challenges in the different countries with regard to LT, including the current state of play in industry and the LT market. We present a brief overview of the main LT-related activities on the EU level in the last ten years and develop strategic guidance with regard to four key dimensions. Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020). To appear

  • Publication . Conference object . Preprint . Article . 2020
    Open Access English
    Authors: 
    Khojasteh, H. A.; Ansari, E.; Mahdi Bohlouli;
    Publisher: HAL CCSD
    Country: France

    Language recognition has been significantly advanced in recent years by means of modern machine learning methods such as deep learning and benchmarks with rich annotations. However, research is still limited in low-resource formal languages. This consists of a significant gap in describing the colloquial language especially for low-resourced ones such as Persian. In order to target this gap for low resource languages, we propose a "Large Scale Colloquial Persian Dataset" (LSCP). LSCP is hierarchically organized in a semantic taxonomy that focuses on multi-task informal Persian language understanding as a comprehensive problem. This encompasses the recognition of multiple semantic aspects in the human-level sentences, which naturally captures from the real-world sentences. We believe that further investigations and processing, as well as the application of novel algorithms and methods, can strengthen enriching computerized understanding and processing of low resource languages. The proposed corpus consists of 120M sentences resulted from 27M tweets annotated with parsing tree, part-of-speech tags, sentiment polarity and translation in five different languages. Comment: 6 pages, 2 figures, 3 tables, Accepted at the 12th International Conference on Language Resources and Evaluation (LREC 2020)

  • Open Access English
    Authors: 
    Van Der Eycken, Johan; Styven, Dorien; Gheldof, Tom; Depoortere, Rolande;
    Publisher: HAL CCSD
    Countries: France, Belgium

    This article shows that metadata plays a central role in our society and concludes that through collaborative work, it is possible to pool solutions and to establish relationships of cooperation, both at the level of practical tool development and with regard to sharing and creating knowledge and know-how. ispartof: ABB: Archives et Bibliothèques de Belgique - Archief- en Bibliotheekwezen in België vol:106 pages:135-144 status: published

  • Publication . Article . Conference object . Preprint . 2018
    Open Access English
    Authors: 
    Ondřej Cífka; Ondřej Bojar;

    One of possible ways of obtaining continuous-space sentence representations is by training neural machine translation (NMT) systems. The recent attention mechanism however removes the single point in the neural network from which the source sentence representation can be extracted. We propose several variations of the attentive NMT architecture bringing this meeting point back. Empirical evaluation suggests that the better the translation quality, the worse the learned sentence representations serve in a wide range of classification and similarity tasks. ACL 2018; 10 pages + 2 page supplementary

Send a message
How can we help?
We usually respond in a few hours.