Advanced search in
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
5 Research products, page 1 of 1

  • Publications
  • Research software
  • 2012-2021
  • Open Access
  • BE
  • Lirias
  • CLARIN

Date (most recent)
arrow_drop_down
  • Open Access English
    Authors: 
    Van Der Eycken, Johan; Styven, Dorien; Gheldof, Tom; Depoortere, Rolande;
    Publisher: HAL CCSD
    Countries: France, Belgium

    This article shows that metadata plays a central role in our society and concludes that through collaborative work, it is possible to pool solutions and to establish relationships of cooperation, both at the level of practical tool development and with regard to sharing and creating knowledge and know-how. ispartof: ABB: Archives et Bibliothèques de Belgique - Archief- en Bibliotheekwezen in België vol:106 pages:135-144 status: published

  • Open Access
    Authors: 
    Frank Van Eynde; Liesbeth Augustinus; Vincent Vandeghinste;
    Publisher: Elsevier BV
    Country: Belgium

    This paper has both a theoretical and a methodological objective. The theoretical one concerns the modeling of number agreement in copular constructions. For that purpose it adopts the distinction, familiar from Head-driven Phrase Structure Grammar, between morpho-syntactic agreement (also known as concord) and index agreement. The methodological objective concerns the demonstration of how treebanks can be exploited in order to guide the formulation of relevant generalizations. For that purpose we crucially rely on tools and resources that have recently been developed in the framework of the Dutch-Flemish STEVIN program (2004--2011) and the European CLARIN infrastructure. publisher: Elsevier articletitle: Number agreement in copular constructions: A treebank-based investigation journaltitle: Lingua articlelink: http://dx.doi.org/10.1016/j.lingua.2016.02.001 content_type: article copyright: © 2016 The Authors. Published by Elsevier B.V. ispartof: Lingua: International Review of General Linguistics vol:178 pages:104-126 status: published

  • Publication . Conference object . 2015
    Open Access English
    Authors: 
    Ineke Schuurman; Menzo Windhouwer; Odrun Ohren; Dan Zeman;
    Publisher: , Linköping
    Country: Belgium

    The CLARIN Concept Registry (clarin.eu/conceptregistry) is the place in the CLARIN Infrastructure where common and shared semantics of, but not limited to, linguistic concepts are defined. This is important to achieve semantic interoperability, and to overcome to a degree the diversity in data structures, either in metadata or linguistic resources, encountered within the infrastructure. Whereas in the past, CLARIN has been using the ISOcat registry for these purposes, nowadays this new registry is being used, as ISOcat turned out to have some serious drawbacks as far as its use in the CLARIN community is concerned. The main difference between the two semantic registries is that the CCR is a concept registry whereas ISOcat is a data category registry. In this paper we describe why the decision to switch to a concept registry has been made. We also describe the most important other characteristics of the new (Open)SKOSbased registry, as well as the management procedures used to prevent a recurrent proliferation of entries, as was the case with ISOcat. ispartof: pages:62-70 ispartof: Selected Papers from the CLARIN Annual Conference 2015, October 14–16, 2015, Wroclaw, Poland pages:62-70 ispartof: CLARIN Annual Conference location:Wroclaw, Poland date:15 Oct - 17 Oct 2015 status: published

  • Open Access English
    Authors: 
    Wright, S. E.; Menzo Windhouwer; Schuurman, I.; Broeder, D.;
    Publisher: HAL CCSD
    Countries: Belgium, France

    The terminology Community of Practice has long standardized data categories in the framework of ISO TC 37. ISO 12620:2009 specifies the data model and procedures for a Data Category Registry (DCR), which has been implemented by the Max Planck Institute for Psycholinguistics as the ISOcat DCR. The DCR has been used by not only ISO TC 37, but also by the CLARIN research infrastructure. This paper describes how the needs of these communities have started to diverge and the process of segueing from a DCR to a Data Concept Registry in order to meet the needs of both communities. ispartof: pages:177-187 ispartof: Proccedings of the 11th international conference on Terminology and Knowledge Engineering 2014 vol:11 pages:177-187 ispartof: 11th international conference on Terminology and Knowledge Engineering location:Berlin date:19 Jun - 21 Jun 2014 status: published

  • Publication . Conference object . Article . 2012
    Open Access
    Authors: 
    Reynaert, Martin; Schuurman, Ineke; Hoste, Veronique; Oostdijk, Nelleke; Van Gompel, Maarten;
    Countries: Netherlands, Belgium

    In this paper we report on the experiences gained in the recent construction of the SoNaR corpus, a 500 MW reference corpus of contemporary, written Dutch. It shows what can realistically be done within the confines of a project setting where there are limitations to the duration in time as well to the budget, employing current state-of-the-art tools, standards and best practices. By doing so we aim to pass on insights that may be beneficial for anyone considering to undertake an effort towards building a large, varied yet balanced corpus for use by the wider research community. Various issues are discussed that come into play while compiling a large corpus, including approaches to acquiring texts, the arrangement of IPR, the choice of text formats, and steps to be taken in the preprocessing of data from widely different origins. We describe FoLiA, a new XML format geared at rich linguistic annotations. We also explain the rationale behind the investment in the high-quali ty semi-automatic enrichment of a relatively small (1 MW) subset with very rich syntactic and semantic annotations. Finally, we present some ideas about future developments and the direction corpus development may take, such as setting up an integrated work flow between web services and the potential role for ISOcat. We list tips for potential corpus builders, tricks they may want to try and further recommendations regarding technical developments future corpus builders may wish to hope for. ispartof: pages:2897-2904 ispartof: Proceedings of the Eighth International conference on Language Resources and Evaluation (LREC) vol:8 pages:2897-2904 ispartof: International conference on Language Resources and Evaluation (LREC) location:Istanbul (Turkey) date:21 May - 27 May 2012 status: published

Send a message
How can we help?
We usually respond in a few hours.