• shareshare
  • link
  • cite
  • add
Publication . Part of book or chapter of book . 2014

Linguistic Resources and Cats: How to Use ISOcat, RELcat and SCHEMAcat

Windhouwer, Menzo; Schuurman, Ineke; Chair), Nicoletta Calzolari (Conference; Choukri, Khalid; Declerck, Thierry; Loftsson, Hrafn; Maegaard, Bente; +4 Authors
Open Access
Published: 01 May 2014
Publisher: European Language Resources Association (ELRA)

Within the European CLARIN infrastructure ISOcat is used to enable both humans and computer programs to find specific resources even when they use different terminology or data structures. In order to do so, it should be clear which concepts are used in these resources, both at the level of metadata for the resource as well as its content, and what is meant by them. The concepts can be specified in ISOcat. SCHEMAcat enables us to relate the concepts used by a resource, while RELcat enables to type these relationships and add relationships beyond resource boundaries. This way these three registries together allow us (and the programs) to find what we are looking for.

11 references, page 1 of 2

Broeder, D., Schuurman, I., & Windhouwer, M. (2014). Experiences with the ISOcat Data Category Registry. Ninth International Conference on Language Resources and Evaluation (LREC 2014). Reykjavik, Iceland: ELRA. [OpenAIRE]

ISO 12620. (2009). Specification of data categories and management of a Data Category Registry for language resources. Geneve: International Organization for Standardization.

ISO 24611. (2012). Morpho-syntactic annotation framework (MAF). Geneve: International Organization for Standardization.

ISO 24619. (2011). Persistent identification and sustainable access (PISA). Geneve: International Organization for Standardization.

Kemps-Snijders, M., Windhouwer, M., Wittenburg, P., & Wright, S. E. (2008). A Revised Data Model for the ISO Data Category Registry. Proceedings of the 8th International Conference on Terminology and Knowledge Engineering (TKE2008). Copenhagen, Denmark.

Linguist List. (2014). General Ontology for Linguistic Description. Opgeroepen op March 14, 2013, van

Patejuk, A., & Przepiórkowski, A. (2010). ISOcat Definition of the National Corpus of Polish Tagset. Language Resource and Language Technology Standards. Malta: ELRA.

TEI Consortium. (2014). TEI P5: Guidelines for Electronic Text Encoding and Interchange. 2.6.0. January 20, 2014. TEI Consortium.

Van Eynde, F. (2004). Part of Speech Tagging en Lemmatisering van het CGN Corpus. Centrum voor Computerlinguïstiek, KU Leuven.

Windhouwer, M. (2012). RELcat: a Relation Registry for ISOcat data categories. Eight International Conference on Language Resources and Evaluation. Istanbul, Turkey: ELRA. [OpenAIRE]

Related to Research communities
Download from
Part of book or chapter of book . 2014
Providers: NARCIS