Contributions about the morphosyntactic structure of terminological units and about hybridization between terminology and topic models

Name: Contributions about the morphosyntactic structure of terminological units and about hybridization between terminology and topic models
Creator: Delamaire, Amaury
Keywords: Morphosyntaxe, Theme models, Modèles de thèmes, Morphosyntax, Évaluation, Hybridation, [INFO] Computer Science [cs], Evaluation, Terminology, Terminologie

Delamaire, Amaury

Found an issue? Give us feedback

HAL Clermont Univers...arrow_drop_down

HAL Clermont Université

Doctoral thesis . 2020

Data sources: HAL Clermont Université

Contributions about the morphosyntactic structure of terminological units and about hybridization between terminology and topic models

descriptionPublicationkeyboard_double_arrow_right Doctoral thesis 01 Jan 2020 French

Authors: Delamaire, Amaury;

Contributions about the morphosyntactic structure of terminological units and about hybridization between terminology and topic models

- Summary
- Subjects
- Metrics

Abstract

Nous présentons ici diverses expériences et hypothèses en lien avec l’extraction terminographique automatique et de potentielles hybridations avec des modèles de thèmes. Dans le domaine du tal, la construction automatique de terminologies n’est que peu consensuelle. Les différents objectifs des chercheurs font poindre des divergences d’opinion quant à ce qui constitue ou non une unité terminologique. Les divergences se situent à différents niveaux de la tâche. Sur le plan linguistique, les chercheurs sont parvenus à un accord relatif quant à la structure morphosyntaxique des graphies terminologiques. De nouvelles propositions apparaissent régulièrement mais qui complètent le consensus plus qu’elles ne l’invalident. Si la structure des graphies fait consensus, il n’en est pas de même pour leur caractérisation en tant qu’unité terminologique. L’aspect terminologique d’une unité est déterminé à partir de différents facteurs internes ainsi qu’externes. Dans un premier temps nos expériences portent sur le contexte d’apparition des unités terminologiques à partir de modèles de thèmes. Nous verrons si et comment les unités terminologiques peuvent bénéficier à la construction de modèles de thèmes. Ce bénéfice sera estimé à l’aune de la pertinence des modèles construits et de mesures statistiques. Dans un second temps, nous proposerons une extension de la structure morphosyntaxique des graphies terminologiques.

We propose here several experiments and hypothesis about automatic terminography extraction and potential hybridization with topic models. In NLP, there is only little consensus about automatic terminology construction. The different goals of researchers lead to dissension about what constitutes or not a terminological unit. On a linguistic level, researchers reached a relative consensus about terminological unit structure. New proposals regularly appear, but they complete the consensus rather than invalidate it. Even if there is a relative agreement on terminological unit structure, there is none about how to flag pertinent terminological units. The terminological aspect of a unit is estimated through several internal and external criteria. We will first focus our experiments on terminological unit contexts through topic models. We will see if and how terminological units can improve topic models. This improvement will be estimated through statistical metrics representing the quality of the model. Then we will introduce and experiment with our terminological unit structure extension proposal.

Related Organizations

CLERMONT AUVERGNE INP
France
Clermont Université
France
French National Centre for Scientific Research
France
University of Clermont Auvergne
France
Mines Saint-Etienne
France

Keywords

Morphosyntaxe, Theme models, Modèles de thèmes, Morphosyntax, Évaluation, Hybridation, [INFO] Computer Science [cs], Evaluation, Terminology, Terminologie, Hybridization, Lexicon, Lexique

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green