Defining discourse formulae: computational approach

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 18 Mar 2019Publisher:EasyChairJournal:EPiC Series in Language and Linguistics, volume 4, pages 61-51 (issn: 2398-5283,

Copyright policy )

Authors: Ekaterina Gerasimenko; Svetlana Puzhaeva; Elena Zakharova; Ekaterina Rakhilina;

doi: 10.29007/k5q2

Defining discourse formulae: computational approach

- Summary
- Metrics

Abstract

In this paper, we address the problem of automatic extraction of discourse formulae. By discourse formulae (DF) we mean a special type of constructions at the discourse level, which have a fixed form and serve as a typical response in the dialogue. Unlike traditional constructions [4, 5, 6], they do not contain variables within the sequence; their slots can be found in the left-hand or right-hand statements of the speech act. We have developed the system that extracts DF from drama texts. We have compared token-based and clause- based approaches and found the latter performing better. The clause-based model involves a uniform weight vote of four classifiers and currently shows the precision of 0.30 and the recall of 0.73 (F1-score 0.42).The created module was used to extract a list of DF from 420 drama texts of XIX-XXI centuries [1, 7]. The final list contains 3000 DF, 1800 of which are unique. Further development of the project includes enhancing the module by extracting left context features and applying other models, as well as exploring what DF concept looks like in other languages.

Related Organizations

National Research University Higher School of Economics
Russian Federation
Russian Academy of Science
Russian Federation
Russian Academy of Sciences
Russian Academy of Sciences
Russian Federation

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

bronze

Fields of Science (4) View all

medical and health sciences

other medical science

Fields of Science

medical and health sciences

other medical science

View all