<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>

COPY SCRIPT

For further information contact us at helpdesk@openaire.eu

Survey on evaluation methods for dialogue systems

Name: Survey on evaluation methods for dialogue systems
Keywords: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial intelligence, Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Conversational, Computer Science - Human-Computer Interaction, Deep learning, [INFO] Computer Science [cs], 006: Spezielle Computerverfahren

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type , Preprint 25 Jun 2020Embargo end date: 01 Jan 2019 France Publisher:Springer Science and Business Media LLCJournal:Artificial Intelligence Review, volume 54, pages 755-810 (issn: 0269-2821, eissn: 1573-7462,

Authors: Deriu, Jan; Rodrigo, Alvaro; Otegi, Arantxa; Echegoyen, Guillermo; Rosset, Sophie; Agirre, Eneko; Cieliebak, Mark;

doi: 10.1007/s10462-020-09866-x , 10.48550/arxiv.1905.04071 , 10.21256/zhaw-20318

pmid: 33505103

pmc: PMC7817575

arXiv: http://arxiv.org/abs/1905.04071

Survey on evaluation methods for dialogue systems

- Summary
- Subjects
- Metrics

Abstract

AbstractIn this paper, we survey the methods and concepts developed for the evaluation of dialogue systems. Evaluation, in and of itself, is a crucial part during the development process. Often, dialogue systems are evaluated by means of human evaluations and questionnaires. However, this tends to be very cost- and time-intensive. Thus, much work has been put into finding methods which allow a reduction in involvement of human labour. In this survey, we present the main concepts and methods. For this, we differentiate between the various classes of dialogue systems (task-oriented, conversational, and question-answering dialogue systems). We cover each class by introducing the main technologies developed for the dialogue systems and then present the evaluation methods regarding that class.

Country

France

Related Organizations

National University of Distance Education
Spain
Sorbonne University
France
Laboratoire d'informatique de Paris 6
France
University of Paris-Saclay
France
Applied Science University
Bahrain

View all View all

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial intelligence, Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Conversational, Computer Science - Human-Computer Interaction, Deep learning, [INFO] Computer Science [cs], 006: Spezielle Computerverfahren, Article, Human-Computer Interaction (cs.HC), Machine Learning (cs.LG), Artificial Intelligence (cs.AI), Evaluation metrics, Chatbots, Dialogue systems, [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL], Discourse model, Evaluation, Computation and Language (cs.CL)

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	163
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 1%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 1%