Towards automatic evaluation of customer-helpdesk dialogues

Zhaohao Zeng, Cheng Luo, Lifeng Shang, Hang Li, Tetsuya Sakai

Research output: Contribution to journalArticle

Abstract

We attempt to tackle the problem of evaluating textual, multi-round, task-oriented dialogues between the customer and the helpdesk, such as those that take the form of online chats. As an initial step towards automatic evaluation of helpdesk agent systems, we have constructed a test collection comprising 3,700 real Customer-Helpdesk multi-round dialogues by mining Weibo, a major Chinese microblogging media. Each dialogue has been annotated with multiple subjective quality annotations and nugget annotations, where a nugget is a minimal sequence of posts by the same utterer that helps towards problem solving. In addition, 34% of the dialogues have been manually translated into English. We first propose a nugget-based dialogue quality evaluation measure called Utility for Customer and Helpdesk (UCH), where a nugget is a manually identified utterance within a dialogue that helps the Customer advance towards problem solving. In addition, we propose a simple neural network-based approach to predicting the dialogue quality scores from the entire dialogue, which we call Neural Evaluation Machine (NEM). According to our experiments with DCH-1, UCH correlates better with the appropriateness of utterances than with customer satisfaction. In contrast, as NEM leverages natural language expressions within the dialogue, it correlates relatively well with customer satisfaction.

Original languageEnglish
Pages (from-to)768-778
Number of pages11
JournalJournal of information processing
Volume26
DOIs
Publication statusPublished - 2018 Jan 1

Fingerprint

Customer satisfaction
Online conferencing
Neural networks
Experiments

Keywords

  • Dialogue
  • Evaluation
  • Helpdesk
  • Neural network
  • Nugget

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Towards automatic evaluation of customer-helpdesk dialogues. / Zeng, Zhaohao; Luo, Cheng; Shang, Lifeng; Li, Hang; Sakai, Tetsuya.

In: Journal of information processing, Vol. 26, 01.01.2018, p. 768-778.

Research output: Contribution to journalArticle

Zeng, Zhaohao ; Luo, Cheng ; Shang, Lifeng ; Li, Hang ; Sakai, Tetsuya. / Towards automatic evaluation of customer-helpdesk dialogues. In: Journal of information processing. 2018 ; Vol. 26. pp. 768-778.
@article{134c5943ff0c4234b2d2080d3040c525,
title = "Towards automatic evaluation of customer-helpdesk dialogues",
abstract = "We attempt to tackle the problem of evaluating textual, multi-round, task-oriented dialogues between the customer and the helpdesk, such as those that take the form of online chats. As an initial step towards automatic evaluation of helpdesk agent systems, we have constructed a test collection comprising 3,700 real Customer-Helpdesk multi-round dialogues by mining Weibo, a major Chinese microblogging media. Each dialogue has been annotated with multiple subjective quality annotations and nugget annotations, where a nugget is a minimal sequence of posts by the same utterer that helps towards problem solving. In addition, 34{\%} of the dialogues have been manually translated into English. We first propose a nugget-based dialogue quality evaluation measure called Utility for Customer and Helpdesk (UCH), where a nugget is a manually identified utterance within a dialogue that helps the Customer advance towards problem solving. In addition, we propose a simple neural network-based approach to predicting the dialogue quality scores from the entire dialogue, which we call Neural Evaluation Machine (NEM). According to our experiments with DCH-1, UCH correlates better with the appropriateness of utterances than with customer satisfaction. In contrast, as NEM leverages natural language expressions within the dialogue, it correlates relatively well with customer satisfaction.",
keywords = "Dialogue, Evaluation, Helpdesk, Neural network, Nugget",
author = "Zhaohao Zeng and Cheng Luo and Lifeng Shang and Hang Li and Tetsuya Sakai",
year = "2018",
month = "1",
day = "1",
doi = "10.2197/ipsjjip.26.768",
language = "English",
volume = "26",
pages = "768--778",
journal = "Journal of Information Processing",
issn = "0387-5806",
publisher = "Information Processing Society of Japan",

}

TY - JOUR

T1 - Towards automatic evaluation of customer-helpdesk dialogues

AU - Zeng, Zhaohao

AU - Luo, Cheng

AU - Shang, Lifeng

AU - Li, Hang

AU - Sakai, Tetsuya

PY - 2018/1/1

Y1 - 2018/1/1

N2 - We attempt to tackle the problem of evaluating textual, multi-round, task-oriented dialogues between the customer and the helpdesk, such as those that take the form of online chats. As an initial step towards automatic evaluation of helpdesk agent systems, we have constructed a test collection comprising 3,700 real Customer-Helpdesk multi-round dialogues by mining Weibo, a major Chinese microblogging media. Each dialogue has been annotated with multiple subjective quality annotations and nugget annotations, where a nugget is a minimal sequence of posts by the same utterer that helps towards problem solving. In addition, 34% of the dialogues have been manually translated into English. We first propose a nugget-based dialogue quality evaluation measure called Utility for Customer and Helpdesk (UCH), where a nugget is a manually identified utterance within a dialogue that helps the Customer advance towards problem solving. In addition, we propose a simple neural network-based approach to predicting the dialogue quality scores from the entire dialogue, which we call Neural Evaluation Machine (NEM). According to our experiments with DCH-1, UCH correlates better with the appropriateness of utterances than with customer satisfaction. In contrast, as NEM leverages natural language expressions within the dialogue, it correlates relatively well with customer satisfaction.

AB - We attempt to tackle the problem of evaluating textual, multi-round, task-oriented dialogues between the customer and the helpdesk, such as those that take the form of online chats. As an initial step towards automatic evaluation of helpdesk agent systems, we have constructed a test collection comprising 3,700 real Customer-Helpdesk multi-round dialogues by mining Weibo, a major Chinese microblogging media. Each dialogue has been annotated with multiple subjective quality annotations and nugget annotations, where a nugget is a minimal sequence of posts by the same utterer that helps towards problem solving. In addition, 34% of the dialogues have been manually translated into English. We first propose a nugget-based dialogue quality evaluation measure called Utility for Customer and Helpdesk (UCH), where a nugget is a manually identified utterance within a dialogue that helps the Customer advance towards problem solving. In addition, we propose a simple neural network-based approach to predicting the dialogue quality scores from the entire dialogue, which we call Neural Evaluation Machine (NEM). According to our experiments with DCH-1, UCH correlates better with the appropriateness of utterances than with customer satisfaction. In contrast, as NEM leverages natural language expressions within the dialogue, it correlates relatively well with customer satisfaction.

KW - Dialogue

KW - Evaluation

KW - Helpdesk

KW - Neural network

KW - Nugget

UR - http://www.scopus.com/inward/record.url?scp=85063871962&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063871962&partnerID=8YFLogxK

U2 - 10.2197/ipsjjip.26.768

DO - 10.2197/ipsjjip.26.768

M3 - Article

AN - SCOPUS:85063871962

VL - 26

SP - 768

EP - 778

JO - Journal of Information Processing

JF - Journal of Information Processing

SN - 0387-5806

ER -