JP5910194B2

JP5910194B2 - Voice dialogue summarization apparatus, voice dialogue summarization method and program

Info

Publication number: JP5910194B2
Application number: JP2012056645A
Authority: JP
Inventors: 石川　開; 開石川; 貴士大西; 正明土田
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2012-03-14
Filing date: 2012-03-14
Publication date: 2016-04-27
Anticipated expiration: 2032-03-14
Also published as: JP2013190991A

Description

本発明は音声対話要約装置、音声対話要約方法およびプログラムに関する。 The present invention relates to a voice dialogue summarizing apparatus, a voice dialogue summarizing method, and a program.

コールセンターには、顧客を自社につなぎ止め、収益を拡大させる「企業の顔」としての役割がある。応対品質が顧客のロイヤリティに大な影響を与えるため、経営者は、コールセンターの運営効率化（応答率、稼働率、処理時間）のみならず、応対の品質の向上や均質化という必要性に迫られている。 The call center has a role as a “corporate face” that connects customers to the company and expands their profits. As response quality has a major impact on customer loyalty, managers not only need to improve call center operational efficiency (response rate, availability, processing time), but also the need to improve and homogenize response quality. It has been.

コールセンター運営の現場では、コミュニケ―タの応対の基本的な応対スキルや、スクリプトに沿った対話が行えているかどうか、顧客に不適切な発言を行ったり、不快感を与えていないか、といった応対上の問題の確認や応対品質の評価を、各コミュニケ―タの実際の応対の通話録音を聞いて確認、評価を行う必要がある。 At the call center operation site, the basic response skills of the communicator's response, whether or not the dialogue can be performed according to the script, whether or not the customer speaks inappropriately or feels uncomfortable. It is necessary to confirm and evaluate the above problems and the quality of reception by listening to the actual call recording of each communicator.

しかし、通話録音を再生して内容を確認、評価する方法は、時間を要するため、コールセンターでは、応対モニタリング（音声を聞いて評価）を効率化する支援方式に対する要求がある。 However, since the method of confirming and evaluating the content by playing back the call recording requires time, the call center has a demand for a support method that improves the efficiency of response monitoring (evaluation by listening to voice).

非特許文献１において、要約システムの例が記載されている。非特許文献１に記載された「TRUE TELLER VOICEダイジェスト」は、応対品質評価レポート生成機能を提供する。図１１は、非特許文献１に記載された要約システムの構成を示すブロック図である。図１１を参照すると、要約システムは、音声認識手段101、トピック遷移グラフ生成手段102およびスクリプト一致度計算手段103を備える。音声認識手段101は、コールセンターにおける電話応対に対する音声認識を行い、認識結果として対話テキスト集合を抽出する。トピック遷移グラフ生成手段102は、コールセンターで運用されているトークスクリプト（すなわち、トピックごとに収集された表現集）に対して、電話応対の対話テキスト集合を照合し、トピックの遷移グラフを生成する。図１２は、トピックの遷移グラフを一例として示す図である。スクリプト一致度計算手段103は、トークスクリプト中の表現との一致率に基づいた応対品質の評価を実施し、評価レポートを生成する。 Non-Patent Document 1 describes an example of a summary system. “TRUE TELLER VOICE digest” described in Non-Patent Document 1 provides a response quality evaluation report generation function. FIG. 11 is a block diagram showing the configuration of the summarization system described in Non-Patent Document 1. Referring to FIG. 11, the summarizing system includes voice recognition means 101, topic transition graph generation means 102, and script matching degree calculation means 103. The speech recognition means 101 performs speech recognition for telephone reception at a call center, and extracts a dialogue text set as a recognition result. The topic transition graph generation means 102 matches the conversation text set of telephone reception against a talk script (that is, a collection of expressions collected for each topic) operated in the call center, and generates a topic transition graph. FIG. 12 is a diagram illustrating a topic transition graph as an example. The script matching degree calculation means 103 performs an evaluation of the reception quality based on the matching rate with the expression in the talk script, and generates an evaluation report.

堀宣男、竹原一彰、「対話要約で実現する"顧客の声"活用―電話応対の自動要約と全件モニタリングの実現―」、ＮＲＩＩＴソリューションフロンティア、２０１０年９月号、ｐｐ．１０−１３．Hori Nobuo and Takehara Kazuaki, “Utilization of“ Voice of Voice ”Realized by Dialogue Summarization—Realization of Automatic Summarization of Telephone Response and Monitoring of All Items” ”, NRI IT Solution Frontier, September 2010, pp. 10-13.

以下の分析は、本発明者によってなされたものである。 The following analysis was made by the present inventors.

非特許文献１に記載された要約システムによると、トークスクリプトの対話フローと一致しない応対事例に対して、分類整理をすることができない。したがって、かかる要約システムによると、トークスクリプトの対話フローと一致しない応対事例に対しては、モニタリング業務を効率化する効果が得られないという問題がある。 According to the summarization system described in Non-Patent Document 1, it is not possible to sort and arrange response cases that do not match the conversation flow of the talk script. Therefore, according to such a summarization system, there is a problem in that it is not possible to obtain an effect of improving the efficiency of the monitoring work for a response case that does not match the conversation flow of the talk script.

そこで、コールセンターにおける応対モニタリング（音声を聞いて評価）を効率化する支援方式において、トークスクリプトの対話フローから外れた応対集合に対しても、モニタリング業務を効率化できるようにすることが課題となる。本発明の目的は、かかる課題を解決する音声対話要約装置、音声対話要約方法およびプログラムを提供することにある。 Therefore, in the support method to improve the response monitoring (evaluation by listening to voice) in the call center, it becomes an issue to make the monitoring work more efficient even for the response set that is out of the conversation flow of the talk script. . An object of the present invention is to provide a voice dialogue summarizing apparatus, a voice dialogue summarizing method and a program for solving such a problem.

本発明の第１の視点に係る音声対話要約装置は、対話テキストの集合から複数の発話を抽出し、抽出した複数の発話のうちの互いに含意関係で結ばれたものを発話クラスタとして抽出する。 The spoken dialogue summarizing apparatus according to the first aspect of the present invention extracts a plurality of utterances from a set of dialogue texts, and extracts a plurality of extracted utterances connected in an implication relationship as an utterance cluster.

本発明の第２の視点に係る音声対話要約方法は、コンピュータが、対話テキストの集合から複数の発話を抽出する工程と、抽出した複数の発話のうちの互いに含意関係で結ばれたものを発話クラスタとして抽出する工程と、を含む。 In the speech dialogue summarizing method according to the second aspect of the present invention, a computer extracts a plurality of utterances from a set of dialogue texts, and utterances of the extracted plurality of utterances connected in implication relation with each other. Extracting as a cluster.

本発明の第３の視点に係るプログラムは、対話テキストの集合から複数の発話を抽出する処理と、抽出した複数の発話のうちの互いに含意関係で結ばれたものを発話クラスタとして抽出する処理と、をコンピュータに実行させる。
本発明の第４の視点に係る音声対話要約装置は、対話テキスト集合に含まれる複数の発話から、各発話が含意している部分表現を発話部分表現として抽出する発話部分表現抽出部と、抽出された発話部分表現を頂点とし、互いに含意関係が成り立つ発話部分表現の間を有向辺で結んだ有向グラフを生成する有向グラフ生成部と、前記有向グラフ中の１つの頂点から順方向に有向辺を辿って到達可能な頂点から成る部分グラフを求め、求めた部分グラフに含まれる頂点に相当する発話部分表現の抽出元である発話から成る発話の集合を、発話クラスタとして生成する発話クラスタ生成部と、を備えている。
本発明の第５の視点に係る音声対話要約方法は、コンピュータが、対話テキスト集合に含まれる複数の発話から、各発話が含意している部分表現を発話部分表現として抽出するステップと、抽出された発話部分表現を頂点とし、互いに含意関係が成り立つ発話部分表現の間を有向辺で結んだ有向グラフを生成するステップと、前記有向グラフ中の１つの頂点から順方向に有向辺を辿って到達可能な頂点から成る部分グラフを求め、求めた部分グラフに含まれる頂点に相当する発話部分表現の抽出元である発話から成る発話の集合を、発話クラスタとして生成するステップとを含む。
本発明の第６の視点に係るプログラムは、対話テキスト集合に含まれる複数の発話から、各発話が含意している部分表現を発話部分表現として抽出する処理と、抽出された発話部分表現を頂点とし、互いに含意関係が成り立つ発話部分表現の間を有向辺で結んだ有向グラフを生成する処理と、前記有向グラフ中の１つの頂点から順方向に有向辺を辿って到達可能な頂点から成る部分グラフを求め、求めた部分グラフに含まれる頂点に相当する発話部分表現の抽出元である発話から成る発話の集合を、発話クラスタとして生成する処理とをコンピュータに実行させる。
なお、プログラムは、非トランジエントなコンピュータ読み取り可能な記録媒体（non-transient computer-readable storage medium）に記録されたプログラム製品として提供することができる。 The program according to the third aspect of the present invention includes a process of extracting a plurality of utterances from a set of dialogue texts, a process of extracting a plurality of extracted utterances connected to each other by an implication relationship as an utterance cluster, , Execute on the computer.
The speech dialogue summarizing apparatus according to the fourth aspect of the present invention includes: an utterance partial expression extraction unit that extracts a partial expression implied by each utterance as an utterance partial expression from a plurality of utterances included in the conversation text set; A directed graph generation unit that generates a directed graph in which the utterance partial representations are used as vertices, and utterance partial representations that have implications are connected by directional edges, and a directional edge in a forward direction from one vertex in the directed graph An utterance cluster generation unit that generates a utterance cluster including a set of utterances that are extracted from the utterance partial expression corresponding to the vertices included in the obtained subgraph, and obtains a subgraph including vertices that can be reached by tracing It is equipped with.
The spoken dialogue summarizing method according to the fifth aspect of the present invention includes a step in which a computer extracts a partial expression implied by each utterance as a utterance partial expression from a plurality of utterances included in the conversation text set. Generating a directed graph in which the utterance partial representations are vertices and the utterance partial representations that have implications are connected by directional edges, and the directional edges are reached in the forward direction from one vertex in the directed graph. Obtaining a subgraph composed of possible vertices, and generating a set of utterances composed of utterances from which utterance partial expressions corresponding to the vertices included in the obtained subgraph are extracted as an utterance cluster.
The program according to the sixth aspect of the present invention includes a process of extracting a partial expression implied by each utterance as an utterance partial expression from a plurality of utterances included in the conversation text set, and the extracted utterance partial expression as a vertex. A process of generating a directed graph in which utterance partial representations that have implication relations are connected by a directed edge, and a portion that is reachable by following a directed edge from one vertex in the directed graph in the forward direction A graph is obtained, and a computer is caused to execute a process of generating a set of utterances composed of utterances from which utterance partial expressions corresponding to vertices included in the obtained subgraph are extracted as utterance clusters.
The program can be provided as a program product recorded on a non-transient computer-readable storage medium.

本発明に係る音声対話要約装置、音声対話要約方法およびプログラムによると、コールセンターにおける応対モニタリング（音声を聞いて評価）を効率化する支援方式において、トークスクリプトの対話フローから外れた応対集合に対しても、モニタリング業務を効率化することができる。 According to the voice dialog summarizing apparatus, the voice dialog summarizing method and the program according to the present invention, in the support method for improving the response monitoring (evaluation by listening to voice) in the call center, the response set deviating from the dialog flow of the talk script. Can also improve the efficiency of monitoring work.

実施形態に係る音声対話要約システムの構成を一例として示すブロック図である。It is a block diagram which shows the structure of the speech dialogue summary system which concerns on embodiment as an example. 実施形態に係る音声対話要約システムの動作を一例として示すフロー図である。It is a flowchart which shows operation | movement of the speech dialogue summary system which concerns on embodiment as an example. 実施例における対話テキスト集合を一例として示す図である。It is a figure which shows the dialogue text set in an Example as an example. 実施例における形態素間の係り受け関係を示す図である。It is a figure which shows the dependency relationship between the morphemes in an Example. 実施例における発話の部分表現と、それらが発話に含意されているかどうかの関係を示す図である。It is a figure which shows the relationship between the partial expression of the utterance in an Example, and whether they are implied by the utterance. 実施例における第２文の発話部分表現を一例として示す図である。It is a figure which shows the utterance partial expression of the 2nd sentence in an Example as an example. 実施例における第１文と第２文の各発話部分表現間の含意関係を示す図である。It is a figure which shows the implication relationship between each utterance partial expression of the 1st sentence in an Example, and a 2nd sentence. 実施例における発話部分表現を頂点とし、その間を含意する側に向かう有向辺で結んだ有向グラフを一例として示す図である。It is a figure which shows the directed graph which made the utterance partial expression in an Example a vertex, and connected with the directed edge which goes to the side which implied between it as an example. 実施例における有向グラフにおいて、起点となる頂点を含まない部分グラフを抽出する一例を示す図である。It is a figure which shows an example which extracts the partial graph which does not contain the vertex used as the starting point in the directed graph in an Example. 実施例における発話クラスタを一例として示す図である。It is a figure which shows the speech cluster in an Example as an example. 非特許文献１に記載された音声対話要約（トピック遷移グラフの生成）システムの構成を示すブロック図である。It is a block diagram which shows the structure of the voice dialog summary (generation | generation of a topic transition graph) system described in the nonpatent literature 1. 非特許文献１に記載された音声対話要約システムの出力（トピック遷移グラフ）を一例として示す図である。It is a figure which shows the output (topic transition graph) of the speech dialogue summary system described in the nonpatent literature 1 as an example.

はじめに、一実施形態の概要について説明する。なお、この概要に付記する図面参照符号は、専ら理解を助けるための例示であり、本発明を図示の態様に限定することを意図するものではない。 First, an outline of one embodiment will be described. Note that the reference numerals of the drawings attached to this summary are merely examples for facilitating understanding, and are not intended to limit the present invention to the illustrated embodiment.

図１は、本開示に係る音声対話要約装置の構成を一例として示すブロック図である。図１を参照すると、音声対話要約装置は、発話部分表現抽出部（11）と、含意判定部（12）と、有向グラフ生成部（13）と、発話クラスタ生成部（14）と、を備える。発話部分表現抽出部（11）は、対話テキスト集合中の各発話から、該発話が含意している部分表現を発話部分表現として抽出する。有向グラフ生成部（13）は、異なる発話から抽出された発話部分表現の間に含意関係が成り立つ場合、それらの発話部分表現を頂点とし、その間を含意する側に向かう有向辺で結んだ有向グラフを生成する。発話クラスタ生成部（14）は、有向グラフ中のある頂点から順方向で有向辺を辿れる部分グラフを抽出し、該部分グラフ中の各頂点である発話部分表現の抽出元である発話集合を、発話クラスタとして出力する。 FIG. 1 is a block diagram illustrating, as an example, a configuration of a voice dialogue summarizing apparatus according to the present disclosure. Referring to FIG. 1, the speech dialogue summarizing apparatus includes an utterance partial expression extraction unit (11), an implication determination unit (12), a directed graph generation unit (13), and an utterance cluster generation unit (14). The utterance partial expression extraction unit (11) extracts a partial expression implied by the utterance as an utterance partial expression from each utterance in the conversation text set. When the implication relation is established between the utterance partial expressions extracted from different utterances, the directed graph generation unit (13) uses the utterance partial expressions as vertices and connects the directional graphs connected with the directed edges toward the implication side. Generate. The utterance cluster generation unit (14) extracts a subgraph that can follow a directed edge in a forward direction from a certain vertex in the directed graph, and an utterance set that is an extraction source of the utterance partial expression that is each vertex in the subgraph, Output as utterance cluster.

かかる音声対話要約装置によると、コールセンターにおける応対モニタリング（音声を聞いて評価）を効率化する支援方式において、トークスクリプトの対話フローと一致しない応対事例であっても、モニタリング業務を効率化することが可能となる。その理由は、トークスクリプトの対話フローと一致しない応対事例であっても、発話クラスタを生成・提示することで、同じ発話内容に関する発話の集合を一括で閲覧、音声を確認可能とするからである。 According to such a voice dialogue summarization apparatus, in a support method that improves the efficiency of response monitoring (evaluation by listening to voice) in a call center, it is possible to improve the efficiency of monitoring work even if the response case does not match the conversation flow of the talk script. It becomes possible. The reason is that even if the response case does not match the conversation flow of the talk script, by generating and presenting an utterance cluster, it is possible to view a set of utterances related to the same utterance content in a batch and confirm the voice. .

なお、本発明において、下記の形態が可能である。
［形態１］
上記第１の視点に係る音声対話要約装置のとおりである。
［形態２］
前記音声対話要約装置は、前記対話テキスト集合から抽出された各発話から、該発話が含意する部分表現を発話部分表現として抽出する発話部分表現抽出部を備えていてもよい。
［形態３］
前記音声対話要約装置は、前記対話テキスト集合から抽出された第１の発話から抽出された第１の発話部分表現と、前記対話テキスト集合から抽出された第２の発話から抽出された第２の発話部分表現との間に含意関係が成り立つ場合、前記第１の発話部分表現を第１の頂点とし、前記第２の発話部分表現を第２の頂点とし、前記第１の頂点および前記第２の頂点を辺で結んだグラフを生成する有向グラフ生成部を備えていてもよい。
［形態４］
前記音声対話要約装置は、前記第１の発話部分表現と前記第２の発話部分表現との間に含意関係が成り立つ場合、前記第１の発話および前記第２の発話を含む発話集合を、前記発話クラスタとして抽出する発話クラスタ生成部を備えていてもよい。
［形態５］
前記有向グラフ生成部は、前記第１の発話部分表現が前記第２の発話部分表現を含意する場合、前記第１の頂点から前記第２の頂点に向かう有向辺で結んだ有向グラフを、前記グラフとして生成してもよい。
［形態６］
前記発話クラスタ生成部は、前記有向グラフを順方向に辿って得られる部分グラフを抽出し、該部分グラフに含まれる頂点に相当する発話部分表現の抽出元の発話の集合を、前記発話クラスタとして抽出してもよい。
［形態７］
前記発話クラスタ生成部は、前記部分グラフに含まれる頂点のうちの他の部分グラフと共有される頂点を除外し、前記部分グラフに含まれる頂点に相当する発話部分表現の抽出元の発話の集合を、前記発話クラスタとして抽出してもよい。
［形態８］
上記第２の視点に係る音声対話要約方法のとおりである。
［形態９］
前記音声対話要約方法において、コンピュータが、前記対話テキスト集合から抽出された各発話から、該発話が含意する部分表現を発話部分表現として抽出する工程を含んでいてもよい。
［形態１０］
前記音声対話要約方法において、コンピュータが、前記対話テキスト集合から抽出された第１の発話から抽出された第１の発話部分表現と、前記対話テキスト集合から抽出された第２の発話から抽出された第２の発話部分表現との間に含意関係が成り立つ場合、前記第１の発話部分表現を第１の頂点とし、前記第２の発話部分表現を第２の頂点とし、前記第１の頂点および前記第２の頂点を辺で結んだグラフを生成する工程を含んでいてもよい。
［形態１１］
前記音声対話要約方法において、コンピュータが、前記第１の発話部分表現と前記第２の発話部分表現との間に含意関係が成り立つ場合、前記第１の発話および前記第２の発話を含む発話集合を、前記発話クラスタとして抽出してもよい。
［形態１２］
前記音声対話要約方法において、前記第１の発話部分表現が前記第２の発話部分表現を含意する場合、コンピュータが、前記第１の頂点から前記第２の頂点に向かう有向辺で結んだ有向グラフを、前記グラフとして生成してもよい。
［形態１３］
前記音声対話要約方法において、コンピュータが、前記有向グラフを順方向に辿って得られる部分グラフを抽出する工程を含み、
前記部分グラフに含まれる頂点に相当する発話部分表現の抽出元の発話の集合を、前記発話クラスタとして抽出してもよい。
［形態１４］
上記第３の視点に係るプログラムのとおりである。
［形態１５］
前記プログラムは、前記対話テキスト集合から抽出された各発話から、該発話が含意する部分表現を発話部分表現として抽出する処理を、コンピュータに実行させるようにしてもよい。
［形態１６］
前記プログラムは、前記対話テキスト集合から抽出された第１の発話から抽出された第１の発話部分表現と、前記対話テキスト集合から抽出された第２の発話から抽出された第２の発話部分表現との間に含意関係が成り立つ場合、前記第１の発話部分表現を第１の頂点とし、前記第２の発話部分表現を第２の頂点とし、前記第１の頂点および前記第２の頂点を辺で結んだグラフを生成する処理を、コンピュータに実行させるようにしてもよい。
［形態１７］
前記プログラムは、前記第１の発話部分表現と前記第２の発話部分表現との間に含意関係が成り立つ場合、前記第１の発話および前記第２の発話を含む発話集合を、前記発話クラスタとして抽出する処理を、コンピュータに実行させるようにしてもよい。
［形態１８］
前記プログラムは、前記第１の発話部分表現が前記第２の発話部分表現を含意する場合、前記第１の頂点から前記第２の頂点に向かう有向辺で結んだ有向グラフを、前記グラフとして生成する処理を、コンピュータに実行させるようにしてもよい。
［形態１９］
前記プログラムは、前記有向グラフを順方向に辿って得られる部分グラフを抽出する処理と、
前記部分グラフに含まれる頂点に相当する発話部分表現の抽出元の発話の集合を、前記発話クラスタとして抽出する処理と、をコンピュータに実行させるようにしてもよい。 In the present invention, the following modes are possible.
[Form 1]
This is the same as the speech dialogue summarizing apparatus according to the first viewpoint.
[Form 2]
The speech dialog summarizing apparatus may include an utterance partial expression extraction unit that extracts a partial expression implied by the utterance as an utterance partial expression from each utterance extracted from the conversation text set.
[Form 3]
The speech dialogue summarization apparatus includes a first utterance partial expression extracted from a first utterance extracted from the dialogue text set, and a second utterance extracted from the second utterance extracted from the dialogue text set. When an implication relationship is established with the utterance partial representation, the first utterance partial representation is the first vertex, the second utterance partial representation is the second vertex, the first vertex and the second There may be provided a directed graph generation unit that generates a graph in which the vertices are connected by edges.
[Form 4]
The speech dialogue summarization apparatus, when an implication relationship is established between the first utterance partial representation and the second utterance partial representation, the speech set including the first utterance and the second utterance, You may provide the utterance cluster production | generation part extracted as an utterance cluster.
[Form 5]
When the first utterance partial expression implies the second utterance partial expression, the directed graph generation unit generates a directed graph connected by directed edges from the first vertex to the second vertex. May be generated as
[Form 6]
The utterance cluster generation unit extracts a subgraph obtained by tracing the directed graph in a forward direction, and extracts a set of utterances from which an utterance partial expression corresponding to a vertex included in the subgraph is extracted as the utterance cluster. May be.
[Form 7]
The utterance cluster generation unit excludes vertices shared with other subgraphs among the vertices included in the subgraph, and a set of utterances from which utterance partial expressions corresponding to the vertices included in the subgraph are extracted May be extracted as the utterance cluster.
[Form 8]
It is as the voice dialogue summarizing method according to the second viewpoint.
[Form 9]
In the spoken dialogue summarizing method, the computer may include a step of extracting, from each utterance extracted from the dialogue text set, a partial expression implied by the utterance as an utterance partial expression.
[Mode 10]
In the spoken dialogue summarizing method, a computer is extracted from a first utterance partial expression extracted from a first utterance extracted from the dialogue text set and a second utterance extracted from the dialogue text set. When an implication relationship holds with the second utterance partial representation, the first utterance partial representation is the first vertex, the second utterance partial representation is the second vertex, and the first vertex and A step of generating a graph connecting the second vertexes with edges may be included.
[Form 11]
In the spoken dialogue summarizing method, if the computer has an implication relationship between the first utterance partial representation and the second utterance partial representation, an utterance set including the first utterance and the second utterance May be extracted as the utterance cluster.
[Form 12]
In the spoken dialogue summarizing method, when the first utterance partial expression implies the second utterance partial expression, a directed graph connected by a computer at a directed edge from the first vertex to the second vertex May be generated as the graph.
[Form 13]
In the spoken dialogue summarizing method, the computer includes a step of extracting a subgraph obtained by tracing the directed graph in a forward direction,
A set of utterances from which utterance partial expressions corresponding to vertices included in the subgraph are extracted may be extracted as the utterance cluster.
[Form 14]
The program is related to the third viewpoint.
[Form 15]
The program may cause the computer to execute a process of extracting a partial expression implied by the utterance as an utterance partial expression from each utterance extracted from the conversation text set.
[Form 16]
The program includes: a first utterance partial expression extracted from a first utterance extracted from the conversation text set; and a second utterance partial expression extracted from a second utterance extracted from the conversation text set. The first utterance partial representation is a first vertex, the second utterance partial representation is a second vertex, and the first vertex and the second vertex are You may make it make a computer perform the process which produces | generates the graph connected with the edge.
[Form 17]
When the implication relationship is established between the first utterance partial expression and the second utterance partial expression, the program sets the utterance set including the first utterance and the second utterance as the utterance cluster. You may make it make a computer perform the process to extract.
[Form 18]
When the first utterance partial representation implies the second utterance partial representation, the program generates, as the graph, a directed graph connected by directed edges from the first vertex to the second vertex. You may make it make a computer perform the process to perform.
[Form 19]
The program extracts a partial graph obtained by tracing the directed graph in a forward direction;
You may make it make a computer perform the process which extracts the collection of utterances of the extraction part of the utterance partial expression corresponding to the vertex contained in the said partial graph as said utterance cluster.

（実施形態）
次に、実施形態に係る音声対話要約装置について、図面を参照して詳細に説明する。図１は、本実施形態に係る音声対話要約装置の構成を一例として示すブロック図である。図１を参照すると、音声対話要約装置は、プログラム制御により動作するコンピュータ（中央処理装置；プロセッサ；データ処理装置）10と、記憶部20とを備える。 (Embodiment)
Next, the speech dialogue summarizing apparatus according to the embodiment will be described in detail with reference to the drawings. FIG. 1 is a block diagram showing, as an example, the configuration of the voice dialogue summarizing apparatus according to the present embodiment. Referring to FIG. 1, the speech dialogue summarizing apparatus includes a computer (central processing unit; processor; data processing unit) 10 that operates by program control, and a storage unit 20.

記憶部20は、対話テキスト集合記憶部21および発話クラスタ記憶部22を備える。対話テキスト集合記憶部21は、対話テキストの集合を記憶する。一方、発話クラスタ記憶部22は、発話クラスタを記憶する。 The storage unit 20 includes a dialogue text set storage unit 21 and an utterance cluster storage unit 22. The dialog text set storage unit 21 stores a set of dialog texts. On the other hand, the utterance cluster storage unit 22 stores utterance clusters.

コンピュータ（中央処理装置；プロセッサ；データ処理装置）10は、発話部分表現抽出部11、含意判定部12、有向グラフ生成部13、および、発話クラスタ生成部14を備える。
発話部分表現抽出部11は、対話テキスト集合記憶部21に記憶された対話テキスト集合中の各発話から、該発話が含意している部分表現を発話部分表現として抽出する。 A computer (central processing unit; processor; data processing unit) 10 includes an utterance partial expression extraction unit 11, an implication determination unit 12, a directed graph generation unit 13, and an utterance cluster generation unit 14.
The utterance partial expression extraction unit 11 extracts, from each utterance in the dialog text set stored in the dialog text set storage unit 21, a partial expression implied by the utterance as an utterance partial expression.

含意判定部12は、２つの発話部分表現の間に含意関係が存在するかどうかを判定する。また、含意判定部１２は、２つの発話部分表現の間に含意関係が存在する場合には、２つの発話部分表現のうちのどちらが含意する側であるかを判定する。 The implication determining unit 12 determines whether an implication relationship exists between the two utterance partial expressions. Further, when there is an implication relationship between the two utterance partial expressions, the implication determination unit 12 determines which of the two utterance partial expressions is on the side of implication.

有向グラフ生成部13は、異なる発話から抽出された発話部分表現の間に含意関係が成り立つ場合、それらの発話部分表現を頂点とし、その間を含意する側に向かう有向辺で結んだ有向グラフを生成する。 When the implication relationship is established between the utterance partial expressions extracted from different utterances, the directed graph generation unit 13 generates a directional graph in which the utterance partial expressions are used as vertices and connected between the directional edges toward the implication side. .

発話クラスタ生成部14は、有向グラフ中のある頂点から有向辺を順方向に辿ることで得られる部分グラフを抽出し、該部分グラフ中の各頂点である発話部分表現の抽出元である発話集合を、発話クラスタとして出力する。 The utterance cluster generation unit 14 extracts a subgraph obtained by tracing a directional edge in a forward direction from a certain vertex in the directed graph, and an utterance set from which the utterance partial expression that is each vertex in the subgraph is extracted Are output as utterance clusters.

図２は、本実施形態の音声対話要約装置の動作を一例として示すフロー図である。図１のブロック図と図２のフロー図を参照して、本実施形態の音声対話要約装置の動作について詳細に説明する。 FIG. 2 is a flowchart showing an example of the operation of the voice dialogue summarizing apparatus of this embodiment. With reference to the block diagram of FIG. 1 and the flowchart of FIG. 2, the operation of the speech dialogue summarizing apparatus of this embodiment will be described in detail.

まず、発話部分表現抽出部11は、対話テキスト集合記憶部21に記憶された対話テキスト集合中の各発話から部分表現を発話部分表現の候補として複数抽出し、各部分表現の内容を発話が含意するかどうかを含意判定部12によって判定する。発話部分表現抽出部11は、発話が含意していると判定された部分表現を、該発話に対する発話部分表現として抽出する（ステップA1）。 First, the utterance partial expression extraction unit 11 extracts a plurality of partial expressions as utterance partial expression candidates from each utterance in the conversation text set stored in the dialog text set storage unit 21, and the utterance implies the contents of each partial expression. The implication determining unit 12 determines whether or not to do so. The utterance partial expression extraction unit 11 extracts the partial expression determined to imply the utterance as the utterance partial expression for the utterance (step A1).

次に、有向グラフ生成部13は、異なる発話から抽出された発話部分表現の間に含意関係が成り立つかどうかを含意判定部12によって判定する。ここで発話が含意していると判定された場合、有向グラフ生成部13は、それらの発話部分表現を頂点とし、その間を含意する側に向かう有向辺で結んだ有向グラフを生成する（ステップA2）。 Next, the directed graph generation unit 13 determines whether or not an implication relationship is established between utterance partial expressions extracted from different utterances. When it is determined that the utterance is implied, the directed graph generation unit 13 generates a directional graph in which the utterance partial expressions are used as vertices and connected between the directional edges toward the side that implies the utterance (step A2). .

次に、発話クラスタ生成部14は、有向グラフ中のある頂点から有向辺を順方向に辿ることで得られる部分グラフを抽出し、該部分グラフ中の各頂点である発話部分表現の抽出元である発話集合を、発話クラスタとして発話クラスタ記憶部22に記憶する（ステップA3）。 Next, the utterance cluster generation unit 14 extracts a subgraph obtained by tracing a directional edge in a forward direction from a certain vertex in the directed graph, and extracts the utterance partial expression that is each vertex in the subgraph. A certain utterance set is stored in the utterance cluster storage unit 22 as an utterance cluster (step A3).

本実施形態の音声対話要約装置は、トークスクリプトの対話フローと一致しない応対事例であっても、発話クラスタを生成・提示することで、同じ発話内容に関する発話の集合を一括で閲覧、音声を確認可能とするように構成されている。かかる音声対話要約装置によると、トークスクリプトの対話フローと一致しない応対事例であっても、モニタリング業務を効率化することができる。 The voice dialog summarization apparatus of this embodiment can browse and collect a set of utterances related to the same utterance content at once by generating and presenting an utterance cluster even if the response case does not match the conversation flow of the talk script. It is configured to be possible. According to such a voice dialogue summarizing apparatus, it is possible to improve the efficiency of the monitoring work even in the case of a response that does not match the dialogue flow of the talk script.

次に、図１のブロック図および図２のフロー図を参照し、本実施形態の音声対話要約装置の動作を具体的な実施例に基づいて説明する。 Next, with reference to the block diagram of FIG. 1 and the flowchart of FIG. 2, the operation of the speech dialogue summarizing apparatus of this embodiment will be described based on a specific example.

まず、発話部分表現抽出部11は、対話テキスト集合記憶部21に記憶された図３に示すような対話テキスト集合中の各発話から部分表現を発話部分表現の候補として複数抽出する。 First, the utterance partial expression extraction unit 11 extracts a plurality of partial expressions as utterance partial expression candidates from each utterance in the conversation text set as shown in FIG. 3 stored in the dialog text set storage unit 21.

例えば、発話が「詰まるのはいつも同じところなんでしょうか。」であった場合、発話部分表現抽出部11は、形態素解析および係り受け（構文）構造解析を行うことにより、図４に示すような形態素間の係り受け関係を生成し、その係り受け関係の部分集合から、図５の右列に示すような部分表現を複数抽出する。 For example, when the utterance is “Is it always the same place that is clogged?”, The utterance partial expression extraction unit 11 performs morphological analysis and dependency (syntax) structure analysis, as shown in FIG. A dependency relationship between morphemes is generated, and a plurality of partial expressions as shown in the right column of FIG. 5 are extracted from a subset of the dependency relationship.

次に、発話部分表現抽出部11は、これらの各部分表現の内容を、発話「詰まるのはいつも同じところなんでしょうか。」が含意するかどうかを含意判定部12によって判定し、図５の左列に示すような発話の含意の関係を得る。 Next, the utterance partial expression extraction unit 11 determines whether or not the utterance “Is it always the same place that is clogged” with the content of each of these partial expressions is determined by the implication determination unit 12, and FIG. Obtain the implications of utterances as shown in the left column.

したがって、発話部分表現抽出部11は、発話が含意していると判定された部分表現「詰まるのは同じところなんでしょうか。」および「詰まるのはいつも同じところなんでしょうか。」を、該発話に対する発話部分表現として抽出する（図２のステップA1）。 Therefore, the utterance partial expression extraction unit 11 determines that the partial expressions determined to imply the utterance “Is it the same place that is clogged?” And “Is it always the same place that is clogged?” Is extracted as an utterance partial expression for (step A1 in FIG. 2).

次に、有向グラフ生成部13は、異なる発話から抽出された発話部分表現の間に含意関係が成り立つかどうかを含意判定部12によって判定する。 Next, the directed graph generation unit 13 determines whether or not an implication relationship is established between utterance partial expressions extracted from different utterances.

ここでは、異なる発話が、第１文「詰まるのはいつも同じところなんでしょうか。」と、第２文「毎回紙詰まりの場所は一緒でしょうか。」の２文であるものとする。なお、説明の便宜ために２文としているが、３文以上でも同様である。発話部分表現抽出部11は、第２文についても同様に、図６に示すような発話部分表現を求める。 Here, it is assumed that different utterances are the first sentence “Is it always the same place that is jammed?” And the second sentence “Is the location of the paper jam every time together?”. Although two sentences are used for convenience of explanation, the same applies to three or more sentences. Similarly, the utterance partial expression extraction unit 11 obtains an utterance partial expression as shown in FIG. 6 for the second sentence.

有向グラフ生成部13は、第１文と第２文の各発話部分表現間の含意関係を、図７に示すように求める。また、有向グラフ生成部13は、発話部分表現の間に含意関係が成り立つと判定された場合、それらの発話部分表現を頂点とし、その間を含意する側に向かう有向辺で結んだ有向グラフを生成する（ステップA2）。図８は、有向グラフ生成部13によって生成された有向グラフを一例として示す。 The directed graph generation unit 13 obtains an implication relationship between the utterance partial expressions of the first sentence and the second sentence as shown in FIG. Further, when it is determined that an implication relationship is established between the utterance partial representations, the directed graph generation unit 13 generates a directed graph in which the utterance partial representations are used as vertices and connected between the directional sides toward the implication side. (Step A2). FIG. 8 shows a directed graph generated by the directed graph generation unit 13 as an example.

次に、発話クラスタ生成部14は、有向グラフ中のある頂点から順方向で有向辺を辿れる部分グラフを抽出する。有向グラフが図８に示す構造を有する場合、発話クラスタ生成部14は、「場所は一緒でしょうか。」という発話部分表現に対応する頂点から順方向で有向辺を辿れる、破線で囲った部分グラフを抽出する。ここで、部分グラフの抽出の際、順方向で有向辺を辿る際の起点となる頂点は、別の部分グラフと共有される可能性があるため、発話クラスタ生成部14は、図９に示すように、起点を含まない部分グラフを抽出してもよい。 Next, the utterance cluster generation unit 14 extracts a subgraph that can follow a directed edge in a forward direction from a certain vertex in the directed graph. When the directed graph has the structure shown in FIG. 8, the utterance cluster generation unit 14 can trace the directed edge in the forward direction from the vertex corresponding to the utterance partial expression “Is the place together?” To extract. Here, at the time of subgraph extraction, since the vertex that is the starting point when following a directed edge in the forward direction may be shared with another subgraph, the utterance cluster generation unit 14 As shown, a subgraph that does not include a starting point may be extracted.

抽出された部分グラフ中の各頂点である発話部分表現の抽出元である発話集合は、第１文「詰まるのはいつも同じところなんでしょうか。」と、第２文「毎回紙詰まりの場所は一緒でしょうか。」の２文である。したがって、発話クラスタ生成部14は、これらの発話を、図１０に示すように、発話クラスタとして発話クラスタ記憶部22に記憶する（ステップA3）。 The utterance set that is the source of the utterance subexpression that is each vertex in the extracted subgraph is the first sentence “Is it always the same place that is jammed?” And the second sentence “The location of the paper jam every time is "Is it together?" Therefore, the utterance cluster generation unit 14 stores these utterances in the utterance cluster storage unit 22 as utterance clusters as shown in FIG. 10 (step A3).

本実施形態に係る音声対話要約装置は、トークスクリプトの対話フローと一致しない応対事例であっても、発話クラスタを生成・提示することで、同じ発話内容に関する発話の集合を一括で閲覧、音声を確認可能とするように構成されている。したがって、本実施形態の音声対話要約装置によると、トークスクリプトの対話フローと一致しない応対事例であっても、モニタリング業務を効率化することができる。 The speech dialogue summarization apparatus according to the present embodiment generates and presents an utterance cluster even when the response case does not match the dialogue flow of the talk script, thereby collectively browsing a set of utterances related to the same utterance content and It is configured so that it can be confirmed. Therefore, according to the speech dialogue summarizing apparatus of the present embodiment, it is possible to improve the efficiency of the monitoring work even in the case of a response that does not match the dialogue flow of the talk script.

本実施形態に係る音声対話要約装置は、コンタクトセンターにおける通話音声からの応対品質評価や、テキストマイニング、検索、ＦＡＱ（Frequently Asked Questions）作成支援といった用途に適用することができる。 The voice dialogue summarizing apparatus according to the present embodiment can be applied to uses such as response quality evaluation from call voice in a contact center, text mining, search, and FAQ (Frequently Asked Questions) creation support.

なお、上記の非特許文献の開示を、本書に引用をもって繰り込むものとする。本発明の全開示（請求の範囲を含む）の枠内において、さらにその基本的技術思想に基づいて、実施形態の変更・調整が可能である。また、本発明の請求の範囲の枠内において種々の開示要素（各請求項の各要素、各実施形態の各要素、各図面の各要素等を含む）の多様な組み合わせ、ないし、選択が可能である。すなわち、本発明は、請求の範囲を含む全開示、技術的思想にしたがって当業者であればなし得るであろう各種変形、修正を含むことは勿論である。 It should be noted that the disclosure of the above non-patent document is incorporated herein by reference. Within the scope of the entire disclosure (including claims) of the present invention, the embodiment can be changed and adjusted based on the basic technical concept. Further, various combinations or selections of various disclosed elements (including each element of each claim, each element of each embodiment, each element of each drawing, etc.) are possible within the scope of the claims of the present invention. It is. That is, the present invention of course includes various variations and modifications that could be made by those skilled in the art according to the entire disclosure including the claims and the technical idea.

１０コンピュータ
１１発話部分表現抽出部
１２含意判定部
１３有向グラフ生成部
１４発話クラスタ生成部
２０記憶部
２１対話テキスト集合記憶部
２２発話クラスタ記憶部
１０１音声認識手段
１０２トピック遷移グラフ生成手段
１０３スクリプト一致度計算手段 DESCRIPTION OF SYMBOLS 10 Computer 11 Utterance partial expression extraction part 12 Implication determination part 13 Directed graph generation part 14 Utterance cluster generation part 20 Storage part 21 Dialogue text set storage part 22 Utterance cluster storage part 101 Speech recognition means 102 Topic transition graph generation means 103 Script coincidence calculation means

Claims

An utterance partial expression extraction unit that extracts a partial expression implied by each utterance as an utterance partial expression from a plurality of utterances included in the conversation text set;
A directed graph generation unit for generating a directed graph in which the extracted utterance partial representations are vertices and the utterance partial representations that have an implication relationship with each other are connected by directed edges;
A subgraph consisting of vertices that can be reached by following a directed edge in the forward direction from one vertex in the directed graph is obtained, and consists of an utterance that is an extraction source of an utterance subexpression corresponding to the vertex included in the obtained subgraph. An utterance cluster generation unit that generates a set of utterances as an utterance cluster;
A speech dialogue summarizing apparatus characterized by the above.

The directed graph generation unit generates a directed graph connecting utterance partial representations that have an implication relationship with each other by a directional edge that is directed from the utterance partial representation on the side of implication to the utterance partial representation on the side of implication.
The speech dialogue summarizing apparatus according to claim 1.

A computer extracting a partial expression implied by each utterance as an utterance partial expression from a plurality of utterances included in the conversation text set;
Generating a directed graph in which the extracted utterance partial representations are vertices and the utterance partial representations that have an implication relationship with each other are connected by directed edges;
A subgraph consisting of vertices that can be reached by following a directed edge in the forward direction from one vertex in the directed graph is obtained, and consists of an utterance that is an extraction source of an utterance subexpression corresponding to the vertex included in the obtained subgraph. Generating a set of utterances as an utterance cluster;
A speech dialogue summarization method characterized by the above.

As the directed graph, generate a directed graph in which the utterance partial representations that have an implication relationship with each other are connected by a directional edge that is directed from the utterance partial representation on the implication side to the utterance partial representation on the implication side.
The voice dialogue summarizing method according to claim 3.

A process of extracting a partial expression implied by each utterance as an utterance partial expression from a plurality of utterances included in the conversation text set;
A process of generating a directed graph in which the extracted utterance partial representations are vertices and the utterance partial representations that have implication relations are connected by directed edges;
A subgraph consisting of vertices that can be reached by following a directed edge in the forward direction from one vertex in the directed graph is obtained, and consists of an utterance that is an extraction source of an utterance subexpression corresponding to the vertex included in the obtained subgraph. Processing a computer to generate a set of utterances as an utterance cluster;
A program characterized by that.

A process for generating a directed graph in which the utterance partial representations that have an implication relationship with each other are connected to the directional side toward the implied utterance partial representation from the implied utterance partial representation as the directed graph. To execute,
The program according to claim 5.