JP2010021681A

JP2010021681A - Reception flow creation program, reception flow creation method, and reception flow creating apparatus

Info

Publication number: JP2010021681A
Application number: JP2008178803A
Authority: JP
Inventors: Sachiko Onodera; 佐知子小野寺; Isao Nanba; 功難波
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2008-07-09
Filing date: 2008-07-09
Publication date: 2010-01-28
Anticipated expiration: 2028-07-09
Also published as: JP5136246B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a reception flow creation program, a reception flow creation method, and a reception flow creating apparatus which can appropriately treat silent interval which exists during a speech, and which easily extract problems in each procedures. <P>SOLUTION: The problem can be solved by a computer which is made to function as: a procedure keyword table recording means; an utterance interval extraction means which extracts the utterance interval of a customer and a reception person; a voice recognition means which carries out voice recognition of the speech recording data; a procedure extraction means which extracts a procedure, corresponding to a recognized keyword included in the voice recognition result based on the procedure keyword table; a procedure time calculation means which calculates the time from a start, until an end of the procedure, and at the same time, if there is a silent interval other than the speech interval, makes a talker, immediately after the silent interval as a generator of the silent interval, and adds the time of the silent interval to the time from the start, until the end of the procedure; and a reception flow creation means which creates and outputs the reception flow, based on the time from the start, until the end of the procedure. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、応対フロー作成プログラム、応対フロー作成方法及び応対フロー作成装置に係り、特に顧客に対する応対の通話を録音した通話録音データから応対フローを作成する応対フロー作成プログラム、応対フロー作成方法及び応対フロー作成装置に関する。 The present invention relates to a reception flow creation program, a reception flow creation method, and a reception flow creation device, and more particularly to a reception flow creation program, a reception flow creation method, and a reception flow for creating a reception flow from call recording data obtained by recording a call to a customer. The present invention relates to a flow creation device.

例えばヘルプデスクのような顧客からの質問に回答するコールセンタでは、同じ質問に対する回答時間が応対者（エージェント）によって異なっている。これは、エージェント毎に質問内容の状況・要因を切り分け，質問の本質課題を得るまでの手順、回答を説明する手順が異なる為と考えられている。 For example, in a call center that answers a question from a customer such as a help desk, the answering time for the same question varies depending on an agent (agent). This is thought to be because the procedure for separating the situation / factor of the question contents for each agent and obtaining the essential problem of the question and the procedure for explaining the answer are different.

一方、コールセンタの管理者は一つの応対に要する時間を、応対の品質を損なうことなく短縮し、コールセンタの業務の効率化を図りたいと考えている。このため、管理者は質問内容の状況・要因を切り分け，質問の本質課題を得るまでの手順、回答を説明する手順の違いを明らかにし、どのエージェントも最適な手順で回答できるようにしたい。 On the other hand, the call center manager wants to shorten the time required for one response without impairing the quality of the response and to improve the efficiency of the call center operations. For this reason, the administrator wants to identify the situation / factors of the question content, clarify the difference between the procedure for obtaining the essential problem of the question and the procedure for explaining the answer, and make it possible for any agent to answer in the optimal procedure.

現状のコールセンタでは、顧客からの質問内容（例えばフレキシブルディスクが読み込めない）と、その質問内容に対する最終回答（例えばプログラムディスクの×××．ＥＸＥを実行していただき、再起動後、フレキシブルディスクが開けることを御確認いただきました）と、が応対記録として残されている。 At the current call center, the customer's question (for example, the flexible disk cannot be read) and the final answer to the question (for example, xxx.EXE of the program disk is executed, and the flexible disk can be opened after rebooting. I have confirmed this), and is recorded as a response record.

しかし、応対記録には最終回答に至るまでの経緯が残されていない。例えばプログラムディスクの×××．ＥＸＥを実行したことは記載されているが、どのような手順でプログラムディスクの×××．ＥＸＥを実行したのか記載されていない。このように、残されている応対記録からは、質問内容の状況・要因を切り分け，質問の本質課題を得るまでの手順（問題切り分け手順）、回答を説明する手順（回答手順）が分からなかった。 However, there is no history in the response record until the final answer. For example, xxx. Although it is described that EXE has been executed, the procedure of xxx. It is not described whether EXE is executed. In this way, from the remaining response records, we did not know the procedure (question isolation procedure) to separate the situation / factors of the question contents and obtain the essential question of the question, and the procedure (answer procedure) to explain the answer. .

ところで、コールセンタでは顧客に対する各対応の全通話を録音し、通話録音データとして残してある。この通話録音データを聴けば、コールセンタでは問題切り分け手順と回答手順（説明手順）とが後からでも全て分かる。しかし、実際に人間が通話録音データを聴いて問題切り分け手順と説明手順とを調べる為には膨大な時間が掛かる。 By the way, in the call center, all calls corresponding to each customer are recorded and left as call recording data. By listening to the call recording data, the call center can understand the problem isolation procedure and the answer procedure (explanation procedure) later. However, it takes an enormous amount of time for humans to actually listen to call recording data and examine the problem isolation procedure and the explanation procedure.

通話録音データのような音声対話データから特徴的な箇所を抽出する技術としては例えば特許文献１があり、質問−応答箇所をその発話特徴を利用して抽出するものがある。また、コールセンタにおけるコール情報を利用した技術として例えば特許文献２があり、一つの案件に対する処理時間をそれに関わるコールの時間から計測するものがある。
特開２００５−２４２８９１号公報特開２００３−２９８７４８号公報 As a technique for extracting a characteristic part from voice dialogue data such as call recording data, there is, for example, Patent Document 1, and there is a technique for extracting a question-response part using the utterance characteristic. Further, as a technique using call information in a call center, for example, there is Patent Document 2, which measures the processing time for one case from the time of a call related thereto.
JP 2005-242891 A JP 2003-298748 A

通話録音データから説明手順等の各手順を抽出し、各手順の所要時間を抽出する為には通話録音データから手順を抽出する手段と、各手順の所要時間を算出する手段と、コールセンタでの通話に特有の無音区間（通話中断区間）を処理する手段とが必要である。特に従来のコールセンタでは、通話に特有の無音区間を、どのように処理するかが検討されていなかった。 In order to extract each procedure such as the explanation procedure from the call recording data and extract the time required for each procedure, means for extracting the procedure from the call recording data, means for calculating the time required for each procedure, A means for processing a silent section (call interruption section) peculiar to a call is necessary. In particular, in a conventional call center, it has not been studied how to process a silent section peculiar to a call.

コールセンタにおけるエージェントと顧客との通話中には、通話が中断する無音区間が存在する。無音区間が生じる要因は、エージェントが回答内容を調べているため顧客を待たせている、エージェントが顧客情報（過去の応対内容など）を検索しているため顧客を待たせている、顧客が質問内容の詳細を確認するためエージェントを待たせている、顧客がエージェントに説明された対処を実際に行っているためエージェントを待たせている等であり、誰が何のために生じさせたのかは、その都度異なっている。 During a call between an agent and a customer in a call center, there is a silent section where the call is interrupted. The reason for the silent section is that the agent is waiting for the customer because the agent is examining the response, the customer is waiting for the customer because the agent is searching for customer information (such as past response contents), and the customer is asking The agent is waiting to check the details of the content, the customer is waiting for the agent because the customer is actually doing the action explained to the agent, etc. It is different each time.

したがって、無音区間が通話中に存在する場合には、どの手順によって引き起こされた無音区間であり、どの手順の所要時間に含ませるのか、誰が生じさせた無音区間なのかを判定する必要があった。 Therefore, when there is a silent section during a call, it is necessary to determine which procedure is the silent section caused by which procedure, which time is included in the required time, and who the silent section is caused by .

本発明の一実施形態は、上記の点に鑑みなされたもので、通話中に存在する無音区間を適切に扱うことができ、各手順の問題点を容易に抽出することができる応対フロー作成プログラム、応対フロー作成方法及び応対フロー作成装置を提供することを目的とする。 One embodiment of the present invention has been made in view of the above points, and it is possible to appropriately handle a silent section existing during a call and to easily extract a problem of each procedure. An object is to provide a reception flow creation method and a reception flow creation apparatus.

上記課題を解決するため、本発明の一実施形態は、顧客に対する応対の通話を録音した通話録音データから応対フローを作成するためにコンピュータを、顧客に対する応対の手順ごとに、前記手順を説明する為に応対者が発話するキーワード及び前記手順の説明に呼応して顧客が発話するキーワードを対応付けて手順キーワード表として予め記録しておく手順キーワード表記録手段と、前記通話録音データから抽出された韻律情報に基づき、顧客及び応対者の発話区間を抽出する発話区間抽出手段と、前記通話録音データを音声認識し、認識したキーワード，話者，及び出現時間を音声認識結果として出力する音声認識手段と、前記手順キーワード表を読み出し、前記音声認識結果に含まれる認識したキーワードに対応する前記手順を、前記手順キーワード表に基づき抽出する手順抽出手段と、前記発話区間抽出手段が抽出した前記発話区間，前記音声認識手段が出力した前記音声認識結果，及び前記手順抽出手段が抽出した前記認識したキーワードに対応する前記手順に基づき、前記手順の開始から終了までの時間を算出すると共に、前記発話区間以外の無音区間がある場合、前記無音区間の直後の発話者を前記無音区間の発生者とし、前記無音区間の時間を前記手順の開始から終了までの時間に加算する手順時間算出手段と、前記手順の開始から終了までの時間に基づき、前記応対フローを作成して出力する応対フロー作成手段として機能させるための応対フロー作成プログラムである。 In order to solve the above-described problem, an embodiment of the present invention describes a computer for creating a reception flow from call recording data obtained by recording a reception call for a customer, and the procedure for each reception procedure for a customer. Therefore, a procedure keyword table recording means for pre-recording a keyword as a procedure keyword table in association with a keyword spoken by a responder and a keyword spoken by a customer in response to the description of the procedure, and extracted from the call recording data Speech segment extraction means for extracting speech segments of customers and respondents based on prosodic information, and speech recognition means for speech recognition of the call recording data and outputting the recognized keyword, speaker, and appearance time as speech recognition results And reading the procedure keyword table, the procedure corresponding to the recognized keyword included in the speech recognition result, Corresponding to the procedure extraction means for extracting based on the forward keyword table, the utterance section extracted by the utterance section extraction means, the speech recognition result output by the speech recognition means, and the recognized keyword extracted by the procedure extraction means Based on the procedure, the time from the start to the end of the procedure is calculated, and when there is a silent section other than the speech section, the speaker immediately after the silent section is set as the generator of the silent section, and the silence A procedure time calculating means for adding the time of the section to a time from the start to the end of the procedure, and a response flow creating means for creating and outputting the response flow based on the time from the start to the end of the procedure This is a response flow creation program.

なお、本発明の一実施形態の構成要素、表現または構成要素の任意の組合せを、方法、装置、システム、コンピュータプログラム、記録媒体、データ構造などに適用したものも本発明の態様として有効である。 In addition, what applied the component, the expression, or arbitrary combinations of the component of one Embodiment of this invention to a method, an apparatus, a system, a computer program, a recording medium, a data structure, etc. is also effective as an aspect of this invention. .

上述の如く、本発明の一実施形態によれば、通話中に存在する無音区間を適切に扱うことができ、各手順の問題点を容易に抽出することができる応対フロー作成プログラム、応対フロー作成方法及び応対フロー作成装置を提供可能である。 As described above, according to one embodiment of the present invention, a reception flow creation program and a reception flow creation that can appropriately handle a silent section existing during a call and can easily extract the problems of each procedure. A method and a service flow creation apparatus can be provided.

次に、本発明を実施するための最良の形態を、以下の実施例に基づき図面を参照しつつ説明していく。 Next, the best mode for carrying out the present invention will be described based on the following embodiments with reference to the drawings.

図１は本実施例の概要を表した一例の説明図である。本実施例の応対フロー作成装置１は顧客に対する各対応の全通話を録音した通話録音データを入力とする。応対フロー作成装置１は入力された通話録音データから後述のように手順箇所を特定し、各手順の所要時間を算出する。 FIG. 1 is an explanatory diagram showing an example of an outline of the present embodiment. The response flow creation apparatus 1 of the present embodiment receives call recording data obtained by recording all calls to each customer. The response flow creation device 1 specifies a procedure location from the input call recording data as described later, and calculates the time required for each procedure.

応対フロー作成装置１は、特定した手順箇所の前後に無音区間が存在する場合、どの手順によって引き起こされた無音区間であり、その無音区間をどの手順に入れるか後述のように判定する。また、応対フロー作成装置１は特定した手順箇所及び各手順の所要時間から後述のように応対フロー（説明手順等の各手順のフロー）を作成する。対応フロー作成装置１は各手順のフローと各手順の所要時間とを出力とする。 When there is a silent section before and after the specified procedure location, the reception flow creation apparatus 1 determines which procedure is the silent section caused by the procedure, and in which procedure the silent section is put as described below. In addition, the reception flow creation device 1 creates a reception flow (a flow of each procedure such as an explanation procedure) from the identified procedure location and the time required for each procedure as described later. The corresponding flow creation apparatus 1 outputs the flow of each procedure and the time required for each procedure.

図２は、入力された通話録音データから手順箇所を特定し、各手順の所要時間を算出する処理を表した一例のイメージ図である。応対フロー作成装置１は、入力された通話録音データから手順箇所を特定するとき、エージェントの手順を示すキーワード（ＫＷ）と、そのエージェントのキーワードに呼応する顧客のキーワードとが出現する箇所を通話録音データから探す。応対フロー作成装置１は、エージェントのキーワードを含む発話区間の開始から、エージェントのキーワードに呼応する顧客のキーワードを含む発話区間の終了までを、そのキーワードが示す手順に掛かった時間（所要時間）とする。 FIG. 2 is an image diagram showing an example of processing for specifying a procedure part from input call recording data and calculating a time required for each procedure. When the response flow creation device 1 specifies a procedure location from the input call recording data, the call recording is performed at a location where a keyword (KW) indicating an agent procedure and a customer keyword corresponding to the agent keyword appear. Search from data. The response flow creation device 1 includes the time (required time) required for the procedure indicated by the keyword from the start of the utterance section including the agent keyword to the end of the utterance section including the customer keyword corresponding to the agent keyword. To do.

例えば図２ではエージェントのキーワード「プログラムディスク、ダブルクリック」を含む発話区間１０の開始から、エージェントのキーワード「プログラムディスク、ダブルクリック」に呼応する顧客のキーワード「プログラムディスク」を含む発話区間１１の終了までを、そのキーワード「プログラムディスク、ダブルクリック」が示す手順の所要時間とする。 For example, in FIG. 2, from the start of the utterance section 10 including the agent keyword “program disk, double click”, the end of the utterance section 11 including the customer keyword “program disk” corresponding to the agent keyword “program disk, double click”. Is the time required for the procedure indicated by the keyword “program disk, double click”.

図３は、各手順の所要時間を算出する処理のうち、隙間があいた場合の処理を表した一例のイメージ図である。応対フロー作成装置１は、音声認識で全てのキーワードを確実に抽出できる訳ではないため、特定した手順に隙間があいた場合、その隙間に入る手順を推定する。 FIG. 3 is an image diagram illustrating an example of a process when there is a gap among processes for calculating the time required for each procedure. Since the response flow creation device 1 cannot reliably extract all keywords by voice recognition, if there is a gap in the identified procedure, the procedure in the gap is estimated.

例えば図３ではエージェントのキーワード「プログラムディスク、ダブルクリック」を含む発話区間２０の開始から、エージェントのキーワード「プログラムディスク、ダブルクリック」に呼応する顧客のキーワード「はい」を含む発話区間２１の終了までを、そのキーワード「プログラムディスク、ダブルクリック」が示す手順２とする。 For example, in FIG. 3, from the start of the utterance section 20 including the agent keyword “program disk, double click” to the end of the utterance section 21 including the customer keyword “yes” corresponding to the agent keyword “program disk, double click”. Is the procedure 2 indicated by the keyword “program disk, double click”.

また、図３ではエージェントのキーワード「フォルダ、開いて」を含む発話区間２４の開始から、エージェントのキーワード「フォルダ、開いて」に呼応する顧客のキーワード「開きました」を含む発話区間２５の終了までを、そのキーワード「フォルダ、開いて」が示す手順４とする。 Further, in FIG. 3, from the start of the utterance section 24 including the agent keyword “folder, open”, the end of the utterance section 25 including the customer keyword “opened” corresponding to the agent keyword “folder, open”. Step 4 is the procedure 4 indicated by the keyword “open folder”.

手順２と手順４との間に手順３があることが分かっていれば、応対フロー作成装置１は手順２と手順４との間にあいた隙間の発話区間２２の開始から発話区間２３の終了までを手順３と推定する。 If it is known that there is a procedure 3 between the procedure 2 and the procedure 4, the response flow creation device 1 starts from the start of the utterance section 22 in the gap between the procedure 2 and the procedure 4 until the end of the utterance section 23. Is estimated as procedure 3.

図４は、各手順の所要時間を算出する処理のうち、無音区間に対する処理を表した一例のイメージ図である。応対フロー作成装置１は、無音区間の出現する箇所によって、どの手順に対応する無音区間であるか、妥当な無音区間であるかを特定する。 FIG. 4 is an image diagram showing an example of a process for a silent section among the processes for calculating the time required for each procedure. The response flow creation device 1 identifies which procedure corresponds to a silent section or a valid silent section, depending on where the silent section appears.

例えば図４ではエージェントのキーワード「プログラムディスク、ダブルクリック」を含む発話区間３０の開始から、エージェントのキーワード「プログラムディスク、ダブルクリック」に呼応する顧客のキーワード「はい」を含む発話区間３２の終了までを、そのキーワード「プログラムディスク、ダブルクリック」が示す手順とする。 For example, in FIG. 4, from the start of the utterance section 30 including the agent keyword “program disk, double click” to the end of the utterance section 32 including the customer keyword “yes” corresponding to the agent keyword “program disk, double click”. Is the procedure indicated by the keyword “program disk, double click”.

キーワード「プログラムディスク、ダブルクリック」が示す手順には、発話区間３１と発話区間３２との間に無音区間３３が含まれている。応対フロー作成装置１は無音区間３３の直後の最初の発話者が誰であるかによって、その無音区間３３が誰によって引き起こされたかを特定する。 The procedure indicated by the keyword “program disc, double click” includes a silent section 33 between the utterance section 31 and the utterance section 32. The response flow creation device 1 identifies who caused the silent section 33 by who the first speaker immediately after the silent section 33 is.

具体的に、応対フロー作成装置１は無音区間３３の直後の発話区間３２の発話者が顧客であることから無音区間３３が顧客によって引き起こされたものと推定する。なお、応対フロー作成装置１は、無音区間３３の直後の発話区間３２の発話者がエージェントである場合、無音区間３３がエージェントによって引き起こされたものと推定する。 Specifically, the response flow creation apparatus 1 estimates that the silent section 33 is caused by the customer because the speaker in the speech section 32 immediately after the silent section 33 is a customer. When the speaker in the utterance section 32 immediately after the silence section 33 is an agent, the response flow creation apparatus 1 estimates that the silence section 33 is caused by the agent.

図５は、各手順の所要時間を算出する処理のうち、無音区間に対する処理を表した他の例のイメージ図である。図５ではエージェントのキーワード「プログラムディスク、ダブルクリック」に呼応する顧客のキーワード「はい」を含む発話区間４１の後に、無音区間４４が出現している。 FIG. 5 is a conceptual diagram showing another example of the process for the silent section among the processes for calculating the time required for each procedure. In FIG. 5, a silent section 44 appears after the utterance section 41 including the customer keyword “yes” corresponding to the agent keyword “program disc, double click”.

応対フロー作成装置１は、エージェントのキーワード「プログラムディスク、ダブルクリック」が示す手順に付随するエージェント操作の有無により、後述のように無音区間４４が妥当なものであるか、無音区間４４をエージェントのキーワード「プログラムディスク、ダブルクリック」が示す手順の一部とみなすのか、を判定する。 The response flow creation device 1 determines whether the silent section 44 is appropriate, as described later, depending on the presence or absence of an agent operation accompanying the procedure indicated by the agent keyword “program disk, double click”. It is determined whether it is considered as a part of the procedure indicated by the keyword “program disk, double click”.

具体的に、応対フロー作成装置１は発話区間４０の開始から発話区間４１の終了までの手順に付随するエージェント操作の後作業があれば、その後作業を無音区間４４で行っていると推定し、その無音区間４４を発話区間４０の開始から発話区間４１の終了までの手順の所要時間に加算する。また、応対フロー作成装置１は発話区間４０の開始から発話区間４１の終了までの手順に付随するエージェント操作の後作業があれば、無音区間４４を妥当なものと判定する。 Specifically, if there is a post-operation of the agent operation accompanying the procedure from the start of the utterance section 40 to the end of the utterance section 41, the response flow creation apparatus 1 estimates that the work is performed in the silent section 44, The silent section 44 is added to the time required for the procedure from the start of the utterance section 40 to the end of the utterance section 41. Further, the response flow creation device 1 determines that the silent section 44 is appropriate if there is a post-operation of the agent operation accompanying the procedure from the start of the utterance section 40 to the end of the utterance section 41.

なお、応対フロー作成装置１は発話区間４０の開始から発話区間４１の終了までの手順に付随するエージェント操作の後作業が無く、且つ発話区間４２の開始から発話区間４３の終了までの手順に付随するエージェント操作の前作業が無ければ、その無音区間４４を発話区間４０の開始から発話区間４１の終了までの手順の所要時間に加算する。また、応対フロー作成装置１は無音区間４４を妥当でないものと判定する。 Note that the response flow creation apparatus 1 has no post-operation of the agent operation associated with the procedure from the start of the utterance section 40 to the end of the utterance section 41, and is associated with the procedure from the start of the utterance section 42 to the end of the utterance section 43. If there is no work prior to the agent operation, the silent section 44 is added to the time required for the procedure from the start of the utterance section 40 to the end of the utterance section 41. Moreover, the reception flow creation apparatus 1 determines that the silent section 44 is not valid.

図６は応対フロー作成装置の一例のハードウェア構成図である。応対フロー作成装置１は、それぞれバスＢで相互に接続されている入力装置５１，出力装置５２，ドライブ装置５３，補助記憶装置５４，主記憶装置５５，演算処理装置５６およびインターフェース装置５７を有するように構成される。 FIG. 6 is a hardware configuration diagram of an example of a reception flow creation apparatus. The response flow creation device 1 includes an input device 51, an output device 52, a drive device 53, an auxiliary storage device 54, a main storage device 55, an arithmetic processing device 56, and an interface device 57 that are mutually connected by a bus B. Configured.

入力装置５１はキーボードやマウスなどで構成され、各種信号を入力するために用いられる。出力装置５２はディスプレイ装置などで構成され、各種ウインドウやデータ等を表示するために用いられる。インターフェース装置５７は、モデム，ＬＡＮカードなどで構成されており、ネットワークに接続する為に用いられる。 The input device 51 includes a keyboard and a mouse, and is used for inputting various signals. The output device 52 includes a display device and is used to display various windows, data, and the like. The interface device 57 includes a modem, a LAN card, and the like, and is used for connecting to a network.

本実施例の応対フロー作成プログラムは、応対フロー作成装置１を制御する各種プログラムの少なくとも一部である。応対フロー作成プログラムは例えば記録媒体５８の配布やネットワークからのダウンロードなどによって提供される。応対フロー作成プログラムを記録した記録媒体５８は、ＣＤ−ＲＯＭ、フレキシブルディスク、光磁気ディスク等の様に情報を光学的，電気的或いは磁気的に記録する記録媒体、ＲＯＭ、フラッシュメモリ等の様に情報を電気的に記録する半導体メモリ等、様々なタイプの記録媒体を用いることができる。 The reception flow creation program of this embodiment is at least a part of various programs that control the reception flow creation apparatus 1. The response flow creation program is provided by, for example, distribution of the recording medium 58 or downloading from a network. The recording medium 58 on which the response flow creation program is recorded is a recording medium such as a CD-ROM, a flexible disk, a magneto-optical disk, etc. for recording information optically, electrically or magnetically, a ROM, a flash memory, etc. Various types of recording media such as a semiconductor memory for electrically recording information can be used.

また、応対フロー作成プログラムを記録した記録媒体５８がドライブ装置５３にセットされると、応対フロー作成プログラムは記録媒体５８からドライブ装置５３を介して補助記憶装置５４にインストールされる。ネットワークからダウンロードされた応対フロー作成プログラムは、インターフェース装置５７を介して補助記憶装置５４にインストールされる。補助記憶装置５４は、インストールされた応対フロー作成プログラムを格納すると共に、必要なファイル，データ等を格納する。 When the recording medium 58 on which the response flow creation program is recorded is set in the drive device 53, the response flow creation program is installed from the recording medium 58 to the auxiliary storage device 54 via the drive device 53. The response flow creation program downloaded from the network is installed in the auxiliary storage device 54 via the interface device 57. The auxiliary storage device 54 stores the installed response flow creation program and stores necessary files, data, and the like.

主記憶装置５５は、応対フロー作成プログラムの起動時に、補助記憶装置５４から応対フロー作成プログラムを読み出して格納する。そして、演算処理装置５６は主記憶装置５５に格納された応対フロー作成プログラムに従って、ソフトウェアとハードウェア資源とが協働した処理部（具体的手段）によって、後述するような各種処理を実現している。 The main storage device 55 reads the response flow creation program from the auxiliary storage device 54 and stores it when the response flow creation program is started. The arithmetic processing unit 56 implements various processes as described later by a processing unit (specific means) in which software and hardware resources cooperate in accordance with a reception flow creation program stored in the main storage device 55. Yes.

図７は応対フロー作成装置の処理部とデータとの関係を表した構成図である。図７の応対フロー作成装置１は、韻律情報抽出部６１と、発話区間抽出部６２と、フェーズ推定部６３と、音声認識部６４と、手順抽出部６５と、手順時間算出部６６と、応対フロー作成部６７とを有する構成である。 FIG. 7 is a configuration diagram showing the relationship between the processing unit and data of the reception flow creation apparatus. 7 includes a prosodic information extraction unit 61, a speech segment extraction unit 62, a phase estimation unit 63, a speech recognition unit 64, a procedure extraction unit 65, a procedure time calculation unit 66, and a response. The flow creation unit 67 is included.

また、図７の応対フロー作成装置１は、通話録音データ７１，韻律データ７２，発話区間情報７３，フェーズ情報７４，音声認識キーワードリスト７５，音声認識結果７６，手順キーワード表７７，手順抽出データ７８，応対フローデータ７９，応対手順時間情報８０を利用している。 Also, the response flow creation device 1 of FIG. 7 includes call recording data 71, prosody data 72, speech segment information 73, phase information 74, speech recognition keyword list 75, speech recognition result 76, procedure keyword table 77, procedure extraction data 78. , Reception flow data 79 and reception procedure time information 80 are used.

ここでは、応対フロー作成装置１の処理部とデータとの関係を、図８のフローチャートに従って説明する。図８は応対フロー作成装置の処理手順を表した一例のフローチャートである。 Here, the relationship between the processing unit of the reception flow creation apparatus 1 and data will be described with reference to the flowchart of FIG. FIG. 8 is a flowchart of an example showing the processing procedure of the reception flow creation apparatus.

ステップＳ１に進み、韻律情報抽出部６１は通話録音データ７１を入力される。通話録音データ７１は、例えばエージェントの音声及び顧客の音声を別チャネルで録音したステレオファイルである。通話録音データ７１は各コールの最初から最後までの全通話が録音されている。通話録音データ７１には、例えば応対記録と紐付けられるインシデントＩＤや録音開始日時などの情報が付随している。 Proceeding to step S 1, the prosody information extracting unit 61 receives the call recording data 71. The call recording data 71 is, for example, a stereo file in which agent voice and customer voice are recorded on different channels. In the call recording data 71, all calls from the beginning to the end of each call are recorded. For example, the call recording data 71 is accompanied by information such as an incident ID and recording start date / time associated with the response recording.

韻律情報抽出部６１は、入力された通話録音データ７１からエージェント及び顧客の韻律データ７２を抽出して出力する。韻律情報抽出部６１の処理及び韻律データ７２の詳細は後述する。 The prosody information extraction unit 61 extracts agent and customer prosody data 72 from the input call recording data 71 and outputs it. Details of the processing of the prosody information extraction unit 61 and the prosody data 72 will be described later.

発話区間抽出部６２はステップＳ２に進み、入力された韻律データ７２からエージェント及び顧客の発話区間を抽出して発話区間情報７３として出力する。発話区間抽出部６２の処理及び発話区間情報７３の詳細は後述する。また、フェーズ推定部６３はステップＳ３に進み、入力された発話区間情報７３から質問フェーズと問題切り分け・回答フェーズとを推定し、フェーズ情報７４として出力する。なお、フェーズ推定部６３による処理は必須ではない。フェーズ推定部６３の処理及びフェーズ情報７４の詳細は後述する。 In step S 2, the utterance section extraction unit 62 extracts the utterance sections of the agent and the customer from the input prosodic data 72 and outputs the extracted utterance section information 73. Details of the processing of the utterance section extraction unit 62 and the utterance section information 73 will be described later. Further, the phase estimation unit 63 proceeds to step S 3, estimates the question phase and the problem isolation / answer phase from the input utterance section information 73, and outputs it as phase information 74. Note that the processing by the phase estimation unit 63 is not essential. Details of the processing of the phase estimation unit 63 and the phase information 74 will be described later.

ステップＳ４に進み、音声認識部６４は通話録音データ７１，フェーズ情報７４，音声認識キーワードリスト７５が入力される。音声認識部６４は、入力されたフェーズ情報７４及び音声認識キーワードリスト７５を利用して通話録音データ７１の音声認識を行って音声認識結果７６を出力する。なお、音声認識部７６の処理，音声認識キーワードリスト７５及び音声認識結果７６の詳細は後述する。 In step S4, the voice recognition unit 64 receives the call recording data 71, the phase information 74, and the voice recognition keyword list 75. The voice recognition unit 64 performs voice recognition of the call recording data 71 using the input phase information 74 and the voice recognition keyword list 75 and outputs a voice recognition result 76. Details of the processing of the voice recognition unit 76, the voice recognition keyword list 75, and the voice recognition result 76 will be described later.

ステップＳ５に進み、手順抽出部６５は音声認識結果７６及び手順キーワード表７７が入力される。手順抽出部６５は、入力された音声認識結果７６及び手順キーワード表７７を利用して手順箇所の特定を行って手順抽出データ７８を出力する。なお、手順抽出部６５の処理，手順キーワード表７７及び手順抽出データ７８の詳細は後述する。 In step S5, the procedure extraction unit 65 receives the speech recognition result 76 and the procedure keyword table 77. The procedure extraction unit 65 specifies the procedure part using the input speech recognition result 76 and the procedure keyword table 77 and outputs the procedure extraction data 78. Details of the process of the procedure extraction unit 65, the procedure keyword table 77, and the procedure extraction data 78 will be described later.

ステップＳ６に進み、手順時間算出部６６は発話区間情報７３，音声認識結果７６及び手順キーワード表７７が入力される。手順時間算出部６６は、入力された発話区間情報７３，音声認識結果７６及び手順キーワード表７７を利用して手順時間算出を行って手順抽出データ７８にデータを追加する。なお、手順時間算出部６６の処理，手順抽出データ７８に追加するデータの詳細は後述する。 In step S6, the procedure time calculation unit 66 receives the utterance section information 73, the speech recognition result 76, and the procedure keyword table 77. The procedure time calculation unit 66 calculates the procedure time using the input utterance section information 73, the speech recognition result 76, and the procedure keyword table 77, and adds data to the procedure extraction data 78. Details of the process of the procedure time calculation unit 66 and the data added to the procedure extraction data 78 will be described later.

ステップＳ７に進み、応対フロー作成部６７は手順抽出データ７８を入力される。応対フロー作成部６７は、入力された手順抽出データ７８を利用して応対フローデータ７９及び応対手順時間情報８０を作成して出力する。なお、応対フロー作成部６７の処理，応対フローデータ７９及び応対手順時間情報８０の詳細は後述する。 In step S 7, the response flow creation unit 67 receives the procedure extraction data 78. The reception flow creation unit 67 creates and outputs reception flow data 79 and reception procedure time information 80 using the input procedure extraction data 78. Details of the processing of the reception flow creation unit 67, reception flow data 79, and reception procedure time information 80 will be described later.

以下では、図７に示した応対フロー作成装置１の各処理部が行う処理について詳細に説明していく。 Below, the process which each process part of the reception flow preparation apparatus 1 shown in FIG. 7 performs is demonstrated in detail.

（韻律情報抽出部６１）
韻律情報抽出部６１は、エージェントの音声及び顧客の音声を別チャネルで録音したステレオデータである通話音声データ７１が入力される。韻律情報抽出部６１は、入力された通話音声データ７１の韻律情報（パワー値）を一定時間ごとに算出し、エージェント及び顧客の韻律データ７２を抽出して出力する。 (Prosodic information extraction unit 61)
The prosodic information extraction unit 61 receives call voice data 71 which is stereo data obtained by recording agent voice and customer voice on different channels. The prosody information extraction unit 61 calculates the prosody information (power value) of the input call voice data 71 at regular intervals, and extracts and outputs the prosody data 72 of the agent and the customer.

図９はエージェント及び顧客の韻律データの一例を示す説明図である。図９に示した韻律データ７２は一行が時間とパワー値との組であり、一定時間（１２．８ｍｓｅｃ）ごとのパワー値列を表している。 FIG. 9 is an explanatory diagram showing an example of prosodic data of agents and customers. In the prosody data 72 shown in FIG. 9, one row is a set of time and power value, and represents a power value string for every fixed time (12.8 msec).

（発話区間抽出部６２）
発話区間抽出部６２は、エージェント及び顧客の韻律データ７２を入力される。発話区間抽出部６２は、入力されたエージェントの韻律データ７２から閾値以上のパワー値が連続しており、且つ、その連続時間が最低発話時間以上の箇所をエージェントの発話区間として抽出し、エージェントの発話区間情報７３として出力する。 (Speech section extraction unit 62)
The utterance section extraction unit 62 receives the prosodic data 72 of the agent and customer. The utterance section extraction unit 62 extracts a point where the power value equal to or greater than the threshold is continuous from the input prosodic data 72 of the agent and the continuous time is equal to or greater than the minimum utterance time as the agent utterance section. Output as utterance section information 73.

また、発話区間抽出部６２は入力された顧客の韻律データ７２から閾値以上のパワー値が連続しており、且つ、その連続時間が最低発話時間以上の箇所を顧客の発話区間として抽出し、顧客の発話区間情報７３として出力する。 Further, the utterance section extraction unit 62 extracts a portion where the power value equal to or higher than the threshold is continuous from the input customer prosodic data 72 and the continuous time is equal to or longer than the minimum utterance time as the customer's utterance section. Is output as utterance section information 73.

図１０はエージェント及び顧客の発話区間情報の一例を示す説明図である。図１０に示した発話区間情報７３は発話ＩＤ，開始時間及び終了時間の組であり、発話区間ごとの開始時間及び終了時間を表している。 FIG. 10 is an explanatory diagram showing an example of agent and customer utterance section information. The utterance section information 73 shown in FIG. 10 is a set of the utterance ID, start time, and end time, and represents the start time and end time for each utterance section.

（フェーズ推定部６３）
フェーズ推定部６３が行う処理は必須でない。フェーズ推定部６３が行う処理は、精度向上に寄与するものである。フェーズ推定部６３はエージェント及び顧客の発話区間情報７３を入力される。フェーズ推定部６３は、入力されたエージェント及び顧客の発話区間情報７３から発話区間の主導権話者を推定する。なお、発話区間の主導権話者を推定する方法は周知であるが、例えば特開２００７−１８４６９９号公報に記載されている方法を利用できる。 (Phase estimation unit 63)
The processing performed by the phase estimation unit 63 is not essential. The processing performed by the phase estimation unit 63 contributes to accuracy improvement. The phase estimation unit 63 receives the utterance section information 73 of the agent and the customer. The phase estimation unit 63 estimates the initiative speaker of the utterance section from the input agent and customer utterance section information 73. In addition, although the method of estimating the initiative speaker of an utterance area is known, the method described in Unexamined-Japanese-Patent No. 2007-184699 can be utilized, for example.

フェーズ推定部６３は、通話録音データ７１の開始直後の顧客が主導権話者となっている発話区間を質問フェーズと推定する。また、フェーズ推定部６３は、質問フェーズ後のエージェントが主導権話者となっている発話区間を、問題切り分け・回答フェーズと推定する。フェーズ推定部６３は、推定した質問フェーズと問題切り分け・回答フェーズとをフェーズ情報７４として出力する。 The phase estimation unit 63 estimates the utterance section in which the customer immediately after the start of the call recording data 71 is the initiative speaker as the question phase. Further, the phase estimation unit 63 estimates the utterance section in which the agent after the question phase is the initiative speaker as the problem isolation / answer phase. The phase estimation unit 63 outputs the estimated question phase and the problem isolation / answer phase as phase information 74.

図１１は、フェーズ情報の一例を示す説明図である。図１１に示したフェーズ情報７４はフェーズ，開始時間及び終了時間の組である。フェーズ情報７４は、質問フェーズ及び問題切り分け・回答フェーズごとの開始時間及び終了時間を表している。 FIG. 11 is an explanatory diagram showing an example of phase information. The phase information 74 shown in FIG. 11 is a set of phase, start time and end time. The phase information 74 represents a start time and an end time for each question phase and question isolation / answer phase.

（音声認識部６４）
音声認識部６４は、通話録音データ７１及び音声認識キーワードリスト７５が入力される。なお、フェーズ情報７４を利用する場合、音声認識部６４はフェーズ情報７４も入力される。 (Voice recognition unit 64)
The voice recognition unit 64 receives the call recording data 71 and the voice recognition keyword list 75. When the phase information 74 is used, the speech recognition unit 64 also receives the phase information 74.

図１２は音声認識キーワードリストの一例を示す説明図である。音声認識キーワードリスト７５は音声認識辞書である。音声認識キーワードリスト７５は、マニュアル，ＦＡＱや手順書等から後述のように作成される。音声認識キーワードリスト７５は、エージェント及び顧客それぞれに用意される。なお、図１２に示した音声認識キーワードリスト７５はキーワードと読みとの組である。 FIG. 12 is an explanatory diagram showing an example of a voice recognition keyword list. The voice recognition keyword list 75 is a voice recognition dictionary. The speech recognition keyword list 75 is created as described later from a manual, FAQ, procedure manual, and the like. The speech recognition keyword list 75 is prepared for each agent and customer. The voice recognition keyword list 75 shown in FIG. 12 is a set of keywords and readings.

音声認識部６４は、音声認識にワードスポッティングを利用する。音声認識部６４は通話録音データ７１の音声認識を行う。なお、フェーズ情報７４を利用する場合は、通話録音データ７１の切り分け・回答フェーズに限定して音声認識を行う。例えばエージェントの音声を録音した通話音声データ７１の場合、音声認識部６４はエージェント用音声認識キーワードリスト７５を利用する。また、音声認識部６４は顧客の音声を録音した通話音声データ７１の場合、顧客用音声認識キーワードリスト７５を利用する。 The voice recognition unit 64 uses word spotting for voice recognition. The voice recognition unit 64 performs voice recognition of the call recording data 71. Note that when the phase information 74 is used, voice recognition is performed only in the separation / answer phase of the call recording data 71. For example, in the case of call voice data 71 in which the voice of an agent is recorded, the voice recognition unit 64 uses the agent voice recognition keyword list 75. The voice recognition unit 64 uses the customer voice recognition keyword list 75 in the case of the call voice data 71 in which the customer voice is recorded.

音声認識部６４は通話録音データ７１の音声認識を行った結果である音声認識結果７６を出力する。図１３は音声認識結果の一例を示す説明図である。図１３に示した音声認識結果７６は、認識結果ＩＤ，認識キーワード，話者及び出現時刻の組である。音声認識結果７６は、認識結果ＩＤごとの認識キーワード，話者及び出現時刻を表している。 The voice recognition unit 64 outputs a voice recognition result 76 that is a result of voice recognition of the call recording data 71. FIG. 13 is an explanatory diagram showing an example of a speech recognition result. The speech recognition result 76 shown in FIG. 13 is a set of a recognition result ID, a recognition keyword, a speaker, and an appearance time. The voice recognition result 76 represents a recognition keyword, a speaker, and an appearance time for each recognition result ID.

認識結果ＩＤは、音声認識結果を識別するものである。認識キーワードは音声認識により認識した認識キーワードを表す。話者は認識キーワードを発話したエージェント又は顧客を表す。また、出現時刻は認識キーワードが出現する通話録音データ７１の先頭からの経過時刻を表す。 The recognition result ID is for identifying the voice recognition result. The recognition keyword represents a recognition keyword recognized by voice recognition. The speaker represents an agent or a customer who has spoken the recognition keyword. The appearance time represents the elapsed time from the beginning of the call recording data 71 in which the recognition keyword appears.

（手順抽出部６５）
手順抽出部６５は、音声認識結果７６及び手順キーワード表７７が入力される。図１４は手順キーワード表の一例を示す説明図である。図１４の手順キーワード表７７は、手順ＩＤ，エージェント操作，抽出キーワード，品詞及び対応語句の組である。手順ＩＤは手順を識別するものである。エージェント操作は、各手順におけるエージェントの操作内容を表すものである。 (Procedure extraction unit 65)
The procedure extraction unit 65 receives the speech recognition result 76 and the procedure keyword table 77. FIG. 14 is an explanatory diagram showing an example of a procedure keyword table. The procedure keyword table 77 in FIG. 14 is a set of procedure ID, agent operation, extracted keyword, part of speech, and corresponding phrase. The procedure ID identifies the procedure. The agent operation represents the operation content of the agent in each procedure.

抽出キーワードは手順を説明するためにエージェントが発話する、手順を特定するためのキーワードである。品詞は抽出キーワードの品詞情報である。対応語句はエージェントの手順説明に対する顧客応答で発話される語句である。なお、手順キーワード表７７はマニュアル，ＦＡＱや手順書等から後述のように作成される。 The extracted keyword is a keyword for specifying the procedure that the agent speaks to explain the procedure. The part of speech is the part of speech information of the extracted keyword. The corresponding word / phrase is a word / phrase uttered by a customer response to the agent's procedure explanation. The procedure keyword table 77 is created from a manual, FAQ, procedure manual, etc. as described later.

手順抽出部６５は、手順箇所の特定を次のように行う。手順抽出部６５は入力された音声認識結果７６から話者がエージェントの認識キーワードを抽出キーワードとして順番に取り出す。手順抽出部６５は、取り出したエージェントの抽出キーワードに対応する手順ＩＤと対応語句とを手順キーワード表７７から取り出す。 The procedure extraction unit 65 identifies the procedure part as follows. In the procedure extraction unit 65, the speaker sequentially extracts the recognition keyword of the agent as the extraction keyword from the input speech recognition result 76. The procedure extracting unit 65 extracts the procedure ID and the corresponding phrase corresponding to the extracted keyword of the extracted agent from the procedure keyword table 77.

手順抽出部６５は音声認識結果７６からエージェントの抽出キーワード直後の話者が顧客の認識キーワードで対応語句であるものを顧客キーワードとして取り出す。なお、ここで言うエージェントの抽出キーワード直後とは、次のエージェントの認識キーワードが出現するまで、且つ、予め定めた時間の範囲内を言う。 The procedure extraction unit 65 extracts from the voice recognition result 76 the speaker immediately after the extracted keyword of the agent as a customer keyword that is a customer's recognized keyword and is a corresponding phrase. The term “immediately after the agent extraction keyword” here refers to the time period within a predetermined time until the next agent recognition keyword appears.

手順抽出部６５は、エージェントの抽出キーワードと、エージェントの抽出キーワード直後の話者が顧客の認識キーワードで対応語句である顧客キーワードと、抽出キーワードに対応する手順ＩＤと、認識結果ＩＤとを一つの手順箇所を示すものとして特定し、手順抽出データ７８に記入する。 The procedure extraction unit 65 sets the extracted keyword of the agent, the customer keyword in which the speaker immediately after the extracted keyword of the agent is the customer's recognized keyword and the corresponding phrase, the procedure ID corresponding to the extracted keyword, and the recognition result ID as one. The procedure part is specified as an indication and written in the procedure extraction data 78.

図１５は手順抽出部により記入された手順抽出データの一例を示す説明図である。図１５の手順抽出データ７８は、エージェントキーワード，エージェントキーワードの認識結果ＩＤ，顧客キーワード，顧客キーワードの認識結果ＩＤ，手順ＩＤ，開始時間，終了時間，無音区間，無音区間の発生者，及び無音区間の妥当性の組である。 FIG. 15 is an explanatory diagram showing an example of procedure extraction data entered by the procedure extraction unit. The procedure extraction data 78 of FIG. 15 includes an agent keyword, an agent keyword recognition result ID, a customer keyword, a customer keyword recognition result ID, a procedure ID, a start time, an end time, a silence interval, a silence interval generator, and a silence interval. Is a pair of validity.

図１５の手順抽出データ７８は、エージェントキーワードと、エージェントキーワードの認識結果ＩＤと、顧客キーワードと、顧客キーワードの認識結果ＩＤと、手順ＩＤとが記入された状態を表している。 The procedure extraction data 78 in FIG. 15 represents a state in which an agent keyword, an agent keyword recognition result ID, a customer keyword, a customer keyword recognition result ID, and a procedure ID are entered.

図１６は手順抽出部の処理手順を表した一例のフローチャートである。ステップＳ１１に進み、手順抽出部６５は音声認識結果７６から話者がエージェントの認識キーワードを抽出キーワードとして順番に取り出す。手順抽出部６５は、取り出したエージェントの抽出キーワードに対応する手順ＩＤと対応語句とを手順キーワード表７７から取り出す。 FIG. 16 is a flowchart illustrating an example of the processing procedure of the procedure extracting unit. In step S11, the procedure extracting unit 65 sequentially extracts the recognition keyword of the agent from the speech recognition result 76 as the extracted keyword. The procedure extracting unit 65 extracts the procedure ID and the corresponding phrase corresponding to the extracted keyword of the extracted agent from the procedure keyword table 77.

ステップＳ１２に進み、手順抽出部６５は音声認識結果７６から話者が顧客である次の認識キーワードの取得を試みる。ステップＳ１３に進み、手順抽出部６５は音声認識結果７６から話者が顧客である次の認識キーワードがあれば、その話者が顧客である次の認識キーワードを取得し、ステップＳ１４の処理に進む。 In step S12, the procedure extraction unit 65 attempts to acquire the next recognition keyword whose speaker is a customer from the speech recognition result 76. Proceeding to step S13, if there is a next recognition keyword whose speaker is a customer from the speech recognition result 76, the procedure extraction unit 65 acquires the next recognition keyword whose speaker is a customer, and proceeds to the processing of step S14. .

ステップＳ１４に進み、手順抽出部６５は、取得した顧客の次の認識キーワードの出現時刻が、エージェントの認識キーワードの出現時刻＋α（予め定めた時間）の範囲内であり、且つエージェントの次の認識キーワードの出現時刻より前であり、且つ取得した顧客の次の認識キーワードがエージェントの認識キーワードの対応語句であるか否かを判定する。 In step S14, the procedure extraction unit 65 determines that the appearance time of the acquired next recognition keyword of the customer is within the range of the appearance time of the recognition keyword of the agent + α (predetermined time) and the next recognition of the agent. It is determined whether it is before the appearance time of the keyword and the next recognition keyword of the acquired customer is a corresponding phrase of the recognition keyword of the agent.

取得した顧客の次の認識キーワードの出現時刻が、エージェントの認識キーワードの出現時刻＋α（予め定めた時間）の範囲内であり、且つエージェントの次の認識キーワードの出現時刻より前であり、且つ取得した顧客の次の認識キーワードがエージェントの認識キーワードの対応語句であれば、手順抽出部６５はステップＳ１５に進む。 The appearance time of the next recognition keyword of the acquired customer is within the range of the appearance time of the recognition keyword of the agent + α (predetermined time), and is earlier than the appearance time of the next recognition keyword of the agent. If the next recognized keyword of the customer is the corresponding phrase of the recognized keyword of the agent, the procedure extracting unit 65 proceeds to step S15.

取得した顧客の次の認識キーワードの出現時刻が、エージェントの認識キーワードの出現時刻＋α（予め定めた時間）の範囲内であり、且つエージェントの次の認識キーワードの出現時刻より前であり、且つ取得した顧客の次の認識キーワードがエージェントの認識キーワードの対応語句でなければ、手順抽出部６５はステップＳ１２に戻る。 The appearance time of the next recognition keyword of the acquired customer is within the range of the appearance time of the recognition keyword of the agent + α (predetermined time), and is earlier than the appearance time of the next recognition keyword of the agent. If the next recognized keyword of the customer is not the corresponding phrase of the recognized keyword of the agent, the procedure extracting unit 65 returns to step S12.

ステップＳ１５に進み、手順抽出部６５は、エージェントキーワードと、エージェントキーワードの認識結果ＩＤと、顧客キーワードと、顧客キーワードの認識結果ＩＤと、手順ＩＤとを一つの手順箇所を示すものとして手順抽出データ７８に記入する。ステップＳ１１〜Ｓ１５の処理は未処理の音声認識結果７６がある限り、繰り返し行われる。 In step S15, the procedure extraction unit 65 sets the agent keyword, the recognition result ID of the agent keyword, the customer keyword, the recognition result ID of the customer keyword, and the procedure ID as one procedure part. Fill in 78. The processes of steps S11 to S15 are repeated as long as there is an unprocessed speech recognition result 76.

（手順時間算出部６６）
手順時間算出部６６は、発話区間情報７３，音声認識結果７６及び手順キーワード表７７が入力される。手順時間算出部６６は、図１７に示す処理手順により各手順の時間算出を行う。 (Procedure time calculation unit 66)
The procedure time calculation unit 66 receives the utterance section information 73, the speech recognition result 76, and the procedure keyword table 77. The procedure time calculation unit 66 calculates the time for each procedure according to the processing procedure shown in FIG.

図１７は手順時間算出部が行う処理手順を表した一例のフローチャートである。ステップＳ２１に進み、手順時間算出部６６は手順抽出部６５によって手順抽出データ７８に記入された各手順の開始時間と終了時間とを求める。なお、ステップＳ２１の処理の詳細は後述する。 FIG. 17 is a flowchart illustrating an example of a processing procedure performed by the procedure time calculation unit. In step S21, the procedure time calculation unit 66 obtains the start time and end time of each procedure entered in the procedure extraction data 78 by the procedure extraction unit 65. Details of the process in step S21 will be described later.

ステップＳ２２に進み、手順時間算出部６６は手順抽出データ７８に記入された各手順のうち、連続している同一手順を重複する手順として同一手順としてまとめ、抽出漏れの手順を推定して挿入する。なお、ステップＳ２２の処理の詳細は後述する。ステップＳ２３に進み、手順時間算出部６６は無音区間の処理を行う。なお、ステップＳ２３の詳細は後述する。 Proceeding to step S22, the procedure time calculation unit 66 summarizes the same procedure that is consecutive among the procedures entered in the procedure extraction data 78 as a duplicate procedure, and estimates and inserts the procedure of extraction omission. . Details of the process in step S22 will be described later. Proceeding to step S23, the procedure time calculation unit 66 performs a silent section. Details of step S23 will be described later.

ステップＳ２１の処理は次のように行う。ステップＳ２１の処理は、手順抽出部６５によって手順抽出データ７８に記入された全ての手順について開始時間と終了時間とを記入するものである。 The process of step S21 is performed as follows. The process of step S21 is to enter the start time and end time for all the procedures entered in the procedure extraction data 78 by the procedure extraction unit 65.

まず、手順時間算出部６６は、手順抽出部６５によって手順抽出データ７８に記入された全ての手順を順番に取り出す。手順時間算出部６６は、取り出した手順のエージェントキーワードの認識結果ＩＤに基づき、音声認識結果７６からエージェントキーワードの出現時刻を取得する。そして、手順時間算出部６６はエージェントキーワードの出現時刻を含む発話区間をエージェントの発話区間情報７３から取得し、その発話区間の開始時間を手順の開始時間とする。 First, the procedure time calculation unit 66 sequentially extracts all procedures entered in the procedure extraction data 78 by the procedure extraction unit 65. The procedure time calculation unit 66 acquires the appearance time of the agent keyword from the speech recognition result 76 based on the recognition result ID of the agent keyword of the extracted procedure. Then, the procedure time calculation unit 66 acquires an utterance section including the appearance time of the agent keyword from the utterance section information 73 of the agent, and uses the start time of the utterance section as the procedure start time.

手順時間算出部６６は、取り出した手順の顧客キーワードの認識結果ＩＤに基づき、音声認識結果７６から顧客キーワードの出現時刻を取得する。そして、手順時間算出部６６は顧客キーワードの出現時刻を含む発話区間を顧客の発話区間情報７３から取得し、その発話区間の終了時間を手順の終了時間とする。 The procedure time calculation unit 66 acquires the appearance time of the customer keyword from the speech recognition result 76 based on the recognition result ID of the customer keyword of the extracted procedure. Then, the procedure time calculation unit 66 acquires the utterance section including the appearance time of the customer keyword from the customer utterance section information 73, and sets the end time of the utterance section as the end time of the procedure.

そして、手順時間算出部６６は、手順の開始時間と終了時間とを手順抽出データ６７に記入する。図１８は、手順時間算出部により開始時間と終了時間とが記入された手順抽出データの一例を示す説明図である。 Then, the procedure time calculation unit 66 enters the procedure start time and end time in the procedure extraction data 67. FIG. 18 is an explanatory diagram showing an example of procedure extraction data in which the start time and end time are entered by the procedure time calculation unit.

図１９はステップＳ２１の処理手順を表した一例のフローチャートである。ステップＳ３１に進み、手順時間算出部６６は、手順のエージェントキーワードの認識結果ＩＤに基づき、音声認識結果７６からエージェントキーワードの出現時刻を取得する。手順時間算出部６６は、エージェントキーワードの出現時刻を含む発話区間をエージェントの発話区間情報７３から取得し、その発話区間の開始時間を手順の開始時間とする。 FIG. 19 is a flowchart illustrating an example of the processing procedure of step S21. In step S31, the procedure time calculation unit 66 acquires the appearance time of the agent keyword from the speech recognition result 76 based on the recognition result ID of the agent keyword of the procedure. The procedure time calculation unit 66 acquires an utterance section including the appearance time of the agent keyword from the utterance section information 73 of the agent, and uses the start time of the utterance section as the procedure start time.

ステップＳ３２に進み、手順時間算出部６６は、手順の顧客キーワードの認識結果ＩＤに基づき、音声認識結果７６から顧客キーワードの出現時刻を取得する。そして、手順時間算出部６６は顧客キーワードの出現時刻を含む発話区間を顧客の発話区間情報７３から取得し、その発話区間の終了時間を手順の終了時間とする。 In step S32, the procedure time calculation unit 66 acquires the appearance time of the customer keyword from the voice recognition result 76 based on the recognition result ID of the customer keyword of the procedure. Then, the procedure time calculation unit 66 acquires the utterance section including the appearance time of the customer keyword from the customer utterance section information 73, and sets the end time of the utterance section as the end time of the procedure.

ステップＳ３３に進み、手順時間算出部６６は、手順の開始時間と終了時間とを手順抽出データ６７に記入する。ステップＳ３４に進み、手順時間算出部６６は処理対象とする手順を手順抽出データ７８の次の手順に進める。ステップＳ３１〜Ｓ３４の処理は未処理の手順が手順抽出データ７８にある限り、繰り返し行われる。 In step S 33, the procedure time calculation unit 66 writes the procedure start time and end time in the procedure extraction data 67. In step S 34, the procedure time calculation unit 66 advances the procedure to be processed to the procedure following the procedure extraction data 78. The processes of steps S31 to S34 are repeated as long as there are unprocessed procedures in the procedure extraction data 78.

ステップＳ２２の処理は次のように行う。ステップＳ２２の処理は、手順抽出部６５によって手順抽出データ７８に記入された全ての手順について、重複する手順の修正および抽出漏れの手順の挿入するものである。 The process of step S22 is performed as follows. The process of step S22 is to insert duplicate procedure corrections and extraction omission procedures for all procedures entered in the procedure extraction data 78 by the procedure extraction unit 65.

まず、手順時間算出部６６は、手順抽出部６５によって手順抽出データ７８に記入された全ての手順を順番に取り出す。手順時間算出部６６は、処理対象の手順の手順ＩＤと次の手順の手順ＩＤとが同じであるかを調べる。処理対象の手順の手順ＩＤと次の手順の手順ＩＤとが同じである場合、手順時間算出部６６は処理対象の手順と次の手順とをマージする。このとき、手順時間算出部６６は手順の開始時間に処理対象の手順の開始時間を採用し、手順の終了時間に次の手順の終了時間を採用する。 First, the procedure time calculation unit 66 sequentially extracts all procedures entered in the procedure extraction data 78 by the procedure extraction unit 65. The procedure time calculation unit 66 checks whether the procedure ID of the procedure to be processed is the same as the procedure ID of the next procedure. When the procedure ID of the procedure to be processed and the procedure ID of the next procedure are the same, the procedure time calculation unit 66 merges the procedure to be processed and the next procedure. At this time, the procedure time calculation unit 66 adopts the start time of the procedure to be processed as the start time of the procedure, and adopts the end time of the next procedure as the end time of the procedure.

なお、手順時間算出部６６は、エージェントキーワード，エージェントキーワードの認識結果ＩＤ，顧客キーワード，顧客キーワードの認識結果ＩＤを処理対象の手順と次の手順とをマージし、マージした次の手順を削除する。 The procedure time calculation unit 66 merges the agent keyword, the agent keyword recognition result ID, the customer keyword, and the customer keyword recognition result ID with the procedure to be processed and the next procedure, and deletes the merged next procedure. .

また、手順時間算出部６６は、処理対象の手順の手順ＩＤと次の手順の手順ＩＤとの間に手順が存在し、且つ、処理対象の手順の終了と次の手順の開始との間に発話区間が存在するかを調べる。処理対象の手順の手順ＩＤと次の手順の手順ＩＤとの間に手順が存在するか否かは、手順キーワード表７７から調べることができる。また、処理対象の手順の終了と次の手順の開始との間に発話区間が存在するか否かは、発話区間情報７３から調べることができる。 Further, the procedure time calculation unit 66 has a procedure between the procedure ID of the procedure to be processed and the procedure ID of the next procedure, and between the end of the procedure to be processed and the start of the next procedure. Check if there is an utterance interval. It can be checked from the procedure keyword table 77 whether or not a procedure exists between the procedure ID of the procedure to be processed and the procedure ID of the next procedure. Whether or not there is an utterance interval between the end of the procedure to be processed and the start of the next procedure can be checked from the utterance interval information 73.

処理対象の手順の手順ＩＤと次の手順の手順ＩＤとの間に手順が存在し、且つ、処理対象の手順の終了と次の手順の開始との間に発話区間が存在する場合、手順時間算出部６６は抽出漏れの手順があると推定し、手順抽出データ７８の処理対象の手順と次の手順との間に抽出漏れの手順を挿入し、処理対象の手順の終了と次の手順の開始との間にある発話区間の開始時間及び終了時間を手順の開始時間及び終了時間として記入する。 If there is a procedure between the procedure ID of the procedure to be processed and the procedure ID of the next procedure, and there is an utterance section between the end of the procedure to be processed and the start of the next procedure, the procedure time The calculation unit 66 estimates that there is an extraction failure procedure, inserts the extraction failure procedure between the procedure to be processed in the procedure extraction data 78 and the next procedure, and ends the processing target procedure and the next procedure. The start time and end time of the utterance section between the start and the start time and end time of the procedure are entered.

図２０は、手順時間算出部により重複する手順の修正および抽出漏れの手順の挿入がされた手順抽出データの一例を示す説明図である。図２０の手順抽出データ７８は手順ＩＤが「Ａ−１」の手順が同一手順としてまとめられ、手順ＩＤが「Ａ−３」の手順が抽出漏れの手順として挿入されている。 FIG. 20 is an explanatory diagram illustrating an example of procedure extraction data in which the procedure time calculation unit corrects duplicate procedures and inserts an extraction failure procedure. In the procedure extraction data 78 of FIG. 20, the procedures with the procedure ID “A-1” are collected as the same procedure, and the procedure with the procedure ID “A-3” is inserted as a procedure for omission of extraction.

図２１はステップＳ２２の処理手順を表した一例のフローチャートである。ステップＳ４１に進み、手順時間算出部６６は処理対象の手順と次の手順とが同じであるか否かを判定する。 FIG. 21 is a flowchart illustrating an example of the processing procedure of step S22. In step S41, the procedure time calculation unit 66 determines whether or not the procedure to be processed is the same as the next procedure.

処理対象の手順と次の手順とが同じである場合、手順時間算出部６６はステップＳ４２に進み、処理対象の手順と次の手順とをマージする。このとき、手順時間算出部６６は手順の開始時間に処理対象の手順の開始時間を採用し、手順の終了時間に次の手順の終了時間を採用して、ステップＳ４３に進む。なお、処理対象の手順と次の手順とが同じでない場合、手順時間算出部６６はステップＳ４１からステップＳ４３に進む。 If the procedure to be processed and the next procedure are the same, the procedure time calculation unit 66 proceeds to step S42 and merges the procedure to be processed and the next procedure. At this time, the procedure time calculation unit 66 adopts the start time of the procedure to be processed as the start time of the procedure, adopts the end time of the next procedure as the end time of the procedure, and proceeds to step S43. If the procedure to be processed is not the same as the next procedure, the procedure time calculation unit 66 proceeds from step S41 to step S43.

ステップＳ４３では、手順時間算出部６６が、処理対象の手順と次の手順との間に手順が存在し、且つ、処理対象の手順と次の手順との間に、発話区間が存在するか否かを判定する。 In step S43, the procedure time calculation unit 66 determines whether there is a procedure between the procedure to be processed and the next procedure, and whether there is an utterance section between the procedure to be processed and the next procedure. Determine whether.

処理対象の手順と次の手順との間に手順が存在し、且つ、処理対象の手順と次の手順との間に発話区間が存在する場合、手順時間算出部６６はステップＳ４４に進み、抽出漏れの手順があると推定し、手順抽出データ７８の処理対象の手順と次の手順との間に一行挿入し、処理対象の手順と次の手順との間に存在する手順の手順ＩＤを記入し，処理対象の手順と次の手順との間にある発話区間の開始時間及び終了時間を手順の開始時間及び終了時間として記入して、ステップＳ４５に進む。 If there is a procedure between the procedure to be processed and the next procedure, and there is an utterance section between the procedure to be processed and the next procedure, the procedure time calculation unit 66 proceeds to step S44 and extracts Estimated that there is a leaking procedure, insert one line between the procedure to be processed in the procedure extraction data 78 and the next procedure, and enter the procedure ID of the procedure existing between the procedure to be processed and the next procedure Then, the start time and end time of the utterance section between the procedure to be processed and the next procedure are entered as the start time and end time of the procedure, and the process proceeds to step S45.

なお、処理対象の手順と次の手順との間に手順が存在せず、又は、処理対象の手順と次の手順との間に発話区間が存在しない場合、手順時間算出部６６はステップＳ４３からステップＳ４５に進む。ステップＳ４５に進み、手順時間算出部６６は処理対象とする手順を手順抽出データ７８の次の手順に進める。ステップＳ４１〜Ｓ４５の処理は未処理の手順が手順抽出データ７８にある限り、繰り返し行われる。 When there is no procedure between the procedure to be processed and the next procedure, or when there is no utterance section between the procedure to be processed and the next procedure, the procedure time calculation unit 66 starts from step S43. Proceed to step S45. In step S 45, the procedure time calculation unit 66 advances the procedure to be processed to the procedure following the procedure extraction data 78. The processes of steps S41 to S45 are repeated as long as there are unprocessed procedures in the procedure extraction data 78.

ステップＳ２３の処理は次のように行う。ステップＳ２３の処理は、無音区間の処理を行うものである。まず、手順時間算出部６６は無音区間情報を作成する。手順時間算出部６６は、発話区間情報７３から各話者の発話区間を順番に取得し、発話の存在しない区間が一定時間以上であれば、その一定期間以上の発話の存在しない区間を無音区間として無音区間情報を作成する。なお、無音区間を判定する為の一定時間とは、通常の対話において自然に生じる間として許容される時間である。 The process of step S23 is performed as follows. The process of step S23 performs a silent section. First, the procedure time calculation unit 66 creates silent section information. The procedure time calculation unit 66 obtains each speaker's utterance section in turn from the utterance section information 73, and if a section in which no utterance exists is a certain time or more, a section in which there is no utterance for a certain period or more is a silent section. Silent section information is created. The fixed time for determining the silent section is a time that is allowed as a time naturally occurring in a normal conversation.

図２２は無音区間情報の一例を示す説明図である。図２２に示した無音区間情報は開始時間，終了時間及び区間の組であり、無音区間ごとの開始時間，終了時間及び区間を表している。 FIG. 22 is an explanatory diagram showing an example of silent section information. The silent section information shown in FIG. 22 is a set of start time, end time and section, and represents the start time, end time and section for each silent section.

次に、手順時間算出部６６は処理対象とする手順の開始時間及び終了時間をみて、処理対象とする手順に無音区間があるかを判定する。無音区間があれば、手順時間算出部６６は無音区間の出現直後の話者および付随する作業（エージェント操作）から、無音区間を生じさせた話者（無音区間の発生者）を判定すると共に、その無音区間が生じることのあり得る手順なのか（妥当性のある手順なのか）を判定する。 Next, the procedure time calculation unit 66 determines whether there is a silent section in the procedure to be processed by looking at the start time and end time of the procedure to be processed. If there is a silent section, the procedure time calculation unit 66 determines the speaker that generated the silent section (the generator of the silent section) from the speaker immediately after the appearance of the silent section and the accompanying work (agent operation). It is determined whether or not the silent section may occur (a procedure with validity).

手順時間算出部６６は判定した結果から手順抽出データ７８の無音区間，無音区間の発生者及び無音区間の妥当性を記入する。 The procedure time calculation unit 66 fills in the silence section of the procedure extraction data 78, the generator of the silence section, and the validity of the silence section from the determined result.

また、手順時間算出部６６は処理対象とする手順の開始時間と次に手順の終了時間とをみて、処理対象とする手順と次の手順との間に無音区間があるかを判定する。無音区間があれば、手順時間算出部６６は無音区間の出現直後の話者、及び、対象とする手順と次の手順とに付随する作業（エージェント操作）から、無音区間を生じさせている手順，それに伴って修正される開始時間及び終了時間，無音区間を生じさせた話者，無音区間の妥当性を判定する。 Further, the procedure time calculation unit 66 determines whether there is a silent section between the procedure to be processed and the next procedure by looking at the start time of the procedure to be processed and the end time of the next procedure. If there is a silent section, the procedure time calculation unit 66 generates a silent section from the speaker immediately after the appearance of the silent section and the work (agent operation) associated with the target procedure and the next procedure. , Start time and end time to be corrected accordingly, the speaker that caused the silent period, and the validity of the silent period.

手順時間算出部６６は判定した結果から手順抽出データ７８の無音区間，無音区間の発生者及び無音区間の妥当性を記入すると共に、必要があれば開始時間及び終了時間を修正する。 The procedure time calculation unit 66 fills in the silence section of the procedure extraction data 78, the generation of the silence section, and the validity of the silence section from the determined result, and corrects the start time and end time if necessary.

図２３は、無音区間の処理を行った手順抽出データの一例を示す説明図である。図２３の手順抽出データ７８は手順ＩＤが「Ａ−２」の手順に、無音区間，無音区間の発生者及び無音区間の妥当性が記入されている。 FIG. 23 is an explanatory diagram showing an example of procedure extraction data obtained by processing a silent section. In the procedure extraction data 78 of FIG. 23, a silence ID, a generator of the silence interval, and the validity of the silence interval are entered in the procedure having the procedure ID “A-2”.

図２４はステップＳ２３の処理手順を表した一例のフローチャートである。ステップＳ５１に進み、手順時間算出部６６は無音区間情報を作成する。ステップＳ５２に進み、手順時間算出部６６は処理対象とする手順の開始時間から終了時間までに無音区間があるか否かを判定する。無音区間があれば、手順時間算出部６６はステップＳ５３に進み、後述の無音区間の処理Ａを行ったあと、ステップＳ５４に進む。 FIG. 24 is a flowchart illustrating an example of the processing procedure of step S23. In step S51, the procedure time calculation unit 66 creates silent section information. In step S52, the procedure time calculation unit 66 determines whether there is a silent section from the start time to the end time of the procedure to be processed. If there is a silent section, the procedure time calculation unit 66 proceeds to step S53, performs a silent section process A described later, and then proceeds to step S54.

ステップＳ５４に進み、手順時間算出部６６は、ステップＳ５３の無音区間の処理Ａの結果から手順抽出データ７８の無音区間，無音区間の発生者及び無音区間の妥当性を記入したあと、ステップＳ５６に進む。処理対象とする手順の開始時間から終了時間までに無音区間が無ければ、手順時間算出部６６はステップＳ５５に進み、手順抽出データ７８の無音区間に、無音区間の時間が０であることを表す「０」を記入したあと、ステップＳ５６に進む。 Proceeding to step S54, the procedure time calculation unit 66 fills in the silence section of the procedure extraction data 78, the occurrence of the silence section, and the validity of the silence section from the result of the process A of the silence section of step S53, and then proceeds to step S56. move on. If there is no silence section from the start time to the end time of the procedure to be processed, the procedure time calculation unit 66 proceeds to step S55, and the silence period of the procedure extraction data 78 indicates that the time of the silence section is zero. After entering “0”, the process proceeds to step S56.

ステップＳ５６に進み、手順時間算出部６６は処理対象とする手順の開始時間と次に手順の終了時間との間に無音区間があるかを判定する。無音区間があれば、手順時間算出部６６はステップＳ５７に進み、後述の無音区間の処理Ｂを行ったあと、ステップＳ５８に進む。ステップＳ５８に進み、手順時間算出部６６は、ステップＳ５７の無音区間の処理Ｂの結果から手順抽出データ７８の無音区間，無音区間の発生者及び無音区間の妥当性を記入すると共に、必要があれば開始時間及び終了時間を修正したあと、ステップＳ６０に進む。 In step S56, the procedure time calculation unit 66 determines whether there is a silent section between the start time of the procedure to be processed and the end time of the procedure. If there is a silent section, the procedure time calculation unit 66 proceeds to step S57, performs a silent section processing B described later, and then proceeds to step S58. Proceeding to step S58, the procedure time calculation unit 66 fills in the silence interval of the procedure extraction data 78, the occurrence of the silence interval, and the validity of the silence interval from the result of the silence interval process B of step S57, and is necessary. If the start time and end time are corrected, the process proceeds to step S60.

処理対象とする手順の開始時間と次に手順の終了時間との間に無音区間が無ければ、手順時間算出部６６はステップＳ５９に進み、手順抽出データ７８の無音区間に、無音区間の時間が０であることを表す「０」を記入したあと、ステップＳ６０に進む。ステップＳ６０に進み、手順時間算出部６６は処理対象とする手順を手順抽出データ７８の次の手順に進める。ステップＳ５２〜Ｓ６０の処理は未処理の手順が手順抽出データ７８にある限り、繰り返し行われる。 If there is no silence interval between the start time of the procedure to be processed and the end time of the next procedure, the procedure time calculation unit 66 proceeds to step S59 and the time of the silence interval is included in the silence interval of the procedure extraction data 78. After “0” representing 0 is entered, the process proceeds to step S60. In step S 60, the procedure time calculation unit 66 advances the procedure to be processed to the procedure following the procedure extraction data 78. The processes of steps S52 to S60 are repeated as long as there are unprocessed procedures in the procedure extraction data 78.

図２５はステップＳ５３の処理手順を表した一例のフローチャートである。ステップＳ７１に進み、手順時間算出部６６は無音区間の出現直後の話者がエージェントであるか否かを判定する。無音区間の出現直後の話者がエージェントであれば、手順時間算出部６６は無音区間を生じさせた話者をエージェントとし、ステップＳ７３に進む。 FIG. 25 is a flowchart illustrating an example of the processing procedure of step S53. In step S71, the procedure time calculation unit 66 determines whether or not the speaker immediately after the appearance of the silent section is an agent. If the speaker immediately after the appearance of the silent section is an agent, the procedure time calculation unit 66 sets the speaker that generated the silent section as an agent, and proceeds to step S73.

ステップＳ７３に進み、手順時間算出部６６は手順キーワード表７７を参照し、処理対象とする手順のエージェント操作が空を表す「なし」であるか判定する。処理対象とする手順のエージェント操作が空であれば、手順時間算出部６６はステップＳ７４に進み、無音区間の妥当性が無いと判定する。 In step S73, the procedure time calculation unit 66 refers to the procedure keyword table 77 and determines whether the agent operation of the procedure to be processed is “None” indicating empty. If the agent operation of the procedure to be processed is empty, the procedure time calculation unit 66 proceeds to step S74 and determines that the silence section is not valid.

なお、ステップＳ７１において、無音区間の出現直後の話者が顧客であれば、手順時間算出部６６は無音区間を生じさせた話者を顧客とする。ステップＳ７３において、処理対象とする手順のエージェント操作が空でなければ、手順時間算出部６６はステップＳ７６に進む。 In step S71, if the speaker immediately after the appearance of the silent section is a customer, the procedure time calculation unit 66 sets the speaker who generated the silent section as the customer. In step S73, if the agent operation of the procedure to be processed is not empty, the procedure time calculation unit 66 proceeds to step S76.

ステップＳ７６に進み、手順時間算出部６６は手順キーワード表７７を参照し、処理対象とする手順のエージェント操作が「途中処理」以外であるか判定する。処理対象とする手順のエージェント操作が「途中処理」以外であれば、手順時間算出部６６はステップＳ７４に進み、無音区間の妥当性が無いと判定する。処理対象とする手順のエージェント操作が「途中処理」であれば、手順時間算出部６６はステップＳ７７に進み、無音区間の妥当性があると判定する。 In step S76, the procedure time calculation unit 66 refers to the procedure keyword table 77 and determines whether the agent operation of the procedure to be processed is other than “intermediate processing”. If the agent operation of the procedure to be processed is other than “intermediate processing”, the procedure time calculation unit 66 proceeds to step S74 and determines that the silence section is not valid. If the agent operation of the procedure to be processed is “intermediate processing”, the procedure time calculation unit 66 proceeds to step S77 and determines that the silence section is valid.

図２６はステップＳ５７の処理手順を表した一例のフローチャートである。ステップＳ８１に進み、手順時間算出部６６は無音区間の出現直後の話者がエージェントであるか否かを判定する。無音区間の出現直後の話者がエージェントであれば、手順時間算出部６６は無音区間を生じさせた話者をエージェントとし、ステップＳ８３に進む。 FIG. 26 is a flowchart showing an example of the processing procedure of step S57. In step S81, the procedure time calculation unit 66 determines whether or not the speaker immediately after the appearance of the silent section is an agent. If the speaker immediately after the appearance of the silent section is an agent, the procedure time calculation unit 66 sets the speaker that generated the silent section as an agent, and proceeds to step S83.

ステップＳ８３に進み、手順時間算出部６６は手順キーワード表７７を参照し、処理対象とする手順のエージェント操作が空を表す「後処理」であるか判定する。処理対象とする手順のエージェント操作が「後処理」であれば、手順時間算出部６６はステップＳ８４に進み、処理対象とする手順の終了を無音区間の終了時間に修正し、無音区間を対象とする手順に加算する。ステップＳ８５に進み、手順時間算出部６６は無音区間の妥当性があると判定する。 Proceeding to step S83, the procedure time calculation unit 66 refers to the procedure keyword table 77 and determines whether the agent operation of the procedure to be processed is “post-processing” representing empty. If the agent operation of the procedure to be processed is “post-processing”, the procedure time calculation unit 66 proceeds to step S84, corrects the end of the procedure to be processed to the end time of the silence interval, and targets the silence interval as a target. Add to the procedure. Proceeding to step S85, the procedure time calculation unit 66 determines that the silence section is valid.

なお、ステップＳ８１において、無音区間の出現直後の話者が顧客であれば、手順時間算出部６６は無音区間を生じさせた話者を顧客とする。ステップＳ８３において、処理対象とする手順のエージェント操作が「後処理」でなければ、手順時間算出部６６はステップＳ８７に進む。 In step S81, if the speaker immediately after the appearance of the silent section is a customer, the procedure time calculation unit 66 sets the speaker who generated the silent section as the customer. If the agent operation of the procedure to be processed is not “post-processing” in step S83, the procedure time calculation unit 66 proceeds to step S87.

ステップＳ８７に進み、手順時間算出部６６は手順キーワード表７７を参照し、処理対象とする手順のエージェント操作が「前処理」であるか判定する。 In step S87, the procedure time calculation unit 66 refers to the procedure keyword table 77 and determines whether the agent operation of the procedure to be processed is “pre-processing”.

処理対象とする手順のエージェント操作が「前処理」であれば、手順時間算出部６６はステップＳ８８に進み、処理対象とする手順の開始を無音区間の開始時間に修正し、無音区間を次の手順に加算する。ステップＳ８５に進み、手順時間算出部６６は無音区間の妥当性があると判定する。 If the agent operation of the procedure to be processed is “pre-processing”, the procedure time calculation unit 66 proceeds to step S88, corrects the start of the procedure to be processed to the start time of the silence interval, and sets the silence interval to the next time. Add to the procedure. Proceeding to step S85, the procedure time calculation unit 66 determines that the silence section is valid.

処理対象とする手順のエージェント操作が「前処理」でなければ、手順時間算出部６６はステップＳ８９に進み、処理対象とする手順の終了を無音区間の終了時間に修正し、無音区間を次の手順に加算する。ステップＳ９０に進み、手順時間算出部６６は無音区間の妥当性がないと判定する。 If the agent operation of the procedure to be processed is not “pre-processing”, the procedure time calculation unit 66 proceeds to step S89, corrects the end of the procedure to be processed to the end time of the silence interval, and sets the silence interval to the next time. Add to the procedure. Proceeding to step S90, the procedure time calculation unit 66 determines that the silence section is not valid.

（応対フロー作成部６７）
応対フロー作成部６７は、手順抽出データ７８が入力される。応対フロー作成部６７は手順抽出データ７８の上から順番に、一手順を一マスとして図２７の応対フローデータ７９を作成する。 (Responding flow creation unit 67)
The procedure extraction data 78 is input to the reception flow creation unit 67. The reception flow creation unit 67 creates the reception flow data 79 of FIG. 27 in order from the procedure extraction data 78 with one procedure as one square.

図２７は応対フローデータの一例のイメージ図である。応対フローデータ７９は各手順が無音区間を含むか、その無音区間が妥当であるか否かを、色分けなどにより区別して表示する。例えば応対フロー作成部６７は無音区間のある手順の場合、枠の色を変え、その無音区間が妥当でないときに色で塗りつぶす。 FIG. 27 is an image diagram of an example of reception flow data. The response flow data 79 displays whether each procedure includes a silent section or whether the silent section is valid or not by distinguishing it by color coding or the like. For example, in the case of a procedure with a silent section, the reception flow creation unit 67 changes the color of the frame, and fills with a color when the silent section is not valid.

標準となる応対フローデータ（標準フロー）が予めある場合には、応対フローデータ７９と標準フローとを比較する。標準フローより手順が多い場合、応対フロー作成部６７は例えば多い手順にマーク付けをし、又は手順名に記号を入れる。 When there is standard response flow data (standard flow), the response flow data 79 is compared with the standard flow. When there are more procedures than the standard flow, the reception flow creation unit 67, for example, marks many procedures or puts a symbol in the procedure name.

標準フローより手順が少ない場合、応対フロー作成部６７は、例えば応対フローデータ７９にあるべき標準フローの手順にマーク付けをし、又は線を変える。標準フローと手順の順番が異なっている場合、応対フロー作成部６７は例えば異なっている手順の手順名に記号を入れる。 When the procedure is fewer than the standard flow, the reception flow creation unit 67 marks the procedure of the standard flow that should be in the reception flow data 79 or changes the line, for example. If the order of the standard flow and the procedure is different, the reception flow creation unit 67 puts a symbol in the procedure name of the different procedure, for example.

応対フロー作成部６７は、各手順の処理時間を手順抽出データ７８から算出し、各手順の処理時間を合計して総処理時間を算出し、応対手順時間情報８０を作成する。図２８は応対手順時間情報の一例を示す説明図である。図２８の応対手順時間情報８０は、手順ＩＤ，時間，無音区間の有無，無音区間の妥当性の組である。 The reception flow creation unit 67 calculates the processing time of each procedure from the procedure extraction data 78, calculates the total processing time by adding the processing times of each procedure, and generates the response procedure time information 80. FIG. 28 is an explanatory diagram showing an example of reception procedure time information. The response procedure time information 80 in FIG. 28 is a set of procedure ID, time, presence / absence of silent section, and validity of silent section.

応対手順時間情報８０は、手順ごとの時間，無音区間の有無，無音区間の妥当性を表している。また、応対手順時間情報８０は総処理時間と、無音区間の有無（全手順の無音区間の有無のＯＲ）とを表している。 The response procedure time information 80 indicates the time for each procedure, the presence / absence of a silent section, and the validity of a silent section. The response procedure time information 80 represents the total processing time and the presence / absence of a silent section (OR of the presence / absence of a silent section of all procedures).

応対フロー作成部６７は、作成した応対フローデータ７９及び応対手順時間情報８０を出力する。即ち、応対フロー作成部６７は通話録音データ７１から抽出した説明の手順を表す応対フローデータ７９と、各手順の所要時間及び説明フェーズに要した総所要時間を表す応対手順時間情報８０とを出力する。 The reception flow creation unit 67 outputs the created reception flow data 79 and reception procedure time information 80. That is, the reception flow creation unit 67 outputs reception flow data 79 representing the explanation procedure extracted from the call recording data 71 and reception procedure time information 80 representing the time required for each procedure and the total time required for the explanation phase. To do.

（音声認識キーワードリスト７５と手順キーワード表７７との作成）
図２９は音声認識キーワードリストと手順キーワード表とを作成する手順を示す説明図である。手順キーワード抽出部１００はマニュアル，ＦＡＱ，手順書などから重要語句を抽出し、各手順と紐付ける。 (Creation of voice recognition keyword list 75 and procedure keyword table 77)
FIG. 29 is an explanatory diagram showing a procedure for creating a voice recognition keyword list and a procedure keyword table. The procedure keyword extraction unit 100 extracts important phrases from manuals, FAQs, procedure manuals, and the like, and associates them with each procedure.

図３０は手順キーワード抽出部が行う処理を表したイメージ図である。なお、重要語句の決め方は、周知技術であるテキストからのキーワード抽出方法を用いる。図３０の場合は重要語句として、「プログラムディスク」，「ダブルクリック」，「フォルダ」などが抽出されている。手順キーワード抽出部１００は抽出した重要語句を、マニュアル，ＦＡＱ，手順書などに記載されている手順に紐付ける。また、手順キーワード抽出部１００は手順で行うエージェント操作を紐付ける。手順キーワード抽出部１００は手順キーワード表７７を出力する。 FIG. 30 is an image diagram showing processing performed by the procedure keyword extraction unit. Note that a keyword extraction method from text, which is a well-known technique, is used to determine an important phrase. In the case of FIG. 30, “program disk”, “double click”, “folder”, and the like are extracted as important words. The procedure keyword extraction unit 100 associates the extracted important phrases with procedures described in manuals, FAQs, procedure manuals, and the like. The procedure keyword extraction unit 100 associates an agent operation performed in the procedure. The procedure keyword extraction unit 100 outputs a procedure keyword table 77.

図３１は対応語句作成部が行う処理を表したイメージ図である。対応語句作成部１０１は手順キーワード抽出部１００が抽出した抽出キーワードについて、対応語句を作成するものである。例えば抽出キーワードが名詞の場合、対応語句作成部１０１は同じ語句やその語句の詳細語句を対応語句とする。また、抽出キーワードが動詞の場合、対応語句作成部１０１は同じ語句や「はい」，「できました」等の応答する語を対応語句とする。対応語句作成部１０１は作成した対応語句を手順キーワード表７７に書き込む。 FIG. 31 is an image diagram showing processing performed by the corresponding phrase creating unit. The corresponding phrase creating unit 101 creates a corresponding phrase for the extracted keyword extracted by the procedure keyword extracting unit 100. For example, when the extracted keyword is a noun, the corresponding phrase creating unit 101 sets the same phrase or a detailed phrase of the phrase as a corresponding phrase. When the extracted keyword is a verb, the corresponding phrase creating unit 101 sets the corresponding phrase as a corresponding phrase such as “Yes” or “Done”. The corresponding phrase creating unit 101 writes the created corresponding phrase in the procedure keyword table 77.

図３２は音声認識キーワードリスト作成部が行う処理を表したイメージ図である。音声認識キーワードリスト作成部１０２は手順キーワード表７７から音声認識キーワードリスト７５を作成する。音声認識キーワードリスト作成部１０２は手順キーワード表７７の抽出キーワードからエージェント用音声認識キーワードリスト７５を作成する。また、音声認識キーワードリスト作成部１０２は手順キーワード表７７の対応語句から顧客用音声認識キーワードリスト７５を作成する。音声認識キーワードリスト作成部１０２は、動詞について活用形を全て登録する。「はい」等の応答する語は、例えば「ええ」，「うん」などのバリエーションを登録する。 FIG. 32 is an image diagram showing processing performed by the voice recognition keyword list creation unit. The voice recognition keyword list creation unit 102 creates a voice recognition keyword list 75 from the procedure keyword table 77. The speech recognition keyword list creation unit 102 creates the agent speech recognition keyword list 75 from the extracted keywords in the procedure keyword table 77. The speech recognition keyword list creation unit 102 creates a customer speech recognition keyword list 75 from the corresponding words in the procedure keyword table 77. The speech recognition keyword list creation unit 102 registers all the usage forms for the verb. As a response word such as “Yes”, for example, a variation such as “Yes” or “Yes” is registered.

本実施例によれば、応対フローにおける問題箇所の抽出が容易となる。すなわち、妥当性のない無音区間を含む手順、言い換えればエージェントが顧客を待たせている手順を見つけて、その無音区間の時間を算出することにより、エージェントが顧客を待たせている箇所（手順）と、その時間とが得られる。また、標準フローがある場合、標準フローと異なる手順を直ぐに見つけることができる。 According to the present embodiment, it becomes easy to extract a problem portion in the reception flow. In other words, a procedure including an invalid silence interval, in other words, a procedure where the agent waits for the customer, and calculating the time of the silence interval so that the agent waits for the customer (procedure) And that time. If there is a standard flow, a procedure different from the standard flow can be found immediately.

また、本実施例によれば、同じような問い合わせに対する応対フローの比較が容易となる。すなわち、応対フローデータ７９を利用することで、同じような問い合わせに対する応対フローの比較ができる。また、応対手順時間情報８０を利用することで、各手順の所要時間を比較できる。このとき、無音区間の時間を参照することで、その無音区間の影響度を調べることができる。 In addition, according to the present embodiment, it becomes easy to compare response flows for similar inquiries. That is, by using the response flow data 79, it is possible to compare response flows for similar inquiries. Further, by using the response procedure time information 80, the time required for each procedure can be compared. At this time, by referring to the time of the silent section, the influence degree of the silent section can be examined.

本実施例の応対フロー作成装置１は最適説明手順抽出装置として利用できる。例えば最適説明手順抽出装置は、顧客からの問い合わせ別の手順を抽出し、その手順の中で最も時間が短い手順を抽出し、最適な説明手順として出力する。 The reception flow creation apparatus 1 of this embodiment can be used as an optimum explanation procedure extraction apparatus. For example, the optimum explanation procedure extracting device extracts a procedure for each inquiry from a customer, extracts a procedure having the shortest time among the procedures, and outputs it as an optimum explanation procedure.

最適説明手順抽出装置は応対フロー作成装置１の手法により手順を抽出する。最適説明手順抽出装置は顧客による無音区間がある場合、無音区間を対処時間に含めない。最適説明手順抽出装置は、各問い合わせ別の最短事例である説明手順を抽出し、その最短事例である説明手順を出力する。 The optimum explanation procedure extracting device extracts a procedure by the method of the reception flow creating device 1. When there is a silent section by the customer, the optimum explanation procedure extracting device does not include the silent section in the handling time. The optimum explanation procedure extracting device extracts the explanation procedure which is the shortest case for each inquiry, and outputs the explanation procedure which is the shortest case.

図３３は最適説明手順抽出装置の処理手順の一例を表すフローチャートである。ステップＳ１００に進み、最適説明手順抽出装置は、応対記録から同じ問い合わせに相当する通話録音データ７１を選別する。 FIG. 33 is a flowchart showing an example of the processing procedure of the optimum explanation procedure extracting device. Proceeding to step S100, the optimum explanation procedure extracting apparatus selects call recording data 71 corresponding to the same inquiry from the response record.

ステップＳ１０１に進み、最適説明手順抽出装置は選別した各通話録音データ７１に対して応対フロー作成装置１の手法により応対フローデータ７９を生成する。ステップＳ１０２に進み、最適説明手順抽出装置は処理時間が最短の事例を選択する。ステップＳ１０３に進み、最適説明手順抽出装置は処理時間が最短の事例を出力する。 In step S101, the optimum explanation procedure extracting device generates response flow data 79 for each selected call recording data 71 by the method of the response flow creation device 1. In step S102, the optimum explanation procedure extracting apparatus selects a case having the shortest processing time. Proceeding to step S103, the optimum explanation procedure extracting apparatus outputs a case with the shortest processing time.

本発明は、以下に記載する付記のような構成が考えられる。
（付記１）
顧客に対する応対の通話を録音した通話録音データから応対フローを作成するためにコンピュータを、
顧客に対する応対の手順ごとに、前記手順を説明する為に応対者が発話するキーワード及び前記手順の説明に呼応して顧客が発話するキーワードを対応付けて手順キーワード表として予め記録しておく手順キーワード表記録手段と、
前記通話録音データから抽出された韻律情報に基づき、顧客及び応対者の発話区間を抽出する発話区間抽出手段と、
前記通話録音データを音声認識し、認識したキーワード，話者，及び出現時間を音声認識結果として出力する音声認識手段と、
前記手順キーワード表を読み出し、前記音声認識結果に含まれる認識したキーワードに対応する前記手順を、前記手順キーワード表に基づき抽出する手順抽出手段と、
前記発話区間抽出手段が抽出した前記発話区間，前記音声認識手段が出力した前記音声認識結果，及び前記手順抽出手段が抽出した前記認識したキーワードに対応する前記手順に基づき、前記手順の開始から終了までの時間を算出すると共に、前記発話区間以外の無音区間がある場合、前記無音区間の直後の発話者を前記無音区間の発生者とし、前記無音区間の時間を前記手順の開始から終了までの時間に加算する手順時間算出手段と、
前記手順の開始から終了までの時間に基づき、前記応対フローを作成して出力する応対フロー作成手段と
して機能させるための応対フロー作成プログラム。
（付記２）
前記手順キーワード表記録手段は顧客に対する応対の前記手順ごとに、更に、前記手順に付随する前作業又は後作業を、応対者の操作内容として前記手順キーワード表に対応付けて記録しており、
前記手順時間算出手段は前記発話区間以外の無音区間がある場合、前記手順キーワード表に含まれる前記応対者の操作内容に基づいて、前記手順の前または後に前記手順に付随する前作業又は後作業が有るか無いかを判定し、その判定結果に基づいて前記無音区間を加算する前記手順を選択する付記１記載の応対フロー作成プログラム。
（付記３）
前記手順キーワード表に含まれる前記応対者の操作内容に基づき、前記無音区間に対応する位置に、前記手順に付随する前作業又は後作業が有れば前記無音区間を妥当性のあるものとして判定し、前記手順に付随する前作業又は後作業が無ければ前記無音区間を妥当性のないものとして判定する付記２記載の応対フロー作成プログラム。
（付記４）
前記手順時間算出手段は、前記手順キーワード表を読み出し、前記手順抽出手段が抽出した前記認識したキーワードに対応する前記手順に抜けがあり、抜けている前記手順に対応する位置に前記発話区間がある場合、その発話区間に応じて、抜けている前記手順を追加する付記１乃至３何れか一項記載の応対フロー作成プログラム。
（付記５）
前記手順時間算出手段は、前記音声認識手段が出力した音声認識結果に含まれる認識したキーワード，前記手順を説明する為に応対者が発話するキーワード及び前記手順の説明に呼応して顧客が発話するキーワードに基づき、前記発話区間を含む前記手順を判定する付記１乃至４何れか一項記載の応対フロー作成プログラム。
（付記６）
コンピュータを、更に、前記発話区間抽出手段が抽出した前記発話区間の主導権話者を推定し、前記通話録音データの開始直後の前記顧客が主導権話者となっている前記発話区間を質問フェーズ、前記質問フェーズ後の前記発話区間を回答フェーズとし、前記音声認識手段による音声認識を、前記回答フェーズに限定させるフェーズ推定手段として機能させるための付記１乃至５何れか一項記載の応対フロー作成プログラム。
（付記７）
コンピュータが、顧客に対する応対の通話を録音した通話録音データから応対フローを作成する応対フロー作成方法であって、
顧客に対する応対の手順ごとに、前記手順を説明する為に応対者が発話するキーワード及び前記手順の説明に呼応して顧客が発話するキーワードを対応付けて手順キーワード表として予め記録される手順キーワード表記録ステップと、
前記通話録音データから抽出された韻律情報に基づき、顧客及び応対者の発話区間を抽出する発話区間抽出ステップと、
前記通話録音データを音声認識し、認識したキーワード，話者，及び出現時間を音声認識結果として出力する音声認識ステップと、
前記手順キーワード表を読み出し、前記音声認識結果に含まれる認識したキーワードに対応する前記手順を、前記手順キーワード表に基づき抽出する手順抽出ステップと、
前記発話区間抽出ステップで抽出した前記発話区間，前記音声認識ステップで出力した前記音声認識結果，及び前記手順抽出ステップで抽出した前記認識したキーワードに対応する前記手順に基づき、前記手順の開始から終了までの時間を算出すると共に、前記発話区間以外の無音区間がある場合、前記無音区間の直後の発話者を前記無音区間の発生者とし、前記無音区間の時間を前記手順の開始から終了までの時間に加算する手順時間算出ステップと、
前記手順の開始から終了までの時間に基づき、前記応対フローを作成して出力する応対フロー作成ステップと
を有する応対フロー作成方法。
（付記８）
顧客に対する応対の通話を録音した通話録音データから、応対フローを作成する応対フロー作成装置であって、
顧客に対する応対の手順ごとに、前記手順を説明する為に応対者が発話するキーワード及び前記手順の説明に呼応して顧客が発話するキーワードを対応付けて手順キーワード表として予め記録しておく手順キーワード表記録手段と、
前記通話録音データから抽出された韻律情報に基づき、顧客及び応対者の発話区間を抽出する発話区間抽出手段と、
前記通話録音データを音声認識し、認識したキーワード，話者，及び出現時間を音声認識結果として出力する音声認識手段と、
前記手順キーワード表を読み出し、前記音声認識結果に含まれる認識したキーワードに対応する前記手順を、前記手順キーワード表に基づき抽出する手順抽出手段と、
前記発話区間抽出手段が抽出した前記発話区間，前記音声認識手段が出力した前記音声認識結果，及び前記手順抽出手段が抽出した前記認識したキーワードに対応する前記手順に基づき、前記手順の開始から終了までの時間を算出すると共に、前記発話区間以外の無音区間がある場合、前記無音区間の直後の発話者を前記無音区間の発生者とし、前記無音区間の時間を前記手順の開始から終了までの時間に加算する手順時間算出手段と、
前記手順の開始から終了までの時間に基づき、前記応対フローを作成して出力する応対フロー作成手段と
を有する応対フロー作成装置。 The present invention may have the following configurations as described below.
(Appendix 1)
In order to create a response flow from the call recording data recorded from the customer response,
Procedure keywords that are recorded in advance as a procedure keyword table by associating a keyword uttered by the responder to explain the procedure and a keyword uttered by the customer in response to the explanation of the procedure for each procedure of dealing with the customer Table recording means;
Based on prosodic information extracted from the call recording data, utterance interval extraction means for extracting the utterance interval of the customer and the respondent;
Voice recognition means for voice recognition of the call recording data and outputting the recognized keyword, speaker, and appearance time as a voice recognition result;
A procedure extracting means for reading out the procedure keyword table and extracting the procedure corresponding to the recognized keyword included in the speech recognition result based on the procedure keyword table;
Based on the procedure corresponding to the speech segment extracted by the speech segment extraction unit, the speech recognition result output by the speech recognition unit, and the recognized keyword extracted by the procedure extraction unit, the procedure ends from the start. And when there is a silent section other than the speech section, the speaker immediately after the silent section is set as the generator of the silent section, and the time of the silent section is determined from the start to the end of the procedure. A procedure time calculating means for adding to the time;
A reception flow creation program for functioning as reception flow creation means for creating and outputting the reception flow based on the time from the start to the end of the procedure.
(Appendix 2)
The procedure keyword table recording means records the pre-operation or post-operation associated with the procedure in association with the procedure keyword table as the operation contents of the responder for each procedure of the customer response.
When there is a silent section other than the utterance section, the procedure time calculating means performs a pre-work or post-work associated with the procedure before or after the procedure based on the operation content of the responder included in the procedure keyword table. The reception flow creation program according to supplementary note 1, wherein it is determined whether or not there is, and the procedure for adding the silent section is selected based on the determination result.
(Appendix 3)
Based on the operation details of the responder included in the procedure keyword table, if there is a pre-operation or a post-operation associated with the procedure at a position corresponding to the silence interval, the silence interval is determined to be valid. The response flow creation program according to supplementary note 2, wherein if there is no pre-work or post-work associated with the procedure, the silent section is determined to be invalid.
(Appendix 4)
The procedure time calculation means reads the procedure keyword table, the procedure corresponding to the recognized keyword extracted by the procedure extraction means is missing, and the utterance section is at a position corresponding to the missing procedure In this case, the response flow creation program according to any one of appendices 1 to 3, wherein the missing procedure is added according to the utterance period.
(Appendix 5)
The procedure time calculation means includes a recognized keyword included in the voice recognition result output from the voice recognition means, a keyword spoken by the responder to explain the procedure, and a customer uttered in response to the explanation of the procedure. 5. The response flow creation program according to any one of supplementary notes 1 to 4, wherein the procedure including the utterance section is determined based on a keyword.
(Appendix 6)
The computer further estimates the initiative speaker of the speech segment extracted by the speech segment extraction means, and determines the speech segment in which the customer is the initiative speaker immediately after the start of the call recording data. The response flow creation according to any one of appendices 1 to 5, wherein the utterance section after the question phase is set as an answer phase, and the speech recognition by the voice recognition means is functioned as a phase estimation means for limiting to the answer phase. program.
(Appendix 7)
A reception flow creation method in which a computer creates a reception flow from call recording data obtained by recording a reception call to a customer.
A procedure keyword table that is recorded in advance as a procedure keyword table in association with a keyword uttered by the responder for explaining the procedure and a keyword uttered by the customer in response to the explanation of the procedure for each procedure of dealing with the customer Recording step;
Based on the prosodic information extracted from the call recording data, the utterance interval extraction step of extracting the utterance interval of the customer and the responder;
A voice recognition step of voice-recognizing the call recording data and outputting the recognized keyword, speaker, and appearance time as a voice recognition result;
A procedure extracting step of reading out the procedure keyword table and extracting the procedure corresponding to the recognized keyword included in the speech recognition result based on the procedure keyword table;
Based on the procedure corresponding to the speech segment extracted in the speech segment extraction step, the speech recognition result output in the speech recognition step, and the recognized keyword extracted in the procedure extraction step And when there is a silent section other than the speech section, the speaker immediately after the silent section is set as the generator of the silent section, and the time of the silent section is determined from the start to the end of the procedure. A procedure time calculation step to add to the time;
And a reception flow creation step of creating and outputting the reception flow based on the time from the start to the end of the procedure.
(Appendix 8)
A service flow creation device for creating a service flow from call recording data obtained by recording a service call to a customer,
Procedure keywords that are recorded in advance as a procedure keyword table by associating a keyword uttered by the responder to explain the procedure and a keyword uttered by the customer in response to the explanation of the procedure for each procedure of dealing with the customer Table recording means;
Based on prosodic information extracted from the call recording data, utterance interval extraction means for extracting the utterance interval of the customer and the respondent;
Voice recognition means for voice recognition of the call recording data and outputting the recognized keyword, speaker, and appearance time as a voice recognition result;
A procedure extracting means for reading out the procedure keyword table and extracting the procedure corresponding to the recognized keyword included in the speech recognition result based on the procedure keyword table;
Based on the procedure corresponding to the speech segment extracted by the speech segment extraction unit, the speech recognition result output by the speech recognition unit, and the recognized keyword extracted by the procedure extraction unit, the procedure ends from the start. And when there is a silent section other than the speech section, the speaker immediately after the silent section is set as the generator of the silent section, and the time of the silent section is determined from the start to the end of the procedure. A procedure time calculating means for adding to the time;
A reception flow creation device comprising reception flow creation means for creating and outputting the reception flow based on the time from the start to the end of the procedure.

本発明は、具体的に開示された実施例に限定されるものではなく、特許請求の範囲から逸脱することなく、種々の変形や変更が可能である。例えば音声認識がエラーになった発話区間があった場合は周知の技術により推定しても無視してもよい。 The present invention is not limited to the specifically disclosed embodiments, and various modifications and changes can be made without departing from the scope of the claims. For example, if there is an utterance section in which speech recognition has caused an error, it may be estimated by a known technique or ignored.

本実施例の概要を表した一例の説明図である。It is explanatory drawing of an example showing the outline | summary of a present Example. 入力された通話録音データから手順箇所を特定し、各手順の所要時間を算出する処理を表した一例のイメージ図である。It is an image figure of an example showing the process which specifies the procedure location from the input telephone call recording data, and calculates the required time of each procedure. 各手順の所要時間を算出する処理のうち、隙間があいた場合の処理を表した一例のイメージ図である。It is an image figure of an example showing the process when there is a gap among the processes which calculate the required time of each procedure. 各手順の所要時間を算出する処理のうち、無音区間に対する処理を表した一例のイメージ図である。It is an image figure of an example showing the process with respect to a silence area among the processes which calculate the required time of each procedure. 各手順の所要時間を算出する処理のうち、無音区間に対する処理を表した他の例のイメージ図である。It is an image figure of the other example showing the process with respect to a silence area among the processes which calculate the required time of each procedure. 応対フロー作成装置の一例のハードウェア構成図である。It is a hardware block diagram of an example of a reception flow production apparatus. 応対フロー作成装置の処理部とデータとの関係を表した構成図である。It is a block diagram showing the relationship between the process part and data of a reception flow production apparatus. 応対フロー作成装置の処理手順を表した一例のフローチャートである。It is a flowchart of an example showing the process sequence of the reception flow preparation apparatus. エージェント及び顧客の韻律データの一例を示す説明図である。It is explanatory drawing which shows an example of the prosody data of an agent and a customer. エージェント及び顧客の発話区間情報の一例を示す説明図である。It is explanatory drawing which shows an example of an utterance area information of an agent and a customer. フェーズ情報の一例を示す説明図である。It is explanatory drawing which shows an example of phase information. 音声認識キーワードリストの一例を示す説明図である。It is explanatory drawing which shows an example of a speech recognition keyword list. 音声認識結果の一例を示す説明図である。It is explanatory drawing which shows an example of a speech recognition result. 手順キーワード表の一例を示す説明図である。It is explanatory drawing which shows an example of a procedure keyword table | surface. 手順抽出部により記入された手順抽出データの一例を示す説明図である。It is explanatory drawing which shows an example of the procedure extraction data entered by the procedure extraction part. 手順抽出部の処理手順を表した一例のフローチャートである。It is a flowchart of an example showing the process sequence of the procedure extraction part. 手順時間算出部が行う処理手順を表した一例のフローチャートである。It is a flowchart of an example showing the process sequence which a procedure time calculation part performs. 手順時間算出部により開始時間と終了時間とが記入された手順抽出データの一例を示す説明図である。It is explanatory drawing which shows an example of the procedure extraction data in which the start time and the end time were entered by the procedure time calculation part. ステップＳ２１の処理手順を表した一例のフローチャートである。It is a flowchart of an example showing the process sequence of step S21. 手順時間算出部により重複する手順の修正および抽出漏れの手順の挿入がされた手順抽出データの一例を示す説明図である。It is explanatory drawing which shows an example of the procedure extraction data in which the correction of the duplicate procedure and the insertion omission procedure were inserted by the procedure time calculation part. ステップＳ２２の処理手順を表した一例のフローチャートである。It is a flowchart of an example showing the process sequence of step S22. 無音区間情報の一例を示す説明図である。It is explanatory drawing which shows an example of a silence area information. 無音区間の処理を行った手順抽出データの一例を示す説明図である。It is explanatory drawing which shows an example of the procedure extraction data which performed the process of the silence section. ステップＳ２３の処理手順を表した一例のフローチャートである。It is an example flowchart showing the process sequence of step S23. ステップＳ５３の処理手順を表した一例のフローチャートである。It is an example flowchart showing the process sequence of step S53. ステップＳ５７の処理手順を表した一例のフローチャートである。It is a flowchart of an example showing the process sequence of step S57. 応対フローデータの一例のイメージ図である。It is an image figure of an example of reception flow data. 応対手順時間情報の一例を示す説明図である。It is explanatory drawing which shows an example of reception procedure time information. 音声認識キーワードリストと手順キーワード表とを作成する手順を示す説明図である。It is explanatory drawing which shows the procedure which produces a speech recognition keyword list and a procedure keyword table. 手順キーワード抽出部が行う処理を表したイメージ図である。It is an image figure showing the process which a procedure keyword extraction part performs. 対応語句作成部が行う処理を表したイメージ図である。It is an image figure showing the process which a corresponding phrase creation part performs. 音声認識キーワードリスト作成部が行う処理を表したイメージ図である。It is an image figure showing the process which the speech recognition keyword list preparation part performs. 最適説明手順抽出装置の処理手順の一例を表すフローチャートである。It is a flowchart showing an example of the process sequence of the optimal description procedure extraction apparatus.

Explanation of symbols

１応対フロー作成装置
５１入力装置
５２出力装置
５３ドライブ装置
５４補助記憶装置
５５主記憶装置
５６演算処理装置
５７インターフェース装置
６１韻律情報抽出部
６２発話区間抽出部
６３フェーズ推定部
６４音声認識部
６５手順抽出部
６６手順時間算出部
６７応対フロー作成部
７１通話録音データ
７２韻律データ
７３発話区間情報
７４フェーズ情報
７５音声認識キーワードリスト
７６音声認識結果
７７手順キーワード表
７８手順抽出データ
７９応対フローデータ
８０応対手順時間情報 DESCRIPTION OF SYMBOLS 1 Response flow creation apparatus 51 Input apparatus 52 Output apparatus 53 Drive apparatus 54 Auxiliary storage apparatus 55 Main storage apparatus 56 Arithmetic processing apparatus 57 Interface apparatus 61 Prosodic information extraction part 62 Spoken section extraction part 63 Phase estimation part 64 Speech recognition part 65 Procedure extraction Unit 66 Procedure time calculation unit 67 Response flow creation unit 71 Call recording data 72 Prosody data 73 Speech section information 74 Phase information 75 Speech recognition keyword list 76 Speech recognition result 77 Procedure keyword table 78 Procedure extraction data 79 Response flow data 80 Response procedure time information

Claims

In order to create a response flow from the call recording data recorded from the customer response,
Procedure keywords that are recorded in advance as a procedure keyword table by associating a keyword uttered by the responder to explain the procedure and a keyword uttered by the customer in response to the explanation of the procedure for each procedure of dealing with the customer Table recording means;
Based on prosodic information extracted from the call recording data, utterance interval extraction means for extracting the utterance interval of the customer and the respondent;
Voice recognition means for voice recognition of the call recording data and outputting the recognized keyword, speaker, and appearance time as a voice recognition result;
A procedure extracting means for reading out the procedure keyword table and extracting the procedure corresponding to the recognized keyword included in the speech recognition result based on the procedure keyword table;
Based on the procedure corresponding to the speech segment extracted by the speech segment extraction unit, the speech recognition result output by the speech recognition unit, and the recognized keyword extracted by the procedure extraction unit, the procedure ends from the start. And when there is a silent section other than the speech section, the speaker immediately after the silent section is set as the generator of the silent section, and the time of the silent section is determined from the start to the end of the procedure. A procedure time calculating means for adding to the time;
A reception flow creation program for functioning as reception flow creation means for creating and outputting the reception flow based on the time from the start to the end of the procedure.

The procedure keyword table recording means records the pre-operation or post-operation associated with the procedure in association with the procedure keyword table as the operation contents of the responder for each procedure of the customer response.
When there is a silent section other than the utterance section, the procedure time calculating means performs a pre-work or post-work associated with the procedure before or after the procedure based on the operation content of the responder included in the procedure keyword table. The reception flow creation program according to claim 1, wherein it is determined whether or not there is, and the procedure for adding the silent section is selected based on the determination result.

Based on the operation details of the responder included in the procedure keyword table, if there is a pre-operation or a post-operation associated with the procedure at a position corresponding to the silence interval, the silence interval is determined to be valid. 3. The response flow creation program according to claim 2, wherein if there is no pre-work or post-work associated with the procedure, the silent section is determined to be invalid.

The procedure time calculation means reads the procedure keyword table, the procedure corresponding to the recognized keyword extracted by the procedure extraction means is missing, and the utterance section is at a position corresponding to the missing procedure 4. The response flow creation program according to claim 1, wherein the missing procedure is added according to the utterance period.

A reception flow creation method in which a computer creates a reception flow from call recording data obtained by recording a reception call to a customer.
A procedure keyword table that is recorded in advance as a procedure keyword table in association with a keyword uttered by the responder for explaining the procedure and a keyword uttered by the customer in response to the explanation of the procedure for each procedure of dealing with the customer Recording step;
Based on the prosodic information extracted from the call recording data, the utterance interval extraction step of extracting the utterance interval of the customer and the responder;
A voice recognition step of voice-recognizing the call recording data and outputting the recognized keyword, speaker, and appearance time as a voice recognition result;
A procedure extracting step of reading out the procedure keyword table and extracting the procedure corresponding to the recognized keyword included in the speech recognition result based on the procedure keyword table;
Based on the procedure corresponding to the speech segment extracted in the speech segment extraction step, the speech recognition result output in the speech recognition step, and the recognized keyword extracted in the procedure extraction step And when there is a silent section other than the speech section, the speaker immediately after the silent section is set as the generator of the silent section, and the time of the silent section is determined from the start to the end of the procedure. A procedure time calculation step to add to the time;
And a reception flow creation step of creating and outputting the reception flow based on the time from the start to the end of the procedure.

A service flow creation device for creating a service flow from call recording data obtained by recording a service call to a customer,
Procedure keywords that are recorded in advance as a procedure keyword table by associating a keyword uttered by the responder to explain the procedure and a keyword uttered by the customer in response to the explanation of the procedure for each procedure of dealing with the customer Table recording means;
Based on prosodic information extracted from the call recording data, utterance interval extraction means for extracting the utterance interval of the customer and the respondent;
Voice recognition means for voice recognition of the call recording data and outputting the recognized keyword, speaker, and appearance time as a voice recognition result;
A procedure extracting means for reading out the procedure keyword table and extracting the procedure corresponding to the recognized keyword included in the speech recognition result based on the procedure keyword table;
Based on the procedure corresponding to the speech segment extracted by the speech segment extraction unit, the speech recognition result output by the speech recognition unit, and the recognized keyword extracted by the procedure extraction unit, the procedure ends from the start. And when there is a silent section other than the speech section, the speaker immediately after the silent section is set as the generator of the silent section, and the time of the silent section is determined from the start to the end of the procedure. A procedure time calculating means for adding to the time;
A reception flow creation device comprising reception flow creation means for creating and outputting the reception flow based on the time from the start to the end of the procedure.