JP7377668B2

JP7377668B2 - Control device, control method and computer program

Info

Publication number: JP7377668B2
Application number: JP2019183954A
Authority: JP
Inventors: 光記東野; 恭子福田; 哲也中ノ堂
Original assignee: NTT Communications Corp
Current assignee: NTT Communications Corp
Priority date: 2019-10-04
Filing date: 2019-10-04
Publication date: 2023-11-10
Anticipated expiration: 2039-10-04
Also published as: JP2021060490A

Description

本発明は、コンタクトセンター等の顧客対応業務において用いられる技術に関する。 The present invention relates to technology used in customer service operations such as contact centers.

近年、コンタクトセンターの顧客対応業務において、人手を介して行うことに複数の課題が指摘されている。例えば、労働力人口の減少やクレーム対応における精神的負担の高さなどに伴い、人材確保が難しくなってきている。また、上述した精神的負担の高さ等の条件に起因して離職率が非常に高く、育成コストが高くついてしまっている。 In recent years, multiple issues have been pointed out in contact center customer service operations that require manual intervention. For example, it has become difficult to secure human resources due to the decline in the working population and the high mental burden of dealing with complaints. Furthermore, due to conditions such as the high mental burden mentioned above, the turnover rate is extremely high and training costs are high.

このような課題に対し、例えば特許文献１や特許文献２に開示されているように、ＡＩを用いた顧客対応業務の技術が提案されている。 To address these issues, technologies for customer service operations using AI have been proposed, as disclosed in Patent Document 1 and Patent Document 2, for example.

特開２０１９－１４４４００号公報Japanese Patent Application Publication No. 2019-144400 特開２０１８－１０１９４１号公報JP2018-101941A

しかしながら、自動化の技術はまだ完全ではなく、認識や解釈などの処理の精度が十分ではない場合があった。
上記事情に鑑み、本発明は、顧客対応業務においてより精度の高い処理を実行することができる技術の提供を目的としている。 However, automation technology is not yet perfect, and the accuracy of processing such as recognition and interpretation may not be sufficient.
In view of the above circumstances, an object of the present invention is to provide a technology that allows more accurate processing to be executed in customer service operations.

本発明の一態様は、顧客の発話内容に基づいて、前記顧客に提示される応答文を決定し、前記顧客の要求に応じた業務内容を判定し、前記業務内容の実行を業務実行装置に対して指示する制御部、を備え、前記制御部は、前記発話内容を解析する処理の実行主体となる装置を、同種の処理が可能な複数の装置の中から選択する、制御装置である。 One aspect of the present invention is to determine a response sentence to be presented to the customer based on the content of the customer's utterance, determine the business content in response to the customer's request, and cause the business execution device to execute the business content. The control unit is a control device that selects a device that will be the main body for executing processing for analyzing the utterance content from among a plurality of devices capable of performing the same type of processing.

本発明の一態様は、上記の制御装置であって、前記発話内容は音声で表され、前記制御部は、音声認識処理を実行する音声認識装置に対し、前記発話内容を示す音声の音声認識処理を要求し、前記音声認識装置から得られた認識結果に基づいて前記応答文を決定し、前記業務内容を判定し、前記制御部は、所定の条件が満たされる場合には、複数の音声認識装置の中から、前記音声認識処理の実行主体となる装置を前記条件に応じた装置として選択する。 One aspect of the present invention is the above-mentioned control device, wherein the content of the utterance is expressed as a voice, and the control unit is configured to cause a voice recognition device that executes a voice recognition process to perform voice recognition of the voice indicating the content of the utterance. The control unit requests processing, determines the response sentence based on the recognition result obtained from the voice recognition device, determines the business content, and when a predetermined condition is satisfied, the control unit From among the recognition devices, a device that is the main body for executing the voice recognition process is selected as a device that meets the conditions.

本発明の一態様は、上記の制御装置であって、前記制御部は、音声合成処理を実行する音声合成装置に対し、前記応答文を表す音声の生成を要求し、前記音声合成装置から得られた音声を用いて前記顧客に対して応答を行い、前記制御部は、所定の条件が満たされる場合には、複数の音声合成装置の中から、前記音声合成処理の実行主体となる装置を前記条件に応じた装置として選択する。 One aspect of the present invention is the above control device, wherein the control unit requests a speech synthesis device that executes speech synthesis processing to generate speech representing the response sentence, and obtains a speech from the speech synthesis device. The controller responds to the customer using the voice generated by the user, and if a predetermined condition is satisfied, the control unit selects a device that will be the main body for executing the voice synthesis process from among the plurality of voice synthesis devices. Select the device according to the above conditions.

本発明の一態様は、顧客の発話内容に基づいて、前記顧客に提示される応答文を決定する第一ステップと、前記顧客の要求に応じた業務内容を判定する第二ステップと、前記業務内容の実行を業務実行装置に対して指示する第三ステップと、前記第一ステップ、第二ステップ、第三ステップのいずれかの処理の実行主体となる装置を、同種の処理が可能な複数の装置の中から選択する第四ステップと、を有する制御方法である。 One aspect of the present invention includes a first step of determining a response sentence to be presented to the customer based on the content of the customer's utterance, a second step of determining the business content in response to the customer's request, and a second step of determining the business content according to the customer's request. The third step is to instruct the business execution device to execute the content, and the device that is the main body for executing any of the processes in the first step, second step, or third step is divided into two or more devices that can perform the same type of processing. This control method includes a fourth step of selecting from among the devices.

本発明の一態様は、上記の制御装置としてコンピュータを機能させるためのコンピュータプログラムである。 One aspect of the present invention is a computer program for causing a computer to function as the above control device.

本発明により、顧客対応業務においてより精度の高い処理を実行することが可能となる。 According to the present invention, it is possible to perform more accurate processing in customer service work.

本発明の顧客対応業務システムのシステム構成を示す概略ブロック図である。1 is a schematic block diagram showing the system configuration of a customer service business system according to the present invention. フロー制御装置の機能ブロックの具体例を示す図である。FIG. 3 is a diagram showing a specific example of functional blocks of a flow control device. フロー情報テーブルの具体例を示す図である。FIG. 3 is a diagram showing a specific example of a flow information table. ルール情報テーブルの具体例を示す図である。FIG. 3 is a diagram showing a specific example of a rule information table. 音声を用いた対話が行われる際に顧客対応業務システムが実行する処理の具体例を示すシーケンスチャートである。2 is a sequence chart showing a specific example of processing executed by the customer service business system when a dialogue using voice is performed. 音声を用いた対話が行われる際に顧客対応業務システムが実行する処理の具体例を示すシーケンスチャートである。2 is a sequence chart showing a specific example of processing executed by the customer service business system when a dialogue using voice is performed.

以下、本発明の具体的な構成例について、図面を参照しながら説明する。
図１は、本発明の顧客対応業務システム１００のシステム構成を示す概略ブロック図である。顧客対応業務システム１００は、応答制御装置１０、フロー制御装置２０、複数の音声認識装置３０（３０－１、３０－２）、複数の意図解釈装置４０（４０－１、４０－２）、複数の音声合成装置５０（５０－１、５０－２）、業務実行装置６０、業務システム７０及びオペレーター端末８０を備える。このように、音声認識装置３０、意図解釈装置４０及び音声合成装置５０に関しては、同種の処理を行う装置が複数備えられる。同種の処理を行う装置の具体例として、図１ではそれぞれ２台の装置が記載されているが、３台以上備えられてもよい。音声認識装置３０、意図解釈装置４０及び音声合成装置５０が行うそれぞれの処理は、発話内容を解析する処理である。顧客対応業務システム１００の各装置は、ネットワーク３００を介してデータ通信可能に接続されている。顧客対応業務システム１００は、ネットワーク３００又は電話網４００を介して接続してくるユーザー端末２００のユーザー（顧客）に対し、音声対話又は文字対話による顧客対応業務を行う。 Hereinafter, specific configuration examples of the present invention will be described with reference to the drawings.
FIG. 1 is a schematic block diagram showing the system configuration of a customer service business system 100 of the present invention. The customer service business system 100 includes a response control device 10, a flow control device 20, a plurality of voice recognition devices 30 (30-1, 30-2), a plurality of intention interpretation devices 40 (40-1, 40-2), and a plurality of The speech synthesis device 50 (50-1, 50-2), a business execution device 60, a business system 70, and an operator terminal 80 are provided. In this way, regarding the speech recognition device 30, the intention interpretation device 40, and the speech synthesis device 50, a plurality of devices that perform the same type of processing are provided. As specific examples of devices that perform the same type of processing, two devices are shown in FIG. 1, but three or more devices may be provided. Each of the processes performed by the speech recognition device 30, the intention interpretation device 40, and the speech synthesis device 50 is a process of analyzing the content of the utterance. Each device of the customer service business system 100 is connected via a network 300 to enable data communication. The customer service system 100 performs customer service using voice dialogue or text dialogue for users (customers) of user terminals 200 connected via the network 300 or telephone network 400.

ユーザー端末２００は、顧客対応業務システム１００と通信可能に接続されている。ユーザー端末２００は、ネットワーク３００を介してデータ通信可能に顧客対応業務システム１００に接続されてもよいし、電話網４００を介して音声通信やデータ通信が可能となるように顧客対応業務システム１００に接続されてもよい。ユーザー端末２００は、顧客となるユーザーによって操作されることによって、顧客対応業務システム１００との間で音声通信又は文字の通信を行う。ユーザー端末２００は、スマートフォン、電話機、パーソナルコンピューター、スマートスピーカー等の情報処理装置である。 The user terminal 200 is communicably connected to the customer service business system 100. The user terminal 200 may be connected to the customer service system 100 for data communication via the network 300, or may be connected to the customer service system 100 for voice communication and data communication via the telephone network 400. May be connected. The user terminal 200 performs voice communication or text communication with the customer service business system 100 when operated by a user who is a customer. The user terminal 200 is an information processing device such as a smartphone, a telephone, a personal computer, or a smart speaker.

次に顧客対応業務システム１００の各装置について説明する。
応答制御装置１０は、ユーザー端末２００と顧客対応業務システム１００との間で通信インターフェースとしての機能を実現する。応答制御装置１０は、例えばユーザー端末２００との間で通信のセッションを管理する。応答制御装置１０は、例えばユーザー端末２００から文字対話の通信を要求された場合、ユーザー端末２００から送信された文字を示す情報をフロー制御装置２０に出力する。応答制御装置１０は、例えばユーザー端末２００から音声対話の通信を要求された場合、ユーザー端末２００から送信された音声を示す情報をフロー制御装置２０に出力する。応答制御装置１０は、フロー制御装置２０から応答情報を受信すると、応答情報に基づいてユーザー端末２００に対し応答する。応答制御装置１０は、例えばユーザー端末２００から文字対話の通信を要求されている場合、ユーザー端末２００に対し応答内容を示す文字列を送信する。応答制御装置１０は、例えばユーザー端末２００から音声対話の通信を要求されている場合、ユーザー端末２００に対し音声信号を送信することによって音声で応答する。このような応答制御装置１０の機能は、例えばＩＶＲ（Interactive Voice Response）を用いて実現されてもよい。 Next, each device of the customer service business system 100 will be explained.
The response control device 10 realizes a function as a communication interface between the user terminal 200 and the customer service business system 100. The response control device 10 manages a communication session with, for example, a user terminal 200. For example, when the user terminal 200 requests text dialogue communication, the response control device 10 outputs information indicating the characters transmitted from the user terminal 200 to the flow control device 20. For example, when the user terminal 200 requests voice dialogue communication, the response control device 10 outputs information indicating the voice transmitted from the user terminal 200 to the flow control device 20. When the response control device 10 receives the response information from the flow control device 20, it responds to the user terminal 200 based on the response information. For example, when the user terminal 200 requests text dialogue communication, the response control device 10 transmits a character string indicating the content of the response to the user terminal 200. For example, when the user terminal 200 requests voice dialogue communication, the response control device 10 responds with voice by transmitting an audio signal to the user terminal 200. Such functions of the response control device 10 may be realized using, for example, IVR (Interactive Voice Response).

フロー制御装置２０は、ユーザー端末２００から送信されるユーザーの発話内容に応じて対応処理を行う。図２は、フロー制御装置２０の機能ブロックの具体例を示す図である。フロー制御装置２０は、パーソナルコンピューターやサーバー装置等の情報処理装置を用いて構成される。フロー制御装置２０は、通信部２１、通話履歴記憶部２２、フロー情報記憶部２３、ルール情報記憶部２４及び制御部２５を備える。 The flow control device 20 performs corresponding processing according to the content of the user's utterances transmitted from the user terminal 200. FIG. 2 is a diagram showing a specific example of functional blocks of the flow control device 20. As shown in FIG. The flow control device 20 is configured using an information processing device such as a personal computer or a server device. The flow control device 20 includes a communication section 21 , a call history storage section 22 , a flow information storage section 23 , a rule information storage section 24 , and a control section 25 .

通信部２１は、通信インターフェースを用いて構成される。通信部２１は、ネットワーク３００を介して他の装置とデータ通信する。 The communication unit 21 is configured using a communication interface. The communication unit 21 performs data communication with other devices via the network 300.

通話履歴記憶部２２は、磁気ハードディスク装置や半導体記憶装置等の記憶装置を用いて構成される。通話履歴記憶部２２は、通話履歴情報を記憶する。通話履歴情報は、通話開始時刻、通話終了時刻、ユーザー識別情報、通話内容情報を含む。通話開始時刻は、通話が開始された時刻である。文字対話の通信の場合には、文字対話が開始された時刻を指す。通話終了時刻は、通話が終了した時刻である。文字対話の通信の場合には、文字対話が終了した時刻を指す。ユーザー識別情報は、対話を行ったユーザーを示す情報である。ユーザー識別情報は、例えばユーザーに予め割り当てられた識別情報（ユーザーＩＤ）であってもよいし、ユーザー端末２００に割り当てられている発信者番号（電話番号）であってもよいし、他の情報であってもよい。通話内容情報は、顧客と顧客対応業務システム１００との間で行われた通話における、顧客の発話内容と、顧客対応業務システム１００による顧客への発話内容と、を含む。例えば音声対話が行われた場合には、その音声が録音されたデータであってもよい。例えば文字対話が行われた場合には、その文字列が記録されたデータであってもよい。音声対話が行われた場合には、音声の録音データと、音声認識の結果を示す文字列データと、の双方が記録されてもよい。 The call history storage unit 22 is configured using a storage device such as a magnetic hard disk device or a semiconductor storage device. The call history storage unit 22 stores call history information. The call history information includes call start time, call end time, user identification information, and call content information. The call start time is the time when the call started. In the case of text dialogue communication, this refers to the time when the text dialogue started. The call end time is the time when the call ended. In the case of text dialogue communication, this refers to the time when the text dialogue ended. The user identification information is information indicating the user who has interacted with the user. The user identification information may be, for example, identification information (user ID) assigned to the user in advance, a caller number (telephone number) assigned to the user terminal 200, or other information. It may be. The call content information includes the content of the customer's utterance and the content of the utterance to the customer by the customer service system 100 during the call between the customer and the customer service system 100. For example, in the case of a voice dialogue, the voice may be recorded data. For example, in the case of a character dialogue, the character string may be recorded data. When a voice dialogue is performed, both voice recording data and character string data indicating the result of voice recognition may be recorded.

フロー情報記憶部２３は、磁気ハードディスク装置や半導体記憶装置等の記憶装置を用いて構成される。フロー情報記憶部２３は、フロー情報を記憶する。図３は、フロー情報テーブルの具体例を示す図である。フロー情報テーブルは、テーブル形式で構成されたフロー情報である。フロー情報テーブルは、複数のフロー情報レコード２３１を記憶する。フロー情報レコード２３１は、メッセージＩＤ及び業務内容情報を有する。メッセージＩＤは、顧客に対して通知されるメッセージを示す識別情報である。業務内容情報は、同じフロー情報レコード２３１に記録されているメッセージＩＤのメッセージが顧客に通知された際に、実行されるべき業務内容を示す。例えば、メッセージＩＤが依頼を受領したことを伝えるメッセージ（例えば「ご依頼承りました。ありがとうございました。」というメッセージ）を示す場合、業務内容情報には、それまでの音声で蓄積された依頼内容に基づいて発注処理を行うことを示す情報が定義される。業務内容情報は、例えばＲＰＡ（Robotic Process Automation）のシナリオを示す情報として定義されてもよい。 The flow information storage unit 23 is configured using a storage device such as a magnetic hard disk device or a semiconductor storage device. The flow information storage unit 23 stores flow information. FIG. 3 is a diagram showing a specific example of the flow information table. The flow information table is flow information configured in a table format. The flow information table stores a plurality of flow information records 231. The flow information record 231 has a message ID and business content information. The message ID is identification information indicating the message to be notified to the customer. The business content information indicates the business content to be executed when the customer is notified of the message with the message ID recorded in the same flow information record 231. For example, if the message ID indicates a message indicating that a request has been received (for example, a message that says "We have accepted your request. Thank you."), the job content information may include the request content that has been accumulated in the previous voice. Information indicating that order processing is to be performed based on is defined. The business content information may be defined as information indicating a scenario of RPA (Robotic Process Automation), for example.

ルール情報記憶部２４は、磁気ハードディスク装置や半導体記憶装置等の記憶装置を用いて構成される。ルール情報記憶部２４は、ルール情報を記憶する。図４は、ルール情報テーブルの具体例を示す図である。ルール情報テーブルは、テーブル形式で構成されたルール情報である。ルール情報テーブルは、複数のルール情報レコード２４１を記憶する。ルール情報レコード２４１は、条件、実行タイミング及び装置ＩＤを有する。条件は、顧客との間で行われている処理に関する条件を示す。実行タイミングは、条件が満たされた際に装置ＩＤによる処理が実行されるタイミングを示す。装置ＩＤは、処理結果が採用される装置を示す識別情報である。装置ＩＤは、例えば、顧客対応業務システム１００に含まれる装置を一意に示す識別情報である。装置ＩＤは、例えば第一音声認識装置３０－１を示す識別情報、第二音声認識装置３０－２を示す識別情報、第一意図解釈装置４０－１を示す識別情報、第二意図解釈装置４０－２を示す識別情報、第一音声合成装置５０－１を示す識別情報、第二音声合成装置５０－２を示す識別情報、などを含む。 The rule information storage unit 24 is configured using a storage device such as a magnetic hard disk device or a semiconductor storage device. The rule information storage unit 24 stores rule information. FIG. 4 is a diagram showing a specific example of the rule information table. The rule information table is rule information configured in a table format. The rule information table stores a plurality of rule information records 241. The rule information record 241 includes conditions, execution timing, and device ID. The conditions indicate the conditions regarding the processing being performed with the customer. The execution timing indicates the timing at which the process based on the device ID is executed when the condition is met. The device ID is identification information that indicates the device for which the processing result is adopted. The device ID is, for example, identification information that uniquely indicates a device included in the customer service business system 100. The device ID includes, for example, identification information indicating the first speech recognition device 30-1, identification information indicating the second speech recognition device 30-2, identification information indicating the first intention interpretation device 40-1, and second intention interpretation device 40. -2, identification information indicating the first speech synthesis device 50-1, identification information indicating the second speech synthesis device 50-2, etc.

ルール情報レコード２４１に示される条件が満たされると、その実行タイミングが示す処理は、装置ＩＤが示す装置によって実行される。例えば、メッセージＩＤ“Ａ１”が顧客に対して出力された場合、１つ次の処理は、装置ＩＤ“２０ｄ１”によって実行されるように制御が行われる。実行タイミングが“１つ次”となっている場合のように、条件が満たされたことが判明した後の処理を示すタイミングの処理に関しては、処理の実行主体は装置ＩＤが示す装置のみに限定されてもよい。 When the conditions indicated in the rule information record 241 are satisfied, the process indicated by the execution timing is executed by the device indicated by the device ID. For example, when message ID "A1" is output to a customer, control is performed such that the next process is executed by device ID "20d1". As in the case where the execution timing is "one time", the execution entity of the process is limited to the device indicated by the device ID for the process at the timing that indicates the process after the condition has been found to be satisfied. may be done.

例えば、顧客によって行われた回答の音声に“名前”という文字列が含まれていた場合には、その音声に対する処理は、処理が可能な各装置によって実行された後に、装置ＩＤ“３０ａ２”による実行結果が採用される。実行タイミングが“現在”となっている場合のように、条件が満たされたことが判明するのに一度は処理を実行しなければならない場合には、処理の実行主体は複数（又は全部）が選択され、その後に条件が判定されて、満たされた条件に応じた装置ＩＤの実行結果のみが採用されてもよい。 For example, if the character string "name" is included in the voice response given by the customer, the voice will be processed by each device that can process it, and then the voice will be processed using the device ID "30a2". The execution result is adopted. If the execution timing is "current" and the process must be executed once to determine that the condition has been met, multiple (or all) execution entities may be responsible for executing the process. A device ID may be selected, and then a condition is determined, and only the execution result of the device ID corresponding to the satisfied condition may be adopted.

制御部２５は、ＣＰＵ（Central Processing Unit）等のプロセッサとメモリとを用いて構成される。制御部２５は、所定のプログラムを実行することによって、セッション制御部２５１、履歴制御部２５２、音声認識制御部２５３、意図解釈制御部２５４、音声合成制御部２５５及び業務制御部２５６として機能する。 The control unit 25 is configured using a processor such as a CPU (Central Processing Unit) and a memory. The control unit 25 functions as a session control unit 251, a history control unit 252, a speech recognition control unit 253, an intention interpretation control unit 254, a speech synthesis control unit 255, and a business control unit 256 by executing a predetermined program.

セッション制御部２５１は、ユーザー端末２００と顧客対応業務システム１００との間の通信セッションを制御する。セッション制御部２５１は、例えばエスカレーションの条件が満たされたか否か判定する。エスカレーションの条件とは、現在実行中の顧客対応では対応を維持することとが困難である可能性を示す条件である。例えば、エスカレーションの条件は、ユーザー端末２００から受信された音声に、顧客が気分を害している可能性があることを示す所定の文字列が含まれていることであってもよい。エスカレーションとは、他の顧客対応者又は他の顧客機能に顧客対応を移行することである。本実施形態では、エスカレーションの具体例として、人間のオペレーターに対して顧客対応が移行される。セッション制御部２５１は、業務制御部２５６においてエスカレーションを行うと判定された場合には、音声対話の接続先をフロー制御装置２０からオペレーター端末８０に変更する。 The session control unit 251 controls the communication session between the user terminal 200 and the customer service business system 100. The session control unit 251 determines, for example, whether an escalation condition is satisfied. The escalation condition is a condition indicating that it may be difficult to maintain the customer response that is currently being performed. For example, the escalation condition may be that the voice received from the user terminal 200 includes a predetermined character string indicating that the customer may be offended. Escalation is the transfer of a customer interaction to another customer handler or other customer function. In this embodiment, as a specific example of escalation, customer service is transferred to a human operator. If the task control unit 256 determines that escalation is to be performed, the session control unit 251 changes the connection destination of the voice dialogue from the flow control device 20 to the operator terminal 80.

履歴制御部２５２は、ユーザー端末２００と顧客対応業務システム１００との間で行われた通話に関する通話履歴情報を生成する。履歴制御部２５２は、生成された通話履歴情報を通話履歴記憶部２２に記録する。 The history control unit 252 generates call history information regarding calls made between the user terminal 200 and the customer service business system 100. The history control unit 252 records the generated call history information in the call history storage unit 22.

音声認識制御部２５３は、顧客対応業務システム１００とユーザー端末２００との間で音声対話が行われている場合に、ユーザー端末２００から送信された音声について音声認識処理を音声認識装置３０に対し要求する。音声認識制御部２５３は、音声認識装置３０から音声認識結果を受信すると、音声認識結果を意図解釈制御部２５４に出力する。音声認識制御部２５３は、ルール情報記憶部２４に記憶されている条件が満たされた場合であって、且つ、その条件に対応付けられた装置ＩＤが示す装置が音声認識装置である場合には、実行タイミングが示すタイミングで、装置ＩＤが示す音声認識装置３０（実行主体）を選択し、その装置による音声認識処理の実行結果を取得する。 The voice recognition control unit 253 requests the voice recognition device 30 to perform voice recognition processing on the voice transmitted from the user terminal 200 when a voice dialogue is taking place between the customer service business system 100 and the user terminal 200. do. Upon receiving the voice recognition result from the voice recognition device 30, the voice recognition control unit 253 outputs the voice recognition result to the intention interpretation control unit 254. When the condition stored in the rule information storage section 24 is satisfied and the device indicated by the device ID associated with the condition is a voice recognition device, the voice recognition control section 253 performs the following operation. , at the timing indicated by the execution timing, selects the speech recognition device 30 (executor) indicated by the device ID, and obtains the execution result of the speech recognition process by that device.

例えば、顧客の名前が回答として得られる可能性が高い所定のメッセージを示すメッセージＩＤの出力という条件に対応付けて、名前の認識の精度が相対的に高い音声認識装置３０の装置ＩＤが対応付けられていてもよい。顧客の名前が回答として得られる可能性が高い所定のメッセージとは、例えば名前を問うメッセージ（例えば「あなたの名前をお聞かせ下さい。」というメッセージ）である。 For example, in association with the condition of outputting a message ID indicating a predetermined message for which the customer's name is likely to be obtained as an answer, the device ID of the voice recognition device 30 with relatively high name recognition accuracy is associated. It may be. The predetermined message for which the customer's name is likely to be obtained as an answer is, for example, a message asking for the customer's name (for example, a message saying "Please tell me your name.").

例えば、顧客の名前が含まれている可能性の高い回答が得られたという条件に対応付けて、名前の認識の精度が相対的に高い音声認識装置３０の装置ＩＤが対応付けられていてもよい。顧客の名前が含まれている可能性の高い回答とは、例えば“名前”という文字列が含まれていた回答である。 For example, even if the device ID of the voice recognition device 30 with relatively high name recognition accuracy is associated with the condition that an answer that is likely to include the customer's name is obtained, good. An answer that is likely to include the customer's name is, for example, an answer that includes the character string "name."

例えば、顧客の住所が回答として得られる可能性が高い所定のメッセージを示すメッセージＩＤの出力という条件に対応付けて、住所の認識の精度が相対的に高い音声認識装置３０の装置ＩＤが対応付けられていてもよい。顧客の住所が回答として得られる可能性が高い所定のメッセージとは、例えば住所を問うメッセージ（例えば「あなたの住所をお聞かせ下さい。」というメッセージ）である。 For example, in association with the condition of outputting a message ID indicating a predetermined message for which the customer's address is likely to be obtained as an answer, the device ID of the voice recognition device 30, which has relatively high address recognition accuracy, is associated. It may be. The predetermined message for which the customer's address is likely to be obtained as an answer is, for example, a message asking for the address (for example, a message saying "Please tell me your address.").

例えば、顧客の住所が含まれている可能性の高い回答が得られたという条件に対応付けて、住所の認識の精度が相対的に高い音声認識装置３０の装置ＩＤが対応付けられていてもよい。顧客の住所が含まれている可能性の高い回答とは、例えば“住所”という文字列か、４７都道府県の県名の文字列かのいずれかが含まれていた回答である。 For example, even if the device ID of the voice recognition device 30 with relatively high address recognition accuracy is associated with the condition that an answer that is likely to include the customer's address is obtained, good. An answer that is likely to include the customer's address is, for example, an answer that includes either the character string "address" or the character string of the name of one of the 47 prefectures.

意図解釈制御部２５４は、顧客対応業務システム１００とユーザー端末２００との間で音声対話が行われている場合に、音声認識処理の結果として得られた文字列についての意図解釈処理を意図解釈装置４０に対し要求する。意図解釈制御部２５４は、意図解釈装置４０から意図解釈結果を受信すると、意図解釈結果の文字列をセッション制御部２５１に出力する。意図解釈制御部２５４は、ルール情報記憶部２４に記憶されている条件が満たされた場合であって、且つ、その条件に対応付けられた装置ＩＤが示す装置が意図解釈装置である場合には、実行タイミングが示すタイミングで、装置ＩＤが示す意図解釈装置４０（実行主体）を選択し、その装置による意図解釈処理の実行結果を取得する。 The intention interpretation control unit 254 performs an intention interpretation process on a character string obtained as a result of voice recognition processing when a voice dialogue is performed between the customer service business system 100 and the user terminal 200. Request for 40. When the intention interpretation control unit 254 receives the intention interpretation result from the intention interpretation device 40, it outputs the character string of the intention interpretation result to the session control unit 251. When the condition stored in the rule information storage section 24 is satisfied and the device indicated by the device ID associated with the condition is an intention interpretation device, the intention interpretation control section 254 , at the timing indicated by the execution timing, selects the intention interpretation device 40 (execution subject) indicated by the device ID, and obtains the execution result of the intention interpretation process by that device.

例えば、顧客から質問を示す音声が得られる可能性が高い状況という条件に対応付けて、質問の意図解釈の精度が相対的に高い意図解釈装置４０の装置ＩＤが対応付けられていてもよい。質問を示す音声が得られる可能性が高い状況とは、例えば顧客によって問い合わせの目的を示す選択肢の中から質問を示す選択肢が選択された場合である。 For example, the device ID of the intention interpretation device 40 that has relatively high accuracy in interpreting the intention of the question may be associated with a condition that there is a high possibility that a voice indicating a question will be obtained from the customer. A situation in which there is a high possibility that a voice indicating a question will be obtained is, for example, a case where the customer selects an option indicating a question from among options indicating the purpose of the inquiry.

例えば、顧客から苦情を示す音声が得られる可能性が高い状況という条件に対応付けて、苦情の意図解釈の精度が相対的に高い意図解釈装置４０の装置ＩＤが対応付けられていてもよい。苦情を示す音声が得られる可能性が高い状況とは、例えば顧客によって問い合わせの目的を示す選択肢の中から苦情を示す選択肢が選択された場合である。 For example, the device ID of the intention interpretation device 40 that has relatively high precision in interpreting the intention of a complaint may be associated with the condition that there is a high possibility that a voice indicating a complaint will be obtained from the customer. A situation in which there is a high possibility that a voice indicating a complaint will be obtained is, for example, a case where the customer selects an option indicating a complaint from among options indicating the purpose of the inquiry.

意図解釈制御部２５４は、意図解釈装置４０から得られたデータ（応答文情報）に人の名前が含まれている場合には、予め得られている人の名前と、意図解釈結果の人の名前とを比較する。比較の結果、両者が同一であれば、意図解釈制御部２５４は、意図解釈結果の人の名前を用いて処理を進める。一方、比較の結果同一でなかったとしても、同一である可能性が高い所定の条件を満たしている場合には、予め得られている人の名前を正解の名前として意図解釈制御部２５４は処理を進める。同一である可能性が高い所定の条件とは、例えば名前の文字列の中で１つの文字列のみが異なることであってもよい。同一である可能性が高い所定の条件とは、例えば名前の文字列の中で１つの文字列のみが異なっており、且つ、異なっている文字列の母音が同一であることであってもよい。同一である可能性が高い所定の条件とは、例えば名前の文字列の中で１つの文字列のみが異なっており、且つ、異なっている文字列の母音同士が、一般的に認識誤りが生じやすい所定の関係（例えば“あ”と“え”）であることであってもよい。 When the data (response text information) obtained from the intention interpretation device 40 includes a person's name, the intention interpretation control unit 254 combines the person's name obtained in advance and the person's intention interpretation result. Compare with name. As a result of the comparison, if the two are the same, the intention interpretation control unit 254 proceeds with the process using the name of the person resulting from the intention interpretation. On the other hand, even if they are not the same as a result of the comparison, if the predetermined condition is satisfied that there is a high possibility that they are the same, the intention interpretation control unit 254 processes the person's name obtained in advance as the correct name. proceed. The predetermined condition that is likely to be the same may be, for example, that only one character string among the character strings of names is different. The predetermined condition that there is a high possibility of being the same may be, for example, that only one character string among the character strings of the name is different, and the vowels of the different character strings are the same. . The predetermined condition that there is a high possibility that they are the same means, for example, that only one character string in the character strings of a name is different, and that vowels in different character strings generally cause recognition errors. It may be an easy predetermined relationship (for example, "a" and "e").

予め得られている人の名前とは、例えば顧客によって音声やユーザー端末２００の操作によって得られたユーザーＩＤに対応付けて予め記憶されている名前である。意図解釈制御部２５４の上記処理の前に、音声やユーザー端末２００の操作によってユーザーＩＤが得られている場合には、自装置又は他の装置（例えば業務システム７０）において登録されている顧客データベースにおいて対応付けられている顧客の名前が取得されてもよい。 The person's name obtained in advance is, for example, a name stored in advance in association with a user ID obtained by the customer by voice or operation of the user terminal 200. If the user ID is obtained by voice or operation of the user terminal 200 before the above processing by the intention interpretation control unit 254, the customer database registered in the own device or another device (for example, the business system 70) The name of the customer associated with may be acquired.

音声合成制御部２５５は、セッション制御部２５１から音声合成の指示を受けると、指示された文字列について音声合成処理を行うことを音声合成装置５０に対し要求する。音声合成制御部２５５は、音声合成装置５０から音声信号を受信すると、音声信号をセッション制御部２５１に出力する。音声合成制御部２５５は、ルール情報記憶部２４に記憶されている条件が満たされた場合であって、且つ、その条件に対応付けられた装置ＩＤが示す装置が音声合成装置５０である場合には、実行タイミングが示すタイミングで、装置ＩＤが示す音声合成装置５０（実行主体）を選択し、その装置による音声合成処理の実行結果を取得する。 When receiving a speech synthesis instruction from the session control section 251, the speech synthesis control section 255 requests the speech synthesis device 50 to perform speech synthesis processing on the instructed character string. When the speech synthesis control section 255 receives the speech signal from the speech synthesis device 50, it outputs the speech signal to the session control section 251. The speech synthesis control section 255 controls the speech synthesis device 50 when the condition stored in the rule information storage section 24 is satisfied and the device indicated by the device ID associated with the condition is the speech synthesis device 50. selects the speech synthesis device 50 (executor) indicated by the device ID at the timing indicated by the execution timing, and obtains the execution result of the speech synthesis process by that device.

例えば、顧客から質問を示す音声が得られる可能性が高い状況という条件に対応付けて、通常の音声合成が行われる音声合成装置５０の装置ＩＤが対応付けられていてもよい。質問を示す音声が得られる可能性が高い状況とは、例えば顧客によって問い合わせの目的を示す選択肢の中から質問を示す選択肢が選択された場合である。 For example, the device ID of the speech synthesis device 50 that performs normal speech synthesis may be associated with a condition that there is a high possibility that a voice indicating a question will be obtained from the customer. A situation in which there is a high possibility that a voice indicating a question will be obtained is, for example, a case where the customer selects an option indicating a question from among options indicating the purpose of the inquiry.

例えば、顧客から苦情を示す音声が得られる可能性が高い状況という条件に対応付けて、苦情に対して適している所定の条件を満たした音声合成装置５０の装置ＩＤが対応付けられていてもよい。苦情に対して適している所定の条件を満たした音声合成装置５０とは、例えば謝罪用の音声を出力することを目的として作成された音声合成装置５０であってもよいし、声の音程が相対的に低く、読み上げ速度が相対的に遅い音声を出力できる音声合成装置５０であってもよい。苦情を示す音声が得られる可能性が高い状況とは、例えば顧客によって問い合わせの目的を示す選択肢の中から苦情を示す選択肢が選択された場合である。 For example, even if the device ID of the speech synthesis device 50 that satisfies a predetermined condition suitable for complaints is associated with the condition that there is a high possibility that voice indicating a complaint will be obtained from the customer. good. The speech synthesis device 50 that satisfies the predetermined conditions suitable for responding to a complaint may be, for example, a speech synthesis device 50 created for the purpose of outputting an apology speech, or a speech synthesis device 50 that is created for the purpose of outputting an apology speech, The speech synthesis device 50 may be capable of outputting a relatively low voice with a relatively slow reading speed. A situation in which there is a high possibility that a voice indicating a complaint will be obtained is, for example, a case where the customer selects an option indicating a complaint from among options indicating the purpose of the inquiry.

業務制御部２５６は、セッション制御部２５１から業務制御の指示を受けると、指示に伴うメッセージＩＤに基づいて、実行すべき業務内容を判定する。業務内容の判定は、例えばフロー情報記憶部２３に記憶されているフロー情報レコードに基づいて行われる。業務制御部２５６は、実行すべき業務内容が存在する場合、業務内容を実行する。例えば、業務制御部２５６は、業務実行装置６０に対し、業務実行装置６０が処理可能なフォーマットで業務の実行を指示する。業務実行装置６０が複数種類設けられる場合には、各種の業務実行装置６０のフォーマットが予めフロー情報記憶部２３に記憶されていてもよい。業務制御部２５６は、指示すべき対象となる業務実行装置６０に応じたフォーマットで業務指示情報を生成する。指示すべき対象となる業務実行装置６０は、例えばユーザー端末２００の発信先電話番号や、その後に選択されたプッシュ番号などに基づいて判定されてもよい。 When the business control unit 256 receives a business control instruction from the session control unit 251, the business control unit 256 determines the content of the business to be executed based on the message ID accompanying the instruction. The determination of the work content is performed based on the flow information record stored in the flow information storage unit 23, for example. If there is a task to be executed, the task control unit 256 executes the task. For example, the business control unit 256 instructs the business execution device 60 to execute the business in a format that can be processed by the business execution device 60. When multiple types of business execution devices 60 are provided, the formats of the various business execution devices 60 may be stored in advance in the flow information storage unit 23. The task control unit 256 generates task instruction information in a format suitable for the task execution device 60 to which instructions are to be given. The task execution device 60 to be instructed may be determined based on, for example, the destination telephone number of the user terminal 200 or a subsequently selected push number.

図１に戻って顧客対応業務システム１００の各装置の説明を続ける。音声認識装置３０は、例えばパーソナルコンピューターやサーバー装置等の情報処理装置を用いて構成される。音声認識装置３０は、音声認識サービスを他の装置に提供する装置である。音声認識装置３０は、音声の認識処理の指示を受けると、音声認識処理を実行する。音声認識装置３０は、認識結果として得られる文字列の情報を返す。 Returning to FIG. 1, the description of each device of the customer service business system 100 will be continued. The speech recognition device 30 is configured using, for example, an information processing device such as a personal computer or a server device. The speech recognition device 30 is a device that provides speech recognition services to other devices. The speech recognition device 30 executes the speech recognition process upon receiving the instruction to perform the speech recognition process. The speech recognition device 30 returns information on a character string obtained as a recognition result.

意図解釈装置４０は、例えばパーソナルコンピューターやサーバー装置等の情報処理装置を用いて構成される。意図解釈装置４０は、自然言語解析や顧客との対話に関する学習結果を有しており、学習結果を用いて意図解釈処理を実行する。このような学習結果は、例えばＳＶＭ（Support Vector Machine）や深層学習などの機械学習の技術を用いることによって得ることが可能である。意図解釈装置４０は、ＡＩ（Artificial Intelligence：人工知能）を用いて実装されてもよい。学習結果は、複数種類のシナリオとして保持されてもよい。例えば、注文業務に関する対話に関する学習結果、予約確認に関する対話に対する学習結果、営業に関する対話に関する学習結果、がそれぞれ異なる学習結果として保持されてもよい。いずれの学習結果を用いるかは、例えばユーザー端末２００の発信先電話番号や、その後に選択されたプッシュ番号などに基づいてフロー制御装置２０によって判定されてもよい。意図解釈装置４０は、学習結果に基づいて、顧客の音声の認識結果を解析することで、音声対話の内容を判定する。意図解釈装置４０は、音声対話の内容を判定できなかった場合には、問い直しをするための応答文を送信してもよい。意図解釈装置４０は、取得する事が予め定められている情報が全てそろってはいない場合には、欠けている情報を聞き出すための追加の質問を行うための応答文を生成してもよい。意図解釈装置４０は、判定された音声対話の内容に基づいて、顧客に対し提供されるべき応答文を決定する。意図解釈装置４０は、決定された応答文を示す情報をフロー制御装置２０に返す。 The intention interpretation device 40 is configured using, for example, an information processing device such as a personal computer or a server device. The intention interpretation device 40 has learning results regarding natural language analysis and customer interaction, and executes intention interpretation processing using the learning results. Such learning results can be obtained, for example, by using machine learning techniques such as SVM (Support Vector Machine) and deep learning. The intention interpretation device 40 may be implemented using AI (Artificial Intelligence). The learning results may be held as multiple types of scenarios. For example, learning results regarding dialogue regarding ordering, learning results regarding dialogue regarding reservation confirmation, and learning results regarding dialogue regarding sales may be held as different learning results. Which learning result to use may be determined by the flow control device 20 based on, for example, the destination telephone number of the user terminal 200 or a subsequently selected push number. The intention interpretation device 40 determines the content of the voice dialogue by analyzing the recognition results of the customer's voice based on the learning results. If the intention interpretation device 40 cannot determine the content of the voice dialogue, it may transmit a response sentence for asking the question again. When the intention interpretation device 40 does not have all the information that is predetermined to be acquired, it may generate a response sentence for asking an additional question to find out the missing information. The intention interpretation device 40 determines a response sentence to be provided to the customer based on the content of the determined voice dialogue. The intention interpretation device 40 returns information indicating the determined response sentence to the flow control device 20.

音声合成装置５０は、例えばパーソナルコンピューターやサーバー装置等の情報処理装置を用いて構成される。音声合成装置５０は、音声合成サービスを他の装置に提供する装置である。音声合成装置５０は、音声の合成処理の指示を受けると、指示された文字列を表す音声信号を生成する。音声合成装置５０は、音声合成処理の結果として得られる音声信号（応答メッセージ音声）をフロー制御装置２０に返す。 The speech synthesis device 50 is configured using, for example, an information processing device such as a personal computer or a server device. The speech synthesis device 50 is a device that provides speech synthesis services to other devices. When the speech synthesis device 50 receives an instruction for speech synthesis processing, it generates an audio signal representing the instructed character string. The speech synthesis device 50 returns a speech signal (response message speech) obtained as a result of the speech synthesis processing to the flow control device 20.

業務実行装置６０は、例えばパーソナルコンピューターやサーバー装置等の情報処理装置を用いて構成される。業務実行装置６０は、例えばＲＰＡ装置として実装されてもよい。業務実行装置６０は、フロー制御装置２０から業務指示情報を受信すると、受信された業務指示情報の内容に基づいて業務システム７０に対し入力操作を実行する。業務実行装置６０の処理により、既存の業務システム７０において種々の処理が実行される。例えば、商品の発注処理を例に取ると、発注処理に必要な情報が業務システム７０に入力され、発注実行の処理までが業務実行装置６０によって実行される。発注処理に必要な情報とは、例えば発注される商品識別情報（例えば、商品名、型番、ＩＤ）、発注個数、発注元情報（顧客情報）などの情報である。 The business execution device 60 is configured using, for example, an information processing device such as a personal computer or a server device. The business execution device 60 may be implemented as an RPA device, for example. When the task execution device 60 receives the task instruction information from the flow control device 20, it performs an input operation on the task system 70 based on the content of the received task instruction information. Various processes are executed in the existing business system 70 through the processing of the business execution device 60. For example, in the case of ordering a product, information necessary for the ordering process is input to the business system 70, and the process up to order execution is executed by the business execution device 60. The information necessary for the ordering process includes, for example, product identification information to be ordered (eg, product name, model number, ID), the number of items ordered, order source information (customer information), and the like.

業務システム７０は、業務を遂行するために企業などの法人又は個人において導入されるシステムである。業務システム７０は、ユーザー端末２００のユーザーである顧客に対してサービスを提供するために用いられるシステムである。業務システム７０の具体例として、商品の発注処理を行うための発注システムや、宿や店舗の予約を行うための予約システムや、苦情を受け付けるための苦情応答システムがある。 The business system 70 is a system introduced by a corporation such as a company or an individual in order to carry out business. The business system 70 is a system used to provide services to customers who are users of the user terminal 200. Specific examples of the business system 70 include an ordering system for processing product orders, a reservation system for making reservations for hotels and stores, and a complaint response system for receiving complaints.

オペレーター端末８０は、顧客対応を行う人物であるオペレーターによって使用される端末装置である。オペレーター端末８０は、文字の入出力や音声の入出力を行うユーザーインターフェースを備える。フロー制御装置２０によってエスカレーションが実行されると、ユーザー端末２００とオペレーター端末８０との間で通信路が形成される。その後は、ユーザー端末２００を使用している顧客に対し、オペレーターはオペレーター端末８０を用いてコミュニケーションをとる。オペレーターは、顧客から業務依頼を受けると、オペレーター端末８０を操作することによって、業務実行装置６０に対し業務実行を指示する。 The operator terminal 80 is a terminal device used by an operator who deals with customers. The operator terminal 80 includes a user interface for inputting/outputting characters and inputting/outputting audio. When escalation is executed by the flow control device 20, a communication path is formed between the user terminal 200 and the operator terminal 80. Thereafter, the operator communicates with the customer using the user terminal 200 using the operator terminal 80. Upon receiving a business request from a customer, the operator instructs the business execution device 60 to execute the business by operating the operator terminal 80.

図５及び図６は、音声を用いた対話が行われる際に顧客対応業務システム１００が実行する処理の具体例を示すシーケンスチャートである。まず、顧客が使用するユーザー端末２００と顧客対応業務システム１００との間で音声対話を行うための通信路が形成される。顧客がユーザー端末２００において発話すると、ユーザー端末２００は音声を電気信号に変換し応答制御装置１０に送信する（ステップＳ１００）。応答制御装置１０は、音声の電気信号を受信し、受信された音声を記録する（ステップＳ１０１）。応答制御装置１０は、音声の電気信号をフロー制御装置２０へ送信する（ステップＳ１０２）。 5 and 6 are sequence charts showing specific examples of processing executed by the customer service business system 100 when a dialogue using voice is performed. First, a communication path is formed for voice interaction between the user terminal 200 used by the customer and the customer service business system 100. When the customer speaks on the user terminal 200, the user terminal 200 converts the voice into an electrical signal and transmits it to the response control device 10 (step S100). The response control device 10 receives the audio electrical signal and records the received audio (step S101). The response control device 10 transmits an audio electrical signal to the flow control device 20 (step S102).

フロー制御装置２０のセッション制御部２５１は、音声の電気信号を受信すると、音声の電気信号を履歴制御部２５２及び音声認識制御部２５３に出力する。履歴制御部２５２は、受信された音声の内容を一時的に記録する。音声認識制御部２５３は、受信された音声の信号を音声認識装置３０に送信する（ステップＳ１０３）。このとき、ルール情報記憶部２４に登録されている条件が満たされている場合には、音声認識制御部２５３は、タイミングに応じて、音声の信号の送信先の音声認識装置３０を選択する。音声認識装置３０は、音声認識処理を実行する（ステップＳ１０４）。音声認識装置３０は、音声認識処理によって得られた文字列を示す音声認識結果をフロー制御装置２０に送信する（ステップＳ１０５）。 When the session control unit 251 of the flow control device 20 receives the audio electrical signal, it outputs the audio electrical signal to the history control unit 252 and the speech recognition control unit 253. The history control unit 252 temporarily records the content of the received audio. The voice recognition control unit 253 transmits the received voice signal to the voice recognition device 30 (step S103). At this time, if the conditions registered in the rule information storage unit 24 are satisfied, the voice recognition control unit 253 selects the voice recognition device 30 to which the voice signal is to be transmitted, depending on the timing. The speech recognition device 30 executes speech recognition processing (step S104). The speech recognition device 30 transmits the speech recognition result indicating the character string obtained by the speech recognition process to the flow control device 20 (step S105).

フロー制御装置２０の音声認識制御部２５３は、音声認識結果を受信すると、意図解釈制御部２５４に音声認識結果を出力する。意図解釈制御部２５４は、受信された音声認識結果を送信することによって、音声認識結果に基づいて応答文を決定することを意図解釈装置４０に要求する（ステップＳ１０６）。このとき、ルール情報記憶部２４に登録されている条件が満たされている場合には、意図解釈制御部２５４は、タイミングに応じて、応答文決定の依頼先の意図解釈装置４０を選択する。意図解釈装置４０は、音声認識結果を受信すると、音声認識結果を解析することによって音声対話の内容を判定し、応答文を決定する（ステップＳ１０７）。例えば、意図解釈装置４０は、音声認識結果を元に意図解釈処理を行うことで、発話内容に最も近い質問文を推定する。そして、意図解釈装置４０は、予めデータベースに登録されているＦＡＱリストの中から、推定された質問文に対する応答文として適切なものを決定する。意図解釈装置４０は、決定された応答文を示す情報をフロー制御装置２０に送信する（ステップＳ１０８）。 Upon receiving the voice recognition result, the voice recognition control unit 253 of the flow control device 20 outputs the voice recognition result to the intention interpretation control unit 254. The intention interpretation control unit 254 requests the intention interpretation device 40 to determine a response sentence based on the voice recognition result by transmitting the received voice recognition result (step S106). At this time, if the conditions registered in the rule information storage unit 24 are satisfied, the intention interpretation control unit 254 selects the intention interpretation device 40 to which response sentence determination is requested, depending on the timing. When the intention interpretation device 40 receives the voice recognition result, it determines the content of the voice dialogue by analyzing the voice recognition result, and determines a response sentence (step S107). For example, the intention interpretation device 40 estimates the question sentence closest to the utterance content by performing intention interpretation processing based on the voice recognition result. Then, the intention interpretation device 40 determines an appropriate response sentence to the estimated question sentence from among the FAQ list registered in advance in the database. The intention interpretation device 40 transmits information indicating the determined response sentence to the flow control device 20 (step S108).

フロー制御装置２０の意図解釈制御部２５４は、応答文を示す情報を受信すると、音声合成装置５０に対し、受信された応答文を表す音声の合成処理を要求する（ステップＳ１０９）。このとき、応答文に人の名前が含まれている場合には、同一であるか否か、同一である可能性が高い所定の条件を満たしているか、について意図解釈制御部２５４は判定する。また、ルール情報記憶部２４に登録されている条件が満たされている場合には、音声合成制御部２５５は、タイミングに応じて、音声合成処理の要求先の音声合成装置５０を選択する。音声合成装置５０は、要求された応答文に応じて音声合成処理を行うことによって応答メッセージの音声データを生成する（ステップＳ１１０）。音声合成装置５０は、生成された応答メッセージの音声データをフロー制御装置２０へ送信する（ステップＳ１１１）。 When the intention interpretation control unit 254 of the flow control device 20 receives the information indicating the response sentence, it requests the speech synthesis device 50 to synthesize a voice representing the received response sentence (step S109). At this time, if the response sentence includes a person's name, the intention interpretation control unit 254 determines whether or not they are the same, and whether a predetermined condition that there is a high possibility that they are the same is satisfied. Furthermore, if the conditions registered in the rule information storage section 24 are satisfied, the speech synthesis control section 255 selects the speech synthesis device 50 to which the speech synthesis process is requested, depending on the timing. The speech synthesis device 50 generates speech data of a response message by performing speech synthesis processing according to the requested response sentence (step S110). The voice synthesis device 50 transmits the generated voice data of the response message to the flow control device 20 (step S111).

フロー制御装置２０のセッション制御部２５１は、応答メッセージの音声データを受信すると、受信された音声データを履歴制御部２５２に出力する。履歴制御部２５２は、受信された応答メッセージの音声の内容を一時的に記録する。なお、履歴制御部２５２によって一時的に記録された音声の内容は、ユーザー端末２００との間で通信路が切断された後に、履歴制御部２５２によって通話履歴情報として通話履歴記憶部２２に記録される。セッション制御部２５１は、応答メッセージの音声を、応答制御装置１０へ送信する（ステップＳ１１２）。応答制御装置１０は、フロー制御装置２０から音声を受信すると、受信された音声を記録する（ステップＳ１１３）。応答制御装置１０は、記録された音声をユーザー端末２００に送信する（ステップＳ１１４）。 When the session control unit 251 of the flow control device 20 receives the voice data of the response message, it outputs the received voice data to the history control unit 252. The history control unit 252 temporarily records the audio content of the received response message. Note that the content of the audio temporarily recorded by the history control unit 252 is recorded in the call history storage unit 22 as call history information by the history control unit 252 after the communication path with the user terminal 200 is disconnected. Ru. The session control unit 251 transmits the voice of the response message to the response control device 10 (step S112). When the response control device 10 receives the voice from the flow control device 20, it records the received voice (step S113). The response control device 10 transmits the recorded voice to the user terminal 200 (step S114).

セッション制御部２５１は、所定のタイミングでエスカレーションの条件が満たされたか否か判定する（ステップＳ１１５）。例えば、ユーザー端末２００から受信された音声に、顧客が気分を害している可能性があることを示す所定の文字列が含まれていることが条件であってもよい。エスカレーションの条件が満たされない場合（ステップＳ１１５－ＮＯ）、エスカレーションは実行されない。一方、エスカレーションの条件が満たされる場合（ステップＳ１１５－ＹＥＳ）、セッション制御部２５１は、応答制御装置１０に対しエスカレーションの実行を指示する。この指示に応じて応答制御装置１０は、ユーザー端末２００の接続先をオペレーター端末８０に変更することでエスカレーションを実行する（ステップＳ１１６）。この際に、セッション制御部２５１は、エスカレーションの対象となっている通話に関する情報をオペレーター端末８０に送信してもよい。 The session control unit 251 determines whether the escalation condition is satisfied at a predetermined timing (step S115). For example, the condition may be that the voice received from the user terminal 200 includes a predetermined character string indicating that the customer may be offended. If the escalation conditions are not met (step S115-NO), escalation is not performed. On the other hand, if the escalation conditions are satisfied (step S115-YES), the session control unit 251 instructs the response control device 10 to execute escalation. In response to this instruction, the response control device 10 executes escalation by changing the connection destination of the user terminal 200 to the operator terminal 80 (step S116). At this time, the session control unit 251 may transmit information regarding the call that is the subject of escalation to the operator terminal 80.

フロー制御装置２０の業務制御部２５６は、意図解釈装置４０から通知された応答文情報に基づいて、実行すべき業務内容を判定する（ステップＳ２０１）。例えば、応答文情報が応答メッセージの内容や種別を示すメッセージＩＤを含む場合、業務制御部２５６は、メッセージＩＤに対応づけてフロー情報記憶部２３に記録されている業務内容情報を参照することによって業務内容を判定してもよい。業務制御部２５６は、実行すべき業務がある場合、業務指示情報を生成する（ステップＳ２０２）。業務指示情報は、業務の指示先となる業務実行装置６０が処理可能なフォーマットで生成される。業務制御部２５６は、生成された業務指示情報を業務実行装置６０へ送信する（ステップＳ２０３）。業務実行装置６０は、業務指示情報を受信すると、受信された業務指示情報にしたがって業務システム７０に対し業務指示を行う（ステップＳ２０４）。業務システム７０は、指示された内容にしたがって業務を実行する（ステップＳ２０５）。 The work control unit 256 of the flow control device 20 determines the content of the work to be executed based on the response text information notified from the intention interpretation device 40 (step S201). For example, when the response text information includes a message ID indicating the content and type of the response message, the business control unit 256 can refer to the business content information recorded in the flow information storage unit 23 in association with the message ID. The business content may also be determined. If there is a task to be executed, the task control unit 256 generates task instruction information (step S202). The task instruction information is generated in a format that can be processed by the task execution device 60 to which the task is instructed. The task control unit 256 transmits the generated task instruction information to the task execution device 60 (step S203). When the task execution device 60 receives the task instruction information, it issues a task instruction to the business system 70 in accordance with the received task instruction information (step S204). The business system 70 executes the business according to the instructed content (step S205).

なお、図５及び図６では音声対話を前提とした処理の流れを説明したが、文字対話が行われる場合にはステップＳ１０３～Ｓ１０５とステップＳ１０９～１１１の処理が省略され、ステップＳ１０１及びステップＳ１１３の処理において音声ではなく文字列として記録が行われる。 5 and 6 have explained the flow of processing assuming voice dialogue, but when text dialogue is performed, steps S103 to S105 and steps S109 to 111 are omitted, and steps S101 and S113 are omitted. In this process, the recording is performed as a character string rather than as a voice.

このように構成された顧客対応業務システム１００では、ルール情報記憶部２４に登録されている条件に応じて、処理を実行するに当たって最適な装置が選択され、選択された装置による処理結果がその後の処理で用いられる。そのため、顧客対応業務においてより精度の高い処理を実行することが可能となる。 In the customer service business system 100 configured in this way, an optimal device is selected for executing a process according to the conditions registered in the rule information storage unit 24, and the processing results by the selected device are used for subsequent processing. Used in processing. Therefore, it becomes possible to perform more accurate processing in customer service operations.

また、このように構成された顧客対応業務システム１００では、予め得られている人の名前と、意図解釈結果の人の名前と、が異なったとしても、所定の条件が満たされる場合には処理が進行される。そのため、音声認識処理や意図解釈処理における誤処理に起因してその後の処理が止まってしまう（例えばエラーとして注しされてしまう）ことが防止される。 In addition, in the customer service business system 100 configured in this way, even if the person's name obtained in advance and the person's name as a result of intention interpretation are different, the process will be processed if predetermined conditions are met. will be carried out. Therefore, it is possible to prevent subsequent processing from being stopped (for example, from being marked as an error) due to a processing error in speech recognition processing or intention interpretation processing.

また、顧客対応業務システム１００では、顧客から発せられる音声や文字による発言に対して、機械学習によって予め得られた経験知に基づいて応答文が自動的に決定される。そして、音声対話であれば音声に変換して顧客に送信され、文字対話であれば文字として顧客に送信される。そのため、これまで人手で行われていた顧客対応の業務の一部を人手を介さずに行うことが可能となる。その結果、顧客対応業務において要する人手をさらに削減することができる。 Further, in the customer service business system 100, a response sentence is automatically determined based on experiential knowledge obtained in advance through machine learning in response to a voice or written statement uttered by a customer. If the conversation is a voice conversation, it is converted into voice and sent to the customer, and if it is a text conversation, it is sent to the customer as text. Therefore, it becomes possible to perform some of the customer service tasks that were previously performed manually without human intervention. As a result, it is possible to further reduce the manpower required for customer service work.

また、いわゆる後続業務といわれる業務の業務内容が、顧客との対話に基づいて自動的に判断される。そのうえで、ＲＰＡ装置等の業務実行装置に対して、各装置において解釈可能なフォーマットで業務の指示が行われる。各業務実行装置は、指示に応じて業務システムにおいて業務を実行する。そのため、顧客への応答のみならず、後続業務までも自動的に実行することが可能となる。その結果、顧客対応業務において要する人手をさらに削減することができる Further, the content of the so-called subsequent work is automatically determined based on the dialogue with the customer. Then, business instructions are given to business execution devices such as RPA devices in a format that can be interpreted by each device. Each business execution device executes a business in the business system according to instructions. Therefore, it becomes possible to automatically execute not only responses to customers but also subsequent operations. As a result, it is possible to further reduce the manpower required for customer service operations.

また、業務システムや業務実行装置は、顧客業務を必要とする企業や個人において予め有しているものをそのまま利用することが可能である。そのため、顧客対応業務システム１００の導入に当たって必要となる費用や期間を削減することが可能となる。 Further, business systems and business execution devices that are already available in companies and individuals that require customer business can be used as they are. Therefore, it is possible to reduce the cost and time necessary for introducing the customer service business system 100.

（変形例）
顧客対応業務システム１００のシステム構成は、図１に示されたものに限定される必要は無い。具体的には以下の通りである。
顧客対応業務システム１００に含まれる装置のうち、一部の装置がネットワーク３００とは異なるネットワークを経由して他の装置と通信可能に接続されてもよい。例えば、業務システム７０は、専用線を用いたネットワークを介して他の装置と通信可能に接続されてもよい。
顧客対応業務システム１００に含まれる装置のうち、一部の装置が他の装置と一体に構成されてもよい。例えば、音声認識装置３０及び意図解釈装置４０が一体の装置として構成されてもよい。 (Modified example)
The system configuration of the customer service business system 100 does not need to be limited to that shown in FIG. Specifically, the details are as follows.
Some of the devices included in the customer service business system 100 may be communicably connected to other devices via a network different from the network 300. For example, the business system 70 may be communicably connected to other devices via a network using a dedicated line.
Among the devices included in the customer service business system 100, some of the devices may be configured integrally with other devices. For example, the speech recognition device 30 and the intention interpretation device 40 may be configured as an integrated device.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 Although the embodiments of the present invention have been described above in detail with reference to the drawings, the specific configuration is not limited to these embodiments, and includes designs within the scope of the gist of the present invention.

１００…顧客対応業務システム，１０…応答制御装置，２０…フロー制御装置，３０…音声認識装置，４０…意図解釈装置，５０…音声合成装置，６０…業務実行装置，７０…業務システム，８０…オペレーター端末，２００…ユーザー端末，２１…通信部，２２…通話履歴記憶部，２３…フロー情報記憶部，２５…制御部，２５１…セッション制御部，２５２…履歴制御部，２５３…音声認識制御部，２５４…意図解釈制御部，２５５…音声合成制御部，２５６…業務制御部 100...Customer service business system, 10...Response control device, 20...Flow control device, 30...Speech recognition device, 40...Intention interpretation device, 50...Speech synthesis device, 60...Business execution device, 70...Business system, 80... Operator terminal, 200... User terminal, 21... Communication unit, 22... Call history storage unit, 23... Flow information storage unit, 25... Control unit, 251... Session control unit, 252... History control unit, 253... Voice recognition control unit , 254...Intention interpretation control unit, 255...Speech synthesis control unit, 256...Business control unit

Claims

a control unit that determines a response sentence to be presented to the customer based on the content of the customer's utterance, determines the business content in response to the customer's request, and instructs a business execution device to execute the business content; Equipped with
The control unit selects a device that will be the main body for executing the process of analyzing the utterance content from among a plurality of devices capable of the same type of processing ;
The control unit selects an intention interpretation device that has relatively high accuracy in interpreting the intention of the complaint as the device that will be the execution subject when there is a high possibility that a voice indicating a complaint will be obtained from the customer; A control device that selects a speech synthesis device that satisfies predetermined conditions that are suitable for a complaint and obtains an execution result of speech synthesis processing .

The content of the utterance is expressed in audio,
The control unit requests a speech recognition device that performs speech recognition processing to perform speech recognition processing of the speech indicating the content of the utterance, and determines the response sentence based on the recognition result obtained from the speech recognition device. , determine the business content,
2. The control unit selects, from among a plurality of speech recognition devices, a device that will be the main subject of execution of the speech recognition process as a device that satisfies the condition, if a predetermined condition is satisfied. control device.

The control unit requests a speech synthesis device that performs speech synthesis processing to generate speech representing the response sentence, and responds to the customer using the speech obtained from the speech synthesis device.
3. The control unit selects, from a plurality of speech synthesis devices, a device that will be the main subject of execution of the speech synthesis process as a device that satisfies the condition, if a predetermined condition is satisfied. Control device.

a first step of determining a response sentence to be presented to the customer based on the content of the customer's utterance;
a second step of determining the business content according to the customer's request;
a third step of instructing a business execution device to execute the business content;
a fourth step of selecting a device that will be the main body for executing any of the processes in the first step, second step, and third step from among a plurality of devices capable of performing the same type of processing;
In a situation where there is a high possibility that voice indicating a complaint will be obtained from the customer, an intention interpretation device with relatively high precision in interpreting the intention of the complaint is selected as the device that will be the main executioner, and the device is suitable for responding to the complaint. a fifth step of selecting a speech synthesizer that satisfies a predetermined condition and obtaining the execution result of the speech synthesis process;
A control method having

A computer program for causing a computer to function as the control device according to claim 1.