JP2019139280A

JP2019139280A - Text analyzer, text analysis method and text analysis program

Info

Publication number: JP2019139280A
Application number: JP2018018897A
Authority: JP
Inventors: 晃浩新藤; Akihiro Shindo
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2018-02-06
Filing date: 2018-02-06
Publication date: 2019-08-22
Anticipated expiration: 2038-02-06
Also published as: JP6873935B2

Abstract

To improve accuracy of automatic summarization of text information converted from voice information for telephone answer.SOLUTION: A text analyzer comprises: a text acquisition unit which acquires answer text information obtained by converting voice information for telephone answer between a user and an operator into text; an identification unit 433 which identifies a character string indicating the start or the end of answering in which the operator checks identification information of the user in the answer text information; and a specifying unit 434 which specifies a front side text portion before the character string identified by the identification unit 433 in the answer text information or a rear side text portion after the character string identified by the identification unit 433.SELECTED DRAWING: Figure 2

Description

本発明は、電話応対の音声情報に関連するテキスト分析装置、テキスト分析方法、及びテキスト分析装置としてコンピュータを機能させるためのテキスト分析プログラムに関する。 The present invention relates to a text analysis apparatus, a text analysis method, and a text analysis program for causing a computer to function as a text analysis apparatus related to voice information for telephone reception.

コールセンタでは、オペレータは、顧客からの問い合わせ等の電話応対の終了後に、電話応対の履歴を記録する。この履歴の記録作業によるオペレータの負担を軽減したいという需要がある。電話応対を録音した録音情報を音声認識技術によりテキスト情報に変換することが行われている（例えば、特許文献１を参照）。 In the call center, the operator records the telephone answering history after the telephone answering such as an inquiry from the customer. There is a demand for reducing the burden on the operator due to this history recording operation. Recording information obtained by recording a telephone response is converted into text information by a voice recognition technique (see, for example, Patent Document 1).

特開２００１−２１１２４５号公報JP 2001-2111245 A

電話応対の内容をリスト化したり第三者が確認したりしやすくするために、テキスト化された電話応対内容を自動要約することが行われている。この電話応対内容のテキストにおいて顧客がサービスの契約者本人か否かをオペレータが確認する電話応対に対応するテキスト部分を本人確認部分とすれば、本人確認部分では、ユーザの氏名等の本人確認情報に関連する語句が登場する頻度が多くなる。多くの自動要約はテキスト中に登場する頻度が多い語句を優先的に拾うアルゴリズムを採用しているため、本人確認部分についても要約の対象としてしまうと、本人確認情報に関連する語句のように顧客からの問い合わせ内容自体ではない内容が優先的に拾われることになり、要約の精度が落ちるという問題があった。 In order to make it easy for a third party to check the contents of the telephone reception or to confirm the contents, a telephone summary of the telephoneized text is automatically summarized. If the text part corresponding to the telephone response that the operator confirms whether or not the customer is the contractor of the service in the text of the telephone response content is the identity confirmation part, the identity confirmation information such as the user's name is included in the identity confirmation part. The frequency of words related to is increasing. Many automatic summaries employ an algorithm that preferentially picks up words that frequently appear in the text, so if the identity verification part is also subject to summarization, it will be the customer like the words and phrases related to the identity verification information. Content that is not the content of the inquiry itself was picked up preferentially, and there was a problem that the accuracy of the summary fell.

本発明は、上記の事情に鑑みてなされたものであり、電話応対の音声情報を変換したテキスト情報の自動要約の精度を高めることができるテキスト分析装置、テキスト分析方法及びテキスト分析プログラムを提供することを目的とする。 The present invention has been made in view of the above circumstances, and provides a text analysis device, a text analysis method, and a text analysis program capable of improving the accuracy of automatic summarization of text information obtained by converting voice information for telephone reception. For the purpose.

本発明の第１の態様のテキスト分析装置は、ユーザとオペレータとの間の電話応対の音声情報がテキストに変換された応対テキスト情報を取得するテキスト取得部と、前記応対テキスト情報においてオペレータがユーザの本人確認情報を確認する応対の開始又は終了を示す文字列を識別する識別部と、前記応対テキスト情報において前記識別部が識別した前記文字列よりも前の前側テキスト部分又は前記識別部が識別した前記文字列よりも後の後側テキスト部分を特定する特定部と、を備える。 A text analysis apparatus according to a first aspect of the present invention includes a text acquisition unit that acquires response text information obtained by converting voice information of a telephone response between a user and an operator into text, and the operator in the response text information An identification unit for identifying a character string indicating the start or end of the response for confirming the personal identification information, and a front text portion preceding the character string identified by the identification unit in the response text information or the identification unit is identified A specifying unit for specifying a rear text portion after the character string.

前記テキスト分析装置は、前記前側テキスト部分又は前記後側テキスト部分を機械学習モデルに入力し、当該機械学習モデルから出力された要約データを出力する要約部をさらに有してもよい。 The text analysis apparatus may further include a summarizing unit that inputs the front text portion or the rear text portion to a machine learning model and outputs summary data output from the machine learning model.

前記識別部は、前記本人確認情報を確認する応対の開始を示す開始文字列と、前記本人確認情報を確認する応対の終了を示す終了文字列とを識別し、前記識別部は、識別した前記開始文字列よりも後のテキスト部分から前記終了文字列を識別してもよい。 The identification unit identifies a start character string indicating the start of a response for confirming the identity confirmation information and an end character string indicating an end of the response for confirming the identity confirmation information, and the identification unit identifies the identified The end character string may be identified from a text portion after the start character string.

前記特定部は、前記識別部が識別した前記開始文字列よりも前の前側テキスト部分と、前記識別部が識別した前記終了文字列よりも後の後側テキスト部分とを特定し、前記要約部は、前記後側テキスト部分と前記前側テキスト部分とを前記機械学習モデルに入力し、前記後側テキスト部分の要約データと、前記前側テキスト部分の要約データとを関連付けて出力してもよい。前記特定部は、前記識別部が前記開始文字列を識別できない場合に、前記応対テキスト情報の初めから所定の割合までのテキスト部分を前側テキスト部分として特定してもよい。 The specifying unit specifies a front text portion before the start character string identified by the identifying unit and a rear text portion after the end character string identified by the identifying unit, and the summarizing unit May input the rear text portion and the front text portion into the machine learning model, and output the summary data of the rear text portion and the summary data of the front text portion in association with each other. The identification unit may identify a text part from the beginning of the response text information to a predetermined ratio as a front text part when the identification unit cannot identify the start character string.

前記特定部は、前記識別部が前記開始文字列を識別できない場合に、電話応対全体において音声を発する話者が切り替わった回数に対する電話応対の開始時から音声を発する話者が切り替わった回数の割合が所定値に達するまでのテキスト部分を前側テキスト部分として特定してもよい。前記特定部は、前記識別部が前記開始文字列を識別できない場合に、電話応対の開始から所定の文章数に達するまでに対応するテキスト部分を前側テキスト部分として特定してもよい。前記特定部は、前記識別部が前記開始文字列を識別できない場合に、電話応対の開始から所定時間に達するまでに対応するテキスト部分を前側テキスト部分として特定してもよい。 The specific unit is a ratio of the number of times that a speaker who has made a voice switch from the start of a telephone response to the number of times a speaker who makes a voice is switched in the whole telephone response when the identification unit cannot identify the start character string The text part until the value reaches a predetermined value may be specified as the front text part. The specifying unit may specify a text part corresponding to a predetermined number of sentences from the start of telephone reception as the front text part when the identification unit cannot identify the start character string. The specifying unit may specify a text part corresponding to a predetermined time from the start of telephone reception as the front text part when the identification unit cannot identify the start character string.

前記テキスト分析装置は、前記開始文字列と前記終了文字列とに挟まれたテキスト部分からユーザ情報を抽出する抽出部をさらに備え、前記要約部は、前記抽出部が抽出したユーザ情報に関連付けて、前記要約データを出力してもよい。 The text analysis device further includes an extraction unit that extracts user information from a text portion sandwiched between the start character string and the end character string, and the summarization unit associates with the user information extracted by the extraction unit. The summary data may be output.

前記要約部は、前記識別部が前記終了文字列を識別できない場合に、前記終了文字列を識別できていないことを示す識別子に関連付けて前記要約データを出力してもよい。ユーザとオペレータとの間の電話応対の音声情報を前記応対テキスト情報に変換する変換部と、前記識別部は、前記応対テキスト情報のうち、オペレータが発した音声を前記変換部が変換した応対テキスト情報に基づいて前記終了文字列を識別してもよい。 The summarizing unit may output the summary data in association with an identifier indicating that the end character string cannot be identified when the identifying unit cannot identify the end character string. A conversion unit that converts voice information of a telephone reception between a user and an operator into the reception text information, and the identification unit is a reception text obtained by converting the voice uttered by the operator of the reception text information by the conversion unit. The end character string may be identified based on the information.

オペレータの音声の特徴情報を記憶している記憶部をさらに備え、前記変換部は、前記記憶部が記憶しているオペレータの音声の特徴情報に基づいて、前記音声情報を前記応対テキスト情報に変換してもよい。 The information processing apparatus further includes a storage unit storing feature information of an operator's voice, and the conversion unit converts the voice information into the response text information based on the feature information of the operator's voice stored in the storage unit. May be.

本発明の第２の態様のテキスト分析方法は、ユーザとオペレータとの間の電話応対の音声情報がテキストに変換された応対テキスト情報を取得するステップと、前記応対テキスト情報においてオペレータがユーザの本人確認情報を確認する応対の開始又は終了を示す文字列を識別するステップと、前記応対テキスト情報において識別した前記文字列よりも前の前側テキスト部分又は識別した前記文字列よりも後の後側テキスト部分を特定するステップと、を備える。 According to a second aspect of the present invention, there is provided a text analysis method comprising: obtaining response text information obtained by converting voice information of a telephone response between a user and an operator into text; A step of identifying a character string indicating the start or end of reception for confirming the confirmation information, and a front text part before the character string identified in the reception text information or a rear text after the identified character string Identifying a portion.

本発明の第３の態様のテキスト分析プログラムは、コンピュータを、ユーザとオペレータとの間の電話応対の音声情報がテキストに変換された応対テキスト情報を取得するテキスト取得部、前記応対テキスト情報においてオペレータがユーザの本人確認情報を確認する応対の開始又は終了を示す文字列を識別する識別部、及び前記応対テキスト情報において前記識別部が識別した前記文字列よりも前の前側テキスト部分又は前記識別部が識別した前記文字列よりも後の後側テキスト部分を特定する特定部、として機能させる。 According to a third aspect of the present invention, there is provided a text analysis program comprising a computer, a text acquisition unit that acquires response text information obtained by converting voice information of a telephone response between a user and an operator into text, and the operator in the response text information. An identification unit for identifying a character string indicating the start or end of reception for confirming the user's identity confirmation information, and a front text part or the identification unit preceding the character string identified by the identification unit in the reception text information Is made to function as a specifying unit that specifies a rear text portion after the character string identified by.

本発明によれば、電話応対の音声情報を変換したテキスト情報の自動要約の精度を高めることができる。 ADVANTAGE OF THE INVENTION According to this invention, the precision of the automatic summarization of the text information which converted the voice information of telephone reception can be improved.

本発明の実施形態に係るテキスト分析システムＳの概要について説明するための図である。It is a figure for demonstrating the outline | summary of the text analysis system S which concerns on embodiment of this invention. テキスト分析装置の構成を示す図である。It is a figure which shows the structure of a text analyzer. 応対テキスト情報を示す図である。It is a figure which shows reception text information. 要約部が出力する要約データの一例を示す図である。It is a figure which shows an example of the summary data which a summary part outputs. テキスト分析装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of a text analyzer.

［テキスト分析システムＳの概要］
図１は、本発明の実施形態に係るテキスト分析システムＳの概要について説明するための図である。テキスト分析システムＳは、コールセンタにおいてユーザの問い合わせ記録を作成する作業を支援するために、ユーザの問い合わせ内容を自動要約する。 [Outline of Text Analysis System S]
FIG. 1 is a diagram for explaining an outline of a text analysis system S according to an embodiment of the present invention. The text analysis system S automatically summarizes the contents of user inquiries in order to support the work of creating user inquiry records in a call center.

テキスト分析システムＳは、通信端末１００、電話機２００、録音装置３００及びテキスト分析装置４００を備える。通信端末１００は、例えば、ユーザの携帯電話機であり、ネットワークＮを介してコールセンタの電話機２００との間で音声通信を行う。 The text analysis system S includes a communication terminal 100, a telephone 200, a recording device 300, and a text analysis device 400. The communication terminal 100 is a user's mobile phone, for example, and performs voice communication with the call center phone 200 via the network N.

電話機２００は、コールセンタに設置された通話用のオペレータ端末である。録音装置３００は、電話機２００の電話回線に接続されており、通信端末１００と電話機２００との間の音声通信が行われている間、ユーザ及びオペレータの音声を録音した音声情報を生成する。このとき、録音装置３００は、電話機２００の電話出力から取得したユーザの音声情報を通信端末１００の識別情報に関連付けて記憶し、電話機２００のマイク入力から取得したオペレータの音声情報を電話機２００に割り当てられた識別情報に関連付けて記憶する。通信端末１００の識別情報は、例えば、携帯電話番号である。電話機２００の識別情報は、例えば、コールセンタ内での電話機２００の内線番号である。 The telephone 200 is a call operator terminal installed in a call center. The recording device 300 is connected to the telephone line of the telephone 200, and generates voice information in which the voices of the user and the operator are recorded while voice communication is performed between the communication terminal 100 and the telephone 200. At this time, the recording apparatus 300 stores the user's voice information acquired from the telephone output of the telephone 200 in association with the identification information of the communication terminal 100, and assigns the operator's voice information acquired from the microphone input of the telephone 200 to the telephone 200. And stored in association with the identified identification information. The identification information of the communication terminal 100 is, for example, a mobile phone number. The identification information of the telephone 200 is, for example, an extension number of the telephone 200 in the call center.

テキスト分析装置４００は、ユーザ及びオペレータの音声情報をこれらの音声情報に割り当てられた識別情報とともに録音装置３００から取得し、取得した音声情報を応対テキスト情報に変換する。テキスト分析装置４００は、変換した応対テキスト情報の要約データを生成する。テキスト分析装置４００は、生成した要約データを管理装置（不図示）へ送信する。 The text analysis device 400 acquires the voice information of the user and the operator together with the identification information assigned to the voice information from the recording device 300, and converts the acquired voice information into response text information. The text analysis device 400 generates summary data of the converted response text information. The text analysis device 400 transmits the generated summary data to a management device (not shown).

テキスト分析装置４００は、応対テキスト情報において繰り返し言及されている語句を重要な情報であると判定し、この語句を要約データに含める。従来のテキスト分析装置では、ユーザの氏名等の本人確認情報に関連する語句が応対テキスト情報において繰り返し言及されているために、本人確認情報に関連する語句を重要な情報と判定して要約データに含めていた。その結果、従来のテキスト分析装置では、本人確認情報に関連する語句を重要な情報と判定するため、本人確認情報以外の情報が重要な情報と判定されにくくなり、要約の精度が低くなっていた。 The text analysis apparatus 400 determines that the phrase repeatedly referred to in the response text information is important information, and includes this phrase in the summary data. In conventional text analyzers, words related to identity verification information such as the user's name are repeatedly referred to in the response text information. Therefore, it is determined that the words related to the identity verification information are important information and are summarized data. Was included. As a result, in the conventional text analysis device, since the words related to the identity verification information are determined as important information, it is difficult to determine information other than the identity verification information as important information, and the accuracy of the summary is low. .

これに対し、テキスト分析装置４００は、オペレータがユーザの本人確認を行う応対テキスト情報である本人確認部分を特定し、この本人確認部分を要約データから除外する。このため、テキスト分析装置４００は、本人確認情報に関連する語句を要約データに含まれにくくすることにより、要約の精度を向上させることができる。 On the other hand, the text analysis apparatus 400 specifies an identity verification portion that is response text information for the operator to confirm the identity of the user, and excludes the identity verification portion from the summary data. For this reason, the text analysis apparatus 400 can improve the accuracy of the summary by making it difficult for the summary data to include words related to the personal identification information.

［テキスト分析装置４００の構成］
図２は、テキスト分析装置４００の構成を示す図である。テキスト分析装置４００は、通信部４１、記憶部４２及び制御部４３により構成される。通信部４１は、録音装置３００及び管理装置（不図示）と通信するための通信インターフェースである。記憶部４２は、ＲＯＭ（Read Only Memory）及びＲＡＭ（Random Access Memory）等の記憶媒体である。記憶部４２は、制御部４３が実行するプログラムを記憶している。制御部４３は、例えばＣＰＵ（Central Processing Unit）である。制御部４３は、記憶部４２に記憶されたプログラムを実行することにより、取得部４３１、変換部４３２、識別部４３３、特定部４３４、抽出部４３５、要約部４３６及び機械学習モデル４３７として機能する。 [Configuration of Text Analyzer 400]
FIG. 2 is a diagram illustrating a configuration of the text analysis device 400. The text analysis device 400 includes a communication unit 41, a storage unit 42, and a control unit 43. The communication unit 41 is a communication interface for communicating with the recording device 300 and a management device (not shown). The storage unit 42 is a storage medium such as a ROM (Read Only Memory) and a RAM (Random Access Memory). The storage unit 42 stores a program executed by the control unit 43. The control unit 43 is, for example, a CPU (Central Processing Unit). The control unit 43 functions as an acquisition unit 431, a conversion unit 432, an identification unit 433, an identification unit 434, an extraction unit 435, a summary unit 436, and a machine learning model 437 by executing a program stored in the storage unit 42. .

取得部４３１は、通信部４１を介して、ユーザとオペレータとの間の電話応対の音声情報を録音装置３００から取得する。取得部４３１は、取得した音声情報を変換部４３２に出力する。取得部４３１は、出力した音声情報がテキストに変換された応対テキスト情報を変換部４３２から取得するテキスト取得部として機能する。取得部４３１は、取得した応対テキスト情報を識別部４３３へ出力する。 The acquisition unit 431 acquires voice information for telephone reception between the user and the operator from the recording device 300 via the communication unit 41. The acquisition unit 431 outputs the acquired audio information to the conversion unit 432. The acquisition unit 431 functions as a text acquisition unit that acquires response text information obtained by converting the output voice information into text from the conversion unit 432. The acquisition unit 431 outputs the acquired reception text information to the identification unit 433.

変換部４３２は、音声認識技術により、取得部４３１が取得した音声情報を応対テキスト情報に変換する。このとき、変換部４３２は、ユーザの音声情報とオペレータの音声情報とをそれぞれ個別に応対テキスト情報に変換する。変換部４３２は、ユーザの音声情報に対応する応対テキスト情報と、オペレータの音声情報に対応する応対テキスト情報とを時系列に沿ってまとめた応対テキスト情報を生成する。 The conversion unit 432 converts the voice information acquired by the acquisition unit 431 into reception text information using a voice recognition technique. At this time, the conversion unit 432 individually converts the user's voice information and the operator's voice information into reception text information. The conversion unit 432 generates response text information in which the response text information corresponding to the user's voice information and the response text information corresponding to the operator's voice information are collected in time series.

変換部４３２は、例えば、記憶部４２が記憶しているオペレータの音声の特徴情報に基づいて、音声情報を応対テキスト情報に変換する。変換部４３２は、オペレータが発した音声情報について音声情報の変換のトレーニングを事前に行うことにより、オペレータの音声の特徴を記憶部４２に記憶させる。変換部４３２は、オペレータの音声情報を応対テキスト情報に変換する場合に、オペレータの音声の特徴情報を記憶部４２から読み出し、この特徴情報を音声情報の変換に反映させる。このような構成により、変換部４３２は、音声情報の変換の誤りを生じやすい発音等について事前にトレーニングするので、音声情報の変換の精度を向上させることができる。 For example, the conversion unit 432 converts the voice information into reception text information based on the feature information of the operator's voice stored in the storage unit 42. The conversion unit 432 stores the characteristics of the operator's voice in the storage unit 42 by performing training for converting the voice information on the voice information issued by the operator in advance. When converting the voice information of the operator into the response text information, the conversion unit 432 reads the feature information of the operator's voice from the storage unit 42 and reflects the feature information in the conversion of the voice information. With such a configuration, the conversion unit 432 trains in advance on pronunciations and the like that are likely to cause errors in conversion of voice information, so that the accuracy of conversion of voice information can be improved.

識別部４３３は、オペレータがユーザの本人確認情報を確認する応対の終了を示す終了文字列を応対テキスト情報から識別する。識別部４３３は、応対テキスト情報のうち、オペレータが発した音声を変換部４３２が変換した応対テキスト情報に基づいて終了文字列を識別する。本人確認情報は、ユーザがサービス等の契約者本人であることを確認するための情報であり、例えば、ユーザの氏名、住所、生年月日、携帯電話番号、ユーザＩＤ又は暗証番号である。オペレータは、ユーザに本人確認情報を確認することにより、ユーザが契約者本人であることを確認する。 The identification unit 433 identifies, from the response text information, an end character string indicating the end of the response in which the operator confirms the user identification information. The identification unit 433 identifies the end character string based on the response text information obtained by converting the voice uttered by the operator from the response text information by the conversion unit 432. The identity confirmation information is information for confirming that the user is the subscriber of the service or the like, and is, for example, the user's name, address, date of birth, mobile phone number, user ID, or password. The operator confirms the identity confirmation information with the user to confirm that the user is the contractor.

オペレータは、ユーザの本人確認の完了後に本人確認が完了したことをユーザに通知し、識別部４３３は、例えば、オペレータが本人確認の完了をユーザに通知するメッセージの一部を終了文字列として識別する。オペレータの応対の手順や問いかけ等のフレーズは一定の規則性があるため、識別部４３３は、オペレータの発した音声の応対テキスト情報に基づいて終了文字列を識別することにより、終了文字列の識別の精度を向上させることができる。 The operator notifies the user that the personal identification has been completed after the completion of the personal identification of the user, and the identification unit 433 identifies, for example, a part of the message that the operator notifies the user of the completion of the personal identification as the end character string. To do. Since phrases such as operator response procedures and questions have a certain regularity, the identification unit 433 identifies the end character string by identifying the end character string based on the voice response text information issued by the operator. Accuracy can be improved.

終了文字列は、本人確認が完了したことをオペレータがユーザに通知するメッセージを識別するための複数の文字列の組み合わせであってもよい。一例としては、第１グループが「暗証」及び「契約」を要素として含み、第２グループが「番号」、「内容」及び「情報」を要素として含み、第３グループが「確認」、「調べ」、「取れ」及び「照会」を要素として含むものとする。識別部４３３は、第１グループのいずれかの要素と、第２グループのいずれかの要素と、第３グループのいずれかの要素とをいずれも識別したことを条件として、終了文字列を識別したと判定する。 The end character string may be a combination of a plurality of character strings for identifying a message from which the operator notifies the user that the personal identification has been completed. As an example, the first group includes “password” and “contract” as elements, the second group includes “number”, “content”, and “information” as elements, and the third group includes “confirmation” and “examination”. ”,“ Take ”and“ Inquiry ”as elements. The identification unit 433 identified the end character string on the condition that any element of the first group, any element of the second group, and any element of the third group were identified. Is determined.

例えば、識別部４３３は、変換部４３２が「契約内容の照会が取れました」というメッセージを応対テキスト情報に変換した場合、第１グループとして要素「契約」を識別し、第２グループとして要素「内容」を識別し、第３グループとして要素「照会」又は「取れ」を識別する。この場合、識別部４３３は、第１グループ、第２グループ及び第３グループの要素をいずれも識別したため、終了文字列を識別したと判定する。 For example, when the conversion unit 432 converts the message “contract details have been inquired” into response text information, the identification unit 433 identifies the element “contract” as the first group and the element “contract” as the second group. "Content" is identified, and the element "query" or "taken" is identified as the third group. In this case, the identification unit 433 determines that the end character string has been identified because all of the elements of the first group, the second group, and the third group have been identified.

これらのグループの要素は、同じ語句に含まれていてもよい。例えば、識別部４３３は、応対テキスト情報の語句「暗証番号」から第１グループの要素「暗証」と第２グループの要素「番号」とを識別してもよい。 These groups of elements may be included in the same phrase. For example, the identification unit 433 may identify the first group element “password” and the second group element “number” from the phrase “password” of the response text information.

また、識別部４３３は、第１〜第３グループのうち、２つのグループの要素を識別したことを条件として、終了文字列を識別したと判定してもよい。例えば、識別部４３３は、第１〜第３グループのうち、第１グループの要素「暗証」と、第２グループの要素「内容」とを識別した場合に、終了文字列を識別したと判定してもよい。 Further, the identification unit 433 may determine that the end character string has been identified on the condition that the elements of the two groups among the first to third groups have been identified. For example, the identification unit 433 determines that the end character string has been identified when the first group of elements “password” and the second group of elements “content” are identified among the first to third groups. May be.

また、識別部４３３は、本人確認情報を確認する応対の開始を示す開始文字列を識別する。オペレータは、本人確認を開始する前に、本人確認を行うことをユーザに通知するので、識別部４３３は、例えば、オペレータが本人確認を行うことをユーザに通知するメッセージの一部を開始文字列として識別する。 Further, the identification unit 433 identifies a start character string indicating the start of reception for confirming the identity confirmation information. Since the operator notifies the user that the identity confirmation is to be performed before starting the identity verification, the identification unit 433, for example, sends a part of a message notifying the user that the operator confirms the identity as a start character string. Identify as.

開始文字列は、オペレータが本人確認を行うことをユーザに通知するメッセージを識別するための複数の文字列の組み合わせであってもよい。例えば、第１グループが「名前」、「契約内容」及び「ＩＤ」を要素として含み、第２グループが「フルネーム」を要素として含み、第３グループが「教えて」、「確認」及び「調べる」を要素として含む場合に、識別部４３３は、第１グループのいずれかの要素と、第２グループのいずれかの要素と、第３グループのいずれかの要素とをいずれも識別したことを条件として、開始文字列を識別したものと判定する。 The start character string may be a combination of a plurality of character strings for identifying a message that notifies the user that the operator confirms the identity. For example, the first group includes “name”, “contract contents”, and “ID” as elements, the second group includes “full name” as elements, and the third group includes “tell”, “confirm”, and “check”. ”As an element, the identification unit 433 recognizes that any element in the first group, any element in the second group, and any element in the third group have been identified. As a result, it is determined that the start character string is identified.

識別部４３３は、終了文字列を誤って識別することを抑制するために、識別した開始文字列よりも後のテキスト部分から終了文字列を識別してもよい。識別部４３３は、識別した終了文字列及び開始文字列を特定部４３４に通知する。また、識別部４３３は、開始文字列を識別できない場合に、開始文字列を識別できない旨を特定部４３４に通知する。同様に、識別部４３３は、終了文字列を識別できない場合に、終了文字列を識別できない旨を特定部４３４に通知する。 The identification unit 433 may identify the end character string from a text portion after the identified start character string in order to suppress erroneously identifying the end character string. The identifying unit 433 notifies the identifying unit 434 of the identified end character string and start character string. In addition, when the start character string cannot be identified, the identification unit 433 notifies the specifying unit 434 that the start character string cannot be identified. Similarly, when the end character string cannot be identified, the identifying unit 433 notifies the specifying unit 434 that the end character string cannot be identified.

なお、識別部４３３は、応対テキスト情報のうち、ユーザが発した音声を変換部４３２が変換した応対テキスト情報に基づいて開始文字列又は終了文字列を識別してもよい。例えば、識別部４３３は、ユーザの発した音声を変換した応対テキスト情報がオペレータの発した音声を変換した応対テキスト情報に比べて多い場合には、ユーザの発した音声を変換した応対テキスト情報に基づいて、開始文字列又は終了文字列を識別してもよい。 The identification unit 433 may identify the start character string or the end character string based on the response text information obtained by converting the voice uttered by the user from the response text information. For example, when there is more response text information obtained by converting the voice uttered by the user than the response text information obtained by converting the voice uttered by the operator, the identification unit 433 converts the voice uttered by the user into the response text information obtained by converting the voice uttered by the user. Based on this, the start character string or the end character string may be identified.

特定部４３４は、識別部４３３が識別した開始文字列よりも前の応対テキスト情報である前側テキスト部分を特定する。特定部４３４は、開始文字列が複数の文字列の組み合わせである場合は、識別部４３３が識別した複数の文字列のうち、最も前方の位置よりも前の応対テキスト情報を前側テキスト情報として特定する。 The identifying unit 434 identifies the front text portion that is the response text information before the start character string identified by the identifying unit 433. When the start character string is a combination of a plurality of character strings, the identifying unit 434 identifies the response text information before the forefront position among the plurality of character strings identified by the identifying unit 433 as the front text information. To do.

また、特定部４３４は、識別部４３３が識別した終了文字列よりも後の応対テキスト情報である後側テキスト部分を特定する。特定部４３４は、終了文字列が複数の文字列の組み合わせである場合は、識別部４３３が識別した複数の文字列のうち、最も後方の文字列よりも後の応対テキスト情報を後側テキスト情報として特定する。特定部４３４は、識別部４３３が終了文字列を識別できない場合（すなわち、終了文字列を識別できない旨の通知を識別部４３３から受けた場合）、後側テキスト部分を特定しない。 Further, the specifying unit 434 specifies the rear text portion that is the response text information after the end character string identified by the identifying unit 433. When the ending character string is a combination of a plurality of character strings, the specifying unit 434 selects the response text information after the rearmost character string among the plurality of character strings identified by the identification unit 433 as the rear text information. As specified. When the identifying unit 433 cannot identify the end character string (that is, when receiving a notification from the identifying unit 433 that the end character string cannot be identified), the identifying unit 434 does not identify the rear text portion.

図３は、応対テキスト情報を示す図である。記憶部４２は、応対テキスト情報を話者に関連付けて記憶している。例えば、１行目は、オペレータが発したメッセージ「○○でございます。」を示し、２行目は、ユーザが発したメッセージ「△△の件で聞きたいことがあるんですけど。」を示す。 FIG. 3 is a diagram showing the response text information. The storage unit 42 stores the response text information in association with the speaker. For example, the first line shows the message “Oh, is the operator” issued by the operator, and the second line is the message “I want to hear about the case of △△” issued by the user. Show.

図３の例では、特定部４３４は、最初のメッセージ「○○でございます。」からオペレータが発したメッセージ「要件を承りました。」までを前側テキスト部分として特定する。前側テキスト部分では、例えば、オペレータがユーザの問い合わせの要件をユーザから聞き出す。 In the example of FIG. 3, the identifying unit 434 identifies, from the first message “I am XXX”, a message issued by the operator “Received requirements” as the front text portion. In the front text part, for example, the operator asks the user for the user's inquiry requirements.

本人確認部分は、開始文字列と終了文字列とに挟まれた応対テキスト情報であり、オペレータは、ユーザの本人確認情報を確認する。特定部４３４は、オペレータが発したメッセージ「お手数ですが、ご本人様確認をさせていただきます。」からオペレータが発したメッセージ「ご本人様確認が完了致しました。ありがとうございました。」までを本人確認部分として特定する。 The identity confirmation portion is response text information sandwiched between the start character string and the end character string, and the operator confirms the identity confirmation information of the user. The identification unit 434 sends a message from the operator “Thank you for your patience, but we will confirm your identity.” To the message “The identity confirmation has been completed. Thank you.” Identified as the identity verification part.

特定部４３４は、オペレータが発したメッセージ「□□の件ですが、料金プランはいかが致しましょうか？」から最後のメッセージ「ご利用ありがとうございました。」までを後側テキスト部分として特定する。後側テキスト部分では、例えば、オペレータがユーザの要件に合った具体的な提案又は回答をする。以上のように、図３の例では、オペレータが、ユーザの問い合わせの要件を聞き出した後にユーザの本人確認情報を確認し、本人確認の終了後にユーザの要件に合った提案又は回答をする。このため、前側テキスト部分、本人確認部分及び後側テキスト部分の応対の内容は、それぞれ異なっている。 The identification unit 434 identifies, from the message “□□, how do you like the price plan?” From the operator to the last message “Thank you for using.” As the rear text part. In the rear text portion, for example, the operator makes a specific proposal or answer that meets the user's requirements. As described above, in the example of FIG. 3, the operator confirms the user's identity confirmation information after hearing the user's inquiry requirements, and makes a proposal or answer that meets the user's requirements after the identity confirmation is completed. For this reason, the contents of the front text part, the identity verification part, and the rear text part are different from each other.

特定部４３４は、識別部４３３が応対テキスト情報の開始文字列を識別できない場合、開始文字列を用いて前側テキスト部分を特定することができない。そこで、特定部４３４は、識別部４３３が応対テキスト情報の開始文字列を識別できない場合（すなわち、開始文字列を識別できない旨の通知を識別部４３３から受けた場合）、応対テキスト情報の初めから所定の割合までのテキスト部分を前側テキスト部分として特定する。割合は、例えば、応対テキスト情報の全体の文字数に対する初めからの文字数の割合、全体の時間に対する初めからの時間の割合、あるいは全体の文章数に対する始めからの文章数の割合である。文章数は、応対テキスト情報の句点で区切られた数である。 When the identifying unit 433 cannot identify the start character string of the response text information, the identifying unit 434 cannot identify the front text portion using the start character string. Therefore, when the identification unit 433 cannot identify the start character string of the response text information (that is, when the notification that the start character string cannot be identified is received from the identification unit 433), the specifying unit 434 starts from the beginning of the response text information. A text part up to a predetermined ratio is specified as a front text part. The ratio is, for example, the ratio of the number of characters from the beginning to the total number of characters of the response text information, the ratio of the time from the beginning to the total time, or the ratio of the number of sentences from the start to the total number of sentences. The number of sentences is the number delimited by the punctuation points in the response text information.

一例としては、特定部４３４は、応対テキスト情報の全体の文字数に対する初めからの文字数の割合が、過去に取得された応対テキスト情報の全体の文字数に対する前側テキスト部分の長さの割合に基づいて算出された割合（例えば３割）となるまでのテキスト部分を前側テキスト部分として特定する。このようにすることで、例えば、特定部４３４は、変換部４３２による音声情報の変換の誤り等に起因して識別部４３３が開始文字列を識別できなかった場合であっても、前側テキスト部分を特定することができる。 As an example, the specifying unit 434 calculates the ratio of the number of characters from the beginning to the total number of characters of the response text information based on the ratio of the length of the front text portion to the total number of characters of the response text information acquired in the past. The text part up to the proportion (for example, 30%) is specified as the front text part. In this way, for example, the specifying unit 434 can recognize the front text portion even if the identifying unit 433 cannot identify the start character string due to an error in conversion of the speech information by the converting unit 432 or the like. Can be specified.

なお、特定部４３４は、識別部４３３が開始文字列を識別できない場合に、電話応対の開始から所定時間に達するまでに対応するテキスト部分を前側テキスト部分として特定してもよい。所定時間は、過去の複数の問い合わせにおける前側テキスト部分の時間の統計量として求めることができる。このような構成により、特定部４３４は、前側テキスト部分以外の応対テキスト部分の長さの影響を受けずに、所定時間に対応するテキスト部分を前側テキスト部分として特定することができる。 Note that when the identifying unit 433 cannot identify the start character string, the identifying unit 434 may identify the corresponding text part from the start of the telephone reception until the predetermined time is reached as the front text part. The predetermined time can be obtained as a statistic of the time of the front text portion in a plurality of past inquiries. With such a configuration, the specifying unit 434 can specify the text portion corresponding to the predetermined time as the front text portion without being affected by the length of the response text portion other than the front text portion.

また、特定部４３４は、応対テキスト情報の全体の文章数に対する初めからの文章数の割合が、過去に取得された応対テキスト情報の全体の文章数に対する前側テキスト部分の長さの割合に基づいて算出された割合（例えば３割）となるまでのテキスト部分を前側テキスト部分として特定してもよい。また、特定部４３４は、識別部４３３が開始文字列を識別できない場合に、電話応対の開始から所定の文章数に達するまでに対応するテキスト部分を前側テキスト部分として特定してもよい。所定の文章数は、例えば、過去の前側テキスト部分の文章数の統計量である。 Further, the specifying unit 434 determines that the ratio of the number of sentences from the beginning to the total number of sentences in the response text information is based on the ratio of the length of the front text portion to the total number of sentences in the response text information acquired in the past. The text part up to the calculated ratio (for example, 30%) may be specified as the front text part. Further, when the identifying unit 433 cannot identify the start character string, the identifying unit 434 may identify the corresponding text part from the start of the telephone reception until the predetermined number of sentences is reached as the front text part. The predetermined number of sentences is, for example, a statistic of the number of sentences in the past front text part.

特定部４３４は、識別部４３３が開始文字列を識別できない場合に、電話応対全体において音声を発する話者が切り替わった回数に対する電話応対の開始時から音声を発する話者が切り替わった回数の割合が所定値に達するまでのテキスト部分を前側テキスト部分として特定してもよい。図３に示すように、記憶部４２は、応対テキスト情報を話者に関連付けて記憶している。特定部４３４は、応対テキスト情報において話者が切り替わった回数を数える。例えば、特定部４３４は、オペレータが音声を発した後、ユーザが音声を発したときに話者が１回切り替わったと数え、ユーザが音声を発した後、オペレータが音声を発したときにも話者が１回切り替わったと数える。 In the case where the identifying unit 433 cannot identify the start character string, the specifying unit 434 has a ratio of the number of times that the speaker who has spoken is switched from the start of the telephone response to the number of times that the speaker who has spoken is switched in the entire telephone response. The text part until the predetermined value is reached may be specified as the front text part. As shown in FIG. 3, the storage unit 42 stores reception text information in association with the speaker. The identification unit 434 counts the number of times the speaker is switched in the response text information. For example, the specifying unit 434 counts that the speaker is switched once when the user utters the voice after the operator utters the voice, and also speaks when the operator utters the voice after the user utters the voice. Count the person switched once.

特定部４３４は、電話応対の開始時から終了時までの電話応対全体において話者が切り替わった回数を求める。特定部４３４は、電話応対の開始時から話者が切り替わった回数の電話応対全体の回数に対する割合が所定値に達するまでの応対テキスト部分を前側テキスト部分として特定する。一例としては、特定部４３４は、話者が切り替わった回数の電話応対全体の回数に対する割合が、３割に達するまでのテキスト部分を前側テキスト部分として特定する。特定部４３４は、電話応対全体において話者が切り替わった回数が１００回であるとすれば、電話応対の開始時から話者が３０回切り替わるまでの応対テキスト情報を前側テキスト部分として特定する。 The specifying unit 434 obtains the number of times the speaker has been switched in the entire telephone reception from the start to the end of the telephone reception. The identification unit 434 identifies the response text part until the ratio of the number of times the speaker is switched from the start of the telephone response to the total number of telephone receptions reaches a predetermined value as the front text part. As an example, the specifying unit 434 specifies the text part until the ratio of the number of times the speaker is switched to the total number of telephone receptions reaches 30% as the front text part. If the number of times the speaker is switched in the entire telephone reception is 100 times, the specifying unit 434 specifies the response text information from the start of the telephone reception until the speaker is switched 30 times as the front text portion.

話者が２回切り替わるごとにオペレータが１回音声を発したということができる。オペレータの電話応対の手順は予め定められているため、電話応対全体に対する前側テキスト部分の話者が切り替わった回数の割合は、前側テキスト部分の文字数の割合又は前側テキスト部分の時間の割合と比較すれば、話の長さ等のユーザの個人差の影響を受けにくい。このため、特定部４３４は、話者が切り替わった回数の割合によって前側テキスト部分を特定することにより、識別部４３３が開始文字列を識別できない場合に前側テキスト部分を特定する精度の低下を抑制することができる。 It can be said that every time the speaker switches twice, the operator utters a voice once. Since the operator's telephone response procedure is predetermined, the ratio of the number of times the speaker in the front text part has switched to the total telephone response is compared with the percentage of the number of characters in the front text part or the percentage of time in the front text part. For example, it is difficult to be influenced by individual differences among users such as the length of a story. For this reason, the specifying unit 434 specifies the front text portion based on the ratio of the number of times the speaker is switched, thereby suppressing a reduction in accuracy in specifying the front text portion when the identifying unit 433 cannot identify the start character string. be able to.

抽出部４３５は、開始文字列と終了文字列とに挟まれたテキスト部分である本人確認部分からユーザ情報を抽出する。抽出部４３５は、ユーザ情報として、ユーザの氏名、住所、生年月日等を抽出する。例えば、抽出部４３５は、ユーザの氏名等を聞き出すためにオペレータが発した問いかけを認識して、この問いかけの次にユーザが発した語句をユーザの氏名等として抽出する。抽出部４３５は、抽出したユーザ情報を要約部４３６に通知する。 The extraction unit 435 extracts user information from an identification part that is a text part sandwiched between a start character string and an end character string. The extraction unit 435 extracts the user's name, address, date of birth, and the like as user information. For example, the extraction unit 435 recognizes a question issued by the operator in order to find out the user's name and the like, and extracts a word / phrase issued by the user after this question as the user's name and the like. The extraction unit 435 notifies the summary unit 436 of the extracted user information.

要約部４３６は、応対テキストの要約データを生成する。本明細書の例では、要約部４３６は、応対テキスト情報を学習済みの機械学習モデル４３７に入力し、機械学習モデル４３７が出力した応対テキスト情報の要約データを取得する。例えば、制御部４３は、オープンソースの要約作成ＡＰＩであるＳｕｍｍｐｙ等のライブラリを記憶部４２から読み出して実行することにより、機械学習モデル４３７として機能する。なお、図示しない外部サーバが機械学習モデル４３７を有する構成であってもよい。 The summary unit 436 generates summary data of the response text. In the example of the present specification, the summary unit 436 inputs the response text information to the learned machine learning model 437 and acquires the summary data of the response text information output by the machine learning model 437. For example, the control unit 43 functions as the machine learning model 437 by reading a library such as Summpy, which is an open source summary creation API, from the storage unit 42 and executing it. Note that an external server (not shown) may have a machine learning model 437.

機械学習モデル４３７は、前側テキスト部分又は後側テキスト部分のビッグデータを入力として学習することにより予め生成された学習済みのモデルである。機械学習モデル４３７は、繰り返し頻度の高い語句を重要度が高いと評価し、繰り返し頻度の低い語句を重要度が低いと評価する。機械学習モデル４３７は、関連する語句の繰り返しも繰り返し頻度に含める。例えば、機械学習モデル４３７は、語句「なくした」と語句「紛失した」とは関連する語句と判定する。一方、機械学習モデル４３７は、どのような問い合わせにも含まれる語句は重要度が低いと評価する。例えば、語句「ありがとうございました」は、どのような問い合わせにも含まれることが多いので、機械学習モデル４３７は、語句「ありがとうございました」を重要度が低いと評価する。 The machine learning model 437 is a learned model generated in advance by learning big data of the front text portion or the rear text portion as an input. The machine learning model 437 evaluates a phrase with a high repetition frequency as having high importance, and evaluates a phrase with a low repetition frequency as having low importance. The machine learning model 437 also includes repetition of related words in the repetition frequency. For example, the machine learning model 437 determines that the phrase “lost” and the phrase “lost” are related phrases. On the other hand, the machine learning model 437 evaluates that words included in any inquiry have low importance. For example, since the phrase “thank you” is often included in any inquiry, the machine learning model 437 evaluates the phrase “thank you” as low in importance.

要約部４３６は、前側テキスト部分又は後側テキスト部分を機械学習モデル４３７に入力し、機械学習モデル４３７が出力した要約データを出力する。要約部４３６は、例えば、通信部４１を介して、要約データを解析する管理者が使用するコンピュータ（不図示）へ出力する。要約部４３６は、本人確認部分の要約データを出力しない。 The summarization unit 436 inputs the front text portion or the rear text portion to the machine learning model 437, and outputs the summary data output from the machine learning model 437. The summary unit 436 outputs the summary data to a computer (not shown) used by an administrator who analyzes summary data, for example, via the communication unit 41. The summarizing unit 436 does not output the summary data of the person confirmation part.

ところで、前側テキスト部分と後側テキスト部分とでは、内容が異なるため、前側テキスト部分と後側テキスト部分との要約データをまとめて作成すると、一方の要点が要約データに十分に反映されないことが想定される。そこで、要約部４３６は、前側テキスト部分と後側テキスト部分とを個別に機械学習モデル４３７に入力し、機械学習モデルが出力した前側テキスト部分及び後側テキスト部分の要約データをそれぞれ取得してもよい。要約部４３６は、前側テキスト部分の要約データと、後側テキスト部分の要約データとを関連付けて出力する。 By the way, because the contents of the front text part and the back text part are different, if the summary data of the front text part and the back text part is created together, it is assumed that one of the main points is not sufficiently reflected in the summary data Is done. Therefore, the summarization unit 436 individually inputs the front text portion and the rear text portion to the machine learning model 437, and acquires the summary data of the front text portion and the rear text portion output from the machine learning model. Good. The summary unit 436 associates and outputs the summary data of the front text portion and the summary data of the back text portion.

図４は、要約部４３６が出力する要約データの一例を示す図である。図４は、前側テキスト部分の要約データと、後側テキスト部分の要約データとを示す。前側テキスト部分の要約データには、ユーザの問い合わせの要件の要約として「料金プランの詳細が知りたい。」が表示され、後側テキスト部分の要約データには、オペレータが問い合わせに応対した結果の要約として「割引サービスは△△が利用可能であることを伝えた。」等が表示される。機械学習モデルは、応対テキスト情報の「もしもし」等の間投詞を削除するので、要約データには、間投詞は含まれない。 FIG. 4 is a diagram illustrating an example of summary data output from the summary unit 436. FIG. 4 shows summary data for the front text portion and summary data for the rear text portion. In the summary data in the front text part, “I want to know the details of the rate plan” is displayed as a summary of the user's inquiry requirements, and in the summary data in the rear text part, a summary of the results of the operator's response to the inquiry “The discount service has reported that △△ is available.” Is displayed. Since the machine learning model deletes interjections such as “Hello” in the response text information, the summary data does not include interjections.

要約部４３６は、抽出部４３５が抽出したユーザ情報に関連付けて、要約データを出力する。例えば、要約部４３６は、ユーザの氏名と、前側テキスト部分の要約データと、後側テキスト部分の要約データとを関連付けて出力する。このような構成により、管理者は、要約データをユーザごとに管理することができるので、同一のユーザから複数の問い合わせがあった場合に、このユーザのユーザ情報に関連付けられた複数の要約データをそれぞれ確認することができる。 The summary unit 436 outputs summary data in association with the user information extracted by the extraction unit 435. For example, the summary unit 436 outputs the user's name, the summary data of the front text portion, and the summary data of the rear text portion in association with each other. With such a configuration, the administrator can manage the summary data for each user. Therefore, when there are a plurality of inquiries from the same user, a plurality of summary data associated with the user information of the user is displayed. Each can be confirmed.

要約部４３６は、前側テキスト部分の種別を示す種別情報と、対応する後側テキスト部分の要約データとを関連付けて出力してもよい。種別は、複数の問い合わせの前側テキスト部分をグループ分けするための分類である。例えば、要約部４３６は、料金プランの詳細を確認するための問い合わせの種別「料金プラン確認」と、この問い合わせに対応する後側テキスト部分の要約データとを関連付けて出力してもよい。また、要約部４３６は、サービスの解約の問い合わせの種別「サービス解約」と、この問い合わせに対応する後側テキスト部分の要約データとを関連付けて出力してもよい。 The summarizing unit 436 may associate and output type information indicating the type of the front text part and the summary data of the corresponding rear text part. The type is a classification for grouping the front text portions of a plurality of inquiries. For example, the summary unit 436 may output the inquiry type “charge plan confirmation” for confirming the details of the charge plan in association with the summary data of the rear text portion corresponding to the inquiry. The summarization unit 436 may output the service cancellation inquiry type “service cancellation” and the summary data of the rear text portion corresponding to the inquiry in association with each other.

要約部４３６は、識別部４３３が終了文字列を識別できない場合には、終了文字列を識別できていないことを示す識別子に関連付けて要約データを出力する。特定部４３４は、識別部４３３が終了文字列を識別できない場合には、後側テキスト部分を特定しない。この場合、要約部４３６は、特定部４３４が特定した前側テキスト部分と、前側テキスト部分以外の応対テキスト情報とを機械学習モデル４３７に入力して、機械学習モデル４３７が出力した前側テキスト部分及び前側テキスト部分以外の要約データを取得する。要約部４３６は、後側テキスト部分が特定されていないことを管理者が把握できるように、機械学習モデル４３７が出力した各要約データと終了文字列が識別できなかったことを示す識別子とを関連付けて管理者が使用するコンピュータに出力する。 If the identifying unit 433 cannot identify the end character string, the summarizing unit 436 outputs summary data in association with an identifier indicating that the end character string cannot be identified. The identifying unit 434 does not identify the rear text portion when the identifying unit 433 cannot identify the end character string. In this case, the summarizing unit 436 inputs the front text portion specified by the specifying unit 434 and the response text information other than the front text portion to the machine learning model 437, and outputs the front text portion and the front side output by the machine learning model 437. Get summary data other than text. The summary unit 436 associates each summary data output from the machine learning model 437 with an identifier indicating that the end character string could not be identified so that the administrator can grasp that the rear text portion has not been specified. To the computer used by the administrator.

［テキスト分析装置４００の動作］
図５は、テキスト分析装置４００の動作を示すフローチャートである。この処理手順は、録音装置３００が、ユーザ及びオペレータの音声を録音した音声情報を生成したときに開始する。 [Operation of Text Analyzer 400]
FIG. 5 is a flowchart showing the operation of the text analysis apparatus 400. This processing procedure starts when the recording apparatus 300 generates voice information in which the voices of the user and the operator are recorded.

まず、取得部４３１は、通信部４１を介して、ユーザとオペレータとの間の電話応対の音声情報を録音装置３００から取得する（ステップＳ１０１）。次に、変換部４３２は、取得部４３１が取得した音声情報を応対テキスト情報に変換する（ステップＳ１０２）。識別部４３３は、応対テキスト情報において開始文字列が識別できたか否かを判定する（ステップＳ１０３）。特定部４３４は、識別部４３３が応対テキスト情報において開始文字列を識別できた場合には（Ｓ１０３のＹＥＳ）、識別した開始文字列より前の応対テキスト情報を前側テキスト部分として特定する（ステップＳ１０４）。 First, the acquisition unit 431 acquires voice information for telephone reception between the user and the operator from the recording device 300 via the communication unit 41 (step S101). Next, the conversion unit 432 converts the voice information acquired by the acquisition unit 431 into response text information (step S102). The identification unit 433 determines whether or not the start character string has been identified in the response text information (step S103). When the identification unit 433 can identify the start character string in the response text information (YES in S103), the identification unit 434 identifies the response text information before the identified start character string as the front text portion (step S104). ).

識別部４３３は、応対テキスト情報において終了文字列が識別できたか否かを判定する（ステップＳ１０５）。特定部４３４は、識別部４３３が応対テキスト情報において終了文字列を識別できた場合には（Ｓ１０５のＹＥＳ）、識別した終了文字列より後の応対テキスト情報を後側テキスト部分として特定する（ステップＳ１０６）。要約部４３６は、前側テキスト部分及び後側テキスト部分を学習済みの機械学習モデル４３７に入力し、機械学習モデル４３７が出力した前側テキスト部分及び後側テキスト部分の要約データをそれぞれ取得する。要約部４３６は、通信部４１を介して、取得した前側テキスト部分の要約データと後側テキスト部分の要約データとを関連付けて出力し（Ｓ１０７）、処理を終了する。 The identification unit 433 determines whether or not the end character string has been identified in the response text information (step S105). When the identification unit 433 can identify the end character string in the response text information (YES in S105), the specification unit 434 specifies the response text information after the identified end character string as the rear text portion (step S104). S106). The summarizing unit 436 inputs the front text portion and the rear text portion to the learned machine learning model 437, and acquires the summary data of the front text portion and the rear text portion output from the machine learning model 437, respectively. The summarizing unit 436 associates and outputs the acquired summary data of the front text portion and the summary data of the rear text portion via the communication unit 41 (S107), and ends the processing.

特定部４３４は、ステップＳ１０３の判定において識別部４３３が応対テキスト情報の開始文字列を識別できていない場合（Ｓ１０３のＮＯ）、応対テキスト情報の初めから所定の割合までのテキスト部分を前側テキスト部分として特定し（Ｓ１０８）、ステップＳ１０５の処理に移る。要約部４３６は、ステップＳ１０５の判定において識別部４３３が応対テキスト情報の終了文字列を識別できていない場合（Ｓ１０５のＮＯ）、特定部４３４が特定した前側テキスト部分と、前側テキスト部分以外の応対テキスト情報とを機械学習モデル４３７に入力して、機械学習モデル４３７が出力した前側テキスト部分の要約データ及び前側テキスト部分以外の要約データを取得する。要約部４３６は、機械学習モデル４３７が出力した各要約データと終了文字列が識別できなかったことを示す識別子とを関連付けて管理者が使用するコンピュータに出力し、処理を終了する（ステップＳ１０９）。 If the identification unit 433 cannot identify the start character string of the response text information in the determination in step S103 (NO in S103), the specification unit 434 determines the text portion from the beginning of the response text information to a predetermined ratio as the front text portion. (S108), and the process proceeds to step S105. When the identifying unit 433 cannot identify the end character string of the response text information in the determination in step S105 (NO in S105), the summarizing unit 436 receives the front text portion specified by the specifying unit 434 and the response other than the front text portion. The text information is input to the machine learning model 437, and the summary data of the front text portion and the summary data other than the front text portion output by the machine learning model 437 are acquired. The summarizing unit 436 associates each summary data output from the machine learning model 437 with an identifier indicating that the end character string could not be identified, and outputs it to the computer used by the administrator, and ends the process (step S109). .

［機械学習のための前処理］
本実施の形態では、要約部４３６が、前側テキスト部分又は後側テキスト部分を学習済みの機械学習モデル４３７に入力し、機械学習モデル４３７が出力した要約データを取得する場合の例について説明した。しかしながら、本発明は、学習済みの機械学習モデルに応対テキスト情報を入力する例に限定されない。例えば、テキスト分析装置４００は、機械学習モデルを生成するための学習部を備えてもよい。 [Preprocessing for machine learning]
In the present embodiment, an example has been described in which the summarizing unit 436 inputs the front text portion or the rear text portion to the learned machine learning model 437 and acquires the summary data output by the machine learning model 437. However, the present invention is not limited to an example in which reception text information is input to a learned machine learning model. For example, the text analysis device 400 may include a learning unit for generating a machine learning model.

学習部は、特定部４３４が特定した複数の後側テキスト部分を学習することにより、後側テキスト部分を入力とし、後側テキスト部分の要約データを出力とする機械学習モデルを生成する。このような構成により、学習部が複数の応対テキスト情報について応対テキスト情報全体を学習することにより機械学習モデルを生成する場合に比べて、後側テキスト部分の要約の精度をより向上させた機械学習モデルを生成することができる。 The learning unit learns a plurality of rear text portions specified by the specifying unit 434, thereby generating a machine learning model having the rear text portion as an input and the summary data of the rear text portion as an output. With this configuration, the learning unit improves the accuracy of summarization of the rear text part compared to the case where a machine learning model is generated by learning the entire response text information for a plurality of response text information. A model can be generated.

［テキスト分析装置４００による効果］
機械学習モデルを用いて自動要約を行う場合に、本人確認部分を含む応対テキスト情報を機械学習モデルに入力すると、本人確認部分において繰り返し言及される本人確認情報が要約に含まれやすい。これに対して、本実施の形態に係るテキスト分析装置４００は、応対テキスト情報において識別部４３３が識別した終了文字列よりも後の後側テキスト部分を特定する。このため、テキスト分析装置４００は、機械学習モデルを用いて自動要約した場合の要約内容の精度を向上させることができる。 [Effects of text analysis device 400]
When automatic summarization is performed using a machine learning model, if response text information including an identification part is input to the machine learning model, identification information that is repeatedly referred to in the identification part is likely to be included in the summary. On the other hand, text analysis device 400 according to the present embodiment specifies a rear text portion after the end character string identified by identification unit 433 in the response text information. For this reason, the text analysis apparatus 400 can improve the accuracy of the summary content when automatically summarizing using the machine learning model.

［変形例］
本実施の形態では、テキスト分析装置４００が、音声情報を応対テキスト情報に変換する変換部４３２を備える場合の例について説明した。しかしながら、本発明はこれに限定されない。例えば、変換部４３２は、テキスト分析装置４００と別体に設けられても良い。 [Modification]
In the present embodiment, an example has been described in which the text analysis apparatus 400 includes the conversion unit 432 that converts voice information into response text information. However, the present invention is not limited to this. For example, the conversion unit 432 may be provided separately from the text analysis device 400.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されず、その要旨の範囲内で種々の変形及び変更が可能である。例えば、装置の分散・統合の具体的な実施の形態は、以上の実施の形態に限られず、その全部又は一部について、任意の単位で機能的又は物理的に分散・統合して構成することができる。また、複数の実施の形態の任意の組み合わせによって生じる新たな実施の形態も、本発明の実施の形態に含まれる。組み合わせによって生じる新たな実施の形態の効果は、もとの実施の形態の効果を合わせ持つ。 As mentioned above, although this invention was demonstrated using embodiment, the technical scope of this invention is not limited to the range as described in the said embodiment, A various deformation | transformation and change are possible within the range of the summary. is there. For example, the specific embodiments of device distribution / integration are not limited to the above-described embodiments, and all or a part of them may be configured to be functionally or physically distributed / integrated in arbitrary units. Can do. In addition, new embodiments generated by any combination of a plurality of embodiments are also included in the embodiments of the present invention. The effect of the new embodiment produced by the combination has the effect of the original embodiment.

４１通信部
４２記憶部
４３制御部
１００通信端末
２００電話機
３００録音装置
４００テキスト分析装置
４３１取得部
４３２変換部
４３３識別部
４３４特定部
４３５抽出部
４３６要約部
４３７機械学習モデル 41 Communication Unit 42 Storage Unit 43 Control Unit 100 Communication Terminal 200 Telephone 300 Recording Device 400 Text Analysis Device 431 Acquisition Unit 432 Conversion Unit 433 Identification Unit 434 Identification Unit 435 Extraction Unit 436 Summarization Unit 437 Machine Learning Model

Claims

A text acquisition unit that acquires response text information in which voice information of a telephone response between a user and an operator is converted into text;
An identification unit for identifying a character string indicating the start or end of reception in which the operator confirms the user identification information in the reception text information;
A specifying unit that specifies a front text part before the character string identified by the identifying unit in the response text information or a rear text part after the character string identified by the identifying unit;
Text analysis device.

A summary unit that inputs the front text portion or the back text portion to a machine learning model and outputs summary data output from the machine learning model;
The text analysis apparatus according to claim 1.

The identification unit identifies a start character string indicating the start of the response for confirming the identity confirmation information and an end character string indicating the end of the response for confirming the identity confirmation information,
The identification unit identifies the end character string from a text portion after the identified start character string;
The text analysis apparatus according to claim 2.

The specifying unit specifies a front text portion before the start character string identified by the identification unit, and a rear text portion after the end character string identified by the identification unit,
The summary unit inputs the rear text portion and the front text portion to the machine learning model, and outputs the summary data of the rear text portion and the summary data of the front text portion in association with each other.
The text analysis apparatus according to claim 3.

The specifying unit specifies a text part from the beginning of the response text information to a predetermined ratio as a front text part when the identification unit cannot identify the start character string.
The text analysis apparatus according to claim 3 or 4.

The specific unit is a ratio of the number of times that a speaker who has made a voice switch from the start of a telephone response to the number of times a speaker who makes a voice is switched in the whole telephone response when the identification unit cannot identify the start character string Identifies the text part until is reached a predetermined value as the front text part,
The text analysis apparatus according to claim 3 or 4.

The specifying unit specifies a text part corresponding to a predetermined text number from the start of a telephone response as a front text part when the identification unit cannot identify the start character string;
The text analysis apparatus according to claim 3 or 4.

The specifying unit specifies a text part corresponding to a predetermined time from the start of telephone reception as a front text part when the identification unit cannot identify the start character string;
The text analysis apparatus according to claim 3 or 4.

An extraction unit that extracts user information from a text portion sandwiched between the start character string and the end character string;
The summary unit outputs the summary data in association with the user information extracted by the extraction unit.
The text analysis apparatus according to any one of claims 3 to 8.

The summary unit outputs the summary data in association with an identifier indicating that the end character string cannot be identified when the identification unit cannot identify the end character string;
The text analysis apparatus according to any one of claims 3 to 9.

A conversion unit that converts voice information of a telephone reception between a user and an operator into the reception text information;
The identification unit identifies the end character string based on the response text information obtained by converting the voice uttered by an operator from the response text information.
The text analysis apparatus according to any one of claims 3 to 10.

It further includes a storage unit that stores feature information of the operator's voice,
The conversion unit converts the voice information into the response text information based on the feature information of the operator's voice stored in the storage unit.
The text analysis apparatus according to claim 11.

Obtaining response text information in which voice information of a telephone response between a user and an operator is converted into text;
Identifying a character string indicating the start or end of reception in which the operator confirms the user identification information in the reception text information;
Identifying a front text portion before the character string identified in the response text information or a rear text portion after the identified character string;
A text analysis method comprising:

Computer
A text acquisition unit for acquiring response text information obtained by converting voice information of a telephone response between a user and an operator into text;
An identification unit for identifying a character string indicating the start or end of reception in which the operator confirms the user identification information in the reception text information, and a front side before the character string identified by the identification unit in the reception text information A specifying part for specifying a text part or a rear text part after the character string identified by the identifying part;
Text analysis program to make it function as