JP2018169843A

JP2018169843A - Information processing device, information processing method and information processing program

Info

Publication number: JP2018169843A
Application number: JP2017067206A
Authority: JP
Inventors: 祥史大西; Yoshifumi Onishi; 真寺尾; Makoto Terao
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2017-03-30
Filing date: 2017-03-30
Publication date: 2018-11-01
Anticipated expiration: 2037-03-30
Also published as: JP2022000825A; JP6957933B2; JP7238940B2

Abstract

PROBLEM TO BE SOLVED: To provide an information processing device that can predict cancellation risks against those clients who come to contact for the first time and thus whose behavior history information is not obtained.SOLUTION: An information processing device 100 comprises a call storage unit 101, a voice amount feature calculation unit 102, an emotion recognition unit 103, and a cancellation prediction unit 104. The call storage unit 101 obtains and stores calls between clients and operators. The voice amount feature calculation unit 102 calculates the feature amount from the voice amount of the stored calls. The emotion recognition unit 103 recognizes the emotion of the voice activities. The cancellation prediction unit 104 predicts, based on the feature amount of the voice amount and the recognized emotion, cancellation risks in the calls.SELECTED DRAWING: Figure 1

Description

本発明は、情報処理装置、情報処理方法および情報処理プログラムに関する。 The present invention relates to an information processing apparatus, an information processing method, and an information processing program.

上記技術分野において、特許文献１には、顧客情報と、顧客の行動履歴情報と、解約リスクが高い顧客とみなす顧客抽出ルールと、を比較して、顧客抽出ルールを満たす顧客を解約リスクが高い顧客として抽出する技術が開示されている。また、非特許文献１には、発話区間の検出方法が開示されている。非特許文献２には、感情認識の方法が開示されている。非特許文献３には、機械学習方法が開示されている。非特許文献４には、発話区間の検出方法が開示されている。 In the above technical field, Patent Document 1 discloses that customer information, customer behavior history information, and a customer extraction rule that is regarded as a customer with a high cancellation risk are compared, and a customer that satisfies the customer extraction rule has a high cancellation risk. A technique for extracting as a customer is disclosed. Non-Patent Document 1 discloses a method for detecting an utterance section. Non-Patent Document 2 discloses an emotion recognition method. Non-Patent Document 3 discloses a machine learning method. Non-Patent Document 4 discloses a method for detecting an utterance section.

特開２００２−３３４２００号公報JP 2002-334200 A

Yusuke Kida and Tatsuya Kawahara, "Voice Activity Detection based on Optimally Weighted Combination of Multiple Features," Proc. INTERSPEECH 2005, pp.2621-2624, 2005.Yusuke Kida and Tatsuya Kawahara, "Voice Activity Detection based on Optimally Weighted Combination of Multiple Features," Proc. INTERSPEECH 2005, pp.2621-2624, 2005. Florian Eyben, Martin Wollmer, and Bjorn Schuller, "openEAR - Introducing the Munich Open-Source Emotion and Affect Recognition Toolkit," 2009 3rd International Conference on Affective Computing and Intelligent Interaction and WorkshopsFlorian Eyben, Martin Wollmer, and Bjorn Schuller, "openEAR-Introducing the Munich Open-Source Emotion and Affect Recognition Toolkit," 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops Pedregosa et al., "Scikit-learn: Machine Learning in Python, " JMLR 12, pp. 2825-2830, 2011.Pedregosa et al., "Scikit-learn: Machine Learning in Python," JMLR 12, pp. 2825-2830, 2011. J.P.Yamron, I.Carp, L.Gillick, S.Lowe, and P.van Mulbregt, “A HIDDEN MARKOV MODEL APPROACH TO TEXT SEGMENTATION AND EVENT TRACKING,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp.333-336, 1998.JPYamron, I.Carp, L.Gillick, S.Lowe, and P.van Mulbregt, “A HIDDEN MARKOV MODEL APPROACH TO TEXT SEGMENTATION AND EVENT TRACKING,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp.333- 336, 1998.

しかしながら、上記文献に記載の技術では、顧客の行動履歴情報を用いるため、行動履歴情報を持たない初めてコンタクトしてきた顧客対しても解約リスク予測ができなかった。 However, since the technology described in the above document uses customer behavior history information, it has not been possible to predict churn risk even for a customer who does not have behavior history information for the first time.

本発明の目的は、上述の課題を解決する技術を提供することにある。 The objective of this invention is providing the technique which solves the above-mentioned subject.

上記目的を達成するため、本発明に係る情報処理装置は、
顧客とオペレータとの間の通話を取得して、記憶する通話記憶手段と、
記憶した前記通話の発話量特徴量を算出する発話量特徴算出手段と、
前記発話区間の感情を認識する感情認識手段と、
前記発話量特徴量と、認識した前記感情と、に基づいて、前記通話における解約リスクを予測する解約予測手段と、
を備えた。 In order to achieve the above object, an information processing apparatus according to the present invention provides:
Call storage means for acquiring and storing a call between a customer and an operator;
Utterance amount feature calculation means for calculating the stored utterance amount feature amount of the call;
Emotion recognition means for recognizing the emotion of the utterance interval;
A churn predicting means for predicting a churn risk in the call based on the utterance amount feature and the recognized emotion;
Equipped with.

上記目的を達成するため、本発明に係る情報処理方法は、
顧客とオペレータとの間の通話を取得して、記憶する通話記憶ステップと、
記憶した前記通話の発話量特徴量を算出する発話量特徴算出ステップと、
前記発話区間の感情を認識する感情認識ステップと、
前記発話量特徴量と、認識した前記感情と、に基づいて、前記通話における解約リスクを予測する解約予測ステップと、
を含む。 In order to achieve the above object, an information processing method according to the present invention includes:
A call storage step of acquiring and storing a call between a customer and an operator;
An utterance amount feature calculating step of calculating an utterance amount feature amount of the stored call;
An emotion recognition step for recognizing the emotion of the utterance interval;
A churn prediction step of predicting a churn risk in the call based on the utterance amount feature and the recognized emotion;
including.

上記目的を達成するため、本発明に係る情報処理プログラムは、
顧客とオペレータとの間の通話を取得して、記憶する通話記憶ステップと、
記憶した前記通話の発話量特徴量を算出する発話量特徴算出ステップと、
前記発話区間の感情を認識する感情認識ステップと、
前記発話量特徴量と、認識した前記感情と、に基づいて、前記通話における解約リスクを予測する解約予測ステップと、
をコンピュータに実行させる。 In order to achieve the above object, an information processing program according to the present invention provides:
A call storage step of acquiring and storing a call between a customer and an operator;
An utterance amount feature calculating step of calculating an utterance amount feature amount of the stored call;
An emotion recognition step for recognizing the emotion of the utterance interval;
A churn prediction step of predicting a churn risk in the call based on the utterance amount feature and the recognized emotion;
Is executed on the computer.

本発明によれば、行動履歴情報を持たない初めてコンタクトしてきた顧客対しても解約リスク予測ができる。 According to the present invention, cancellation risk can be predicted even for a customer who does not have action history information and has contacted for the first time.

本発明の第１実施形態に係る情報処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the information processing apparatus which concerns on 1st Embodiment of this invention. 本発明の第２実施形態に係る情報処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the information processing apparatus which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係る情報処理装置の感情特徴算出部の構成を示すブロック図である。It is a block diagram which shows the structure of the emotion characteristic calculation part of the information processing apparatus which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係る情報処理装置のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the information processing apparatus which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係る情報処理装置の処理手順を説明するフローチャートである。It is a flowchart explaining the process sequence of the information processing apparatus which concerns on 2nd Embodiment of this invention. 本発明の第３実施形態に係る情報処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the information processing apparatus which concerns on 3rd Embodiment of this invention. 本発明の第３実施形態に係る情報処理装置のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the information processing apparatus which concerns on 3rd Embodiment of this invention. 本発明の第３実施形態に係る情報処理装置の処理手順を説明するフローチャートである。It is a flowchart explaining the process sequence of the information processing apparatus which concerns on 3rd Embodiment of this invention.

以下に、本発明を実施するための形態について、図面を参照して、例示的に詳しく説明記載する。ただし、以下の実施の形態に記載されている、構成、数値、処理の流れ、機能要素などは一例に過ぎず、その変形や変更は自由であって、本発明の技術範囲を以下の記載に限定する趣旨のものではない。 DESCRIPTION OF EMBODIMENTS Hereinafter, embodiments for carrying out the present invention will be exemplarily described in detail with reference to the drawings. However, the configuration, numerical values, process flow, functional elements, and the like described in the following embodiments are merely examples, and modifications and changes are free, and the technical scope of the present invention is described in the following description. It is not intended to be limited.

［第１実施形態］
本発明の第１実施形態としての情報処理装置１００について、図１を用いて説明する。情報処理装置１００は、通話における発話量特徴と感情とに基づいて、通話における解約リスクを予測する装置である。 [First Embodiment]
An information processing apparatus 100 as a first embodiment of the present invention will be described with reference to FIG. The information processing apparatus 100 is an apparatus that predicts a churn risk in a call based on the utterance amount characteristics and emotions in the call.

図１に示すように、情報処理装置１００は、通話記憶部１０１と、発話量特徴算出部１０２と、感情認識部１０３と、解約予測部１０４と、を含む。通話記憶部１０１は、顧客とオペレータとの間の通話を取得して、記憶する。発話量特徴算出部１０２は、記憶した通話の発話量特徴量を算出する。感情認識部１０３は、発話区間の感情を認識する。解約予測部１０４は、発話量特徴量と、認識した感情と、に基づいて、通話における解約リスクを予測する。 As illustrated in FIG. 1, the information processing apparatus 100 includes a call storage unit 101, an utterance amount feature calculation unit 102, an emotion recognition unit 103, and a churn prediction unit 104. The call storage unit 101 acquires and stores a call between a customer and an operator. The utterance amount feature calculation unit 102 calculates the utterance amount feature amount of the stored call. The emotion recognition unit 103 recognizes the emotion in the utterance section. The churn prediction unit 104 predicts the churn risk in the call based on the utterance amount feature amount and the recognized emotion.

本実施形態によれば、行動履歴情報を持たない初めてコンタクトしてきた顧客対しても解約リスク予測ができる。 According to the present embodiment, it is possible to predict churn risk even for a customer who does not have action history information and contacts for the first time.

［第２実施形態］
次に本発明の第２実施形態に係る情報処理装置について、図２乃至図５を用いて説明する。図２は、本実施形態に係る情報処理装置の構成を示すブロック図である。 [Second Embodiment]
Next, an information processing apparatus according to a second embodiment of the present invention will be described with reference to FIGS. FIG. 2 is a block diagram illustrating a configuration of the information processing apparatus according to the present embodiment.

本実施形態に係る情報処理装置は、コールセンタなどにおいて、通話から、解約リスクの高い顧客や、継続が見込める顧客などを抽出することができ、それぞれの顧客に対して選択的に適切なフォローアップを実施できる装置である。 The information processing apparatus according to the present embodiment can extract, from a call, a customer who has a high risk of churn or a customer who can expect continuation from a call. It is a device that can be implemented.

情報処理装置２００は、通話記憶部２０１、発話量特徴算出部２０２、感情認識部２０３、感情特徴算出部２０４、顧客抽出モデル記憶部２０５、解約予測部２０６および抽出顧客出力部２０７を有する。 The information processing apparatus 200 includes a call storage unit 201, an utterance amount feature calculation unit 202, an emotion recognition unit 203, an emotion feature calculation unit 204, a customer extraction model storage unit 205, a churn prediction unit 206, and an extraction customer output unit 207.

通話記憶部２０１は、顧客とオペレータとの間の通話を通話音声データとして取得する。そして、通話記憶部２０１は、取得した通話音声データをファイルまたはストリームデータとして記憶する。 The call storage unit 201 acquires a call between a customer and an operator as call voice data. The call storage unit 201 stores the acquired call voice data as a file or stream data.

通話音声データを記憶する際に、通話記憶部２０１は、記憶する通話音声データに対して、通話（コール）を特定する通話ＩＤ（Identifier）（コールＩＤ）を付与して、記憶する。さらに、通話記憶部２０１は、記憶する通話音声データに対して、通話音声データの発話区間を検出し、検出した発話区間のそれぞれに対して発話者情報、すなわち、オペレータ情報または顧客情報を付与して、記憶する。さらにまた、通話記憶部２０１は、記憶する通話音声データに対して、検出した発話区間のそれぞれに対して発話時刻情報、すなわち、発話始端の時刻および発話終端の時刻を付与して、記憶する。よって、通話記憶部２０１は、通話音声データを記憶する際に、記憶する通話音声データに対して、コールＩＤ、オペレータ情報、顧客情報、発話時刻情報を付与して、記憶する。 When storing the call voice data, the call storage unit 201 assigns and stores a call ID (Identifier) (call ID) for specifying the call (call) to the stored call voice data. Furthermore, the call storage unit 201 detects the speech section of the call voice data with respect to the stored call voice data, and gives speaker information, that is, operator information or customer information, to each detected speech section. And remember. Furthermore, the call storage unit 201 assigns and stores the utterance time information, that is, the utterance start time and utterance end time, to each of the detected utterance sections, with respect to the stored call voice data. Therefore, when storing the call voice data, the call storage unit 201 assigns and stores the call ID, operator information, customer information, and utterance time information to the call voice data to be stored.

なお、発話区間の検出方法には様々な方法が開示されており、例えば、非特許文献１に記載されている方法を用いてもよい。 Various methods are disclosed as the method for detecting an utterance section. For example, the method described in Non-Patent Document 1 may be used.

コールＩＤについては、例えば、近年のコールセンタにおいては、ＣＴＩ（Computer Telephony Integration）システムによりコールＩＤを取得することができる。また、顧客側の信号とオペレータ側の信号とは、別のチャンネルとしてデータを記録することが可能である。したがって、コールＩＤおよび発話者情報は、ＣＴＩシステムにより取得したコールＩＤとチャンネル情報とを用いて付与することができる。 As for the call ID, for example, in a recent call center, the call ID can be obtained by a CTI (Computer Telephony Integration) system. The customer side signal and the operator side signal can record data as separate channels. Therefore, the call ID and speaker information can be given using the call ID and channel information acquired by the CTI system.

また、発話時刻情報は、検知された発話区間の絶対時刻、または、システムが規定する時刻情報を始端および終端について取得することで付与できる。 Further, the utterance time information can be given by acquiring the absolute time of the detected utterance section or the time information defined by the system for the start and end points.

発話量特徴算出部２０２は、通話記憶部２０１に記憶された各発話区間の時刻情報から、オペレータと顧客との発話量の特徴を算出する。すなわち、各コールにおけるオペレータおよび顧客のそれぞれの発話区間の総時間、各コールにおけるオペレータと顧客との発話区間の総時間の比率、各コールにおけるオペレータおよび顧客のそれぞれの発話区間長の平均値、分散値、中央値、最頻値、あるいはその他の統計処理によって算出される値の少なくともいずれか一つを用いる。 The utterance amount feature calculation unit 202 calculates the utterance amount feature between the operator and the customer from the time information of each utterance section stored in the call storage unit 201. That is, the total time of each utterance section of the operator and the customer in each call, the ratio of the total time of the utterance section of the operator and the customer in each call, the average value of the length of each utterance section of the operator and the customer in each call, variance At least one of a value, a median value, a mode value, or a value calculated by other statistical processing is used.

感情認識部２０３は、通話記憶部２０１に記憶された各発話区間について、valence（感情の質を示す感情価、感情のpositiveとnegativeとの度合）、arousal（覚醒か沈静かを示す覚醒度、感情の興奮度合）、納得感および期待感の少なくとも一つの感情値を算出する。感情認識方法は様々な方法が開示されており、例えば、非特許文献２に記載されている方法を用いることができる。 The emotion recognition unit 203, for each utterance section stored in the call storage unit 201, valence (emotion valence indicating the quality of emotion, degree of positive and negative emotion), arousal (arousal level indicating arousal or quiet, At least one emotional value of the degree of emotion excitement), satisfaction, and expectation is calculated. Various methods for emotion recognition are disclosed. For example, the method described in Non-Patent Document 2 can be used.

いずれの感情も、事前に指定した範囲の数値、または、有限個のカテゴリを感情値として算出される。すなわち、感情認識部２０３は、例えば、valenceについて、−３〜３の範囲の値、または、｛−３，−２，−１，０，１，２，３｝といったカテゴリを算出する。arousal、納得感、期待感も同様であるが、各感情で指定する範囲やカテゴリは同一のものである必要はない。 For each emotion, a numerical value in a range designated in advance or a finite number of categories is calculated as an emotion value. That is, the emotion recognition unit 203 calculates, for example, a value in the range of −3 to 3 or a category such as {−3, −2, −1, 0, 1, 2, 3} for valence. The same applies to arousal, persuasion, and expectation, but the range and category specified by each emotion need not be the same.

感情特徴算出部２０４は、感情認識部２０３で算出された各発話区間の感情値から、オペレータおよび顧客の感情特徴量を算出する。感情特徴量には、各コールにおけるオペレータと顧客とのそれぞれの感情値の総和、感情値比率、感情値の平均値、分散値、中央値、最頻値、あるいはその他の統計処理によって算出される値の少なくともいずれか一つが含まれる。 The emotion feature calculation unit 204 calculates the emotion feature amount of the operator and the customer from the emotion value of each utterance section calculated by the emotion recognition unit 203. The emotion feature amount is calculated by the sum of the emotion values of the operator and the customer in each call, the emotion value ratio, the average value of the emotion values, the variance value, the median value, the mode value, or other statistical processing. Contains at least one of the values.

図３は、本実施形態に係る情報処理装置の感情特徴算出部の構成を示すブロック図である。感情特徴算出部２０４は、図３に示す構成により、通話音声データ内の特定の区間に含まれる各発話区間の感情値のみを用いて、オペレータおよび顧客の感情特徴量を算出してもよい。図３に示す感情特徴算出部２０４は、話題区間検出部２４１と、絞込み部２４２と、特徴算出部２４３と、を含む。 FIG. 3 is a block diagram illustrating a configuration of the emotion feature calculation unit of the information processing apparatus according to the present embodiment. The emotion feature calculation unit 204 may calculate the emotion feature amounts of the operator and the customer using only the emotion value of each utterance section included in a specific section in the call voice data with the configuration shown in FIG. The emotion feature calculation unit 204 illustrated in FIG. 3 includes a topic section detection unit 241, a narrowing unit 242, and a feature calculation unit 243.

話題区間検出部２４１は、通話記憶部２０１に記憶された通話音声データに含まれる話題区間を検出し、検出した話題区間のそれぞれに対して話題始端の時刻および話題終端の時刻を付与する。話題としては、例えば、オープニング、クロージング、製品・サービス説明、料金説明、勧誘、ヒアリング、個人情報確認、事務手続き、などが挙げられる。 The topic section detection unit 241 detects a topic section included in the call voice data stored in the call storage unit 201, and assigns a topic start time and a topic end time to each of the detected topic sections. Topics include, for example, opening, closing, product / service explanation, fee explanation, solicitation, hearing, personal information confirmation, and office procedures.

話題区間の検出方法は様々な方法が開示されており、例えば、音声認識と非特許文献４に記載されている方法を組合せて用いることができる。具体的には、様々な話題に対する単語の出現頻度をあらかじめ学習した話題モデルを用いて、音声認識で得られる単語列を複数の話題区間へと分割することができる。 Various methods for detecting a topic section are disclosed. For example, voice recognition and a method described in Non-Patent Document 4 can be used in combination. Specifically, it is possible to divide a word string obtained by speech recognition into a plurality of topic sections using a topic model in which the appearance frequency of words for various topics is learned in advance.

また、話題区間検出部２４１は、時刻情報や発話数に基づいて話題区間を検出しても良い。例えば、通話の先頭３０秒区間や先頭５発話区間をオープニングの話題区間として検出できる。また、通話の末尾３０秒区間や末尾５発話区間をクロージングの話題区間として検出できる。さらに、通話の後半１／３区間（例えば、９分間の通話であれば６分〜９分の区間）を1つの話題区間として検出しても良い。 Further, the topic section detection unit 241 may detect a topic section based on time information and the number of utterances. For example, the first 30 seconds section and the first five utterance sections of the call can be detected as the opening topic section. In addition, the last 30 seconds and the last five utterances of the call can be detected as closing topics. Furthermore, the latter half of the call may be detected as one topic section (for example, a section of 6 to 9 minutes for a 9-minute call).

絞込み部２４２は、感情認識部２０３が感情値を付与した各発話区間のうち、話題区間検出部２０８が検出した特定の話題に含まれる発話区間のみを特徴算出部２４３へ出力する。例えば、クロージングの話題区間に含まれる発話区間とそれらの感情値のみを特徴算出部２４３へ出力する。 The narrowing-down unit 242 outputs, to the feature calculation unit 243, only the utterance section included in the specific topic detected by the topic section detection unit 208 among the utterance sections to which the emotion recognition unit 203 has given the emotion value. For example, only the speech sections included in the closing topic section and their emotion values are output to the feature calculation unit 243.

特徴算出部２４３は、絞込み部２４２から出力された各発話区間の感情値から、オペレータおよび顧客の感情特徴量を算出する。感情特徴量には、特定の話題区間におけるオペレータと顧客とのそれぞれの感情値の総和、感情値比率、感情値の平均値、分散値、中央値、最頻値、あるいはその他の統計処理によって算出される値の少なくともいずれか一つが含まれる。例えば、オープニングまたはクロージングの話題区間に含まれる顧客の発話のvalenceや期待感の感情値の総和や平均値を感情特徴量として算出する。 The feature calculation unit 243 calculates the emotion feature amount of the operator and the customer from the emotion value of each utterance section output from the narrowing unit 242. Emotion feature value is calculated by total of emotion value of operator and customer, emotion value ratio, average value of emotion value, variance value, median value, mode value, or other statistical processing in specific topic section At least one of the values to be included. For example, the total or average value of the customer utterance valence and expectation emotion values included in the topic section of the opening or closing is calculated as the emotion feature amount.

顧客抽出モデル記憶部２０５は、感情特徴算出部２０４で算出されたオペレータおよび顧客の感情特徴量と、発話量特徴算出部２０２で算出されたオペレータおよび顧客の発話量と、を特徴量とし、解約リスクが高い顧客が予測するモデルを顧客抽出モデルとして記憶する。 The customer extraction model storage unit 205 uses the operator and customer emotion feature amounts calculated by the emotion feature calculation unit 204 and the operator and customer utterance amounts calculated by the utterance amount feature calculation unit 202 as feature amounts, and cancels the contract. A model predicted by a customer with high risk is stored as a customer extraction model.

客抽出モデルは、あらかじめ各通話データに対応した解約したか否かの正解ラベル情報を与えて、解約リスクを予測するモデルを機械学習手法により学習しておく。機械学習手法は様々な方法が開示されており、例えば、非特許文献３には、その詳細および実行可能なプログラムが開示されている。解約予測は、当該モデルの学習手法に対応した予測手法を用いることができる。 In the customer extraction model, correct label information indicating whether or not the contract corresponding to each call data has been canceled is given in advance, and a model for predicting the cancellation risk is learned by a machine learning method. Various machine learning methods are disclosed. For example, Non-Patent Document 3 discloses the details and executable programs. For the cancellation prediction, a prediction method corresponding to the learning method of the model can be used.

解約予測部２０６は、顧客抽出モデルを用いて、各コールに対して解約リスクを予測（算出）する。 The churn prediction unit 206 predicts (calculates) a churn risk for each call using a customer extraction model.

抽出顧客出力部２０７は、予測された各コールの解約リスクと、あらかじめ指定した解約リスクの閾値や対象コール数における高解約リスク者割合と、を比較し、解約リスクの高い顧客、すなわち、解約リスクが高いコールの顧客を抽出する。 The extracted customer output unit 207 compares the predicted churn risk for each call with the churn risk threshold specified in advance and the high churn risk ratio for the number of calls, so that the customer with high churn risk, that is, the churn risk. Extract high call customers.

なお、本実施形態の説明では、解約を予測する例で説明をしたが、本実施形態は継続予測についても同様に適用できる。 In the description of the present embodiment, an example in which cancellation is predicted has been described. However, the present embodiment can be similarly applied to continuous prediction.

図４は、本実施形態に係る情報処理装置２００のハードウェア構成を説明するブロック図である。ＣＰＵ(Central Processing Unit)４１０は、演算制御用のプロセッサであり、プログラムを実行することで図２の情報処理装置２００の機能構成部を実現する。ＣＰＵ４１０は複数のプロセッサを有し、異なるプログラムやモジュール、タスク、スレッドなどを並行して実行してもよい。ＲＯＭ(Read Only Memory)４２０は、初期データおよびプログラムなどの固定データおよびその他のプログラムを記憶する。また、ネットワークインタフェース４３０は、ネットワークを介して他の装置などと通信する。なお、ＣＰＵ４１０は１つに限定されず、複数のＣＰＵであっても、あるいは画像処理用のＧＰＵ(Graphics Processing Unit)を含んでもよい。また、ネットワークインタフェース４３０は、ＣＰＵ４１０とは独立したＣＰＵを有して、ＲＡＭ(Random Access Memory)４４０の領域に送受信データを書き込みあるいは読み出しするのが望ましい。また、ＲＡＭ４４０とストレージ４５０との間でデータを転送するＤＭＡＣ(Direct Memory Access Controller)を設けるのが望ましい（図示なし）。さらに、ＣＰＵ４１０は、ＲＡＭ４４０にデータが受信あるいは転送されたことを認識してデータを処理する。また、ＣＰＵ４１０は、処理結果をＲＡＭ４４０に準備し、後の送信あるいは転送はネットワークインタフェース４３０やＤＭＡＣに任せる。 FIG. 4 is a block diagram illustrating a hardware configuration of the information processing apparatus 200 according to the present embodiment. A CPU (Central Processing Unit) 410 is a processor for arithmetic control, and implements a functional component of the information processing apparatus 200 in FIG. 2 by executing a program. The CPU 410 may include a plurality of processors and execute different programs, modules, tasks, threads, and the like in parallel. A ROM (Read Only Memory) 420 stores fixed data such as initial data and programs and other programs. The network interface 430 communicates with other devices via the network. Note that the CPU 410 is not limited to one, and may be a plurality of CPUs or may include a graphics processing unit (GPU) for image processing. The network interface 430 preferably includes a CPU independent of the CPU 410 and writes or reads transmission / reception data to / from a RAM (Random Access Memory) 440 area. Further, it is desirable to provide a DMAC (Direct Memory Access Controller) that transfers data between the RAM 440 and the storage 450 (not shown). Further, the CPU 410 recognizes that the data has been received or transferred to the RAM 440 and processes the data. Further, the CPU 410 prepares the processing result in the RAM 440 and leaves the subsequent transmission or transfer to the network interface 430 or the DMAC.

ＲＡＭ４４０は、ＣＰＵ４１０が一時記憶のワークエリアとして使用するランダムアクセスメモリである。ＲＡＭ４４０には、本実施形態の実現に必要なデータを記憶する領域が確保されている。通話ＩＤ４４１は、オペレータと顧客との間の通話（コール）を識別するデータであり、各通話に割り当てられている。発話量特徴４４２は、オペレータと顧客との間の通話において、算出された発話量の特徴である。感情値４４３は、各コールにおける各発話区間の算出された感情値である。感情特徴４４４は、各発話区間の感情値から算出された感情の特徴である、顧客抽出モデル４４５は、解約リスクが高い顧客を予測するためのモデルである。解約予測４４６は、顧客抽出モデルを用いて各コールに対して予測された解約リスクの予測である。 The RAM 440 is a random access memory that the CPU 410 uses as a work area for temporary storage. In the RAM 440, an area for storing data necessary for realizing the present embodiment is secured. The call ID 441 is data for identifying a call (call) between an operator and a customer, and is assigned to each call. The utterance amount feature 442 is a feature of the calculated utterance amount in a call between the operator and the customer. The emotion value 443 is a calculated emotion value for each utterance section in each call. The emotion feature 444 is a feature of emotion calculated from the emotion value of each utterance section. The customer extraction model 445 is a model for predicting a customer with a high churn risk. The churn prediction 446 is a prediction of the churn risk predicted for each call using the customer extraction model.

入出力データ４４７は、入出力インタフェース４６０を介して入出力されるデータである。送受信データ４４８は、ネットワークインタフェース４３０を介して送受信されるデータである。また、ＲＡＭ４４０は、各種アプリケーションモジュールを実行するためのアプリケーション実行領域４４９を有する。 The input / output data 447 is data input / output via the input / output interface 460. The transmission / reception data 448 is data transmitted / received via the network interface 430. The RAM 440 includes an application execution area 449 for executing various application modules.

ストレージ４５０には、データベースや各種のパラメータ、あるいは本実施形態の実現に必要な以下のデータまたはプログラムが記憶されている。ストレージ４５０は、発話量特徴算出モジュール４５３、感情認識モジュール４５４、感情特徴算出モジュール４５５および解約予測モジュール４５７を格納する。 The storage 450 stores a database, various parameters, or the following data or programs necessary for realizing the present embodiment. The storage 450 stores an utterance amount feature calculation module 453, an emotion recognition module 454, an emotion feature calculation module 455, and a churn prediction module 457.

発話量特徴算出モジュール４５３は、オペレータと顧客との間の通話の発話量の特徴を算出するモジュールである。感情認識モジュール４５４は、オペレータと顧客との間の通話の発話区間における感情値を算出するモジュールである。感情特徴算出モジュール４５５は、算出された発話区間の感情値から感情の特徴を算出するモジュールである。解約予測モジュール４５７は、発話量特徴量と認識した感情とに基づいて、通話における解約リスクを予測するモジュールである。これらのモジュール４５３〜４５７は、ＣＰＵ４１０によりＲＡＭ４４０のアプリケーション実行領域４４９に読み出され、実行される。制御プログラム４５６は、情報処理装置２００の全体を制御するためのプログラムである。 The utterance amount feature calculation module 453 is a module that calculates the utterance amount feature of a call between an operator and a customer. The emotion recognition module 454 is a module that calculates an emotion value in an utterance section of a call between an operator and a customer. The emotion feature calculation module 455 is a module that calculates emotion features from the calculated emotion value of the utterance section. The churn prediction module 457 is a module that predicts the churn risk in a call based on the utterance amount feature quantity and the recognized emotion. These modules 453 to 457 are read by the CPU 410 into the application execution area 449 of the RAM 440 and executed. The control program 456 is a program for controlling the entire information processing apparatus 200.

入出力インタフェース４６０は、入出力機器との入出力データをインタフェースする。入出力インタフェース４６０には、表示部４６１、操作部４６２、が接続される。また、入出力インタフェース４６０には、さらに、記憶媒体４６４が接続されてもよい。さらに、音声出力部であるスピーカ４６３や、音声入力部であるマイク（図示せず）、あるいは、ＧＰＳ位置判定部が接続されてもよい。なお、図４に示したＲＡＭ４４０やストレージ４５０には、情報処理装置２００が有する汎用の機能や他の実現可能な機能に関するプログラムやデータは図示されていない。 The input / output interface 460 interfaces input / output data with input / output devices. A display unit 461 and an operation unit 462 are connected to the input / output interface 460. In addition, a storage medium 464 may be connected to the input / output interface 460. Furthermore, a speaker 463 that is an audio output unit, a microphone (not shown) that is an audio input unit, or a GPS position determination unit may be connected. Note that the RAM 440 and the storage 450 shown in FIG. 4 do not show programs and data related to general-purpose functions and other realizable functions that the information processing apparatus 200 has.

図５は、本実施形態に係る情報処理装置２００の処理手順を説明するフローチャートである。このフローチャートは、図４のＣＰＵ４１０がＲＡＭ４４０を使用して実行し、図２の情報処理装置２００の機能構成部を実現する。 FIG. 5 is a flowchart for explaining the processing procedure of the information processing apparatus 200 according to this embodiment. This flowchart is executed by the CPU 410 in FIG. 4 using the RAM 440, and implements the functional components of the information processing apparatus 200 in FIG.

ステップＳ５０１において、情報処理装置２００は、オペレータと顧客との間の通話を取得する。ステップＳ５０３において、情報処理装置２００は、取得した通話の発話区間を検出し、検出した各発話区間に対して、発話者情報および発話時刻情報を付与する。ステップＳ５０５において、情報処理装置２００は、通話におけるオペレータと顧客との発話量の特徴を算出する。ステップＳ５０７において、情報処理装置２００は、発話区間の感情値を算出する。ステップＳ５０９において、情報処理装置２００は、算出した発話区間の感情値から、感情特徴を算出する。ステップＳ５１３において、情報処理装置２００は、顧客抽出モデルを用いて、各コールに対して解約リスクを予測する。ステップＳ５１５において、情報処理装置２００は、予測された解約リスクに基づいて、解約リスクが高いコールの顧客を抽出する。 In step S501, the information processing apparatus 200 acquires a call between an operator and a customer. In step S503, the information processing apparatus 200 detects the utterance section of the acquired call, and adds speaker information and utterance time information to each detected utterance section. In step S505, the information processing apparatus 200 calculates the feature of the amount of speech between the operator and the customer in a call. In step S507, the information processing apparatus 200 calculates the emotion value of the utterance section. In step S509, the information processing apparatus 200 calculates an emotion feature from the calculated emotion value of the utterance section. In step S513, the information processing apparatus 200 predicts the churn risk for each call using the customer extraction model. In step S515, the information processing apparatus 200 extracts a call customer having a high cancellation risk based on the predicted cancellation risk.

本実施形態によれば、顧客の商品やサービスへの問い合わせ、申し込み時の通話データを用いることで、初めてコンタクトしてくるような行動履歴が取得できない顧客に対しても解約予測をすることができる。 According to the present embodiment, by using inquiries about customer products and services and call data at the time of application, it is possible to predict cancellation even for a customer who cannot acquire an action history that makes contact for the first time. .

また、本実施形態によれば、発話特徴としてオペレータと顧客とのどちらが主体的に会話を行っていたかといった情報を、人手でルール化できない情報も含め統計処理を用いることで算出する。同じく、感情特徴として満足、不満といった、人が解釈し得る感情値ではなく、感情値から人手でルール化できない情報も含め統計処理を用いて算出することで、高精度に解約リスクを推定するモデルの学習および予測が可能となる。 In addition, according to the present embodiment, information such as whether the operator or the customer is actively talking as an utterance feature is calculated by using statistical processing including information that cannot be ruled manually. Similarly, models that estimate churn risk with high accuracy by using statistical processing to calculate not only emotional values that can be interpreted by humans, such as satisfaction and dissatisfaction, but also information that cannot be manually ruled from emotional values. Can be learned and predicted.

解約する、解約しないについては、顧客の商品やサービスの問い合せや申し込み時のやり取りで決まるわけではない。一部の顧客において、アップセルやクロスセルにより、本来意図しなかった商品を購入した場合に解約する傾向がある。本発明者は、そのような場合に、解約または継続を予測するのに、サービスへの問い合わせ、申し込み時のやり取りの会話情報が有効であること、その際、発話特徴および感情特徴が有効であることを見出したものである。 Canceling or not canceling is not determined by the customer's product or service inquiry or exchange at the time of application. Some customers tend to cancel when they purchase products that were not originally intended by up-sell or cross-sell. In such a case, the present inventor is effective in using the conversation information of the inquiry to the service and the exchange at the time of application for predicting the cancellation or continuation, and in that case, the utterance feature and the emotion feature are effective. This is what we found.

そのため、本実施形態では、予測（算出）したリスクと、あらかじめ指定した解約リスクの閾値や対象コール数における高解約リスク者割合と、を比較し、解約リスクが高いと判断された顧客を抽出する。これにより、本実施形態では、電話での対応を主要因とする解約リスクが高い顧客を抽出することができる。 Therefore, in the present embodiment, the predicted (calculated) risk is compared with the threshold value of the churn risk specified in advance and the ratio of high churn risk persons in the number of target calls, and the customers judged to have a high churn risk are extracted. . Thereby, in this embodiment, a customer with a high cancellation risk mainly due to telephone response can be extracted.

また、本発明者は、顧客が将来解約するかどうかが、通話内の特定の話題区間におけるオペレータや顧客の感情に依存することを見出した。例えば、オープニングやクロージングの話題区間における顧客のpositive感情や期待感の感情値が大きい場合は、解約リスクが低いことを見出した。 Further, the present inventor has found that whether or not the customer cancels in the future depends on the feelings of the operator and the customer in a specific topic section in the call. For example, it was found that the risk of churn is low when the customer's positive feelings and expectation feeling values are large in the topic sections of the opening and closing.

そのため、本実施形態では、通話における特定の話題区間におけるオペレータや顧客の感情に着目した感情特徴に基づいて解約リスクを推定する。これにより、本実施形態では、解約リスクが高い顧客を高い精度で抽出することができる。 For this reason, in this embodiment, the churn risk is estimated based on the emotion feature focusing on the emotions of the operator and the customer in a specific topic section in the call. Thereby, in this embodiment, a customer with a high cancellation risk can be extracted with high accuracy.

［第３実施形態］
次に本発明の第３実施形態に係る情報処理装置について、図６乃至図９を用いて説明する。図６は、本実施形態に係る情報処理装置の構成を示すブロック図である。本実施形態に係る情報処理装置は、上記第２実施形態と比べると、フレーズ記憶部、フレーズ認識部およびフレーズ特徴算出部を有する点で異なる。その他の構成および動作は、第２実施形態と同様であるため、同じ構成および動作については同じ符号を付してその詳しい説明を省略する。 [Third Embodiment]
Next, an information processing apparatus according to a third embodiment of the present invention will be described with reference to FIGS. FIG. 6 is a block diagram illustrating a configuration of the information processing apparatus according to the present embodiment. The information processing apparatus according to the present embodiment differs from the second embodiment in that it includes a phrase storage unit, a phrase recognition unit, and a phrase feature calculation unit. Since other configurations and operations are the same as those of the second embodiment, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof is omitted.

情報処理装置６００は、フレーズ記憶部６０１、フレーズ認識部６０２およびフレーズ特徴算出部を有する。 The information processing apparatus 600 includes a phrase storage unit 601, a phrase recognition unit 602, and a phrase feature calculation unit.

フレーズ記憶部６０１は、あらかじめ指定したフレーズを記憶する。フレーズ記憶部６０１は、納得、非納得、強引さおよび料金へのこだわりの少なくとも一つについて、それらの状況とみなせるフレーズを記憶する。例えば、納得の場合、「わかりました」、「その通り」などである。非納得の場合、「ひとまず」、「とりあえず」、「さしあたって」などである。強引さの場合は、「決めましょう」、「これでいいですね」などである。料金へのこだわりの場合は、「価格」、「料金」、「送料無料」、「消費税」、「もったいない」などである。 The phrase storage unit 601 stores a phrase designated in advance. The phrase storage unit 601 stores phrases that can be regarded as those situations for at least one of satisfaction, dissatisfaction, brute force, and commitment to fees. For example, in the case of consent, “I understand”, “That ’s right”, etc. In the case of dissatisfaction, it is “for the time being”, “for the time being”, “for the moment” and so on. In the case of brute force, “Let's decide”, “This is all right”, etc. In the case of sticking to the price, there are “price”, “fee”, “free shipping”, “consumption tax”, “mottle”, and the like.

フレーズ認識部６０２は、フレーズ記憶部６０１に記憶されたフレーズの出現を検知して、認識する。フレーズ認識部６０２は、音声認識やワードスポッティングを行い、フレーズ記憶部６０１に記憶されたフレーズの出現を検知して、認識する。 The phrase recognition unit 602 detects and recognizes the appearance of a phrase stored in the phrase storage unit 601. The phrase recognition unit 602 performs voice recognition and word spotting, detects and recognizes the appearance of a phrase stored in the phrase storage unit 601.

フレーズ特徴算出部６０３は、オペレータおよび顧客のフレーズの特徴を算出する。フレーズ特徴算出部６０３は、フレーズ認識部６０２による認識結果（検知結果）と用いて、納得、非納得、強引さおよび料金へのこだわりの少なくとも一つについて、それぞれ、以下の計算を行う。つまり、フレーズ特徴算出部６０３は、フレーズ記憶部６０１に記憶されたフレーズの出現数であるフレーズ出現数を算出する。また、フレーズ特徴算出部６０３は、フレーズ出現数を発話時間で正規化した正規化フレーズ出現数、および、あらかじめ指定した時間間隔ごとのフレーズ出現数を算出する。フレーズ特徴算出部６０３は、フレーズ出現数、正規化フレーズ出現数およびあらかじめ指定した時間間隔ごとのフレーズ出現数の少なくとも一つを算出する。 The phrase feature calculation unit 603 calculates the feature of the operator and customer phrases. The phrase feature calculation unit 603 uses the recognition result (detection result) by the phrase recognition unit 602, and performs the following calculations for at least one of satisfaction, dissatisfaction, brute force, and attention to charge. That is, the phrase feature calculation unit 603 calculates the number of phrase appearances that is the number of appearances of the phrase stored in the phrase storage unit 601. Also, the phrase feature calculation unit 603 calculates the normalized phrase appearance number obtained by normalizing the phrase appearance number by the utterance time and the phrase appearance number for each predetermined time interval. The phrase feature calculation unit 603 calculates at least one of the number of phrase appearances, the number of normalized phrase appearances, and the number of phrase appearances for each predetermined time interval.

顧客抽出モデル記憶部２０５は、上記第２実施形態と同様に、感情特徴算出部２０４で算出されたオペレータおよび顧客の感情特徴量と、発話量特徴算出部２０２で算出されたオペレータおよび顧客の発話量特徴とに、次の特徴量を加えてモデルを生成する。すなわち、顧客抽出モデル記憶部２０５は、感情特徴量と発話量特徴とに、フレーズ特徴算出部６０３で算出された、オペレータおよび顧客のフレーズ特徴量を加えて、解約リスクが高い顧客を予測する顧客抽出モデルを生成して、記憶する。なお、顧客抽出モデルの学習（生成）、および、学習したモデルを用いた予測は、上述の特徴量（フレーズ特徴量）を用いる点以外は、第２実施形態と同様である。 Similarly to the second embodiment, the customer extraction model storage unit 205 stores the operator and customer emotion feature amounts calculated by the emotion feature calculation unit 204, and the operator and customer utterances calculated by the utterance amount feature calculation unit 202. A model is generated by adding the following feature quantity to the quantity feature. In other words, the customer extraction model storage unit 205 adds the phrase feature amount of the operator and the customer calculated by the phrase feature calculation unit 603 to the emotion feature amount and the utterance amount feature, and predicts a customer with a high churn risk An extraction model is generated and stored. Note that learning (generation) of the customer extraction model and prediction using the learned model are the same as in the second embodiment except that the above-described feature amount (phrase feature amount) is used.

図７は、本実施形態に係る情報処理装置６００のハードウェア構成を説明するブロック図である。ＣＰＵ(Central Processing Unit)４１０は、演算制御用のプロセッサであり、プログラムを実行することで図２の情報処理装置６００の機能構成部を実現する。 FIG. 7 is a block diagram illustrating a hardware configuration of the information processing apparatus 600 according to the present embodiment. A CPU (Central Processing Unit) 410 is a processor for arithmetic control, and implements a functional component of the information processing apparatus 600 in FIG. 2 by executing a program.

ＲＡＭ７４０は、ＣＰＵ４１０が一時記憶のワークエリアとして使用するランダムアクセスメモリである。ＲＡＭ７４０には、本実施形態の実現に必要なデータを記憶する領域が確保されている。記憶フレーズ７４１は、あらかじめ指定され記憶されたフレーズである。 The RAM 740 is a random access memory used by the CPU 410 as a work area for temporary storage. In the RAM 740, an area for storing data necessary for realizing the present embodiment is secured. The stored phrase 741 is a phrase that is designated and stored in advance.

ストレージ７５０には、データベースや各種のパラメータ、あるいは本実施形態の実現に必要な以下のデータまたはプログラムが記憶されている。ストレージ８５０は、フレーズ７５１、フレーズ認識モジュール７５２およびフレーズ特徴算出モジュール７５３を格納する。フレーズ７５１は、あらかじめ指定されたフレーズである。フレーズ認識モジュール７５２は、音声認識やワードスポッティングにより、記憶されたフレーズの出現を検知して、認識するモジュールである。フレーズ特徴算出モジュール７５３は、フレーズの認識結果（検知結果）を用いて、フレーズの出現数、フレーズ出現数を発話時間で正規化したもの、および所定時間間隔ごとのフレーズ出現数の少なくとも一つを算出するモジュールである。これらのモジュール７５２〜７５３は、ＣＰＵ４１０によりＲＡＭ８４０のアプリケーション実行領域４４９に読み出され、実行される。 The storage 750 stores a database, various parameters, or the following data or programs necessary for realizing the present embodiment. The storage 850 stores a phrase 751, a phrase recognition module 752, and a phrase feature calculation module 753. The phrase 751 is a phrase designated in advance. The phrase recognition module 752 is a module that detects and recognizes the appearance of a stored phrase by voice recognition or word spotting. Using the phrase recognition result (detection result), the phrase feature calculation module 753 calculates at least one of the number of phrases, the number of phrases normalized by the utterance time, and the number of phrases that appear at every predetermined time interval. This is a module to calculate. These modules 752 to 753 are read by the CPU 410 into the application execution area 449 of the RAM 840 and executed.

図８は、本実施形態に係る情報処理装置６００の処理手順を説明するフローチャートである。このフローチャートは、図７のＣＰＵ４１０がＲＡＭ８４０を使用して実行し、図６の情報処理装置６００の機能構成部を実現する。 FIG. 8 is a flowchart for explaining the processing procedure of the information processing apparatus 600 according to the present embodiment. This flowchart is executed by the CPU 410 in FIG. 7 using the RAM 840, and implements the functional components of the information processing apparatus 600 in FIG.

ステップ８０１において、情報処理装置６００は、あらかじめ指定したフレーズの出現を検知して、認識する。ステップＳ８０３において、情報処理装置６００は、認識したフレーズの特徴を算出する。フレーズの特徴は、例えば、フレーズの出現数や、フレーズ出現数を発話時間で正規化したもの、所定の時間間隔ごとのフレーズ出現数などである。 In step 801, the information processing apparatus 600 detects and recognizes the appearance of a phrase specified in advance. In step S803, the information processing apparatus 600 calculates the features of the recognized phrase. The characteristics of the phrase include, for example, the number of phrases that appear, the number of phrases that are normalized by the utterance time, the number of phrases that appear at predetermined time intervals, and the like.

本実施形態によれば、納得、非納得、強引さおよび料金へのこだわりについて、関係するフレーズを検知し、フレーズ特徴を算出して、このフレーズ特徴を加えてモデルを生成するので、解約や継続について、発言内容まで加味して高精度に予測することができる。［他の実施形態］
以上、実施形態を参照して本願発明を説明したが、本願発明は上記実施形態に限定されるものではない。本願発明の構成や詳細には、本願発明のスコープ内で当業者が理解し得る様々な変更をすることができる。また、それぞれの実施形態に含まれる別々の特徴を如何様に組み合わせたシステムまたは装置も、本発明の範疇に含まれる。 According to the present embodiment, the phrase related to the satisfaction, dissatisfaction, brute force, and commitment to the charge is detected, the phrase feature is calculated, and the model is generated by adding the phrase feature. Can be predicted with high accuracy in consideration of the content of the statement. [Other Embodiments]
While the present invention has been described with reference to the embodiments, the present invention is not limited to the above embodiments. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention. In addition, a system or an apparatus in which different features included in each embodiment are combined in any way is also included in the scope of the present invention.

また、本発明は、複数の機器から構成されるシステムに適用されてもよいし、単体の装置に適用されてもよい。さらに、本発明は、実施形態の機能を実現する情報処理プログラムが、システムあるいは装置に直接あるいは遠隔から供給される場合にも適用可能である。したがって、本発明の機能をコンピュータで実現するために、コンピュータにインストールされるプログラム、あるいはそのプログラムを格納した媒体、そのプログラムをダウンロードさせるＷＷＷ(World Wide Web)サーバも、本発明の範疇に含まれる。特に、少なくとも、上述した実施形態に含まれる処理ステップをコンピュータに実行させるプログラムを格納した非一時的コンピュータ可読媒体（non-transitory computer readable medium）は本発明の範疇に含まれる。 In addition, the present invention may be applied to a system composed of a plurality of devices, or may be applied to a single device. Furthermore, the present invention can also be applied to a case where an information processing program that implements the functions of the embodiments is supplied directly or remotely to a system or apparatus. Therefore, in order to realize the functions of the present invention on a computer, a program installed in the computer, a medium storing the program, and a WWW (World Wide Web) server that downloads the program are also included in the scope of the present invention. . In particular, at least a non-transitory computer readable medium storing a program for causing a computer to execute the processing steps included in the above-described embodiments is included in the scope of the present invention.

［実施形態の他の表現］
上記の実施形態の一部または全部は、以下の付記のようにも記載されうるが、以下には限られない。
（付記１）
顧客とオペレータとの間の通話を取得して、記憶する通話記憶手段と、
記憶した前記通話の発話量特徴量を算出する発話量特徴算出手段と、
前記発話区間の感情を認識する感情認識手段と、
前記発話量特徴量と、認識した前記感情と、に基づいて、前記通話における解約リスクを予測する解約予測手段と、
を備えた情報処理装置。
（付記２）
前記感情認識手段は、認識された前記感情の感情値をさらに算出し、
前記感情値から感情の特徴である感情特徴を算出する感情特徴算出手段と、
前記発話量特徴量と、前記感情特徴と、に基づいて、解約リスクの高い顧客を予測して、抽出するための、あらかじめ学習された顧客抽出モデルを記憶する顧客抽出モデル記憶手段と、
をさらに備え、
前記解約予測手段は、前記顧客抽出モデルに基づいて、前記解約リスクを予測する付記１に記載の情報処理装置。
（付記３）
前記感情特徴算出手段は、
通話に含まれる特定の話題区間を検出する話題区間検出手段と、
前記特定の話題区間に含まれる発話区間の前記感情値を出力する絞込み手段と、
前記絞込み手段が出力した感情値から感情の特徴である感情特徴を算出する特徴算出手段と、
をさらに備える付記２に記載の情報処理装置。
（付記４）
あらかじめ指定したフレーズを記憶するフレーズ記憶手段と、
前記通話における前記フレーズ記憶手段に記憶されたフレーズの出現を検知して、認識するフレーズ認識手段と、
出現したフレーズの特徴であるフレーズ特徴を算出するフレーズ特徴算出手段と、
をさらに備え、
前記解約予測手段は、さらに、前記フレーズ特徴に基づいて、前記解約リスクを予測する付記１乃至３のいずれか１項に記載の情報処理装置。
（付記５）
前記通話記憶手段は、前記通話の発話区間を検出し、検出した前記発話区間に対して、発話者情報と、発話時刻情報と、を付与して前記発話を記憶する付記１乃至４のいずれか１項に記載の情報処理装置。
（付記６）
前記感情は、valence、arousal、納得感および期待感の少なくとも一つを含む付記１乃至５のいずれか１項に記載の情報処理装置。
（付記７）
前記話題区間は、通話におけるオープニング、クロージング、製品・サービス説明、料金説明、勧誘、ヒアリング、個人情報確認および事務手続きの少なくとも一つを含み、
前記感情は、valenceおよび期待感の少なくとも一つを含む付記３乃至５のいずれか１項に記載の情報処理装置。
（付記８）
前記感情特徴は、前記感情値の感情値の比率、平均値、分散値、中央値および最頻値の少なくとも一つを含む付記２乃至７のいずれか１項に記載の情報処理装置。
（付記９）
前記フレーズ特徴は、前記フレーズの出現数であるフレーズ出現数、前記フレーズ出現数を発話時間で正規化した正規化フレーズ出現数およびあらかじめ指定した時間間隔ごとのフレーズ出現数の少なくとも一つを含む４乃至７のいずれか１項に記載の情報処理装置。
（付記１０）
顧客とオペレータとの間の通話を取得して、記憶する通話記憶ステップと、
記憶した前記通話の発話量特徴量を算出する発話量特徴算出ステップと、
前記発話区間の感情を認識する感情認識ステップと、
前記発話量特徴量と、認識した前記感情と、に基づいて、前記通話における解約リスクを予測する解約予測ステップと、
を含む情報処理方法。
（付記１１）
顧客とオペレータとの間の通話を取得して、記憶する通話記憶ステップと、
記憶した前記通話の発話量特徴量を算出する発話量特徴算出ステップと、
前記発話区間の感情を認識する感情認識ステップと、
前記発話量特徴量と、認識した前記感情と、に基づいて、前記通話における解約リスクを予測する解約予測ステップと、
をコンピュータに実行させる情報処理プログラム。 [Other expressions of embodiment]
A part or all of the above-described embodiment can be described as in the following supplementary notes, but is not limited thereto.
(Appendix 1)
Call storage means for acquiring and storing a call between a customer and an operator;
Utterance amount feature calculation means for calculating the stored utterance amount feature amount of the call;
Emotion recognition means for recognizing the emotion of the utterance interval;
A churn predicting means for predicting a churn risk in the call based on the utterance amount feature and the recognized emotion;
An information processing apparatus comprising:
(Appendix 2)
The emotion recognition means further calculates an emotion value of the recognized emotion,
Emotion feature calculating means for calculating an emotion feature that is a feature of the emotion from the emotion value;
Customer extraction model storage means for storing a customer extraction model that has been learned in advance for predicting and extracting a customer with a high churn risk based on the utterance amount feature amount and the emotion feature;
Further comprising
The information processing apparatus according to appendix 1, wherein the cancellation prediction unit predicts the cancellation risk based on the customer extraction model.
(Appendix 3)
The emotion feature calculating means includes:
A topic section detecting means for detecting a specific topic section included in the call;
Narrowing means for outputting the emotion value of the utterance section included in the specific topic section;
Feature calculating means for calculating an emotion feature that is a feature of the emotion from the emotion value output by the narrowing means;
The information processing apparatus according to appendix 2, further comprising:
(Appendix 4)
Phrase storage means for storing a phrase specified in advance;
Phrase recognition means for detecting and recognizing the appearance of a phrase stored in the phrase storage means in the call;
A phrase feature calculating means for calculating a phrase feature that is a feature of the phrase that has appeared;
Further comprising
The information processing apparatus according to any one of supplementary notes 1 to 3, wherein the cancellation prediction unit further predicts the cancellation risk based on the phrase feature.
(Appendix 5)
Any one of Supplementary notes 1 to 4, wherein the call storage unit detects an utterance section of the call, and stores the utterance by adding speaker information and utterance time information to the detected utterance section. The information processing apparatus according to item 1.
(Appendix 6)
The information processing apparatus according to any one of appendices 1 to 5, wherein the emotion includes at least one of valence, arousal, satisfaction, and expectation.
(Appendix 7)
The topic section includes at least one of opening in a call, closing, product / service description, fee description, solicitation, hearing, personal information confirmation, and office procedure,
The information processing apparatus according to any one of appendices 3 to 5, wherein the emotion includes at least one of valence and expectation.
(Appendix 8)
The information processing apparatus according to any one of appendices 2 to 7, wherein the emotion feature includes at least one of a ratio of an emotion value of the emotion value, an average value, a variance value, a median value, and a mode value.
(Appendix 9)
The phrase feature includes at least one of a phrase appearance number that is the number of appearances of the phrase, a normalized phrase appearance number obtained by normalizing the phrase appearance number by an utterance time, and a phrase appearance number for each predetermined time interval. The information processing apparatus according to any one of 1 to 7.
(Appendix 10)
A call storage step of acquiring and storing a call between a customer and an operator;
An utterance amount feature calculating step of calculating an utterance amount feature amount of the stored call;
An emotion recognition step for recognizing the emotion of the utterance interval;
A churn prediction step of predicting a churn risk in the call based on the utterance amount feature and the recognized emotion;
An information processing method including:
(Appendix 11)
A call storage step of acquiring and storing a call between a customer and an operator;
An utterance amount feature calculating step of calculating an utterance amount feature amount of the stored call;
An emotion recognition step for recognizing the emotion of the utterance interval;
A churn prediction step of predicting a churn risk in the call based on the utterance amount feature and the recognized emotion;
An information processing program that causes a computer to execute.

Claims

Call storage means for acquiring and storing a call between a customer and an operator;
Utterance amount feature calculation means for calculating the stored utterance amount feature amount of the call;
Emotion recognition means for recognizing the emotion of the utterance interval;
A churn predicting means for predicting a churn risk in the call based on the utterance amount feature and the recognized emotion;
An information processing apparatus comprising:

The emotion recognition means further calculates an emotion value of the recognized emotion,
Emotion feature calculating means for calculating an emotion feature that is a feature of the emotion from the emotion value;
Customer extraction model storage means for storing a customer extraction model that has been learned in advance for predicting and extracting a customer with a high churn risk based on the utterance amount feature amount and the emotion feature;
Further comprising
The information processing apparatus according to claim 1, wherein the cancellation prediction unit predicts the cancellation risk based on the customer extraction model.

The emotion feature calculating means includes:
A topic section detecting means for detecting a specific topic section included in the call;
Narrowing means for outputting the emotion value of the utterance section included in the specific topic section;
Feature calculating means for calculating an emotion feature that is a feature of the emotion from the emotion value output by the narrowing means;
The information processing apparatus according to claim 2, further comprising:

Phrase storage means for storing a phrase specified in advance;
Phrase recognition means for detecting and recognizing the appearance of a phrase stored in the phrase storage means in the call;
A phrase feature calculating means for calculating a phrase feature that is a feature of the phrase that has appeared;
Further comprising
The information processing apparatus according to claim 1, wherein the cancellation prediction unit further predicts the cancellation risk based on the phrase feature.

The call storage means detects an utterance section of the call, and stores the utterance by adding speaker information and utterance time information to the detected utterance section. The information processing apparatus according to claim 1.

The information processing apparatus according to claim 1, wherein the emotion includes at least one of valence, arousal, satisfaction, and expectation.

The topic section includes at least one of opening in a call, closing, product / service description, fee description, solicitation, hearing, personal information confirmation, and office procedure,
The information processing apparatus according to claim 3, wherein the emotion includes at least one of valence and expectation.

The information processing apparatus according to claim 2, wherein the emotion feature includes at least one of a ratio, an average value, a variance value, a median value, and a mode value of the emotion values of the emotion values.

A call storage step of acquiring and storing a call between a customer and an operator;
An utterance amount feature calculating step of calculating an utterance amount feature amount of the stored call;
An emotion recognition step for recognizing the emotion of the utterance interval;
A churn prediction step of predicting a churn risk in the call based on the utterance amount feature and the recognized emotion;
An information processing method including:

A call storage step of acquiring and storing a call between a customer and an operator;
An utterance amount feature calculating step of calculating an utterance amount feature amount of the stored call;
An emotion recognition step for recognizing the emotion of the utterance interval;
A churn prediction step of predicting a churn risk in the call based on the utterance amount feature and the recognized emotion;
An information processing program that causes a computer to execute.