JP2013117842A

JP2013117842A - Knowledge amount estimation information generating device, knowledge amount estimating device, method, and program

Info

Publication number: JP2013117842A
Application number: JP2011264747A
Authority: JP
Inventors: Chiaki Miyazaki; 千明宮崎; Ryuichiro Higashinaka; 竜一郎東中; Toshiaki Makino; 俊朗牧野; Yoshihiro Matsuo; 義博松尾
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2011-12-02
Filing date: 2011-12-02
Publication date: 2013-06-13

Abstract

PROBLEM TO BE SOLVED: To provide a knowledge amount information generating device for accurately estimating a knowledge amount of an inquirer with respect to a subject which becomes an object of conversation, on the basis of a conversation result, and to provide a knowledge amount estimating device, a method, and a program.SOLUTION: A knowledge amount information generating device comprises: an inquiry extraction part 2 extracts an inquiry speech string which is a speech string from text data of a speech string in conversation between an inquirer and a respondent; a feature quantity extraction part 3 extracts inquiry feature quantity indicating feature quantity regarding a generation state of the inquiry speech string in the text data on the basis of the inquiry speech string; an estimation information generation part 4 uses the inquiry feature quantity, and knowledge amount information indicating a knowledge amount of the inquirer, which is preliminarily given to the text data considered as an object of extraction of the inquiry feature quantity, as learning data to generate estimation information to be used for estimation of the knowledge amount corresponding to the text data; and a knowledge amount estimation part 5 estimates the knowledge amount corresponding to text data for estimation on the basis of the inquiry feature quantity extracted from the text data for estimation and the estimation information generated by the estimation information generation part 4.

Description

本発明は、知識量推定情報生成装置、知識量推定装置、方法、及びプログラムに係り、特に、問合せ者と回答者との対話における発話列のテキストデータから問合せ者の当該対話の内容に関する知識量を精度良く推定するのに好適な知識量推定情報生成装置、知識量推定装置、方法、及びプログラムに関する。 The present invention relates to a knowledge amount estimation information generation device, a knowledge amount estimation device, a method, and a program, and in particular, a knowledge amount related to the content of an inquirer's dialogue from text data of an utterance string in the dialogue between the inquirer and the respondent. The present invention relates to a knowledge amount estimation information generation device, a knowledge amount estimation device, a method, and a program suitable for accurately estimating the amount of information.

プログラムされたコンピュータにより人の知識量を推定する従来技術としては、例えば、特許文献１，２に記載のものがある。これらの特許文献１，２においては、文書検索を行うユーザの分野ごとの知識量や知識の深さといった背景を推定する技術が記載されている。 For example, Patent Documents 1 and 2 disclose conventional techniques for estimating the amount of human knowledge using a programmed computer. In these Patent Documents 1 and 2, a technique for estimating a background such as a knowledge amount and a knowledge depth for each field of a user who performs a document search is described.

文書検索では、ユーザが自分の探している情報を得るために、情報に関連しそうなキーワードを指定して検索を実施する。検索の精度を向上させるためには、検索を行うユーザの背景を知るための情報の獲得が重要であることが知られており、このため、例えば特許文献３では、ユーザの指定したキーワード履歴や、閲覧した電子文書の履歴から、どの分野に関して関心があるかといった嗜好分野を推定する技術が記載されている。 In document search, in order to obtain information that the user is searching for, a search is performed by specifying a keyword that is likely to be related to the information. In order to improve the accuracy of the search, it is known that acquisition of information for knowing the background of the user who performs the search is important. For this reason, in Patent Document 3, for example, the keyword history specified by the user or In addition, a technique for estimating a favorite field such as which field is interested from the history of a browsed electronic document is described.

しかしながら、特許文献３に記載の嗜好推定技術では、ユーザのよく調べる情報の分野を推定できても、ユーザがその分野において、どの程度精通しているのかといったことは推定できない。 However, with the preference estimation technique described in Patent Literature 3, even if the field of information that the user frequently examines can be estimated, it is not possible to estimate how familiar the user is in that field.

これに対して、特許文献１，２においては、検索システムのユーザログの一部であるクエリログを利用し、クエリログのクエリから分野毎にクエリの専門性度合いを算出し、ユーザの知識量に関する背景を推定することにより、ユーザがその分野において、どの程度精通しているのかといったことが推定できる。 On the other hand, in Patent Documents 1 and 2, a query log that is a part of a user log of a search system is used, a degree of expertise of a query is calculated for each field from a query of the query log, and a background relating to a user's knowledge amount It is possible to estimate how familiar the user is in the field.

また、非特許文献１においては、上述の特許文献１，２と同様の技術が開示されている。すなわち、使用語彙の専門性に着目し、各単語の専門性をその希少さなどから算出しておき、あるユーザが入力した検索クエリの履歴から、クエリとして使われた単語の専門性の平均値を計算し、当該ユーザの知識量を推定する技術が記載されている。 Further, Non-Patent Document 1 discloses a technique similar to that of Patent Documents 1 and 2 described above. That is, paying attention to the expertise of the vocabulary used, the expertise of each word is calculated from its rarity, etc., and the average value of the expertise of the word used as the query from the history of the search query entered by a user Is described, and a technique for estimating the knowledge amount of the user is described.

また、非特許部文献２においては、バス運行情報案内システムのユーザを対象として、音声対話システムにおいてユーザに協調的な対話を行うために、ユーザの知識量を推定する技術が記載されている。ここでは、あるユーザが検索対象としたバス停の履歴から、バス停の属性（市民のみが利用するバス停／それ以外）の割合を求め、ユーザの当該地域に関する知識量を推定する技術が記載されている。 Non-Patent Document 2 describes a technique for estimating the amount of knowledge of a user for a user of a bus operation information guidance system in order to perform a collaborative dialogue with the user in a voice dialogue system. Here, a technique is described in which the ratio of bus stop attributes (bus stops used only by citizens / others) is determined from the history of bus stops searched by a certain user, and the amount of knowledge about the area of the user is estimated. .

特開２０１１−１７０６９９号公報JP 2011-170699 A 特開２０１１−２２１８７２号公報JP 2011-221872 A 特開２０００−１４８７７３号公報JP 2000-148773 A

佐藤大祐, 安田宜仁, 望月崇由, 鈴木智也, 松浦由美子, 片岡良治, “検索システムユーザの分野別の知識推定”, DEIM Forum 2010 (2010)Daisuke Sato, Yoshihito Yasuda, Takayuki Mochizuki, Tomoya Suzuki, Yumiko Matsuura, Ryoji Kataoka, “Knowledge Estimation by Search System User Field”, DEIM Forum 2010 (2010) 上野晋一, 駒谷和範, 河原達也, 奥乃博, “バス運行情報案内システムにおけるユーザモデルを用いた適応的応答の生成”, 音声言語情報処理 2002, vol.65, ｐｐ.5-10 (2002)Junichi Ueno, Kazunori Komatani, Tatsuya Kawahara, Hiroshi Okuno, “Generating Adaptive Response Using User Model in Bus Information Guidance System”, Spoken Language Information Processing 2002, vol.65, pp.5-10 (2002)

しかしながら、上述の特許文献１，２、及び非特許文献１に記載の知識量推定技術では、人間同士の対話における話者を対象として当該話者の知識量を推定することはできない。 However, with the knowledge amount estimation techniques described in Patent Documents 1 and 2 and Non-Patent Document 1, it is not possible to estimate the speaker's knowledge amount for speakers in a dialogue between humans.

対話の話者に対しても知識量を推定できるようにするためには、音声認識結果を知識量推定の入力として扱えるようにすることが必要である。しかしながら、上述した従来の語彙の専門性を用いてユーザの知識量を推定する技術において音声認識技術を用いる場合には、音声認識結果に含まれる誤認識に弱いという問題がある。 In order to be able to estimate the amount of knowledge even for a conversation speaker, it is necessary to be able to handle the speech recognition result as an input for knowledge amount estimation. However, when the speech recognition technology is used in the technology for estimating the amount of knowledge of the user using the above-described vocabulary expertise, there is a problem that it is vulnerable to misrecognition included in the speech recognition result.

具体的に起こる問題としては、ある話者の発話の音声認識結果に、専門性の高い単語が誤っていくつか出現した場合、その話者の知識量は誤って高いと推定されてしまう恐れがある。 As a specific problem, if some highly specialized words appear in the speech recognition result of a speaker's utterance, the knowledge amount of the speaker may be presumed to be erroneously high. is there.

このような対話の話者に対して知識量推定技術を使用する場面の例としてコンタクトセンタの通話分析や音声対話システムが考えられるが、どちらの場合においても、発話を音声認識することは不可欠であるため、このような場面で使用する知識量推定技術には、音声認識結果に含まれる誤認識に頑健であることが求められる。 Examples of situations where knowledge amount estimation technology is used for such conversational speakers include contact center call analysis and voice dialogue systems. In either case, speech recognition is indispensable. Therefore, the knowledge amount estimation technique used in such a scene is required to be robust against misrecognition included in the speech recognition result.

上述の非特許文献２においては、バス運行情報案内システムにおいてユーザの知識量を推定する際に、あるエリアに居るユーザが電話等で入力する音声情報を音声対話システムにより認識して、当該エリアに関しての当該ユーザの知識量を推定する技術が記載されている。 In the above-mentioned Non-Patent Document 2, when estimating the amount of knowledge of the user in the bus operation information guidance system, the voice information input by the user in a certain area by telephone or the like is recognized by the voice dialogue system, and A technique for estimating the amount of knowledge of the user is described.

しかしながら、この技術においても、ユーザが電話等で入力する音声情報は、バス停、乗車場所、及び降車場所等の特定の情報であり、さらに、これらの特定情報の属性、すなわち、市民のみが利用するバス停とその他のバス停、正式なバス停名、最寄りの施設名でのバス停指定など、上述の特許文献１，２、及び非特許文献１と同様に、当該エリアに関してのユーザの専門的な単語を用いて、当該エリアにおけるユーザの知識量を推定するものであり、ユーザの発話の音声認識結果に、専門性の高い単語が誤っていくつか出現した場合には、そのユーザの知識量は誤って高いと推定されてしまう恐れがある。 However, also in this technology, the voice information input by the user by telephone or the like is specific information such as a bus stop, a boarding place, and a getting-off place, and further, attributes of these specific information, that is, only citizens use it. Similar to the above-mentioned Patent Documents 1 and 2 and Non-Patent Document 1, such as bus stops and other bus stops, formal bus stop names, bus stop designations with the nearest facility names, etc. The amount of knowledge of the user in the area is estimated, and if some highly specialized words appear in error in the speech recognition result of the user's utterance, the amount of knowledge of the user is erroneously high. It may be estimated that.

本発明で解決しようとする問題点は、音声による対話の話者に対する知識量の推定に、従来のユーザの使用語彙の専門性に着目して当該ユーザの知識量を推定する技術を用いた場合には、音声認識結果に含まれる誤認識により、当該話者の知識量を正しく推定することができない恐れがある点である。 The problem to be solved by the present invention is that, when the technology for estimating the knowledge amount of the user is used for estimating the knowledge amount for the speaker of the dialogue by voice, focusing on the expertise of the conventional user's vocabulary There is a possibility that the knowledge amount of the speaker cannot be estimated correctly due to misrecognition included in the speech recognition result.

本発明は、上記問題点を解決するためになされたものであり、対話結果に基づき、対話の対象となった事項に対する問合せ者の知識量の推定を精度良く行うことが可能となる知識量推定情報生成装置、知識量推定装置、方法、及びプログラムを提供することを目的とする。 The present invention has been made to solve the above-mentioned problems, and based on the result of dialogue, it is possible to estimate the knowledge amount of the inquirer with respect to the subject matter of the dialogue with high accuracy. An object is to provide an information generation device, a knowledge amount estimation device, a method, and a program.

上記目的を達成するために、請求項１に記載の知識量推定情報生成装置は、問合せ者と回答者との対話における発話列のテキストデータから、相手への質問の発話列である質問発話列を抽出する質問抽出手段と、前記質問抽出手段で抽出された質問発話列に基づき、前記テキストデータにおける当該質問発話列の発生状態に関する特徴量を示す質問特徴量を抽出する特徴量抽出手段と、前記特徴量抽出手段で抽出された質問特徴量、及び当該質問特徴量の抽出の対象とされた前記テキストデータに対して当該テキストデータにより示される対話から想定される前記問合せ者の知識量を示すものとして予め付与された知識量情報を学習データとして用いることで、前記テキストデータに対応する前記知識量の推定に用いる推定情報を生成する知識量推定情報生成手段と、を備えている。 In order to achieve the above object, the knowledge amount estimation information generating apparatus according to claim 1 is a question utterance sequence which is an utterance sequence of a question to the other party from text data of an utterance sequence in a dialogue between an inquirer and a respondent. Based on the question utterance string extracted by the question extraction means, feature quantity extraction means for extracting a question feature quantity indicating a feature quantity related to the occurrence state of the question utterance string in the text data, and The question feature amount extracted by the feature amount extraction means and the knowledge amount of the inquirer assumed from the dialogue indicated by the text data with respect to the text data targeted for extraction of the question feature amount Knowledge that generates presumed information used for estimating the knowledge amount corresponding to the text data by using knowledge amount information given in advance as learning data It includes a quantity estimation information generating means.

請求項１に記載の知識量推定情報生成装置によれば、質問抽出手段により、問合せ者と回答者との対話における発話列のテキストデータから、相手への質問の発話列である質問発話列が抽出され、特徴量抽出手段により、前記質問抽出手段で抽出された質問発話列に基づき、前記テキストデータにおける当該質問発話列の発生状態に関する特徴量を示す質問特徴量が抽出され、知識量推定情報生成手段により、前記特徴量抽出手段で抽出された質問特徴量、及び当該質問特徴量の抽出の対象とされたテキストデータに対して当該テキストデータにより示される対話から想定される前記問合せ者の知識量を示すものとして予め知識量情報が付与されたデータを学習データとして用いることで、テキストデータに対応する知識量の推定に用いる推定情報が生成される。 According to the knowledge amount estimation information generating apparatus according to claim 1, the question utterance sequence that is the utterance sequence of the question to the other party is obtained from the text data of the utterance sequence in the dialogue between the inquirer and the respondent by the question extraction unit. Based on the question utterance string extracted by the question extracting means, the extracted feature quantity indicating the feature quantity related to the occurrence state of the question utterance string in the text data is extracted by the feature quantity extracting means, and the knowledge amount estimation information Knowledge of the inquirer assumed from the dialogue indicated by the text data for the question feature quantity extracted by the feature quantity extraction means by the generation means and the text data targeted for extraction of the question feature quantity By using, as learning data, data to which knowledge amount information has been assigned in advance as an indication of the amount, an estimate used for estimating the amount of knowledge corresponding to text data Information is generated.

すなわち、本発明では、問合せ者の知識量の推定に、対話における問合せ者と回答者とでやりとりされる質問を適用しており、これによって、従来の使用語彙の専門性に着目して問合せ者の知識量を推定する技術における誤認識に伴う不具合を解決することができるようにしている。 That is, in the present invention, a question exchanged between the inquirer and the respondent in the dialogue is applied to the estimation of the knowledge amount of the inquirer, thereby focusing on the expertise of the conventional vocabulary used. It is possible to solve problems associated with misrecognition in the technology for estimating the amount of knowledge.

なお、前記学習データには、前記質問特徴量の抽出の対象とされた問合せ者と回答者との対話を書き起こしたテキストデータや、問合せ者と回答者との対話の音声データから音声認識処理を行って作成したテキストデータ等が含まれる。知識量情報は、例えば、テキストデータ及び問合せ者と回答者との対話の音声データの少なくとも何れか一方を用いて事前に人手で付与しておく。 The learning data includes a voice recognition process based on text data that transcribes the dialogue between the inquirer and the respondent who are the target of the question feature extraction, and voice data of the dialogue between the inquirer and the respondent. The text data created by performing is included. For example, the knowledge amount information is manually given in advance using at least one of text data and voice data of dialogue between the inquirer and the respondent.

このように、請求項１に記載の知識量推定情報生成装置によれば、問合せ者と回答者との対話における発話列のテキストデータから、相手への質問の発話列である質問発話列を抽出し、抽出した質問発話列に基づき、前記テキストデータにおける当該質問発話列の発生状態に関する特徴量を示す質問特徴量を抽出し、抽出した質問特徴量、及び当該質問特徴量の抽出の対象とされたテキストデータに対して当該テキストデータにより示される対話から想定される前記問合せ者の知識量を示すものとして予め付与された知識量情報を学習データとして用いることで、前記テキストデータに対応する知識量の推定に用いる推定情報を生成しているので、この推定情報を用いることで、対話の対象となった事項に関する問合せ者の知識量の推定を精度良く行うことができる。 Thus, according to the knowledge amount estimation information generating device according to claim 1, the question utterance sequence that is the utterance sequence of the question to the other party is extracted from the text data of the utterance sequence in the dialogue between the inquirer and the respondent. Then, based on the extracted question utterance sequence, a question feature amount indicating a feature amount related to the occurrence state of the question utterance sequence in the text data is extracted, and the extracted question feature amount and the question feature amount are extracted. Knowledge amount corresponding to the text data by using, as learning data, knowledge amount information given in advance as indicating the knowledge amount of the inquirer assumed from the dialogue indicated by the text data The estimation information used for the estimation of the inquiries is generated. By using this estimation information, it is possible to accurately estimate the inquirer's knowledge amount regarding the subject matter of the dialogue. It can be carried out well.

一方、上記目的を達成するために、請求項２に記載の知識量推定装置は、請求項１記載の知識量推定情報生成装置で生成された前記推定情報を予め記憶した記憶手段と、知識量の推定対象の問合せ者と回答者との対話における発話列のテキストデータから、相手への質問の発話列である質問発話列を抽出する質問抽出手段と、前記質問抽出手段で抽出された質問発話列に基づき、前記テキストデータにおける当該質問発話列の発生状態に関する特徴量を示す質問特徴量を抽出する特徴量抽出手段と、前記特徴量抽出手段で抽出された質問特徴量と前記記憶手段に記憶された前記推定情報とを用いて、推定対象とする前記テキストデータに対応する前記知識量を推定する知識量推定手段と、を備えている。 On the other hand, in order to achieve the above object, a knowledge amount estimation device according to claim 2 includes a storage unit that stores in advance the estimation information generated by the knowledge amount estimation information generation device according to claim 1; Question extraction means for extracting a question utterance string, which is an utterance string of a question to the other party, from text data of an utterance string in the dialogue between the inquirer and the respondent of the estimation target, and the question utterance extracted by the question extraction means A feature amount extracting means for extracting a question feature amount indicating a feature amount relating to an occurrence state of the question utterance sequence in the text data based on the sequence; a question feature amount extracted by the feature amount extracting means; Knowledge amount estimation means for estimating the knowledge amount corresponding to the text data to be estimated using the estimated information.

請求項２に記載の知識量推定装置によれば、記憶手段により、請求項１記載の知識量推定情報生成装置で生成された推定情報が予め記憶され、質問抽出手段により、知識量の推定対象とする問合せ者と回答者との対話における発話列のテキストデータから、相手への質問の発話列である質問発話列が抽出され、特徴量抽出手段により、前記質問抽出手段で抽出された質問発話列に基づき、テキストデータにおける当該質問発話列の発生状態に関する特徴量を示す質問特徴量が抽出され、知識量推定手段により、前記特徴量抽出手段で抽出された質問特徴量と前記記憶手段で予め記憶された推定情報とを用いて、推定対象とするテキストデータに対応する知識量が推定される。 According to the knowledge amount estimation device described in claim 2, the estimation information generated by the knowledge amount estimation information generation device according to claim 1 is stored in advance by the storage means, and the knowledge amount estimation target is determined by the question extraction means. The question utterance sequence that is the utterance sequence of the question to the other party is extracted from the text data of the utterance sequence in the dialogue between the inquirer and the respondent, and the question utterance extracted by the question extraction unit by the feature amount extraction unit Based on the column, a question feature amount indicating a feature amount related to the occurrence state of the question utterance sequence in the text data is extracted, and the question feature amount extracted by the feature amount extraction unit by the knowledge amount estimation unit and the storage unit in advance A knowledge amount corresponding to text data to be estimated is estimated using the stored estimation information.

このように、請求項２に記載の知識量推定装置によれば、請求項１記載の知識量推定情報生成装置で生成された推定情報を予め記憶し、推定対象とする問合せ者と回答者との対話における発話列のテキストデータから、相手への質問の発話列である質問発話列を抽出し、抽出した質問発話列に基づき、テキストデータにおける当該質問発話列の発生状態に関する特徴量を示す質問特徴量を抽出し、抽出した質問特徴量と記憶した推定情報とを用いて、推定対象とするテキストデータに対応する知識量を推定しているので、対話の対象となった事項に関する問合せ者の知識量の推定を精度良く行うことができる。 As described above, according to the knowledge amount estimation device according to claim 2, the estimation information generated by the knowledge amount estimation information generation device according to claim 1 is stored in advance, A question utterance string, which is the utterance string of the question to the other party, is extracted from the text data of the utterance string in the dialogue of the user, and based on the extracted question utterance string, a question indicating a feature amount regarding the occurrence state of the question utterance string in the text data The feature quantity is extracted, and the knowledge quantity corresponding to the text data to be estimated is estimated using the extracted question feature quantity and the stored estimated information. Knowledge amount can be estimated with high accuracy.

なお、請求項２に記載の知識量推定装置における質問抽出手段と特徴量抽出手段は、請求項１記載の知識量推定情報生成装置における質問抽出手段と特徴量抽出手段を共有して用いることができる。 The question extraction means and the feature quantity extraction means in the knowledge amount estimation apparatus according to claim 2 may be used in common with the question extraction means and the feature quantity extraction means in the knowledge amount estimation information generation apparatus according to claim 1. it can.

また、本発明は、請求項３に記載した発明のように、前記質問特徴量が、前記問合せ者の質問回数、前記回答者の質問回数、前記問合せ者の質問回数と前記回答者の質問回数との比率、前記問合せ者の質問が発生した前記テキストデータにおける時期、及び前記回答者の質問が発生した前記テキストデータにおける時期、の少なくとも１つからなるものとして良い。これらの質問特徴量は、何れも比較的容易にかつ高精度で抽出することができるため、他の特徴量を適用する場合に比較して、より簡易かつ高精度で、対話の対象となった事項に対する問合せ者の知識量の推定を行うことができる。 Further, according to the present invention, as in the invention described in claim 3, the question feature amount includes the number of questions of the inquirer, the number of questions of the respondent, the number of questions of the inquirer, and the number of questions of the respondent. , The time in the text data where the question of the inquirer occurred, and the time in the text data where the question of the respondent occurred. Since all of these question feature quantities can be extracted relatively easily and with high precision, they are subject to dialogue more easily and with higher precision than when other feature quantities are applied. It is possible to estimate the knowledge amount of the inquirer for the matter.

一方、上記目的を達成するために、請求項４に記載の知識量推定情報生成方法は、問合せ者と回答者との対話における発話列のテキストデータから、相手への質問の発話列である質問発話列を抽出する質問抽出ステップと、前記質問抽出ステップで抽出された質問発話列に基づき、前記テキストデータにおける当該質問発話列の発生状態に関する特徴量を示す質問特徴量を抽出する特徴量抽出ステップと、前記特徴量抽出ステップで抽出された質問特徴量、及び当該質問特徴量の抽出の対象とされた前記テキストデータに対して当該テキストデータにより示される対話から想定される前記問合せ者の知識量を示すものとして予め付与された知識量情報を学習データとして用いることで、前記テキストデータに対応する前記知識量の推定に用いる推定情報を生成する知識量推定情報生成ステップと、を備えている。 On the other hand, in order to achieve the above object, the knowledge amount estimation information generating method according to claim 4 is a method for generating a question that is an utterance string of a question to a partner from text data of an utterance string in a dialogue between an inquirer and an answerer. A question extraction step for extracting an utterance string, and a feature quantity extraction step for extracting a question feature quantity indicating a feature quantity regarding the occurrence state of the question utterance string in the text data based on the question utterance string extracted in the question extraction step And the question feature amount extracted in the feature amount extraction step, and the knowledge amount of the inquirer assumed from the dialogue indicated by the text data with respect to the text data targeted for extraction of the question feature amount Is used for estimating the knowledge amount corresponding to the text data by using knowledge amount information given in advance as learning data. And a, and knowledge estimation information generation step of generating estimated information.

従って、請求項４に記載の知識量推定情報生成方法によれば、請求項１に記載の発明と同様に作用するので、請求項１に記載の発明と同様に、生成した推定情報を用いることで、対話の対象となった事項に対する問合せ者の知識量の推定を精度良く行うことができる。 Therefore, according to the knowledge amount estimation information generation method according to the fourth aspect, since it operates in the same manner as the invention according to the first aspect, the generated estimation information is used in the same manner as the invention according to the first aspect. Thus, it is possible to accurately estimate the amount of knowledge of the inquirer with respect to the matter that is the subject of the dialogue.

一方、上記目的を達成するために、請求項５に記載の知識量推定方法は、請求項４記載の知識量推定情報生成方法で生成された前記推定情報を記憶装置に予め記憶する記憶ステップと、知識量の推定対象とする問合せ者と回答者との対話における発話列のテキストデータから、相手への質問の発話列である質問発話列を抽出する質問抽出ステップと、前記質問抽出ステップで抽出された質問発話列に基づき、前記テキストデータにおける当該質問発話列の発生状態に関する特徴量を示す質問特徴量を抽出する特徴量抽出ステップと、前記特徴量抽出ステップで抽出された質問特徴量と前記記憶ステップで記憶された前記推定情報とを用いて、推定対象とする前記テキストデータに対応する前記知識量を推定する知識量推定ステップと、を備えている。 On the other hand, in order to achieve the above object, a knowledge amount estimation method according to claim 5 includes a storage step of previously storing the estimation information generated by the knowledge amount estimation information generation method according to claim 4 in a storage device; A question extraction step for extracting a question utterance sequence, which is an utterance sequence of a question to the other party, from text data of an utterance sequence in a dialogue between an inquirer and a respondent as an object of estimation of knowledge amount, and extraction in the question extraction step A feature amount extracting step for extracting a question feature amount indicating a feature amount relating to an occurrence state of the question utterance sequence in the text data based on the question utterance sequence, the question feature amount extracted in the feature amount extracting step, and A knowledge amount estimation step of estimating the knowledge amount corresponding to the text data to be estimated using the estimation information stored in the storage step. There.

従って、請求項５に記載の知識量推定方法によれば、請求項２に記載の発明と同様に作用するので、請求項２に記載の発明と同様に、対話の対象となった事項に対する問合せ者の知識量の推定を精度良く行うことができる。 Therefore, according to the knowledge amount estimation method described in claim 5, since it operates in the same manner as in the invention described in claim 2, as in the invention described in claim 2, an inquiry about the matter subject to dialogue is performed. The amount of knowledge of the person can be estimated with high accuracy.

一方、上記目的を達成するために、請求項６に記載のプログラムによれば、コンピュータを請求項１に記載の知識量推定情報生成装置または請求項２に記載の知識量推定装置と同様に作用させることができるので、当該知識量推定情報生成装置または当該知識量推定装置と同様に、対話の対象となった事項に対する問合せ者の知識量の推定を精度良く行うことができる。 On the other hand, in order to achieve the above object, according to the program according to claim 6, the computer operates in the same manner as the knowledge amount estimation information generating device according to claim 1 or the knowledge amount estimation device according to claim 2. Therefore, similarly to the knowledge amount estimation information generation device or the knowledge amount estimation device, it is possible to accurately estimate the inquirer's knowledge amount with respect to the subject matter of the dialogue.

問合せ者と回答者との対話における質問から抽出した質問の発生状態に関する特徴量を用いて該対話の対象となった事項に関する問合せ者の知識量を推定することで、使用語彙や会話内容の分野によらず、問合せ者の知識量を高精度に推定することができる。また、使用語彙に依存しない質問の発生状態に関する特徴量を用いるため、音声認識の誤りによる影響を受けにくい。使用語彙や会話内容の分野ごとに学習データを用意する必要もない。 The vocabulary used and the content of the conversation content are estimated by estimating the amount of knowledge of the inquirer regarding the subject matter of the dialogue using the feature amount related to the occurrence state of the question extracted from the question in the dialogue between the inquirer and the respondent. Regardless, it is possible to estimate the knowledge amount of the inquirer with high accuracy. In addition, since the feature amount related to the occurrence state of the question that does not depend on the vocabulary used is used, the feature amount is not easily affected by a speech recognition error. There is no need to prepare learning data for each field of vocabulary or conversation content.

実施の形態に係る知識量推定装置の機能的な構成例を示すブロック図（一部流れ図）である。It is a block diagram (partial flowchart) which shows the functional structural example of the knowledge amount estimation apparatus which concerns on embodiment. 実施の形態に係る知識量推定装置のコンピュータ構成例を示すブロック図である。It is a block diagram which shows the computer structural example of the knowledge amount estimation apparatus which concerns on embodiment. 実施の形態に係る知識量推定装置により実行される学習処理プログラムの処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of the learning process program performed by the knowledge amount estimation apparatus which concerns on embodiment. 実施の形態に係る知識量推定装置により実行される推定処理プログラムの処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of the estimation process program performed by the knowledge amount estimation apparatus which concerns on embodiment. 実施の形態に係る知識量推定装置で用いられる質問パターン情報の一例を示す模式図である。It is a schematic diagram which shows an example of the question pattern information used with the knowledge amount estimation apparatus which concerns on embodiment. 実施の形態に係る知識量推定装置の動作の説明に供する図であり、発話列からなるテキストデータの一例を示す模式図である。It is a figure with which it uses for description of operation | movement of the knowledge amount estimation apparatus which concerns on embodiment, and is a schematic diagram which shows an example of the text data which consists of an utterance row | line | column. 実施の形態に係る知識量推定装置の動作の説明に供する図であり、ユーザの発話内容情報の一例を示す模式図である。It is a figure with which it uses for description of operation | movement of the knowledge amount estimation apparatus which concerns on embodiment, and is a schematic diagram which shows an example of a user's utterance content information. 実施の形態に係る知識量推定装置の動作の説明に供する図であり、オペレータの発話内容情報の一例を示す模式図である。It is a figure with which it uses for description of operation | movement of the knowledge amount estimation apparatus which concerns on embodiment, and is a schematic diagram which shows an example of an operator's utterance content information. 実施の形態に係る知識量推定装置の動作の説明に供する図であり、判定基準情報の一例を示す模式図である。It is a figure with which it uses for description of operation | movement of the knowledge amount estimation apparatus which concerns on embodiment, and is a schematic diagram which shows an example of criteria information. 実施の形態に係る知識量推定装置の動作の説明に供する図であり、評価結果情報の一例を示す模式図である。It is a figure with which it uses for description of operation | movement of the knowledge amount estimation apparatus which concerns on embodiment, and is a schematic diagram which shows an example of evaluation result information.

以下、図面を参照して、本発明の実施の形態を詳細に説明する。なお、本例では、コンタクトセンタにおけるオペレータとユーザとの対話（以下コンタクトセンタ通話という）の音声認識結果を入力として、ユーザの知識量を推定することを例として説明を行う。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In this example, an explanation will be given of an example in which the amount of knowledge of a user is estimated by using a voice recognition result of an interaction between an operator and a user in a contact center (hereinafter referred to as a contact center call) as an input.

本実施の形態に係る知識量推定装置１０は、図１に示される機能的な構成を備えており、図２に示されるコンピュータ構成を有している。そこで、まず、図２を参照してコンピュータの構成を説明する。 The knowledge amount estimation apparatus 10 according to the present embodiment has the functional configuration shown in FIG. 1, and has the computer configuration shown in FIG. First, the configuration of the computer will be described with reference to FIG.

図２に示すように、本実施の形態に係る知識量推定装置１０は、知識量推定装置１０全体の動作を司るＣＰＵ（Central Processing Unit；中央処理装置）２２と、ＣＰＵ２２による各種プログラムの実行時のワークエリア等として用いられるＲＡＭ（Random Access Memory）２４と、各種制御プログラムや各種パラメータ等が予め記憶されたＲＯＭ（Read Only Memory）２６と、各種情報を記憶するために用いられるハードディスク２８（図中、「ＨＤＤ」２８と記載）と、キーボード１４、マウス１６、及びディスプレイ１８と、外部に接続された装置との間の各種情報の授受を司る外部インタフェース３０（図中、「外部Ｉ／Ｆ」３０と記載）と、を備えており、これら等がシステムバスＢＵＳにより相互に接続されて構成されている。なお、外部インタフェース３０には例えば対話でやりとりされる音声データを入出力するヘッドセット等の通話装置５０が接続されている。 As shown in FIG. 2, the knowledge amount estimation device 10 according to the present embodiment includes a CPU (Central Processing Unit) 22 that controls the operation of the entire knowledge amount estimation device 10, and when the CPU 22 executes various programs. RAM (Random Access Memory) 24 used as a work area, a ROM (Read Only Memory) 26 in which various control programs and various parameters are stored in advance, and a hard disk 28 (see FIG. The external interface 30 (referred to as “External I / F” in the figure) is responsible for exchanging various types of information among the “HDD” 28), the keyboard 14, mouse 16, display 18, and externally connected devices. These are connected to each other by a system bus BUS. Note that the external interface 30 is connected to a communication device 50 such as a headset for inputting / outputting voice data exchanged through dialogue, for example.

ＣＰＵ２２は、ＲＡＭ２４、ＲＯＭ２６、及びハードディスク２８に対するアクセス、キーボード１４及びマウス１６を介した各種情報の取得、ディスプレイ１８に対する各種情報の表示、及び外部インタフェース３０に接続された通話装置５０を介してやりとりされる音声データからなる対話情報の入出力等を、各々行うことができる。 The CPU 22 accesses the RAM 24, the ROM 26 and the hard disk 28, acquires various information via the keyboard 14 and the mouse 16, displays various information on the display 18, and exchanges via the call device 50 connected to the external interface 30. Input / output of dialogue information consisting of voice data can be performed.

ＣＰＵ２２が、ハードディスク２８に記憶された本発明に係る知識量推定装置としての処理を制御するプログラムを、ＲＡＭ２４に読み込み実行することにより、図１に示す本発明に係る知識量推定装置１０における各処理部の機能が実行される。 Each process in the knowledge amount estimation device 10 according to the present invention shown in FIG. 1 is executed by the CPU 22 reading a program for controlling the processing as the knowledge amount estimation device according to the present invention stored in the hard disk 28 into the RAM 24 and executing it. Part functions are executed.

図１に示すように、本実施の形態に係る知識量推定装置１０においては、発話列抽出部１、質問抽出部２、特徴量抽出部３、推定情報生成部４、知識量推定部５が、プログラムされたコンピュータ処理により実現される機能として設けられている。 As shown in FIG. 1, in the knowledge amount estimation device 10 according to the present embodiment, an utterance string extraction unit 1, a question extraction unit 2, a feature amount extraction unit 3, an estimated information generation unit 4, and a knowledge amount estimation unit 5 are included. Provided as a function realized by programmed computer processing.

このような構成からなる本実施の形態に係る知識量推定装置１０における処理部は、大きく分けて「学習部」と「推定部」からなる。 The processing unit in the knowledge amount estimation apparatus 10 according to the present embodiment configured as described above is roughly divided into a “learning unit” and an “estimation unit”.

学習部は、発話列抽出部１、質問抽出部２、特徴量抽出部３、推定情報生成部４を含み、予め学習用に用意された、問合せ者と回答者との対話に対する音声認識結果（学習用通話）７の対話情報が入力されると、まず、発話列抽出部１において、発話列からなるテキストデータを抽出する。 The learning unit includes an utterance string extraction unit 1, a question extraction unit 2, a feature amount extraction unit 3, and an estimated information generation unit 4, and is prepared for learning in advance, and is a voice recognition result for a dialogue between an inquirer and a respondent ( When the conversation information (learning call) 7 is input, the utterance string extraction unit 1 first extracts text data composed of the utterance string.

次に、質問抽出部２において、発話列抽出部１により抽出された問合せ者と回答者との対話における発話列のテキストデータから、相手への質問の発話列である質問発話列を抽出し、特徴量抽出部３において、質問抽出部２で抽出された質問発話列に基づき、当該テキストデータにおける当該質問発話列の発生状態に関する特徴量を示す質問特徴量を抽出する。 Next, the question extraction unit 2 extracts a question utterance sequence that is an utterance sequence of the question to the other party from the text data of the utterance sequence in the dialogue between the inquirer and the respondent extracted by the utterance sequence extraction unit 1. Based on the question utterance sequence extracted by the question extraction unit 2, the feature amount extraction unit 3 extracts a question feature amount indicating a feature amount related to the occurrence state of the question utterance sequence in the text data.

そして、推定情報生成部４において、特徴量抽出部３で抽出された質問特徴量４ａ（図中、「特徴量４ａ」と記載）、及び当該質問特徴量４ａの抽出の対象とされたテキストデータに対して当該テキストデータにより示される対話から想定される問合せ者の知識量を示すものとして予め付与された知識量情報４ｂ（図中、「知識量ラベル４ｂ」と記載）を学習データとして用いることで、モデル生成機能４ｃ（図中、「知識量と特徴量を対応付けるモデル生成４ｃ」と記載）により、テキストデータに対応する知識量の推定に用いる推定情報を生成する。 Then, in the estimation information generation unit 4, the question feature amount 4a extracted by the feature amount extraction unit 3 (described as “feature amount 4a” in the figure), and the text data that is the target of extraction of the question feature amount 4a Use knowledge amount information 4b (denoted as “knowledge amount label 4b” in the figure) given in advance as an indication of the knowledge amount of the inquirer assumed from the dialogue indicated by the text data as learning data Thus, estimation information used to estimate the knowledge amount corresponding to the text data is generated by the model generation function 4c (described as “model generation 4c for associating the knowledge amount with the feature amount” in the drawing).

このように、本例の学習部は、本発明に係る知識量推定情報生成装置に相当する。そして、学習部において、このような処理を、比較的多数の予め用意された対話に対して行うことにより、推定情報をモデル情報５ａ（図中、「モデル５ａ」と記載）として生成し、ハードディスク２８等の記憶装置に記憶する。 Thus, the learning unit of this example corresponds to the knowledge amount estimation information generation device according to the present invention. Then, the learning unit performs such processing on a relatively large number of previously prepared dialogs, thereby generating estimation information as model information 5a (denoted as “model 5a” in the figure), and a hard disk It memorize | stores in memory | storage devices, such as 28.

一方、推定部は、発話列抽出部１、質問抽出部２、特徴量抽出部３、知識量推定部５を含み、学習部（知識量推定情報生成装置）において生成され、ハードディスク２８等の記憶装置に記憶されたモデル情報５ａを用いて、推定の対象となる個別の対話における問合せ者の当該対話の事項に関する知識量を推定する。 On the other hand, the estimation unit includes an utterance string extraction unit 1, a question extraction unit 2, a feature amount extraction unit 3, and a knowledge amount estimation unit 5. The estimation unit is generated in a learning unit (knowledge amount estimation information generation device) and stored in the hard disk 28 or the like. Using the model information 5a stored in the apparatus, the amount of knowledge related to the matters of the dialogue of the inquirer in the individual dialogue to be estimated is estimated.

すなわち、推定の対象となる個別の対話である通話ｉの音声認識結果（推定対象）６が入力されると、まず、発話列抽出部１において、発話列からなるテキストデータを抽出する。なお、通話ｉの音声認識結果（推定対象）６は、認識精度が低すぎるものについては、用いなくても良い。 That is, when a speech recognition result (estimation target) 6 of a call i, which is an individual conversation to be estimated, is input, first, the utterance string extraction unit 1 extracts text data including the utterance string. Note that the speech recognition result (estimation target) 6 of the call i may not be used if the recognition accuracy is too low.

次に、質問抽出部２において、発話列抽出部１により抽出された推定対象とする問合せ者と回答者との対話における発話列のテキストデータから、相手への質問の発話列である質問発話列を抽出し、特徴量抽出部３において、質問抽出部２により抽出された質問発話列に基づき、当該テキストデータにおける当該質問発話列の発生状態に関する特徴量を示す質問特徴量を抽出する。 Next, in the question extraction unit 2, a question utterance sequence that is an utterance sequence of a question to the other party from the text data of the utterance sequence in the dialogue between the inquirer to be estimated and the respondent extracted by the utterance sequence extraction unit 1. The feature amount extraction unit 3 extracts a question feature amount indicating a feature amount related to the occurrence state of the question utterance sequence in the text data based on the question utterance sequence extracted by the question extraction unit 2.

そして、知識量推定部５においては、照合機能５ｃ（図中、「モデルと照合して知識量を推定５ｃ」と記載）により、特徴量抽出部３で抽出された質問特徴量と、学習部の推定情報生成部４により生成され、ハードディスク２８等の記憶装置にモデル情報５ａとして記憶された推定情報とを用いて、推定対象とするテキストデータに対応する知識量を推定する。このようにして推定したテキストデータに対応する知識量が、推定対象として入力された通話ｉの話者（ユーザ）の知識量８（図中、「出力：通話ｉの知識量ラベル８」と記載）として出力される。 Then, in the knowledge amount estimation unit 5, the question feature amount extracted by the feature amount extraction unit 3 by the collation function 5 c (described as “estimate the knowledge amount 5 c by collating with the model” in the figure), and the learning unit The estimated amount of knowledge corresponding to the text data to be estimated is estimated using the estimated information generated by the estimated information generating unit 4 and stored as the model information 5a in the storage device such as the hard disk 28. The knowledge amount corresponding to the text data estimated in this way is described as the knowledge amount 8 of the speaker (user) of the call i inputted as the estimation target (in the figure, “output: knowledge amount label 8 of the call i”). ) Is output.

なお、本例では、学習部と推定部において、共通な発話列抽出部１、質問抽出部２、特徴量抽出部３については同一のものを使用しているが、学習部と推定部のそれぞれで異なる発話列抽出部１、質問抽出部２、特徴量抽出部３を用いる構成としても良い。 In this example, the same utterance string extraction unit 1, question extraction unit 2, and feature amount extraction unit 3 are used in the learning unit and the estimation unit, but each of the learning unit and the estimation unit is used. The utterance string extraction unit 1, the question extraction unit 2, and the feature amount extraction unit 3 may be different.

また、本実施の形態に係る知識量推定装置１０では、図２に示す通話装置５０で入出力される音声データに対して、ＣＰＵ２２のプログラムされた処理により、音声認識処理を行いテキストファイルに変換し、通話ｉの音声認識結果（推定対象）６及び音声認識結果（学習用通話）７を生成する。 Further, in the knowledge amount estimation apparatus 10 according to the present embodiment, voice recognition processing is performed on voice data input / output by the call device 50 shown in FIG. Then, the speech recognition result (estimation target) 6 and the speech recognition result (learning call) 7 of the call i are generated.

そして、発話列抽出部１においては、テキストファイルである音声認識結果（学習用通話）７から、発話に該当する部分のみを取り出す。通常、音声認識結果は、時間情報や認識精度を示す情報などが含まれている場合が多いため、この処理が必要となる。 Then, the utterance string extraction unit 1 extracts only the portion corresponding to the utterance from the speech recognition result (learning call) 7 which is a text file. Usually, the speech recognition result often includes time information, information indicating recognition accuracy, and the like, so this processing is necessary.

もし、発話列抽出部１に入力されるデータが、音声認識結果を示すテキストファイルではなく、発話列からなるテキストデータが入力される場合には、発話列抽出部１での処理は不要である。 If the data input to the utterance string extraction unit 1 is not a text file indicating a speech recognition result, but text data including an utterance string is input, processing in the utterance string extraction unit 1 is not necessary. .

また、本例では、質問抽出部２においては、対話におけるオペレータとユーザの全発話に対し、その発話が「質問」であるかどうかを判別する。 Further, in this example, the question extraction unit 2 determines whether or not the utterance is a “question” for all utterances of the operator and the user in the dialogue.

また、特徴量抽出部３においては、質問抽出部２で抽出された質問の回数、ユーザ対オペレータの質問回数の比率、及び質問の当該対話における発生時期（タイミング）等を、質問特徴量として抽出する。 In addition, the feature quantity extraction unit 3 extracts the number of questions extracted by the question extraction unit 2, the ratio of the number of user-to-operator questions, and the occurrence time (timing) of the question in the dialogue as question feature quantities. To do.

また、推定情報生成部４における質問特徴量と知識量との対応付けのモデル化、及び知識量推定部５での、ユーザ側の話者の知識量の自動推定は、例えば、公知技術である「教師あり学習（Supervised learning）」の識別手法の１つである「Support Vector Machine (SVM)」などを用いる。 The modeling of the association between the question feature quantity and the knowledge quantity in the estimation information generation unit 4 and the automatic estimation of the knowledge quantity of the speaker on the user side in the knowledge quantity estimation unit 5 are known techniques, for example. For example, “Support Vector Machine (SVM)” which is one of identification methods of “Supervised learning” is used.

以下、図３と図４を用いて、本実施の形態に係る知識量推定装置１０の、発話列抽出部１、質問抽出部２、特徴量抽出部３、推定情報生成部４、及び知識量推定部５による処理内容の詳細な説明を行う。 Hereinafter, using FIG. 3 and FIG. 4, the utterance string extraction unit 1, the question extraction unit 2, the feature amount extraction unit 3, the estimated information generation unit 4, and the knowledge amount of the knowledge amount estimation device 10 according to the present embodiment. The details of the processing performed by the estimation unit 5 will be described.

図３においては、本実施の形態に係る知識量推定装置１０の学習部における処理内容を示しており、まず、予め定められた所定量の音声データが入力されたか否かを判別し（ステップＳ１０１）、入力されると当該音声データを再生すると共に（ステップＳ１０２）、音声認識処理を行って音声データをテキストデータに書き起こし（ステップＳ１０３）、発話列抽出部１により、音声認識結果を書き起こしたテキストデータから、発話に該当する発話列のテキストデータを抽出する（ステップＳ１０４）。 FIG. 3 shows the processing contents in the learning unit of the knowledge amount estimation apparatus 10 according to the present embodiment. First, it is determined whether or not a predetermined amount of speech data has been input (step S101). When input, the voice data is reproduced (step S102), voice recognition processing is performed to transcribe the voice data into text data (step S103), and the speech string extraction unit 1 transcribes the voice recognition result. The text data of the speech string corresponding to the speech is extracted from the text data (step S104).

例えば、音声認識結果がxmlファイルであり、発話に該当する部分が<TEXT>と</TEXT>で囲んで示されている場合は、正規表現を用いたパターンマッチングで<TEXT>と</TEXT>で囲まれている部分を特定すれば発話を抽出することができる。 For example, if the speech recognition result is an xml file, and the part corresponding to the utterance is enclosed in <TEXT> and </ TEXT>, <TEXT> and </ TEXT If the part surrounded by> is specified, the utterance can be extracted.

次に、質問抽出部２において、発話列抽出部１によって抽出された発話列からなるテキストデータに対して、入力された音声データにより示される対話で交わされるユーザとオペレータ間でやりとりされた各発話が「質問」であるか否かの判別処理を行い、「質問」であると判別した発話列のテキストデータを質問発話列として抽出する（ステップＳ１０５）。 Next, in the question extraction unit 2, each utterance exchanged between the user and the operator exchanged in the dialogue indicated by the input voice data with respect to the text data composed of the utterance sequence extracted by the utterance sequence extraction unit 1. Is determined as “question”, and the text data of the utterance string determined as “question” is extracted as a question utterance string (step S105).

ここでの対話で交わされる各発話が「質問」であるかどうかの判別は、例えば、「今村賢治、泉朋子、菊井玄一郎、佐藤理史,“述部機能表現の意味ラベルタガー”,言語処理学会第１７回年次大会発表論文集 (２０１１)」等に記載の公知の技術を用いて、統計的手法によって自動的に行っても良いし、予め用意された質問を表す言語表現とのパターンマッチング技術によって行っても良い。 Whether each utterance exchanged in the dialogue here is a “question” is determined by, for example, “Kenji Imamura, Atsuko Izumi, Genichiro Kikui, Satoshi Sato,“ Semantic Label Tagger of Predicate Functional Expression ”, Language Processing Society of Japan It may be automatically performed by a statistical method using a known technique described in “17th Annual Conference Papers (2011)” or a pattern matching technique with a language expression that represents a question prepared in advance. You may go by.

本例では後者のパターンマッチング技術を採用し、図５の質問パターン情報として例示するように、「ですか。」「ますか。」などのパターンを用いて質問発話列を抽出する。なお、図５に示すパターン情報は、図２におけるハードディスク２８等に予め記憶され、ＲＡＭ２４に読み出されて用いられるものとする。 In this example, the latter pattern matching technique is adopted, and a question utterance string is extracted using a pattern such as “Is it?” Or “Is it?” As illustrated as question pattern information in FIG. Note that the pattern information shown in FIG. 5 is stored in advance in the hard disk 28 or the like in FIG.

次に、特徴量抽出部３において、質問抽出部２によって抽出された各質問発話列から、ユーザとオペレータのそれぞれの質問発話列を用いて、後で詳細を説明する３種類（「１．ユーザとオペレータのそれぞれの質問回数」、「２．ユーザ対オペレータの質問比率」、「３．ユーザ及びオペレータの質問の時期（タイミング）」）の質問特徴量を抽出する（ステップＳ１０６）。なお、このような複数の特徴量は特徴量ベクトルと呼ばれる。 Next, the feature amount extraction unit 3 uses the question utterance sequences extracted by the question extraction unit 2 and uses the question utterance sequences of the user and the operator, and three types (“1. And “number of questions of each operator”, “2. question ratio of user to operator”, “3. timing of user and operator questions (timing)”) (step S106). Such a plurality of feature amounts are called feature amount vectors.

そして、推定情報生成部４において、ステップＳ１０２の処理で再生された音声データに応じて、当該音声データにより示される対話に対して正解ラベルとしての知識量情報（知識量ラベル４ｂ）が入力されると（ステップＳ１０７）、特徴量抽出部３により抽出された質問特徴量（特徴量ベクトル）と、入力された知識量情報（知識量ラベル４ｂ）を１対１の組にして学習データとし、この学習データを入力として、公知のパターン識別手法である「Support Vector Machine (SVM)」を用いてモデルの学習を行うことで、特徴量から知識量の推定に用いるモデル情報５ａ（推定情報）を生成し（ステップＳ１０８）、その後に本処理を終了する。 Then, in the estimated information generation unit 4, knowledge amount information (knowledge amount label 4b) as a correct answer label is input to the dialogue indicated by the sound data in accordance with the sound data reproduced in the process of step S102. (Step S107), the question feature quantity (feature quantity vector) extracted by the feature quantity extraction unit 3 and the input knowledge quantity information (knowledge quantity label 4b) are made into a one-to-one set as learning data, Model information 5a (estimated information) used to estimate the amount of knowledge is generated from the features by learning the model using “Support Vector Machine (SVM)”, which is a well-known pattern identification method, with learning data as input (Step S108), and then the present process is terminated.

「Support Vector Machine (SVM)」は２値の判別手法である。「Support Vector Machine (SVM)」を用いて知識量を２段階で推定する場合には、「知識量が大きい」に該当するか、それ以外かを判別するモデルを作成する。モデル作成の際には、例えば、「知識量が大きい」に該当する学習データを正の学習データとし、それ以外を負の学習データ（「知識量が小さい」に該当する学習データ）として学習を行う。 “Support Vector Machine (SVM)” is a binary discrimination method. When the knowledge amount is estimated in two stages using “Support Vector Machine (SVM)”, a model is created to determine whether the knowledge amount is “high knowledge amount” or not. When creating a model, for example, learning data corresponding to “high knowledge amount” is set as positive learning data, and other learning data is negative learning data (learning data corresponding to “small knowledge amount”). Do.

「Support Vector Machine (SVM)」を用いて知識量を３段階で推定する場合には、学習により段階別に３つのモデルを作成する。「知識量が大きい」に該当するか、それ以外かを判別するモデル１、「知識量が中程度」に該当するか、それ以外かを判別するモデル２、「知識量が小さい」に該当するか、それ以外かを判別するモデル３、の３つのモデルである。例えば、モデル１では、「知識量が大きい」に該当する学習データを正の学習データとし、それ以外を負の学習データとして学習を行う。モデル２では、「知識量が中程度」に該当する学習データを正の学習データとし、それ以外を負の学習データとして学習を行う。モデル３では、「知識量が小さい」に該当する学習データを正の学習データとし、それ以外を負の学習データとして学習を行う。 When the knowledge amount is estimated in three stages using “Support Vector Machine (SVM)”, three models are created for each stage by learning. Model 1 that discriminates whether “knowledge amount is large” or not, Model 2 that discriminates whether “knowledge amount is medium” or other, and “knowledge amount is small” Or model 3 for discriminating whether it is other than that. For example, in the model 1, learning is performed with learning data corresponding to “a large amount of knowledge” as positive learning data, and other learning data as negative learning data. In the model 2, learning is performed with learning data corresponding to “medium amount of knowledge” as positive learning data and other learning data as negative learning data. In the model 3, learning is performed with learning data corresponding to “small knowledge amount” as positive learning data and other learning data as negative learning data.

なお、本例では、パターン識別に「Support Vector Machine (SVM)」を用いてモデルの学習を行っているが、これに限定されるものではなく、例えば、バックプロパゲーション学習等によるニューラルネットワークを用いた学習を行うことでも良い。また、特徴量対知識量のテーブルを作成し、当該テーブルに基づいて知識量を判別する方法を用いても良い。なお、生成されたモデル情報５ａは、図２におけるハードディスク２８等の記憶装置に記憶される。 In this example, model learning is performed using “Support Vector Machine (SVM)” for pattern identification. However, the present invention is not limited to this. For example, a neural network based on back-propagation learning is used. It is also possible to do learning. Alternatively, a method of creating a feature amount vs. knowledge amount table and determining the knowledge amount based on the table may be used. The generated model information 5a is stored in a storage device such as the hard disk 28 in FIG.

一方、図４においては、本実施の形態に係る知識量推定装置１０の推定部における処理内容を示しており、推定対象となるコンタクトセンタでのユーザとオペレータとの間の対話（通話）（以下、「推定対象対話」という。）の音声データが入力されると（ステップＳ２０１）、当該音声データに対する音声認識処理を行った後（ステップＳ２０３）、図３のステップＳ１０４〜Ｓ１０６と同様、発話列抽出部１、質問抽出部２、及び特徴量抽出部３の各処理（ステップＳ２０４，Ｓ２０５，Ｓ２０６）を行い、その後、知識量推定部５において、推定情報生成部４と同様に「Support Vector Machine (SVM)」を用いて、ステップＳ２０６の処理により得られた質問特徴量（特徴量ベクトル）を、前述した学習部による処理（図３）により生成されたモデル情報５ａと照合することにより、推定対象対話が、何れの知識量に該当するものであるかを特定することで、推定対象対話を行っている当該ユーザ（話者）の知識量を推定する（ステップＳ２０９）。なお、知識量推定部５に関しても、推定情報生成部４と同様、「Support Vector Machine (SVM)」を用いて知識量の推定を行うものとするが、ニューラルネットワークを用いた推定等、他のどのような推定手法を用いても構わない。「Support Vector Machine (SVM)」を用いて、例えば上記のモデル１〜３により知識量の推定を行う場合には、モデル１〜３それぞれを用いて判別を行い、最も適合したモデルの正のデータに対応する知識量を話者の知識量とする。 On the other hand, FIG. 4 shows the processing contents in the estimation unit of the knowledge amount estimation apparatus 10 according to the present embodiment, and a dialogue (call) between the user and the operator at the contact center to be estimated (hereinafter referred to as “call”). When the speech data of “estimation target dialogue” is input (step S201), the speech recognition process is performed on the speech data (step S203), and then the speech sequence is the same as in steps S104 to S106 of FIG. Each process of the extraction unit 1, the question extraction unit 2, and the feature amount extraction unit 3 (steps S204, S205, and S206) is performed, and thereafter, in the knowledge amount estimation unit 5, as in the estimated information generation unit 4, “Support Vector Machine” (SVM) "is used to generate the question feature quantity (feature quantity vector) obtained by the process of step S206 by the process by the learning unit described above (FIG. 3). By collating with the model information 5a, the knowledge amount of the user (speaker) who is performing the estimation target dialogue is estimated by specifying which knowledge amount the estimation target dialogue corresponds to. (Step S209). As for the knowledge amount estimation unit 5, similarly to the estimated information generation unit 4, the knowledge amount is estimated using “Support Vector Machine (SVM)”. Any estimation method may be used. For example, when the knowledge amount is estimated using the above-described models 1 to 3 using the “Support Vector Machine (SVM)”, the determination is made using each of the models 1 to 3, and the positive data of the most suitable model is used. Let the amount of knowledge corresponding to be the speaker's amount of knowledge.

次に、特徴量抽出部３が抽出する、３種類の質問特徴量、すなわち、「１．ユーザとオペレータのそれぞれの質問回数」、「２．ユーザ対オペレータの質問比率」、及び「３．ユーザ及びオペレータの質問の時期（タイミング）」からなる特徴量ベクトルについて詳細に説明する。 Next, the three types of question feature quantities extracted by the feature quantity extraction unit 3, that is, “1. User and operator question count”, “2. User-to-operator question ratio”, and “3. User The feature amount vector consisting of “time of operator's question (timing)” will be described in detail.

まず、「１．ユーザとオペレータのそれぞれの質問回数」について以下に説明する。 First, “1. The number of questions for each of the user and the operator” will be described below.

コンタクトセンタでの対話（通話）において質問が果たす役割のうち、最も基本的なものは、相手に対して情報の提供を要求することである。具体的には、ユーザが発する質問は、そのユーザに欠けている知識をオペレータに提供してもらうためのものであり、また、オペレータが発する質問は、ユーザの用件や置かれた状況について情報を提供してもらうためのものである。 Of the roles played by questions in dialogue (calls) at the contact center, the most basic is to request information from the other party. Specifically, a question that a user issues is for the operator to provide knowledge that the user lacks, and a question that the operator issues is information about the user's requirements and situation. It is for having you offer.

つまり、ユーザの質問回数は、ユーザに足りない知識の量を反映しており、また、オペレータの質問回数は、ユーザの用件・状況把握にかかった労力を反映していると言える。ここで、オペレータがユーザの用件・状況を把握しようとする際は、知識量の少ないユーザとの対話において、より多くの労力を要すると考えられるため、ユーザとオペレータの質問回数を用いれば、ユーザの知識量を推定することができる。 In other words, it can be said that the user's number of questions reflects the amount of knowledge that the user lacks, and the operator's number of questions reflects the effort required to grasp the user's requirements and situation. Here, when the operator tries to grasp the user's requirement / situation, it is considered that more labor is required in the dialogue with the user having a small amount of knowledge, so if the number of questions of the user and the operator is used, The amount of knowledge of the user can be estimated.

本例では、ユーザの質問回数とオペレータの質問回数のそのままの値を特徴量として用いることとする。例えば、図６の発話列からなるテキストデータ５００で例示する対話内容の場合、アンダーラインが付与された部分を含む発話が「質問」であり、図６の例では、オペレータとユーザ共に、質問回数は２回ずつということになる。 In this example, the values of the user's question count and the operator's question count are used as feature values. For example, in the case of the dialogue content illustrated by the text data 500 including the utterance string of FIG. 6, the utterance including the part with the underline is “question”. In the example of FIG. Means two times.

従って、ここでは、特徴量として使う値は、オペレータによる質問回数、ユーザによる質問回数は共に「２」となる。なお、この図６に示す発話列からなるテキストデータ５００は、ＲＡＭ２４、ハードディスク２８等の記憶装置に記憶される。 Accordingly, here, the value used as the feature amount is “2” for both the number of questions by the operator and the number of questions by the user. 6 is stored in a storage device such as the RAM 24, the hard disk 28, or the like.

次に、「２．ユーザ対オペレータの質問比率」について以下に説明する。 Next, “2. User-to-operator question ratio” will be described below.

知識量の少ないユーザは、オペレータから聞かれた質問を的確に解釈できないことがしばしばあり、相手（オペレータ）の質問に対してさらに質問で返す「質問返し」が頻繁に起こる。よって、オペレータの質問回数に対し、ユーザの質問回数の比率が高ければ、そのユーザの知識量は少ないと推定することができる。 A user with a small amount of knowledge often cannot accurately interpret a question asked by an operator, and “question return” in which a question (an operator) is further returned as a question frequently occurs. Therefore, if the ratio of the number of questions of the user to the number of questions of the operator is high, it can be estimated that the knowledge amount of the user is small.

なお、ここで、ユーザ対オペレータの質問比率とは、一方の話者の質問回数を他方の質問回数で単純に割った値でも、パーセンテージ等の正規化した値でも、比率を表しうる値であれば良い。本例では、以下の式で算出した値を特徴量として用いる。 Here, the user-to-operator question ratio is a value that can represent the ratio by simply dividing the number of questions of one speaker by the number of questions of the other, or by a normalized value such as a percentage. It ’s fine. In this example, the value calculated by the following equation is used as the feature amount.

ユーザ対オペレータの質問比率＝ユーザの質問回数÷オペレータの質問回数
図６に示す対話内容を例に、質問比率を計算すると、「２÷２＝１」となり、特徴量として使う値（比率）は「１」となる。 User-to-operator question ratio = user question count / operator question count Using the dialog content shown in FIG. 6 as an example, the question ratio is calculated to be “2 ÷ 2 = 1”, and the value (ratio) used as the feature amount is “1”.

次に、「３．ユーザ及びオペレータの質問の時期（タイミング）」について以下に説明する。 Next, “3. Timing of user and operator questions (timing)” will be described below.

対話の時系列において、ユーザがどのあたり（時期（タイミング））で最も多く質問をするかというのは、ユーザの知識量によって異なる。 In the time series of the dialogue, the area where the user asks the most questions (time (timing)) depends on the amount of knowledge of the user.

具体的には、知識量の少ないユーザの場合、対話の冒頭で多くの質問を投げかけ、その後の対話では、終始オペレータの説明を聞いていることが多い。つまり、知識量の少ないユーザの質問は、対話の冒頭に偏って出現することが多い。 Specifically, in the case of a user with a small amount of knowledge, many questions are asked at the beginning of the dialogue, and in subsequent dialogues, the explanation of the operator is often heard from beginning to end. That is, a user's question with a small amount of knowledge often appears biased toward the beginning of the dialogue.

それに対して、知識量の多いユーザは、自身の用件や状況の説明から対話を開始することが多く、さらに、対話を通して発せられる質問回数が比較的少ないため、対話の特定の位置に質問が偏って出現することはあまりない。 On the other hand, users with a lot of knowledge often start conversations from explanations of their own requirements and situations, and because the number of questions that can be asked through the conversation is relatively small, questions are asked at specific locations in the conversation. It does not appear so much.

従って、ユーザの質問回数のピークが対話の冒頭部分に表れている場合、そのユーザの知識量は少ないと推定できる。 Therefore, when the peak of the number of questions of the user appears at the beginning of the dialogue, it can be estimated that the amount of knowledge of the user is small.

これに対して、オペレータが質問するタイミングについては、例えば、質問回数のピークが冒頭部分に表れている場合、ユーザの用件や状況を把握するのに苦労していることが反映されていると考えられ、そのオペレータの対話相手であるユーザは知識量が少ないと推定できる。 On the other hand, regarding the timing when the operator asks questions, for example, when the peak of the number of questions appears at the beginning, it is reflected that it is difficult to grasp the user's requirements and situation It can be estimated that the user who is the conversation partner of the operator has a small amount of knowledge.

一方、オペレータの質問回数のピークが対話の後半に表れている場合、オペレータが、対話の相手であるユーザの、高度な知識を要する問い合わせ内容にうまく回答することができず、オペレータの方が頻繁に質問返しをしているととらえることができる。 On the other hand, if the peak of the number of questions asked by the operator appears in the second half of the dialogue, the operator cannot answer the inquiry content that requires a high level of knowledge from the user who is the partner of the dialogue. Can be seen as answering questions.

従って、対話の後半にオペレータの質問回数のピークが表れる場合、その対話の相手であるユーザの知識量は多いと推定できる。 Therefore, when the peak of the number of questions of the operator appears in the second half of the dialogue, it can be estimated that the knowledge amount of the user who is the partner of the dialogue is large.

以上の例から分かるように、ユーザとオペレータのそれぞれの質問の当該対話における発生時期（タイミング）を把握すれば、ユーザの知識量を推定することができる。 As can be seen from the above example, the amount of knowledge of the user can be estimated by grasping the generation time (timing) of each question of the user and the operator in the dialogue.

さらに、本例では、発話数を単位として対話の時系列を捉えた上で、２種類の方法によって、質問を発話する時期（タイミング）の特徴量として抽出する。１種類目は、対話の冒頭の１割の区間における質問回数を特徴量とするもので、２種類目は、対話を５等分に区分し、どの対話区分に最も多くの質問が出現するかを特徴量とするものである。 Furthermore, in this example, after capturing a dialogue time series in units of the number of utterances, it is extracted as a feature quantity of the timing (timing) when a question is uttered by two types of methods. The first type uses the number of questions in the first 10% of the dialogue as a feature amount. The second type divides the dialogue into five equal parts, and in which dialogue category the most questions appear. Is a feature quantity.

具体的には、質問が最も多く出現した対話区分に対応する特徴量の値を「１」とし、残りの対話区分に対応する値を「０」とする。 Specifically, the value of the feature value corresponding to the dialogue category in which most questions appear is “1”, and the value corresponding to the remaining dialogue category is “0”.

図７で示されるユーザの発話内容情報を例にすると、対話の冒頭の１割の区間における質問回数は３回となり、特徴量として用いる値は「３」となる。なお、図７で示されるユーザの発話内容情報は、ＲＡＭ２４、ハードディスク２８等の記憶装置に記憶される。 Taking the user's utterance content information shown in FIG. 7 as an example, the number of questions in the 10% section at the beginning of the dialogue is 3, and the value used as the feature value is “3”. Note that the user utterance content information shown in FIG. 7 is stored in a storage device such as the RAM 24 and the hard disk 28.

また、図８で示されるオペレータの発話内容情報を例に考えると、最も多くの質問が発話されたのは対話区分（１）ということになり、各対話区分（１）〜（５）に対応する特徴量の値は順に「１」、「０」、「０」、「０」、「０」となる。なお、図８で示されるオペレータの発話内容情報は、ＲＡＭ２４、ハードディスク２８等の記憶装置に記憶される。 Further, taking the utterance content information of the operator shown in FIG. 8 as an example, the most questions are uttered by the dialogue category (1), which corresponds to each dialogue category (1) to (5). The value of the feature amount to be “1”, “0”, “0”, “0”, “0” in order. Note that the utterance content information of the operator shown in FIG. 8 is stored in a storage device such as the RAM 24 and the hard disk 28.

なお、対話における発話のタイミングを捉える方法については、本例のような、対話を発話数で何等分かに区分するという方法の他に、時間情報で何等分かに区分しても良いし、対話開始から何秒後に発話されたか、また、何発話目であるか、あるいは、対話全体の何％の位置で発話されたかなど、発話の時系列が表せるものであれば良い。 In addition, about the method of capturing the timing of utterance in the dialogue, in addition to the method of dividing the dialogue into several parts by the number of utterances as in this example, it may be divided into some parts by time information, It suffices as long as the time series of utterances can be expressed, such as how many seconds after the start of the conversation, how many utterances, or what percentage of the entire conversation is uttered.

これにより、本例で使用した複数の質問特徴量（特徴量ベクトル）は、「１：オペレータの質問回数」、「２：ユーザの質問回数」、「３：ユーザ対オペレータの質問比率」、「４：対話の冒頭の１割の区間におけるユーザの質問回数」、「５〜９：オペレータの質問のタイミング」、「１０〜１４：ユーザの質問のタイミング」となる。 Accordingly, the plurality of question feature amounts (feature vector) used in this example are “1: operator question count”, “2: user question count”, “3: user to operator question ratio”, “ 4: “Number of user questions in the first 10% of the dialogue”, “5-9: timing of operator questions”, and “10-14: timing of user questions”.

なお、これらの特徴量ベクトルは、オペレータがベテランで経験値が高いか、新人で経験値が低いかによっても、その値が変化する。そのため、予めオペレータの経験値の高低に応じた重み付けを行うことも有効である。 Note that the values of these feature vectors change depending on whether the operator is a veteran and has a high experience value, or is a newcomer and has a low experience value. For this reason, it is also effective to perform weighting according to the level of the experience value of the operator in advance.

例えば、オペレータがベテランで経験値が高い場合には、ユーザに対する質問を的確に行うことができるので、オペレータからの質問に対するユーザから質問回数は少なくなる。このような場合、「１：オペレータの質問回数」と「２：ユーザの質問回数」の質問特徴量の値は、他の場合に比較して、いずれも小さくなるので、当該質問特徴量の値を大きくするように重み付けを行い、同様に、「３：ユーザ対オペレータの質問比率」の質問特徴量の値は小さくなるので、当該質問特徴量の値を大きくするように重み付けを行うことができる。なお、以下、本例では、このような重み付けは行わないものとして説明する。 For example, if the operator is experienced and has a high experience value, the user can be asked questions accurately, and the number of questions from the user for the questions from the operator is reduced. In such a case, since the values of the question feature values of “1: operator's question count” and “2: user's question count” are both smaller than the other cases, the value of the question feature value. Similarly, since the value of the question feature value of “3: question ratio of user to operator” becomes small, it is possible to perform weighting so as to increase the value of the question feature value. . In the following description, in this example, it is assumed that such weighting is not performed.

次に、本実施の形態に係る知識量推定装置１０に対する評価実験例について説明する。ここでは、複数のコンタクトセンタでの通話における各ユーザ役を対象として知識量の推定を行い、その評価実験を行った。 Next, an evaluation experiment example for the knowledge amount estimation apparatus 10 according to the present embodiment will be described. Here, the knowledge amount was estimated for each user role in a call at a plurality of contact centers, and an evaluation experiment was performed.

ここで、本実施の形態に係る知識量推定装置１０への入力としては、各コンタクトセンタでの通話のオペレータとユーザの両方の音声認識結果を用い、音声認識結果として得られた発話列からなるテキストデータを用いて上述した各種の質問特徴量を抽出する。 Here, as the input to the knowledge amount estimation apparatus 10 according to the present embodiment, the speech recognition result of both the operator and the user of the call at each contact center is used, and the speech sequence obtained as the speech recognition result is used. The various question feature amounts described above are extracted using text data.

また、各通話に対する知識量の推定結果に対する評価に用いる正解ラベル（知識量情報）としては、人間の主観評価で、各通話における各ユーザが「知識量小」、「知識量大」のどちらに該当するかを、図９において判定基準情報として示される判定基準に従って判定させ、各通話に対して知識量情報（知識量ラベル）を付与した。なお、図９の判定基準情報は、予めハードディスク２８に記憶されており、当該ハードディスク２８から読み込まれてＲＡＭ２４に記憶されて用いられる。 In addition, as a correct label (knowledge amount information) used for evaluation of the estimation result of the knowledge amount for each call, each user in each call is classified as “small knowledge amount” or “large knowledge amount” in human subjective evaluation. Whether it is applicable or not is determined according to the criterion shown as criterion information in FIG. 9, and knowledge amount information (knowledge amount label) is assigned to each call. 9 is stored in advance in the hard disk 28, read from the hard disk 28, and stored in the RAM 24 for use.

この評価実験例では、オペレータとユーザの音声認識結果と知識量ラベルの対は、１１５通話分用意した。そのうち、５４通話に「知識量小」のラベルが付与され、６１通話に「知識量大」のラベルが付与されている。 In this evaluation experiment example, 115 pairs of voice recognition results and knowledge amount labels of operators and users were prepared. Among them, 54 calls are given a “small knowledge” label and 61 calls are given a “high knowledge” label.

各通話に付与された知識量ラベルと、各通話から抽出した質問特徴量を対応付けるモデルの学習、及び知識量の推定の評価は、データを５分割したうちの４つを学習データとし、残り１つを評価データとすることで５パターンのデータセットを用意する５分割交差検定によって実施した。 The learning of the model that associates the knowledge amount label given to each call with the question feature amount extracted from each call, and the evaluation of the estimation of the knowledge amount are four of the data divided into five, and the remaining 1 This was carried out by 5-fold cross-validation in which five patterns of data sets were prepared by using one as evaluation data.

今回の例で、知識量の推定に使用した質問特徴量は、(ｉ)オペレータ役の質問回数、(ｉｉ)ユーザ役の質問回数、(ｉｉｉ)ユーザ役対オペレータ役の質問比率、(ｉｖ)冒頭の１割以内におけるユーザ役の質問回数、(ｖ)オペレータ役の質問が最も多く出現するタイミング、(ｖｉ)ユーザ役の質問が最も多く出現するタイミングの６種類とした。また、各通話から抽出した質問特徴量と、知識量ラベルの対応付けの学習、及び知識量の推定には、「Support Vector Machine (SVM)」を用いた。 In this example, the question feature used to estimate the knowledge amount is (i) the number of questions for the operator role, (ii) the number of questions for the user role, (iii) the question ratio of the user role to the operator role, (iv) The number of user role questions within 10% of the beginning, (v) the timing at which the operator role questions most frequently appear, and (vi) the timing at which the user role questions most frequently appear are six types. In addition, “Support Vector Machine (SVM)” was used for learning the correspondence between the question feature amount extracted from each call and the knowledge amount label, and for estimating the knowledge amount.

一方、評価指標として、本例の精度評価では、正確性の指標である「適合率」を利用する。この指標は、推定した結果の中に正解の知識量ラベルと一致するものがどのくらいあるかを示すものである。 On the other hand, in the accuracy evaluation of this example, “accuracy rate” that is an accuracy index is used as an evaluation index. This index indicates how many of the estimated results coincide with the correct knowledge amount label.

このように、適合率を重視するのには理由がある。すなわち、本例で知識量の推定の対象としたコンタクトセンタでの通話は、データ数が膨大であるのが一番の特徴である。その膨大な通話データから、分析すべき対象を正確に絞り込むのが本発明を利用する大きな目的であると考えると、適合率が重要な意味を持つといえる。 In this way, there is a reason to place importance on the precision. That is, the most characteristic feature of the call at the contact center, which is the target of knowledge amount estimation in this example, is that the number of data is enormous. Considering that it is a major purpose of using the present invention to accurately narrow down the objects to be analyzed from the enormous amount of call data, it can be said that the relevance rate has an important meaning.

本評価例では、「知識量小」を推定する問題と捉え、以下の式を用いて適合率を算出した。 In this evaluation example, it was regarded as a problem of estimating “small amount of knowledge”, and the precision was calculated using the following formula.

適合率＝正しく推定できた「知識量小」の数÷「知識量小」として推定した総数
なお、本実験例では、この適合率に加え、網羅性の指標である「再現率」、及び「再現率」と「適合率」の調和平均であるＦ値も同時に評価し、参考にした。再現率は以下の式を用いて算出した。 Relevance rate = number of “small amount of knowledge” that can be correctly estimated ÷ total number estimated as “small amount of knowledge” In this experimental example, in addition to this relevance rate, “reproducibility” and “ The F value, which is the harmonic average of the “reproducibility” and the “matching rate”, was also evaluated and used as a reference. The recall was calculated using the following formula.

再現率＝正しく推定できた「知識量小」の数÷全データ中の「知識量小」の数
また、Ｆ値については、以下の式を用いて算出した。 Recall rate = the number of “small amount of knowledge” that can be estimated correctly ÷ the number of “small amount of knowledge” in all data. Further, the F value was calculated using the following equation.

Ｆ値＝（２×適合率×再現率）÷（適合率＋再現率） F value = (2 x precision x reproduction rate) / (precision + reproduction rate)

５分割交差検定では、全１１５通話分のデータから５つの学習データ・評価データの対を作って学習と精度評価を行っている。今回の実験例では、５つのデータそれぞれの推定結果から算出した適合率のマイクロ平均によって評価を行った。 In the 5-fold cross-validation, five learning data / evaluation data pairs are created from the data for all 115 calls, and learning and accuracy evaluation are performed. In this experimental example, the evaluation was performed by the micro average of the precision calculated from the estimation results of each of the five data.

本例の評価では、用意したデータ１１５通話のうち５４通話に対して「知識量小」のラベルが付与されている。そして、用意した１１５通話全ての通話に対して「知識量小」の知識量ラベルを付与した場合の精度をベースラインとして設定すると、この場合、上述した算出方法での「適合率」、「再現率」、「Ｆ値」の算出結果は、図１０の評価結果情報に示すように、それぞれ「０．４７」、「１．００」、「０．６４」となる。 In the evaluation of this example, a label “small amount of knowledge” is given to 54 calls out of the prepared 115 data calls. Then, if the accuracy when the knowledge amount label of “small knowledge amount” is assigned to all 115 prepared calls is set as the baseline, in this case, the “accuracy rate” and “reproduction” in the above calculation method are set. The calculation results of “rate” and “F value” are “0.47”, “1.00”, and “0.64”, respectively, as shown in the evaluation result information of FIG.

これに対して、(ｉ)オペレータ役の質問回数と(ｉｉ)ユーザ役の質問回数のみを質問特徴量として性能を評価した結果、図１０の「回数」に対応して示すように、「適合率」、「再現率」、「Ｆ値」の算出結果は、それぞれ「０．６４」、「０．４３」、「０．５１」となり、ベースラインと比較して「適合率」が「０．１７」向上した。 On the other hand, as a result of evaluating the performance using only (i) the number of questions of the operator role and (ii) the number of questions of the user role as the question feature amount, as shown in correspondence with the “number of times” in FIG. The calculation results of “Rate”, “Recall Rate”, and “F Value” are “0.64”, “0.43”, and “0.51”, respectively, and “Fit rate” is “0” compared to the baseline. .17 "improved.

また、「回数」と共に、(ｉｉｉ)ユーザ役対オペレータ役の質問比率もあわせて質問特徴量として性能を評価した結果、図１０の「回数＋比率」に対応して示すように、「適合率」、「再現率」、「Ｆ値」の算出結果は、それぞれ「０．６８」、「０．４３」、「０．５２」となり、ベースラインと比較して「適合率」が「０．２１」向上した。 In addition to the “number of times”, (iii) as a result of evaluating the performance as a question feature amount together with the question ratio of the user role to the operator role, as shown in correspondence with “number of times + ratio” in FIG. ”,“ Reproducibility ”, and“ F value ”are“ 0.68 ”,“ 0.43 ”, and“ 0.52 ”, respectively. 21 "improved.

さらに、「回数」と「比率」と共に、(ｉｖ)冒頭の１割以内におけるユーザ役の質問回数、(ｖ)オペレータ役の質問が最も多く出現するタイミング、及び(ｖｉ)ユーザ役の質問が最も多く出現するタイミングもあわせて質問特徴量としてまとめて性能を評価した結果、図１０の「回数＋比率＋タイミング」に対応して示すように、「適合率」、「再現率」、「Ｆ値」の算出結果は、それぞれ「０．７０」、「０．４３」、「０．５３」となり、ベースラインと比較して「適合率」が「０．２３」向上した。 Furthermore, together with “number of times” and “ratio”, (iv) the number of questions for the user role within 10% of the beginning, (v) the timing at which the questions for the operator role appear most frequently, and (vi) the question for the user role is the most As a result of evaluating the performance together with the frequently appearing timing as a question feature amount, as shown in correspondence with “number of times + ratio + timing” in FIG. 10, “accuracy rate”, “reproduction rate”, “F value” "0.70", "0.43", and "0.53", respectively, and the "precision" improved by "0.23" compared to the baseline.

このように、図１０に示すように、「回数」の特徴量のみを用いた場合でも、ベースラインと比較して適合率が０．１７向上し、本例で提案する質問特徴量の全てを使った「回数＋比率＋タイミング」の場合は、ベースラインと比較して適合率が０．２３向上した。Ｆ値を比較すると、本例の提案技術では、ベースラインと比較して０．１１〜０．１３低下しているが、通話分析において重要となる適合率を向上させることができる点で、本例は有用である。 Thus, as shown in FIG. 10, even when only the “number of times” feature quantity is used, the precision is improved by 0.17 compared to the baseline, and all the question feature quantities proposed in this example are obtained. In the case of “number of times + ratio + timing” used, the precision was improved by 0.23 compared to the baseline. When the F value is compared, the proposed technique of this example has a decrease of 0.11 to 0.13 compared to the baseline. However, this technique can improve the relevance rate that is important in call analysis. An example is useful.

なお、それぞれの質問特徴量を用いた推定結果の正解・不正解の比率に違いがあるか否かを調べるために、「マクネマー検定（McNemar's Test）」を行ったところ、「回数＋比率」や「回数＋比率＋タイミング」を用いた推定結果とベースラインの推定結果との間には、有意な差（ｐ＜０．０５）があることが示された。 In addition, in order to investigate whether there is a difference in the correct / incorrect answer ratios of the estimation results using each question feature, we performed the “McNemar's Test”. It was shown that there was a significant difference (p <0.05) between the estimation result using “number of times + ratio + timing” and the baseline estimation result.

なお、本評価例では、問合せ者と回答者の質問回数のみを用いた例と、この質問回数、及び問合せ者の質問回数と回答者の質問回数との比率を組み合わせて用いた例、及びこれらの質問回数、比率と共に、問合せ者と回答者の質問のテキストデータにおける発生時期を組み合わせて用いた例を示しているが、例えば、問合せ者の質問回数と回答者の質問回数との比率のみを用いて、または問合せ者と回答者との質問のテキストデータにおける発生時期のみを用いて、あるいは、問合せ者の質問回数と回答者の質問回数、及び問合せ者と回答者の質問のテキストデータにおける発生時期を組み合わせて用いる等、それぞれのいずれか１つ、または組み合わせを用いて、学習、推定及び評価を行うようにしても良い。 In this evaluation example, an example using only the number of questions of the inquirer and the respondent, an example of using this number of questions, and a combination of the ratio of the number of questions of the inquirer and the number of questions of the respondent, and these In addition to the number of questions and the ratio, the example in which the occurrence time in the text data of the inquirer and the respondent's question is used is shown. For example, only the ratio between the number of questions of the inquirer and the number of questions of the respondent Or using only the time of occurrence in the text data of the question of the inquirer and the respondent, or the occurrence of the number of questions of the inquirer and the number of questions of the respondent, and the text data of the question of the inquirer and the respondent You may make it perform learning, estimation, and evaluation using any one of each, or a combination, such as using a combination of times.

以上説明したように、本実施の形態に係る知識量推定装置１０では、コンタクトセンタ等におけるオペレータとユーザとの間の対話の中の「質問」を利用して、話者（ユーザ）の知識量を推定する。 As described above, the knowledge amount estimation apparatus 10 according to the present embodiment uses the “question” in the dialogue between the operator and the user in the contact center or the like, thereby obtaining the knowledge amount of the speaker (user). Is estimated.

具体的には、プログラムされたコンピュータ処理により実装される機能として、少なくとも質問抽出部２、特徴量抽出部３、推定情報生成部４、及び知識量推定部５を備え、質問抽出部２は、問合せ者と回答者との対話における発話列のテキストデータから、相手への質問の発話列である質問発話列を抽出し、特徴量抽出部３は、質問抽出部２で抽出された質問発話列に基づき、テキストデータにおける当該質問発話列の発生状態に関する特徴量を示す質問特徴量を抽出し、推定情報生成部４は、特徴量抽出部３で抽出された質問特徴量、及び当該質問特徴量の抽出の対象とされたテキストデータに対して当該テキストデータにより示される対話から想定される問合せ者の知識量を示すものとして予め付与された知識量情報を学習データとして用いることで、テキストデータに対応する知識量の推定に用いる推定情報を生成する。 Specifically, as a function implemented by programmed computer processing, at least a question extraction unit 2, a feature amount extraction unit 3, an estimation information generation unit 4, and a knowledge amount estimation unit 5 are provided. A question utterance sequence, which is an utterance sequence of questions to the other party, is extracted from the text data of the utterance sequence in the dialogue between the inquirer and the respondent, and the feature quantity extraction unit 3 extracts the question utterance sequence extracted by the question extraction unit 2 The question feature quantity indicating the feature quantity related to the occurrence state of the question utterance string in the text data is extracted based on the question feature quantity, and the estimation information generation unit 4 extracts the question feature quantity extracted by the feature quantity extraction unit 3 and the question feature quantity Knowledge amount information given in advance to indicate the knowledge amount of the inquirer assumed from the dialogue indicated by the text data for the text data to be extracted It is used to generate an estimate information used to estimate the amount of knowledge corresponding to the text data.

このようにして推定情報生成部４で生成された推定情報は、ハードディスク２８等の記憶装置に記憶される。 The estimation information generated by the estimation information generation unit 4 in this way is stored in a storage device such as the hard disk 28.

また、質問抽出部２は、知識量の推定対象とする問合せ者と回答者との対話における発話列のテキストデータから、相手への質問の発話列である質問発話列を抽出し、特徴量抽出部３は、質問抽出部２で抽出された質問発話列に基づき、テキストデータにおける当該質問発話列の発生状態に関する特徴量を示す質問特徴量を抽出し、知識量推定部５は、特徴量抽出部３で抽出された質問特徴量と予め記憶装置に記憶された推定情報とを用いて、推定対象とするテキストデータに対応する前記知識量を推定する。 In addition, the question extraction unit 2 extracts a question utterance sequence, which is a utterance sequence of questions to the other party, from text data of an utterance sequence in a dialogue between an inquirer and a respondent whose estimation is a knowledge amount, and extracts a feature amount The unit 3 extracts a question feature amount indicating a feature amount related to the occurrence state of the question utterance sequence in the text data based on the question utterance sequence extracted by the question extraction unit 2, and the knowledge amount estimation unit 5 extracts the feature amount. The knowledge amount corresponding to the text data to be estimated is estimated using the question feature amount extracted by the unit 3 and the estimation information stored in advance in the storage device.

ここで、質問特徴量としては、問合せ者の質問回数、回答者の質問回数、問合せ者の質問回数と回答者の質問回数との比率、問合せ者の質問のテキストデータにおける発生時期、及び回答者の質問のテキストデータにおける発生時期、等からなる。 Here, the question feature amount includes the number of questions of the inquirer, the number of questions of the respondent, the ratio of the number of questions of the inquirer and the number of questions of the respondent, the occurrence time in the text data of the question of the inquirer, and the respondent The date of occurrence in the text data of the question.

このように、本例では、対話の中の「質問」を利用して、対話から話者の知識量を推定しており、従来技術で音声認識結果を知識量の推定の入力として扱う場合における問題点を回避することができる。 As described above, in this example, the “question” in the dialogue is used to estimate the knowledge amount of the speaker from the dialogue, and in the case where the speech recognition result is handled as the input of the knowledge amount estimation in the prior art. The problem can be avoided.

すなわち、従来の語彙の専門性を用いた音声認識結果を知識量の推定の入力として扱う場合、ある話者の発話の音声認識結果に、専門性の高い単語が誤っていくつか出現した場合には、その話者の知識量は誤って高いと推定されてしまう恐れがあるが、本例では、対話の中の「質問」を利用して、対話から当該話者の知識量を推定しており、このような問題点を回避することができる。 In other words, when speech recognition results using conventional vocabulary expertise are treated as input for estimation of the amount of knowledge, when some highly specialized words appear erroneously in the speech recognition results of a speaker's utterance However, in this example, using the “question” in the dialogue, the knowledge amount of the speaker is estimated from the dialogue. Such problems can be avoided.

これにより、本実施の形態に係る知識量推定装置１０を用いることで、対話結果に基づき、対話の対象となった事項に対する当該話者（問合せ者）の知識量の推定を精度良く行うことができる。 Thereby, by using the knowledge amount estimation apparatus 10 according to the present embodiment, it is possible to accurately estimate the knowledge amount of the speaker (inquirer) for the item that is the subject of the dialogue based on the dialogue result. it can.

なお、本発明は、上述した例に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 In addition, this invention is not limited to the example mentioned above, A various deformation | transformation and application are possible within the range which does not deviate from the summary of this invention.

例えば、本例では、処理対象とする対話としてコンタクトセンタにおけるユーザとオペレータ間の対話（通話）を例に説明したが、これに限定されず、例えば、上記非特許文献２バス運行情報案内システムにおけるユーザからの問合せに対する応答者との対話等、他の音声認識を用いた対話におけるユーザの知識量を推定するシステムに適用可能である。 For example, in this example, the dialogue (call) between the user and the operator in the contact center has been described as an example of the dialogue to be processed. However, the dialogue is not limited to this, for example, in the non-patent document 2 bus operation information guidance system described above. The present invention can be applied to a system that estimates a user's knowledge amount in a dialog using other speech recognition, such as a dialog with a responder to a query from a user.

また、上記対話とは、システム（回答者）対人間（問合せ者）でも、人間（回答者）対人間（問合せ者）でも、２者の発話のやりとりであれば何れでも良い。 The dialogue may be any system (respondent) versus human (inquirer), human (respondent) versus human (inquirer), or any conversation between two parties.

また、上記対話のテキストデータとは、対話の音声認識結果や書き起こし、テキストチャットなど、文字化（テキスト化）された対話であれば何でもよい。 The text data of the dialog may be anything as long as it is a text-converted dialog, such as a speech recognition result of the dialog, a transcription, or a text chat.

また、図２に示したコンピュータ構成において、本発明に係る各処理部の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより、各構成による処理が実行されてもよいし、図示されていない通信機能を用いて、当該プログラムを読み込ませることでもよい。 In the computer configuration shown in FIG. 2, a program for realizing the function of each processing unit according to the present invention is recorded on a computer-readable recording medium, and the program recorded on the recording medium is stored in a computer system. By reading and executing, the processing by each configuration may be executed, or the program may be read by using a communication function not shown.

なお、コンピュータ読み取り可能な記録媒体とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。 The computer-readable recording medium refers to a portable medium such as a flexible disk, a magneto-optical disk, a ROM, and a CD-ROM, and a storage device such as a hard disk built in the computer system.

また、上記プログラムは、このプログラムを記憶装置等に格納したコンピュータシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネット等のネットワーク（通信網）や電話回線等の通信回線（通信線）のように情報を伝送する機能を有する媒体のことをいう。 The program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium. Here, the “transmission medium” for transmitting the program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line.

また、上記プログラムは、前述した機能の一部を実現するためのものであってもよい。さらに、前述した機能を、コンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であってもよい。 The program may be for realizing a part of the functions described above. Furthermore, what can implement | achieve the function mentioned above in combination with the program already recorded on the computer system, what is called a difference file (difference program) may be sufficient.

このように、本発明を実施する形態例を、図面を参照して詳述してきたが、具体的な構成はこの実施の形態例に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 As described above, the embodiment for carrying out the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and the scope of the present invention is not deviated. Design etc. are also included.

１発話列抽出部
２質問抽出部
３特徴量抽出部
４推定情報生成部
４ａ質問特徴量
４ｂ知識量情報（知識量ラベル）
４ｃモデル生成機能
５知識量推定部
５ａモデル情報
５ｂ通話ｉの特徴量
５ｃ照合機能
６通話ｉの音声認識結果（推定対象）
７音声認識結果（学習用通話）
８通話ｉの話者（ユーザ）の知識量
１０知識量推定装置
１４キーボード
１６マウス
１８ディスプレイ
２２ＣＰＵ
２４ＲＡＭ
２６ＲＯＭ
２８ハードディスク
３０外部Ｉ／Ｆ（インタフェース）
５０通話装置 DESCRIPTION OF SYMBOLS 1 Speech sequence extraction part 2 Question extraction part 3 Feature-value extraction part 4 Estimated information generation part 4a Question feature-value 4b Knowledge-amount information (knowledge-amount label)
4c Model generation function 5 Knowledge amount estimation unit 5a Model information 5b Feature amount 5c of call i Collation function 6 Speech recognition result of call i (estimation target)
7 Speech recognition results (call for learning)
8 Knowledge amount of speaker (user) of call i 10 Knowledge amount estimation device 14 Keyboard 16 Mouse 18 Display 22 CPU
24 RAM
26 ROM
28 Hard Disk 30 External I / F (Interface)
50 Communication equipment

Claims

A question extraction means for extracting a question utterance string, which is an utterance string of a question to the other party, from text data of an utterance string in the dialogue between the inquirer and the respondent;
Based on the question utterance string extracted by the question extraction means, feature quantity extraction means for extracting a question feature quantity indicating a feature quantity related to the occurrence state of the question utterance string in the text data;
The question feature amount extracted by the feature amount extraction means and the knowledge amount of the inquirer assumed from the dialogue indicated by the text data with respect to the text data targeted for extraction of the question feature amount Knowledge amount estimation information generating means for generating estimation information used for estimation of the knowledge amount corresponding to the text data by using knowledge amount information given in advance as learning data;
A knowledge amount estimation information generating apparatus comprising:

Storage means for storing in advance the estimation information generated by the knowledge amount estimation information generation device according to claim 1;
A question extraction means for extracting a question utterance string, which is an utterance string of a question to the other party, from text data of an utterance string in a dialogue between an inquirer and a respondent who is an estimation target of a knowledge amount;
Based on the question utterance string extracted by the question extraction means, feature quantity extraction means for extracting a question feature quantity indicating a feature quantity related to the occurrence state of the question utterance string in the text data;
Knowledge amount estimation means for estimating the knowledge amount corresponding to the text data to be estimated using the question feature amount extracted by the feature amount extraction means and the estimation information stored in the storage means;
A knowledge amount estimation device.

The question feature amount is the number of questions of the inquirer, the number of questions of the respondent, the ratio of the number of questions of the inquirer and the number of questions of the respondent, the occurrence time of the question of the inquirer in the text data, The knowledge amount estimation information generation device according to claim 1, comprising: at least one of occurrence time of the question of the respondent in the text data.

A question extraction step of extracting a question utterance sequence that is a utterance sequence of a question to the other party from text data of an utterance sequence in the dialogue between the inquirer and the respondent;
Based on the question utterance sequence extracted in the question extraction step, a feature amount extraction step for extracting a question feature amount indicating a feature amount relating to the occurrence state of the question utterance sequence in the text data;
The question feature quantity extracted in the feature quantity extraction step and the knowledge quantity of the inquirer assumed from the dialogue indicated by the text data with respect to the text data targeted for extraction of the question feature quantity A knowledge amount estimation information generation step for generating estimation information used for estimation of the knowledge amount corresponding to the text data by using knowledge amount information given in advance as learning data;
A knowledge amount estimation information generation method comprising:

A storage step for storing the estimation information generated by the knowledge amount estimation information generation method according to claim 4 in a storage device;
A question extraction step for extracting a question utterance sequence, which is a utterance sequence of a question to the other party, from text data of an utterance sequence in the dialogue between the inquirer and the respondent as an estimation target of the knowledge amount;
Based on the question utterance sequence extracted in the question extraction step, a feature amount extraction step for extracting a question feature amount indicating a feature amount relating to the occurrence state of the question utterance sequence in the text data;
A knowledge amount estimation step for estimating the knowledge amount corresponding to the text data to be estimated using the question feature amount extracted in the feature amount extraction step and the estimation information stored in the storage step;
A knowledge amount estimation method comprising:

A program for causing a computer to function as the knowledge amount estimation information generation device according to claim 1 or the knowledge amount estimation device according to claim 2.