JP2011113426A

JP2011113426A - Dictionary generation device, dictionary generating program, and dictionary generation method

Info

Publication number: JP2011113426A
Application number: JP2009271054A
Authority: JP
Inventors: Isao Nanba; 功難波
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2009-11-30
Filing date: 2009-11-30
Publication date: 2011-06-09

Abstract

<P>PROBLEM TO BE SOLVED: To provide a dictionary generation device for generating dictionary information. <P>SOLUTION: A voice recognition part 11 generates a text resulting from the voice recognition of voice data. A morphological analysis part 12 extracts, in the order of utterance, words appearing in question solution information from a word group extracted by morphological analysis of a text to obtain a response history word group (G). A reference candidate generation part 13 selects each piece of question solution data in a question collection storage part 4 of which the word appearance order has a high matching level with respect to that of response history words G of a response failing in search in search history information, and removes words not included in the selected question solution data from the response history words G to obtain a reference candidate. A successful response detection part 14 removes words not included in browsed question solution data from response history words G of a response succeeding in search in search history information to obtain success instance words G. A reference-candidate collation part 15 specifies a pair maximizing word similarity between reference candidates and success instance words G. A dictionary information generation part 16 outputs the dictionary information of a search keyword group of the response history information of the specified pair. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は，辞書作成装置，辞書作成プログラムおよび辞書作成方法に関し，情報検索処理で使用される辞書を作成する技術に関する。 The present invention relates to a dictionary creation device, a dictionary creation program, and a dictionary creation method, and relates to a technique for creating a dictionary used in information retrieval processing.

オペレータが，顧客からの問い合わせに対する解決情報を提供するようなコールセンタでは，問い合わせへの応対時間を短縮化するために，頻出問題集（ＦＡＱ：ＦｒｅｑｕｅｎｔｌｙＡｓｋｅｄＱｕｅｓｔｉｏｎ）を準備することが行われている。応対中のオペレータは，頻出問題とその解決に関する情報を蓄積した問題集データベース（ＦＡＱデータベース）に，顧客の発話から拾い出した検索キーワードを入力して検索処理を行い，検索結果として提示された解決情報を参照して，問い合わせに対する解決手順や対応策などの情報を顧客へ提供する。 In a call center in which an operator provides solution information for an inquiry from a customer, a frequently asked question (FAQ) is prepared in order to shorten the response time for the inquiry. The answering operator enters a search keyword picked up from the customer's utterance into a problem database (FAQ database) that stores information on frequent problems and their solutions, performs a search process, and presents the solutions presented as search results. Refer to the information and provide customers with information such as the resolution procedure and countermeasures for the inquiry.

したがって，検索処理で，問い合わせに対する解決情報を引き当て（ヒット）できるように，適切な検索キーワードが使用される必要がある。 Therefore, an appropriate search keyword needs to be used so that solution information for a query can be assigned (hit) in the search process.

例えば，顧客の問い合わせに対して回答するヘルプデスク業務を行うコールセンタでは，オペレータは，顧客との対話を通じて，問い合わせに関連する単語を得て検索キーワードとして入力し，問い合わせに対する解決情報が蓄積された質問集データベースを検索する。そして，オペレータは，検索で引き当てた解決情報を参照して，問い合わせに対する解決手順や対応策を回答する。 For example, in a call center that performs help desk operations that respond to customer inquiries, the operator obtains words related to the inquiries through dialogue with the customers and inputs them as search keywords. Search the database. Then, the operator refers to the solution information assigned in the search and answers the solution procedure and countermeasures for the inquiry.

顧客との会話が不調であっても効率よく情報検索できるように，オペレータと顧客との会話の音声データに対して音声認識処理を行って検索キーワードを抽出し，抽出した検索キーワードで文書検索を行い，音声認識処理に誤りが発生した場合でも，検索キーワードの組み合わせを変更したり，検索キーワードの一部を同義に変換したりして，検索キーワードの組み合わせを自動的に更新する処理がある。 To enable efficient information retrieval even when the conversation with the customer is not good, the speech data is extracted from the voice data of the conversation between the operator and the customer, and the search keyword is extracted. Even if an error occurs in the speech recognition process, there is a process of automatically updating the search keyword combination by changing the search keyword combination or converting a part of the search keyword to the same meaning.

特開２００７−３０４７９３号公報JP 2007-304793 A

顧客との対話の内容にもとづいて解決情報を検索する際に，熟練したオペレータは，対話の内容から，質問集データベースのデータに含まれているような単語や語句（以下，単に「単語」とする）すなわち検索に適した単語を連想して検索キーワードとすることができる。そのため，熟練したオペレータは，質問集データベースから，問い合わせに対する解決情報が引き当てやすくなり，短時間で顧客の質問に回答することができる。 When searching for solution information based on the contents of the dialogue with the customer, a skilled operator uses the contents of the dialogue to identify words and phrases (hereinafter simply referred to as “words”) that are included in the data in the question collection database. That is, it is possible to associate a word suitable for a search as a search keyword. Therefore, a skilled operator can easily assign solution information to the inquiry from the question collection database, and can answer the customer's question in a short time.

一方，初心者オペレータは，顧客との対話の内容をそのまま取り出して検索キーワードとして入力して質問集データベースの検索を行う。そのため，適切な解決情報を引き当てられずに再検索が必要となって，回答までの所要時間が長くなるという問題があった。さらに，不適切な解決情報を参照して回答してしまうという問題があった。 On the other hand, the beginner operator retrieves the contents of the dialog with the customer as they are and inputs them as search keywords to search the question collection database. For this reason, there is a problem that it takes a long time to answer because it is necessary to search again without assigning appropriate solution information. In addition, there was a problem of referring to inappropriate solution information and answering.

具体的に以下のような状況が生じる。質問集データベースに，異常発生通知の「警告音」についての質問と，その解決手順等の情報が蓄積されているとする。コンピュータが聞き慣れない音を発している旨を問い合わせた顧客が，「機械がピーピー鳴いているんですが」と発話したと想定する。 Specifically, the following situation occurs. Assume that the question collection database stores information on the “warning sound” of the notification of occurrence of anomaly and information on how to solve it. Assume that a customer who inquires that the computer is producing unfamiliar sounds utters "The machine is beeping."

熟練オペレータは，応対経験から，「ピーピー」が「警告音」を意味すると推測して，質問集データベースに記述されているような単語「警告音」を検索キーワードとして検索を行う。そのため，熟練オペレータは，検索キーワード「警告音」に対応付けられた解決情報（警告音発生時の解決手順等の情報）を直ちに引き当てることができる。 The experienced operator infers from experience that “Peep” means “warning sound”, and searches for the word “warning sound” described in the question collection database as a search keyword. Therefore, the skilled operator can immediately assign solution information (information such as a solution procedure when a warning sound is generated) associated with the search keyword “warning sound”.

一方，初心者オペレータは，顧客の発話から「ピーピー」を取り出してそのまま検索キーワードとして検索する。質問集データベースに「ピーピー」という単語が存在していない場合には「警告音」に関する解決情報を直ちに引き当てることができない。 On the other hand, the beginner operator takes “Peepy” from the customer's utterance and searches it as a search keyword as it is. If the word “Peep” does not exist in the question collection database, the solution information regarding “warning sound” cannot be immediately assigned.

顧客との対話から質問集データベースの検索キーワードとなる単語が拾い出せない場合でも，問い合わせに含まれる単語から置換可能な検索キーワードに適した単語を導き出すことによって，オペレータの検索操作を支援できる。そのためには，置換可能な検索キーワード同士を定義した辞書情報が有効であるが，辞書情報を人手によって作成する作業には膨大な時間が掛かる。 Even when words that are search keywords in the question collection database cannot be picked up from dialogue with the customer, it is possible to assist the operator's search operation by deriving words suitable for replaceable search keywords from the words included in the inquiry. For this purpose, dictionary information defining replaceable search keywords is effective, but it takes a lot of time to manually create dictionary information.

本発明の目的は，問い合わせに応対した状況を示すオペレータの音声を録音した音声データを含む各応対の履歴情報を利用して，置換可能な単語群を示す辞書情報を自動作成する辞書作成処理に関する技術として，辞書作成装置，辞書作成プログラム，および辞書作成方法を提供することである。 An object of the present invention relates to a dictionary creation process for automatically creating dictionary information indicating a replaceable word group by using history information of each response including voice data in which an operator's voice indicating a response to an inquiry is recorded. The technology is to provide a dictionary creation device, a dictionary creation program, and a dictionary creation method.

本願において開示される辞書作成装置の代表的なものの概要を簡単に説明すれば，以下のとおりである。すなわち，本願に開示する辞書作成装置は，問い合わせに対する解決情報を含む質問解決データを記憶する質問集記憶部と，問い合わせに対する応対ごとに，顧客との対話でオペレータが発話した音声データを記憶する応対音声記憶部と，応対ごとに検索処理で使用された検索キーワードと該検索処理の成功または失敗を示す検索結果と検索が成功した場合に閲覧された質問解決データを示す閲覧データとを含む検索履歴情報を記憶する検索履歴記憶部と，音声認識部と，形態素解析部と，参照候補作成部と，成功応対検出部と，参照候補照合部と，辞書情報作成部とを備える。 The outline of a typical dictionary creating apparatus disclosed in the present application will be briefly described as follows. That is, the dictionary creation device disclosed in the present application includes a question collection storage unit that stores question solution data including solution information for an inquiry, and an answer storage that stores voice data uttered by an operator in a dialog with a customer for each response to an inquiry. A search history including a voice storage unit, a search keyword used in a search process for each response, a search result indicating success or failure of the search process, and browsing data indicating question solution data browsed when the search is successful A search history storage unit for storing information, a speech recognition unit, a morpheme analysis unit, a reference candidate creation unit, a successful response detection unit, a reference candidate collation unit, and a dictionary information creation unit are provided.

前記辞書作成装置では，前記音声認識部が，応対音声記憶部に記憶された音声データ各々に音声認識処理を行って，該音声データに対応する音声テキストデータを作成して第１記憶部に格納する。前記形態素解析部は，第１記憶部の各音声テキストデータに形態素解析処理を行って，該形態素解析処理で得た単語の中から前記質問集記憶部の前記質問解決データに出現する単語を発話順に抽出して応対履歴単語グループを作成して第２記憶部に格納する。 In the dictionary creation device, the speech recognition unit performs speech recognition processing on each speech data stored in the response speech storage unit, creates speech text data corresponding to the speech data, and stores the speech text data in the first storage unit To do. The morpheme analysis unit performs a morpheme analysis process on each speech text data in the first storage unit, and utters a word that appears in the question solution data in the question collection storage unit from the words obtained by the morpheme analysis process A response history word group is created in order and stored in the second storage unit.

そして，前記参照候補作成部は，検索履歴記憶部に記憶された検索履歴情報をもとに，第２記憶部から，検索結果が失敗である応対の応対履歴単語グループを取り出し，取り出した応対履歴単語グループの単語の出現順序と質問集記憶部の各質問解決データに記述されている単語の出現順序との一致度を計算して，一致度が所定の値以上である質問解決データを選択する。前記参照候補作成部は，検索結果が失敗である応対の応対履歴単語グループから，該応対履歴単語グループとの一致度にもとづいて選択された質問解決データに記述されていない単語を除去して参照候補を作成して第３記憶部に格納する。 Then, the reference candidate creating unit takes out the response history word group of the response whose search result is unsuccessful from the second storage unit based on the search history information stored in the search history storage unit, and extracts the received response history The degree of coincidence between the appearance order of the words in the word group and the appearance order of the words described in each question solving data in the question collection storage unit is calculated, and the question solving data having a matching degree equal to or higher than a predetermined value is selected. . The reference candidate creation unit refers to a word that is not described in the question solution data selected based on the degree of coincidence with the response history word group from the response history word group of the response whose search result is failure. Candidates are created and stored in the third storage unit.

前記成功応対検出部は，検索履歴記憶部に記憶された検索履歴情報をもとに，検索結果が成功である応対を全て検出して，該検索結果が成功である応対の閲覧データである質問解決データを特定する。さらに，前記成功応対検出部は，第２記憶部から，検索結果が成功である応対各々に対応する応対履歴単語グループを取り出して，取り出した応対履歴単語グループから特定された質問解決データに記述されていない単語を除去して成功事例単語グループを作成して第４記憶部に格納する。 The successful response detection unit detects all the responses whose search results are successful based on the search history information stored in the search history storage unit, and is a question which is browsing data of the response whose search results are successful Identify resolution data. Further, the successful response detection unit extracts a response history word group corresponding to each response whose search result is successful from the second storage unit, and is described in the question solution data specified from the extracted response history word group. Unsuccessful words are removed to create a success case word group, which is stored in the fourth storage unit.

前記参照候補照合部は，参照候補を前記成功事例単語グループの各々と比較して，それぞれが含む単語の類似度を計算して，前記類似度が最大である参照候補と成功事例単語グループとの組み合わせを特定する。前記辞書情報作成部は，特定された組み合わせの参照候補と成功事例単語グループそれぞれの検索履歴情報から検索キーワードを抽出して，該抽出した検索キーワード群を辞書情報として出力する。 The reference candidate matching unit compares the reference candidate with each of the success case word groups, calculates the similarity of the words included in each, and calculates the similarity between the reference candidate having the maximum similarity and the success case word group. Identify combinations. The dictionary information creation unit extracts a search keyword from search history information of each of the specified combination of reference candidates and successful case word groups, and outputs the extracted search keyword group as dictionary information.

本願に開示する辞書作成装置によれば，顧客の問い合わせに対して適切な質問解決データの検索に失敗した応対でのオペレータの音声データに含まれる単語を用いて，失敗した応対に類似する応対を，より精度良く特定することができ，失敗の応対と類似する成功の応対との検索キーワード同士を関連付けた辞書情報を効率よく作成することができる。 According to the dictionary creation device disclosed in the present application, a response similar to a failed response is obtained by using a word contained in the operator's voice data in a response that fails to retrieve appropriate question solution data in response to a customer query. Therefore, it is possible to specify with higher accuracy, and it is possible to efficiently create dictionary information that associates search keywords with a successful response similar to a failed response.

よって，検索処理の経験が少ないオペレータが入力した検索キーワードをより適切な単語へ置き換えるための辞書情報の効率的な作成処理を実現することができる。 Therefore, it is possible to realize an efficient creation process of dictionary information for replacing a search keyword input by an operator with little experience in search processing with a more appropriate word.

開示される辞書作成装置の構成例を示す図である。It is a figure which shows the structural example of the dictionary creation apparatus disclosed. 開示される辞書作成装置における質問集記憶部に記憶されている質問解決データの例を示す図である。It is a figure which shows the example of the question solution data memorize | stored in the question collection memory | storage part in the disclosed dictionary creation apparatus. 開示される辞書作成装置における音声テキストデータの例を示す図である。It is a figure which shows the example of the audio | voice text data in the dictionary creation apparatus disclosed.

音声テキストデータが含む各単語の出現順序の一致を説明する図である。It is a figure explaining the coincidence of the appearance order of each word which audio | voice text data contains. 応対履歴単語グループの単語の除去例を示す図である。It is a figure which shows the example of removal of the word of a response history word group. 音声データに含まれる顧客とオペレータの発話例を示す図である。It is a figure which shows the example of a customer's and operator's utterance contained in audio | voice data. 辞書作成装置を含む辞書作成システムの一実施形態における構成例を示す図である。It is a figure which shows the structural example in one Embodiment of the dictionary creation system containing a dictionary creation apparatus. 辞書作成システムの処理の流れを示す図である。It is a figure which shows the flow of a process of a dictionary creation system. 応対音声記憶部の音声データのデータ構成例を示す図である。It is a figure which shows the data structural example of the audio | voice data of a reception voice memory | storage part. 応対履歴記憶部の応対履歴情報のデータ構成例を示す図である。It is a figure which shows the data structural example of the response history information of a response history memory | storage part. 検索履歴記憶部の検索履歴情報のデータ構成例を示す図である。It is a figure which shows the data structural example of the search history information of a search history memory | storage part. 質問集記憶部の質問解決データのデータ構成例を示す図である。It is a figure which shows the example of a data structure of the question solution data of a question collection memory | storage part. 応対履歴音声データのデータ構成例を示す図である。It is a figure which shows the data structural example of reception log | history audio | voice data. 解決提示区間が設けられた応対履歴音声データのデータ構成例を示す図である。It is a figure which shows the data structural example of the response history audio | voice data in which the solution presentation area was provided. 音声認識用辞書の例を示す図である。It is a figure which shows the example of the dictionary for speech recognition. 応対履歴単語グループの例を示す図である。It is a figure which shows the example of a response history word group. 応対履歴単語グループの例を示す図である。It is a figure which shows the example of a response history word group. ステップＳ５の処理のより詳細な処理フローを示す図である。It is a figure which shows the more detailed process flow of the process of step S5. 応対履歴単語グループでの同一単語の除去例を示す図である。It is a figure which shows the example of removal of the same word in a reception log | history word group. 応対履歴単語グループでの同一単語の除去例を示す図である。It is a figure which shows the example of removal of the same word in a reception log | history word group. 応対履歴単語グループの，質問解決データに対して単語の出現順序の一致の計算に使用される単語の並びを示す図である。It is a figure which shows the arrangement | sequence of the word used for calculation of the coincidence of the appearance order of a word with respect to question solution data of a response history word group. 応対履歴単語グループの単語群の暫定候補にもとづく除去例を示す図である。It is a figure which shows the example of removal based on the temporary candidate of the word group of a response history word group. 成功応対履歴単語グループの例を示す図である。It is a figure which shows the example of a successful response history word group. 辞書情報の例を示す図である。It is a figure which shows the example of dictionary information. 類似応対のデータ項目が追加された検索履歴情報のデータ構成例を示す図である。It is a figure which shows the example of a data structure of the search history information to which the data item of the similar response was added. 単語の出現距離の計算を説明するための図である。It is a figure for demonstrating calculation of the appearance distance of a word. 開示される辞書作成装置のハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of the dictionary creation apparatus disclosed.

まず，本発明の一態様として開示される辞書作成装置を概説する。 First, a dictionary creation apparatus disclosed as one aspect of the present invention will be outlined.

図１は，開示される辞書作成装置１の構成例を示す図である。 FIG. 1 is a diagram illustrating a configuration example of a disclosed dictionary creation device 1.

辞書作成装置１は，置き換え可能な単語群を示す辞書情報を生成する装置であって，応対音声記憶部２，検索履歴記憶部３，質問集記憶部４，音声認識部１１，形態素解析部１２，参照候補作成部１３，成功応対検出部１４，参照候補照合部１５，および辞書情報作成部１６を備える。 The dictionary creation device 1 is a device for generating dictionary information indicating replaceable word groups, and includes a response speech storage unit 2, a search history storage unit 3, a question collection storage unit 4, a speech recognition unit 11, and a morpheme analysis unit 12. , A reference candidate creation unit 13, a successful response detection unit 14, a reference candidate collation unit 15, and a dictionary information creation unit 16.

応対音声記憶部２は，問い合わせに対する応対ごとに，オペレータの発話を録音した音声データを記憶する手段である。音声データは，問い合わせに対する解決情報を提示する発話部分である解決提示部を含むものである。 The answering voice storage unit 2 is a means for storing voice data in which an operator's speech is recorded for each answer to an inquiry. The voice data includes a solution presentation unit that is an utterance portion that presents solution information for an inquiry.

検索履歴記憶部３は，応対ごとに，検索処理で使用された検索キーワードと，該検索処理の成功または失敗を示す検索結果と，検索が成功した場合に閲覧された質問解決データを示す閲覧データと含む検索履歴情報を記憶する手段である。 For each response, the search history storage unit 3 searches the search keyword used in the search process, the search result indicating the success or failure of the search process, and the browsing data indicating the question solution data browsed when the search is successful. Is a means for storing search history information.

応対音声記憶部２の音声データと検索履歴記憶部３の検索履歴情報とは，各応対の履歴情報を識別する情報によって関連付けされている。 The voice data of the reception voice storage unit 2 and the search history information of the search history storage unit 3 are associated with each other by information for identifying the history information of each reception.

質問集記憶部４は，問い合わせと，その問い合わせに対する解決手段や対応策等の解決情報とを含む質問解決情報（質問解決データ）を記憶する手段である。 The question collection storage unit 4 is means for storing question solution information (question solution data) including an inquiry and solution information for the inquiry and solution information such as countermeasures.

音声認識部１１は，応対音声記憶部２に記憶された音声データ各々に音声認識処理を行って，その音声データに対応する音声テキストデータを作成して第１記憶部１７１へ格納する。 The voice recognition unit 11 performs voice recognition processing on each piece of voice data stored in the response voice storage unit 2, creates voice text data corresponding to the voice data, and stores the voice text data in the first storage unit 171.

形態素解析部１２は，音声認識部１１が作成した各音声テキストデータに形態素解析処理を行って，この形態素解析処理で得た単語の中から質問集記憶部４の質問解決データに出現する単語を発話順に抽出して応対履歴単語グループを作成して第２記憶部１７２へ格納する。 The morpheme analysis unit 12 performs a morpheme analysis process on each speech text data created by the speech recognition unit 11 and selects words appearing in the question solution data in the question collection storage unit 4 from the words obtained by the morpheme analysis process. A response history word group is created by extracting in the utterance order and stored in the second storage unit 172.

参照候補作成部１３は，検索履歴記憶部３に記憶された検索履歴情報をもとに検索結果が失敗である応対の応対履歴単語グループを第２記憶部１７２から取り出す。さらに，参照候補作成部１３は，取り出した応対履歴単語グループの単語の出現順序と質問集記憶部４の質問解決データに記述されている単語の出現順序との一致度を計算して，一致度が所定の値以上である質問解決データを選択する。さらに，参照候補作成部１３は，検索結果が失敗である応対の応対履歴単語グループから，その応対履歴単語グループとの一致度によって選択された質問解決データに記述されていない単語を除去して参照候補を作成する。参照候補は，第３記憶部１７３に格納される。 Based on the search history information stored in the search history storage unit 3, the reference candidate creation unit 13 extracts the response history word group of the response whose search result is unsuccessful from the second storage unit 172. Furthermore, the reference candidate creation unit 13 calculates the degree of coincidence between the appearance order of the words in the extracted response history word group and the appearance order of the words described in the question solution data in the question collection storage unit 4. Selects question resolution data with a value equal to or greater than a predetermined value. Further, the reference candidate creation unit 13 removes and refers to words that are not described in the question solution data selected based on the degree of matching with the response history word group from the response history word group of the response whose search result is failure. Create a candidate. The reference candidates are stored in the third storage unit 173.

また，参照候補作成部１３は，単語の出現順序の一致度を計算する前に，質問集記憶部４の質問解決データを参照して，応対履歴単語グループに含まれる単語から前記質問解決データのいずれにも記述されていない単語を除去することができる。 In addition, the reference candidate creation unit 13 refers to the question solution data in the question collection storage unit 4 and calculates the question solution data from the words included in the response history word group before calculating the degree of coincidence of the word appearance order. Words not described in either can be removed.

また，参照候補作成部１３は，単語の出現順序の一致度を計算する前に，応対履歴単語グループに，所定時間内に出現した同一の単語が複数存在する場合に，該当する単語の１つのみを保持し他の単語を全て除去することができる。 Further, the reference candidate creation unit 13 selects one of the corresponding words when there are a plurality of identical words appearing within a predetermined time in the response history word group before calculating the degree of coincidence of the word appearance order. Can only remove all other words.

成功応対検出部１４は，検索履歴記憶部３に記憶された検索履歴情報をもとに，検索結果が成功である応対を全て検出して，該検索結果が成功である応対の閲覧データである質問解決データを特定する。さらに，成功応対検出部１４は，特定した検索結果が成功である応対各々に対応する応対履歴単語グループを第２記憶部から取り出し，取り出した応対履歴単語グループから，特定された質問解決データに記述されていない単語を除去して成功事例単語グループを作成する。成功事例単語グループは，第４記憶部１７４に格納される。 The successful response detection unit 14 is the browsing data of the response in which all the search results are detected based on the search history information stored in the search history storage unit 3 and the search result is successful. Identify question resolution data. Further, the successful response detection unit 14 extracts a response history word group corresponding to each response whose identified search results are successful from the second storage unit, and describes the extracted response history word group in the identified question solution data. Remove words that have not been created and create a success case word group. The success case word group is stored in the fourth storage unit 174.

参照候補照合部１５は，第３記憶部の参照候補を第４記憶部の成功事例単語グループの各々と比較して，それぞれが含む単語の類似度を計算して，類似度が最大となった参照候補と成功事例単語グループとの組み合わせを特定する。 The reference candidate matching unit 15 compares the reference candidates in the third storage unit with each of the success case word groups in the fourth storage unit, calculates the similarity of the words included in each, and the similarity is maximized Identify combinations of reference candidates and success case word groups.

辞書情報作成部１６は，参照候補照合部１５によって特定された組み合わせの参照候補と成功事例単語グループのそれぞれの検索履歴情報から，検索キーワードを抽出して，抽出した検索キーワード群を辞書情報として出力する。 The dictionary information creation unit 16 extracts search keywords from the search history information of the reference candidates and the success example word groups of the combinations specified by the reference candidate matching unit 15, and outputs the extracted search keyword group as dictionary information To do.

第１記憶部１７１は，音声認識部１１が生成した音声テキストデータを記憶する。第２記憶部１７２は，形態素解析部１２が生成した応対履歴単語グループを記憶する。第３記憶部１７３は，参照候補作成部１３が生成した参照候補を記憶する。第４記憶部１７４は，成功応対検出部１４が生成した成功事例単語グループを記憶する。 The first storage unit 171 stores the voice text data generated by the voice recognition unit 11. The second storage unit 172 stores the response history word group generated by the morpheme analysis unit 12. The third storage unit 173 stores the reference candidates generated by the reference candidate creation unit 13. The fourth storage unit 174 stores the successful case word group generated by the successful response detection unit 14.

辞書作成装置１は，以下のように動作する。 The dictionary creation device 1 operates as follows.

辞書作成装置１の音声認識部１１は，応対音声記憶部２に記憶された音声データに音声認識処理を行って，音声データに対応する音声テキストデータを作成して第１記憶部１７１へ格納する。音声認識処理用の辞書情報として，質問集記憶部４の質問解決データが使用される。 The speech recognition unit 11 of the dictionary creation device 1 performs speech recognition processing on the speech data stored in the response speech storage unit 2 to create speech text data corresponding to the speech data and store it in the first storage unit 171. . The question solution data in the question collection storage unit 4 is used as dictionary information for voice recognition processing.

図２は，質問集記憶部４に記憶されている質問解決情報の例を示す図である。例えば，質問集記憶部４の質問解決データＤ１の解決部に「サウンドパネル」，「開く」，「ボリューム」，「ミュート」等の単語が記述されていると仮定する。また，図示していないが，解決部Ｄ２に「スタートボタン」，「クリック」，「一覧」，「設定」等が記述されているとする。 FIG. 2 is a diagram illustrating an example of the question solution information stored in the question collection storage unit 4. For example, it is assumed that words such as “sound panel”, “open”, “volume”, and “mute” are described in the solution part of the question solution data D1 of the question collection storage unit 4. Although not shown, it is assumed that “start button”, “click”, “list”, “setting”, and the like are described in the resolution unit D2.

図３は，音声テキストデータの例を示す図である。 FIG. 3 is a diagram illustrating an example of speech text data.

図３に示す音声テキストデータは，音声認識処理によって作成された，検索結果が失敗である応対のオペレータの発話を録音した音声データに相当する音声テキストデータである。図３では，上から下への矢印が時間経過を示し，音声認識処理の単位セグメントのうち問題解決データに関連する部分のみを図示している。オペレータが発話した「サウンドパネルを開いてください」，「スタートボタンをクリックしてください」，「一覧から設定を選んでください」，「ボリュームはミュートですか」が，それぞれテキストデータ化されている。 The speech text data shown in FIG. 3 is speech text data corresponding to speech data created by speech recognition processing and recorded from the utterances of a response operator whose search result is unsuccessful. In FIG. 3, the arrow from the top to the bottom indicates the passage of time, and only the portion related to the problem solving data in the unit segment of the speech recognition processing is illustrated. "Open the sound panel", "Please click the start button", "Please select a setting from the list", and "Is the volume muted?"

形態素解析部１２は，音声認識部１１が作成した音声テキストデータに形態素解析処理を行って，音声テキストデータから所定の抽出条件（例えば，自立語）に合致する単語を抽出する。次に，形態素解析部１２は，形態素解析処理で抽出した単語の中から質問集記憶部４の質問解決データに出現する単語を発話順に抽出して応対履歴単語グループを作成して第２記憶部１７２へ格納する。 The morpheme analysis unit 12 performs morpheme analysis processing on the speech text data created by the speech recognition unit 11, and extracts words that match a predetermined extraction condition (for example, an independent word) from the speech text data. Next, the morpheme analysis unit 12 extracts words appearing in the question solution data in the question collection storage unit 4 from the words extracted in the morpheme analysis process in order of utterances to create a response history word group, and the second storage unit Store to 172.

具体的には，形態素解析部１２は，図３に示す音声テキストデータの４つのセグメントから，形態素解析処理により単語を抽出し，抽出した単語群から，図２に示す質問解決データの解決部に出現する「サウンドパネル，開く，スタートボタン，クリック，一覧，設定，ボリューム，ミュート」等の単語を発話順に抽出して，応対履歴単語グループとする。 Specifically, the morpheme analysis unit 12 extracts words from the four segments of the speech text data shown in FIG. 3 by morpheme analysis processing, and from the extracted word group to the question solution data resolution unit shown in FIG. Words such as “sound panel, open, start button, click, list, setting, volume, mute” that appear are extracted in the order of utterances to make a response history word group.

参照候補作成部１３は，検索履歴記憶部３に記憶された検索履歴情報をもとに，第２記憶部１７２から，検索結果が失敗である応対の応対履歴単語グループを取り出して，取り出した応対履歴単語グループの単語の出現順序と，質問集記憶部４の質問解決データの解決部に記述されている単語の出現順序との一致度を計算して，所定の一致度となる質問解決データの解決部を選択する。 Based on the search history information stored in the search history storage unit 3, the reference candidate creation unit 13 extracts the response history word group of the response whose search result is unsuccessful from the second storage unit 172, and extracts the received response. The degree of coincidence between the appearance order of the words in the history word group and the appearance order of the words described in the solution part of the question solution data in the question collection storage part 4 is calculated, and the question solution data having a predetermined degree of coincidence is calculated. Select a resolution.

図４は，音声テキストデータが含む各単語の出現順序の一致を説明する図である。 FIG. 4 is a diagram for explaining the coincidence of the appearance order of the words included in the speech text data.

音声テキストデータ（図３）に含まれる単語「サウンドパネル，開く，スタートボタン，クリック，一覧，設定，ボリューム，ミュート」のうち「サウンドパネル，開く，ボリューム，ミュート」の部分が，図２に示す質問解決データＤ１の解決部に記述された単語の順序と，また，「スタートボタン，クリック，一覧，設定」の部分が，同じく質問解決データＤ２の解決部に記述された単語の順序とそれぞれ一致すると仮定する。これらの一致度が所定値以上であれば，参照候補作成部１３は，質問解決データＤ１，Ｄ２の解決部を選択する。 Of the words “sound panel, open, start button, click, list, settings, volume, mute” included in the speech text data (FIG. 3), the “sound panel, open, volume, mute” part is shown in FIG. The order of the words described in the solution part of the question solution data D1 and the “start button, click, list, setting” part are the same as the order of the words described in the solution part of the question solution data D2. Assume that. If the degree of coincidence is equal to or greater than a predetermined value, the reference candidate creation unit 13 selects a solution unit for the question solution data D1 and D2.

さらに，参照候補作成部１３は，選択した質問解決データを用いて，応対履歴単語グループの単語から，解決部Ｄ１に存在しない単語を除去した残りの単語群（サウンドパネル，開く，ボリューム，ミュート）である参照候補Ｃ１と，解決部Ｄ２に存在しない単語を除去した残りの単語群「スタートボタン，クリック，一覧，設定」である参照候補Ｃ２とを作成して第３記憶部１７３へ格納する。 Further, the reference candidate creation unit 13 uses the selected question solution data to remove the remaining word group (sound panel, open, volume, mute) from words in the response history word group that are not present in the solution unit D1. And the remaining word group “start button, click, list, setting” from which a word that does not exist in the resolution unit D2 is removed and stored in the third storage unit 173.

さらに，参照候補作成部１３は，単語の出現順序の一致度を計算する前に，質問集記憶部４の質問解決データを参照して，応対履歴単語グループに含まれる単語から，質問解決データに存在しない単語を除去してもよい。 Further, the reference candidate creation unit 13 refers to the question solution data in the question collection storage unit 4 before calculating the degree of coincidence of the appearance order of words, and converts the words included in the response history word group into question solution data. Missing words may be removed.

また，参照候補作成部１３は，単語の出現順序の一致度を計算する前に，音声テキストデータの所定数のセグメント内（すなわち経過時間内）に出現した同一の単語が複数出現する場合に，応対履歴単語グループの該当する単語の１つのみを残して他の単語を全て除去してもよい。 In addition, the reference candidate creating unit 13 may calculate a plurality of the same words that appear in a predetermined number of segments (that is, within the elapsed time) of the voice text data before calculating the degree of coincidence of the appearance order of the words. You may remove all other words, leaving only one of the corresponding words in the response history word group.

図５は，応対履歴単語グループの単語の除去例を示す図である。 FIG. 5 is a diagram illustrating an example of removing words from the response history word group.

図５（Ａ）は，除去処理前の応対履歴単語グループ例を示す。参照候補作成部１３は，質問集記憶部４の質問解決データの解決部に「コントロールパネルを開いてください」という記載がない場合には，図５（Ａ）の応対履歴単語グループの単語から，該当する単語（コントロールパネル，開く）を除去する。この除去の結果，応対履歴単語グループの単語は，図５（Ｂ）に示すような単語群になる。 FIG. 5A shows an example of a response history word group before the removal process. If there is no description “Please open the control panel” in the question resolution data resolution section of the question collection storage section 4, the reference candidate creation section 13 uses the words in the response history word group in FIG. Remove the corresponding word (control panel, open). As a result of this removal, the words in the response history word group become a word group as shown in FIG.

さらに，参照候補作成部１３は，応対履歴単語グループの単語群に，所定数のセグメント間（例えばセグメント数＝５）に出現する同一の単語（サウンドパネル，開く）がある場合に，１つの単語のみを残して他を除去する。この除去の結果，応対履歴単語グループの単語は，図５（Ｃ）に示すような単語群になる。 Further, the reference candidate creation unit 13 selects one word when there is the same word (sound panel, open) that appears between a predetermined number of segments (for example, the number of segments = 5) in the word group of the response history word group. Remove only others, leaving only. As a result of this removal, the words in the response history word group become a word group as shown in FIG.

次に，成功応対検出部１４は，検索履歴記憶部３に記憶された検索履歴情報を参照して，検索結果が成功であった応対全てを取り出して，各応対の検索履歴情報の閲覧データに特定された質問解決データの解決部を特定する。そして，成功応対検出部１４は，成功の応対に対応する応対履歴単語グループを第２記憶部１７２から取り出して，取り出した応対履歴単語グループから，特定された質問解決データの解決部に記述されていない単語を除去し，除去語の単語を成功事例単語グループとして第４記憶部１７４へ格納する。 Next, the successful response detection unit 14 refers to the search history information stored in the search history storage unit 3, extracts all the responses whose search results were successful, and uses them as browsing data for the search history information of each response. The solution part of the identified question solution data is specified. Then, the successful response detection unit 14 extracts the response history word group corresponding to the successful response from the second storage unit 172, and is described in the question answering data resolution unit specified from the extracted response history word group. The missing word is removed, and the removed word is stored in the fourth storage unit 174 as a success case word group.

続いて，参照候補照合部１５は，第３記憶部１７３の参照候補Ｃ１，Ｃ２と，第４記憶部１７４の成功事例単語グループとの組み合わせの各々について，それぞれの単語の類似度を計算して，類似度が最高である参照候補と成功事例単語グループとの組み合わせを特定する。 Subsequently, the reference candidate matching unit 15 calculates the similarity of each word for each combination of the reference candidates C1 and C2 in the third storage unit 173 and the successful case word group in the fourth storage unit 174. , The combination of the reference candidate with the highest similarity and the success case word group is specified.

辞書情報作成部１６は，参照候補照合部１５が特定した組み合わせの参照候補と成功応対事例との検索履歴情報から，検索キーワードをそれぞれ抽出して，抽出した単語群（検索キーワード群）を，置き換え可能な単語同士であることを示す辞書情報として出力する。 The dictionary information creation unit 16 extracts the search keywords from the search history information of the reference candidates of the combination identified by the reference candidate matching unit 15 and the successful response cases, and replaces the extracted word group (search keyword group). Output as dictionary information indicating possible words.

例えば，参照候補Ｃ１と成功事例単語グループＧｘとの組み合わせが特定されたと仮定する。さらに，参照候補Ｃ１の元となる応対の検索履歴情報の検索キーワードが「だんまり」であり，成功事例単語グループＧｘの検索履歴情報の検索キーワードが「音が出ない」であるとする。辞書情報作成部１６は，検索キーワード「だんまり」と「音が出ない」との組を，置換可能な単語群を示す辞書情報として出力する。 For example, it is assumed that a combination of the reference candidate C1 and the success case word group Gx is specified. Further, it is assumed that the search keyword of the response search history information that is the basis of the reference candidate C1 is “many”, and the search keyword of the search history information of the success case word group Gx is “no sound”. The dictionary information creation unit 16 outputs a set of the search keywords “daman” and “no sound” as dictionary information indicating replaceable word groups.

このようにして辞書作成装置１によって生成された辞書情報をもとに，応対中のオペレータが検索キーワードを入力する場合に参照する辞書が準備される。 Based on the dictionary information generated by the dictionary creation device 1 in this way, a dictionary to be referred to when the operator in charge inputs a search keyword is prepared.

例えば，初心者オペレータが，顧客の「コンピュータがだんまりなんだけれど」という発話を受けて，検索キーワードとしてそのまま「だんまり」を入力するような場合でも，辞書作成装置１により生成された辞書情報に基づく辞書が参照されて，置換可能な単語「音が出ない」を知ることができ，検索キーワードとして使用することができるようになる。 For example, a dictionary based on dictionary information generated by the dictionary creation device 1 even when a novice operator receives a customer's utterance of “how much a computer is” and inputs “undice” as a search keyword as it is. Is referred to so that the replaceable word “no sound” can be found and used as a search keyword.

図６に，音声データに含まれる顧客とオペレータの発話例を示す。 FIG. 6 shows an example of customer and operator utterances included in the voice data.

図６において，上から下への矢印は時間経過を示し，時間を示す矢印の左側が顧客の発話，右側がオペレータの発話を表す。図６（Ａ）の音声データと図６（Ｂ）の音声データとは，発話総数および発話時間が大きく異なるが，問い合わせ内容および提示される解決情報は，ほぼ同じであることがわかる。一般的に顧客の問い合わせに対して同じ内容の解決情報を提示する場合でも顧客と対話するオペレータの発話時間はさまざまである。すなわち，同様の応対であっても，応対毎に音声データのサイズにばらつきが生じるため，音声データに含まれる単語群同士を直接比較しても，応対の類否を精度良く判断することは難しい。 In FIG. 6, the arrow from the top to the bottom indicates the passage of time, the left side of the arrow indicating the time represents the customer's utterance, and the right side represents the operator's utterance. Although the voice data in FIG. 6A and the voice data in FIG. 6B are greatly different from each other in the total number of utterances and the utterance time, it can be seen that the contents of the inquiry and the solution information presented are almost the same. In general, even when solution information having the same content is presented in response to a customer inquiry, the utterance time of an operator who interacts with the customer varies. That is, even in the case of the same response, the size of the voice data varies depending on the response, so it is difficult to accurately determine the similarity of the response even if the word groups included in the voice data are directly compared. .

辞書作成装置１は，検索が失敗した応対の音声データが含む単語の出現順序から，検索で引き当てられるべき質問解決データを推定して，検索に必要となる単語を絞り込んだ単語群（参照候補）を作成する。そして，検索が成功した応対での検索キーワードとの類似度にもとづいて応対の類比を判定する。 The dictionary creation device 1 estimates the question solution data to be assigned in the search from the appearance order of the words included in the response voice data that has failed to be searched, and a group of words (reference candidates) that narrows down the words that are required for the search. Create Then, the analogy of the response is determined based on the similarity to the search keyword in the response that has been successfully searched.

これにより，辞書作成装置１は，音声データ自体のサイズの相違に影響を受けることなく，失敗の応対の内容に類似する成功の応対を，一般的な単語の類否判定式を用いて検出することができ，よって，精度の高い辞書情報を自動的に作成・出力することが可能となる。 As a result, the dictionary creation device 1 detects a successful response similar to the content of the failed response using a general word similarity determination formula without being affected by the difference in size of the voice data itself. Therefore, it is possible to automatically create and output highly accurate dictionary information.

さらに，一般に，オペレータは，応対中，顧客の理解の程度に応じて同じことを繰り返したり，違う表現へ言い換えて説明したりする。その結果，同内容の応対であっても，オペレータの発話時間にばらつきが生じるだけでなく，発話中の単語の出現状況も異なってくる。辞書作成装置１は，失敗した応対での発話に出現する単語から不要な単語や重複する単語を除去してから，検索に必要となる単語群を作成する。よって，質問解決データとの一致度の照合精度を高めることができる。 Furthermore, in general, the operator repeats the same thing according to the degree of understanding of the customer during the reception, or paraphrases and explains in a different expression. As a result, even if the response is the same, not only will the operator's utterance time vary, but the appearance status of the words being uttered will also differ. The dictionary creation device 1 removes unnecessary words and duplicate words from words appearing in the utterance at the failed response, and then creates a word group necessary for the search. Therefore, the matching accuracy of the degree of coincidence with the question solution data can be increased.

また，辞書情報作成部１６は，参照候補照合部１５が特定した組み合わせの参照候補と成功応対事例との検索履歴情報から，参照候補の元となっている応対（検索が失敗した応対）の類似応対として，成功事例単語グループの元となっている応対（検索が成功した応対）を関連づけた類似応対情報を出力する。 In addition, the dictionary information creation unit 16 uses the search history information of the reference candidates of the combination identified by the reference candidate matching unit 15 and the successful response examples, and the similarity of the response that is the source of the reference candidate (response for which the search has failed). As the response, similar response information in which the response that is the basis of the successful case word group (the response that has been successfully searched) is output is output.

これにより，検索が成功した検索履歴情報だけではなく、検索が失敗した応対の検索履歴情報までを活用可能情報とすることができる。 As a result, not only the search history information that has been successfully searched but also the search history information of the response that has failed to be searched can be used as usable information.

以下に，辞書作成装置１の一実施形態を，より詳細説明する。 Hereinafter, an embodiment of the dictionary creation device 1 will be described in more detail.

図７は，辞書作成装置１を含む辞書作成システムの一実施形態における構成例を示す図である。 FIG. 7 is a diagram illustrating a configuration example in an embodiment of a dictionary creation system including the dictionary creation device 1.

図７に示す辞書作成システムは，辞書作成装置１，応対音声記憶部２，検索履歴記憶部３，質問集記憶部４，応対履歴管理装置５，談話構造解析装置６，辞書記憶部７，応対履歴記憶部８を備える。 The dictionary creation system shown in FIG. 7 includes a dictionary creation device 1, a response voice storage unit 2, a search history storage unit 3, a question collection storage unit 4, a response history management device 5, a discourse structure analysis device 6, a dictionary storage unit 7, A history storage unit 8 is provided.

辞書作成装置１，応対音声記憶部２，検索履歴記憶部３および質問集記憶部４は，図１に示す辞書作成装置１，応対音声記憶部２，検索履歴記憶部３および質問集記憶部４にそれぞれ相当するものである。 The dictionary creation device 1, the response voice storage unit 2, the search history storage unit 3 and the question collection storage unit 4 are the dictionary creation device 1, the response speech storage unit 2, the search history storage unit 3 and the question collection storage unit 4 shown in FIG. Respectively.

図７に示す実施形態の辞書作成システムでは，応対音声記憶部２は，応対ごとに，問い合わせをした顧客と応対したオペレータとが発話した音声を録音した対話録音音声データを記憶している。 In the dictionary creation system of the embodiment shown in FIG. 7, the reception voice storage unit 2 stores, for each reception, dialogue-recorded voice data obtained by recording voices uttered by the customer who made the inquiry and the operator who responded.

検索履歴記憶部３は，顧客への応対における検索処理で使用された検索キーワードと，検索により関連する問題解決データがヒットできたか否かを示す情報（検索結果）と，ヒットできた問題解決データの識別情報（閲覧データ）とを含む検索履歴情報を記憶している。 The search history storage unit 3 includes a search keyword used in the search process in dealing with the customer, information (search result) indicating whether or not the related problem solving data has been hit by the search, and the problem solving data that has been hit. The search history information including the identification information (browsing data) is stored.

応対履歴管理装置５は，応対音声記憶部２と応対履歴記憶部８と検索履歴記憶部３とに記憶される各情報を管理し，各応対（応対履歴ＩＤ）の応対履歴音声データを作成する。 The response history management device 5 manages each information stored in the response voice storage unit 2, the response history storage unit 8, and the search history storage unit 3, and generates response history voice data of each response (response history ID). .

談話構造解析装置６は，応対履歴管理装置５が作成した応対履歴音声データに含まれる音声データから，オペレータが解決情報を提示している発話期間（解決提示部）を特定する。 The discourse structure analyzing apparatus 6 specifies an utterance period (solution presenting section) in which the operator presents solution information from the voice data included in the response history voice data created by the response history management apparatus 5.

辞書記憶部７は，辞書作成装置１が出力する辞書情報を記憶する。 The dictionary storage unit 7 stores dictionary information output from the dictionary creation device 1.

応対履歴記憶部８は，応対ごとに，応対したオペレータの識別情報と，応対の期間を示す情報，検索の成功または失敗を示す情報（検索成否）とを含む応対履歴情報を記憶している。 The response history storage unit 8 stores, for each response, response history information including identification information of the operator who has been processed, information indicating a response period, and information indicating success or failure of a search (search success / failure).

以下に，辞書作成装置１を含む辞書作成システムの処理動作を説明する。 Hereinafter, the processing operation of the dictionary creation system including the dictionary creation device 1 will be described.

図８は，辞書作成システムの処理の流れを示す図である。 FIG. 8 is a diagram showing the flow of processing of the dictionary creation system.

ステップＳ１：応対履歴管理装置５は，応対履歴記憶部８に記憶されている応対履歴情報の全ての応対（応対履歴ＩＤ）について，応対ごとに，応対音声記憶部２，応対履歴記憶部８，検索履歴記憶部３の各レコードを取り出して，応対履歴音声データを作成する。 Step S1: The response history management device 5 sets the response voice storage unit 2, the response history storage unit 8, the response history storage unit 8 for each response (response history ID) of the response history information stored in the response history storage unit 8, for each response. Each record in the search history storage unit 3 is taken out and response history voice data is created.

図９は，応対音声記憶部２の音声データのデータ構成例を示す図である。 FIG. 9 is a diagram illustrating a data configuration example of voice data in the reception voice storage unit 2.

応対音声記憶部２の音声データは，応対履歴ＩＤ，オペレータＩＤ，録音開始時刻，録音終了時刻，およびデータ本体のデータ項目を有する。 The voice data in the reception voice storage unit 2 includes a response history ID, an operator ID, a recording start time, a recording end time, and data items of a data body.

応対履歴ＩＤは，履歴として蓄積されている応対の識別情報である。オペレータＩＤは，応対したオペレータの識別情報である。 The service history ID is service identification information stored as a history. The operator ID is identification information of the operator who responds.

録音開始時刻および録音終了時刻は，対話録音音声データの録音の開始と終了の時刻を示す情報である。データ本体は，音声データ本体である２チャンネルのバイナリデータである。 The recording start time and the recording end time are information indicating the start and end times of the recording of the dialog recording voice data. The data body is two-channel binary data that is a voice data body.

図１０は，応対履歴記憶部８の応対履歴情報のデータ構成例を示す図である。 FIG. 10 is a diagram illustrating a data configuration example of the response history information in the response history storage unit 8.

応対履歴記憶部８の応対履歴情報は，応対履歴ＩＤ，オペレータＩＤ，応対開始時刻，応対終了時刻，検索成否のデータ項目を有する。 The response history information in the response history storage unit 8 includes data items such as response history ID, operator ID, response start time, response end time, and search success / failure.

応対開始時刻および応対終了時刻は，応対の開始および終了の時刻を示す情報である。検索成否は，質問集記憶部４の検索が成功したか失敗したかを示す情報（フラグ）である。 The response start time and response end time are information indicating the start and end times of the response. The success or failure of the search is information (flag) indicating whether the search of the question collection storage unit 4 has succeeded or failed.

図１１は，検索履歴記憶部３の検索履歴情報のデータ構成例を示す図である。 FIG. 11 is a diagram illustrating a data configuration example of search history information in the search history storage unit 3.

検索履歴記憶部３の検索履歴情報は，応対履歴ＩＤ，オペレータＩＤ，検索キーワード，検索結果，閲覧データのデータ項目を有する。 The search history information in the search history storage unit 3 includes data items of response history ID, operator ID, search keyword, search result, and browsing data.

検索キーワードは，検索処理で使用された検索キーワードである。検索結果は，前記検索キーワードによる検索で質問解決データが引き当て（ヒット）できたか否かを示す情報である。閲覧データは，検索でヒットして閲覧した質問解決データの識別情報である。 The search keyword is a search keyword used in the search process. The search result is information indicating whether or not the question solution data has been allocated (hit) by the search using the search keyword. The browsing data is identification information of the question solution data browsed after being hit by the search.

図１２は，質問集記憶部４の質問解決データのデータ構成例を示す図である。 FIG. 12 is a diagram illustrating a data configuration example of the question solution data in the question collection storage unit 4.

質問集記憶部４の質問解決データは，ＦＡＱＩＤと質問部と解決部とのデータ項目を有する。 The question solution data in the question collection storage unit 4 includes data items of FAQID, question unit, and solution unit.

ＦＡＱＩＤは，質問解決データの識別情報である。質問部は，顧客から問い合わせられる質問や障害を記述するデータ項目である。解決部は，質問部に記述された問題や障害の解決情報を記述するデータ項目である。 FAQID is identification information of question solution data. The question part is a data item describing a question or a problem that is inquired by a customer. The solution section is a data item that describes solution information for the problem or failure described in the question section.

図１３は，応対履歴音声データのデータ構成例を示す図である。 FIG. 13 is a diagram illustrating a data configuration example of the response history voice data.

応対履歴音声データは，応対履歴ＩＤ，オペレータＩＤ，検索キーワード，検索結果，対応音声（データ本体），対応音声区間（録音開始時刻：録音終了時刻）のデータ項目を有する。 The response history voice data includes data items of response history ID, operator ID, search keyword, search result, corresponding voice (data body), and corresponding voice section (recording start time: recording end time).

応対履歴音声データは，応対履歴ＩＤをキーに収集された音声データ，応対履歴情報および検索履歴情報から抽出されたレコードをもとに作成される。検索キーワードと検索結果とには，検索履歴情報の「検索キーワード」と「検索結果」の情報がそれぞれ格納される。対応音声と対応音声区間とには，音声データの「データ本体」と「録音開始時刻：録音終了時刻」の情報がそれぞれ格納される。 The response history voice data is created based on the voice data collected using the response history ID as a key, the response history information, and the records extracted from the search history information. In the search keyword and search result, information of “search keyword” and “search result” of the search history information is stored, respectively. In the corresponding voice and the corresponding voice section, information of “data body” and “recording start time: recording end time” of the voice data is stored.

ステップＳ２：談話構造解析装置６は，応対履歴管理装置５が作成した応対履歴音声データを取得して，応対履歴音声データごとに，対応音声（データ本体）の音声データに談話構造解析処理を行って，音声データを質問部（顧客が問い合わせをしている発話区間）と解決提示部（オペレータが解決情報を提示している発話区間）とに分解し，解決提示部を特定する。 Step S2: The discourse structure analysis device 6 acquires the response history voice data created by the response history management device 5, and performs the discourse structure analysis processing on the voice data of the corresponding voice (data body) for each response history voice data. Then, the voice data is divided into a question part (utterance section in which the customer is inquiring) and a solution presentation part (utterance section in which the operator is presenting the solution information), and the solution presentation part is specified.

具体的には，談話構造解析装置６は，顧客とオペレータとの音声が別チャンネルで録音されている音声データの各チャンネルで，単位区間毎に，発話された音声の大きさを示すパワー値を算出する。次に，談話構造解析装置６は，音声データの所定区間において，一定のパワーで発話する時間がより長く，かつ先行して発話するチャンネルを特定し，そのチャンネルで録音されている発話者を「先行主導発話者」と判定して，発話の最初から先行主導発話者が継続して発話する期間を特定し，その先行主導発話者が顧客であれば，当該期間を「質問発話部」と，その質問発話部に後続する他方のチャンネル（オペレータ側）の発話期間を，オペレータの「解決提示部」と判定する。 Specifically, the discourse structure analysis device 6 has a power value indicating the magnitude of the spoken voice for each unit section in each channel of voice data in which the voices of the customer and the operator are recorded in different channels. calculate. Next, the discourse structure analyzing apparatus 6 specifies a channel in which speech is performed for a longer period of time with a constant power in a predetermined section of the voice data and the speech is spoken in advance, and the speaker who is recorded on the channel is identified as “ It is determined as “leading-speaking utterer”, and the period in which the leading-speaking speaker continues speaking from the beginning of the utterance is specified. The utterance period of the other channel (operator side) following the question utterance part is determined as the “solution presentation part” of the operator.

談話構造解析装置６が実行する処理の詳細は，特願２００８−９９９９２７「音声データの質問発話部抽出処理プログラム，方法および装置，ならびに音声データの質問発話部を用いた顧客問い合わせ傾向推定処理プログラム，方法および装置」に記載されているとおりである。 The details of the processing executed by the discourse structure analyzing device 6 are described in Japanese Patent Application No. 2008-999927 “Sound utterance part extraction processing program, method and apparatus for voice data, and customer inquiry tendency estimation processing program using the question utterance part of voice data,” As described in "Method and apparatus".

談話構造解析装置６は，応対履歴音声データにデータ項目「解決提示区間」を追加して，特定した解決提示区間を示す情報（開始時刻：期間）を記録する。 The discourse structure analyzing apparatus 6 adds a data item “solution presentation section” to the response history voice data, and records information (start time: period) indicating the identified solution presentation section.

図１４は，解決提示区間が設けられた応対履歴音声データのデータ構成例を示す図である。 FIG. 14 is a diagram showing a data configuration example of the response history voice data provided with the solution presentation section.

図１４に示す応対履歴音声データでは，応対履歴ＩＤ＝Ｉｎｃｉｄｅｎｔ００１の応対の音声データは，音声データｒｅｃｏｒｄ００１に録音された時刻１５：３０から１６：００までの区間の音声データであり，時刻１５：３２から開始して５秒間継続する発話の区間が解決提示部の区間であることを示す。 In the response history audio data shown in FIG. 14, the response audio data of response history ID = Incident001 is the audio data of the section from time 15:30 to 16:00 recorded in the audio data record001, and time 15:32 It shows that the section of the utterance that starts for 5 seconds and continues for 5 seconds is the section of the solution presentation unit.

ステップＳ３：音声認識部１１は，談話構造解析装置６から応対履歴音声データを受け取って，応対履歴音声データの解決提示区間に対応する音声データを取り出し，所定の発話セグメント単位で音声認識処理用辞書を用いた音声認識処理を行って，音声テキストデータを作成して第１記憶部１７１へ格納する。 Step S3: The speech recognition unit 11 receives the response history speech data from the discourse structure analysis device 6, extracts the speech data corresponding to the solution presentation section of the response history speech data, and a speech recognition processing dictionary for each predetermined utterance segment. Is used to create voice text data and store it in the first storage unit 171.

図１５は，音声認識用辞書の例を示す図である。 FIG. 15 is a diagram illustrating an example of a speech recognition dictionary.

音声認識用辞書は，形態素解析部１２が，質問集記憶部４の問題解決データの解決部から，主要な自立語を抽出して作成される。 The speech recognition dictionary is created by the morphological analysis unit 12 by extracting main independent words from the problem solving data solution unit of the question collection storage unit 4.

ステップＳ４：形態素解析部１２は，第１記憶部１７１に格納されている音声テキストデータ全てについて形態素解析処理を行って単語に分割し，分割した単語から，所定の条件（例えば，自立語である単語，例えば，名詞，動詞，形容詞，形容動詞，副詞などの各単語等）に合致する単語を抽出する。さらに，形態素解析部１２は，抽出した単語群の中から，質問集記憶部４の質問解決データの解決部に出現する単語を発話順に抽出して応対履歴単語グループを作成して第２記憶部１７２へ格納する。 Step S4: The morpheme analysis unit 12 performs morpheme analysis processing on all the speech text data stored in the first storage unit 171 and divides it into words, and from the divided words, a predetermined condition (for example, an independent word) A word that matches a word (for example, each noun, verb, adjective, adjective verb, adverb, etc.) is extracted. Further, the morpheme analysis unit 12 extracts words appearing in the solution part of the question solution data in the question collection storage unit 4 from the extracted word group in order of utterances to create a response history word group, and the second storage unit Store to 172.

応対履歴単語グループは，応対履歴音声データの対応するレコードに追加または関連付けされる。 The response history word group is added to or associated with the corresponding record of the response history voice data.

図１６と図１７とは，応対履歴単語グループの例を示す図である。 16 and 17 are diagrams showing examples of response history word groups.

図１６（Ａ）は応対履歴単語グループ＃１の例を，図１６（Ｂ）応対履歴単語グループ＃２の例を，図１７は応対履歴単語グループ＃３の例を，それぞれ表す。 16A shows an example of response history word group # 1, FIG. 16B shows an example of response history word group # 2, and FIG. 17 shows an example of response history word group # 3.

応対履歴単語グループ＃１，＃２，＃３では，応対履歴ＩＤに対応する音声データに対する音声認識処理結果である音声テキストデータの各発話セグメントに出現する単語（認識結果単語）が，セグメント順すなわち単語の出現順に記録される。 In the response history word groups # 1, # 2, and # 3, words (recognition result words) appearing in each utterance segment of the speech text data, which is the speech recognition processing result for the speech data corresponding to the response history ID, are in segment order, that is, Recorded in order of word appearance.

ステップＳ５：参照候補作成部１３は，検索結果が失敗である応対に対応する応対履歴単語グループの単語から，参照候補を作成する。 Step S5: The reference candidate creation unit 13 creates a reference candidate from words in the reception history word group corresponding to the reception whose search result is failure.

図１８は，ステップＳ５の処理のより詳細な処理フローを示す図である。 FIG. 18 is a diagram showing a more detailed processing flow of the processing in step S5.

ステップＳ５１：参照候補作成部１３は，各応対履歴単語グループについて，所定の発話セグメント間隔以下（５以下）で出現する同一単語を除去する。 Step S51: The reference candidate creation unit 13 removes the same word that appears within a predetermined utterance segment interval (5 or less) for each response history word group.

具体的には，参照候補作成部１３は，第２記憶部に格納されている全ての応対履歴単語グループの各々について，指定されたセグメント距離（セグメント数）内に同一の単語が出現する関係を抽出し，抽出した関係にある後続の単語に除去フラグ(＃で単語を囲むことで本実施例では表す)を設定する。 Specifically, the reference candidate creation unit 13 determines the relationship in which the same word appears within the specified segment distance (number of segments) for each of all the response history word groups stored in the second storage unit. Extraction is performed, and a removal flag (represented in the present embodiment by enclosing the word with #) is set for subsequent words in the extracted relationship.

図１９，図２０は，同一単語の除去例を示す図である。 19 and 20 are diagrams illustrating examples of removing the same word.

図１９（Ａ）は応対履歴単語グループ＃１の例を，図１９（Ｂ）応対履歴単語グループ＃２の例を，図２０は応対履歴単語グループ＃３の例を，それぞれ表す。 19A shows an example of response history word group # 1, FIG. 19B shows an example of response history word group # 2, and FIG. 20 shows an example of response history word group # 3.

図１９（Ａ）の応対履歴単語グループ＃１では，セグメントＮｏ＝１７，１９に「インストール」があるため，後に出現する単語に除去フラグ（＃）が設定されていることを示す。図１９（Ｂ）の応対履歴単語グループ＃２では，セグメントＮｏ＝３，５と１９，２２とに「しっかり」があるため，後に出現するセグメントＮｏ＝５，２２の「しっかり」に除去フラグが設定されている。図２０の応対履歴単語グループ＃３では，セグメントＮｏ＝１，３に「しっかり」が，セグメントＮｏ＝１０，１３に「インストール」があるため，それぞれ，後に出現する単語に除去フラグが設定されている。 In the response history word group # 1 in FIG. 19A, since there is “install” in the segment numbers = 17 and 19, it indicates that the removal flag (#) is set for the word that appears later. In the response history word group # 2 in FIG. 19B, since the segment numbers = 3, 5 and 19, 22 are “solid”, the removal flag is “solid” for the segment numbers = 5, 22 that appear later. Is set. In the response history word group # 3 in FIG. 20, since “No.” is present in segment Nos. 1 and 3 and “Install” is present in segment Nos. 10 and 13, a removal flag is set for each word that appears later. Yes.

参照候補作成部１３は，応対履歴単語グループから除去フラグが設定された単語を除去する。 The reference candidate creation unit 13 removes the word for which the removal flag is set from the response history word group.

ステップＳ５２：参照候補作成部１３は，応対履歴音声データの検索結果が「ヒットせず」のレコード（応対）について，対応する応対履歴単語グループでの単語の出現順序と，質問集記憶部４の各質問解決データの解決部の単語の出現順序との一致度を計算して，一致度が所定の閾値以上に高い質問解決データを選択して暫定候補を作成する。 Step S52: For the record (response) in which the search result of the response history voice data is “no hit”, the reference candidate creation unit 13 determines the appearance order of the words in the corresponding response history word group, the question collection storage unit 4 The degree of coincidence with the appearance order of the words in the solution part of each question solving data is calculated, and the question solving data having a coincidence degree higher than a predetermined threshold is selected to create a provisional candidate.

参照候補作成部１３は，応対履歴音声データの検索結果が「ヒットせず」の応対に対応する応対履歴単語グループを１つずつ取り出す。参照候補作成部１３は，質問集記憶部４から質問解決データを１つずつ取り出して，取り出した応対履歴単語グループの出現順に並ぶ単語と，取り出した質問解決データの解決部に記述された順序の単語との一致度を計算する。 The reference candidate creation unit 13 takes out one response history word group corresponding to the response whose search result of the response history voice data is “no hit”. The reference candidate creation unit 13 extracts the question solution data one by one from the question collection storage unit 4, and arranges the words arranged in the appearance order of the extracted response history word groups and the order described in the solution unit of the extracted question solution data. Calculate the degree of matching with a word.

図２１（Ａ）は，応対履歴単語グループ＃１の単語群のうち，質問解決データ＃３０，に対して単語の出現順序の一致の計算に使用される単語の並びを示す図であり，図２１（Ｂ）は，応対履歴単語グループ＃１の単語群のうち，質問解決データ＃３１に対して単語の出現順序の一致の計算に使用される単語の並びを示す図である。 FIG. 21A is a diagram showing a sequence of words used for calculation of matching of appearance order of words with respect to question solution data # 30 in the word group of response history word group # 1. 21 (B) is a diagram showing a word sequence used for calculation of matching of appearance order of words with respect to the question solution data # 31 in the word group of the response history word group # 1.

単語の出現順序の一致度の計算は，例えば，編集距離（レーベンシュタイン距離）を用いて計算する。編集距離とは，２つの文字列がどの程度異なっているかを，編集回数すなわち，文字単位の挿入，削除，置換等の回数を文字間の距離として求める手法である。 The degree of coincidence of the appearance order of words is calculated using, for example, an edit distance (Levenstein distance). The edit distance is a technique for determining how different two character strings are as the distance between characters by the number of edits, that is, the number of insertions, deletions, and substitutions in character units.

文字列ａｂｃｄと，文字列ｂｃｄ，文字列ｂｚｄについて距離を求める場合を例とする。文字列ａｂｃｄと文字列ｂｃｄとの場合は，文字列ａｂｃｄを文字列ｂｃｄとするために「文字ａを削除」という１回の編集が必要である。また，文字列ａｂｃｄと文字列ｂｚｄとの場合は，文字列ａｂｃｄを文字列ｂｚｄとするために「文字ａを削除」「文字ｃを文字ｚに置換」という２回の編集が必要である。したがって，文字列ａｂｃｄと文字列ｂｃｄとの組は，文字列ａｂｃｄと文字列ｂｚｄとの組に比べて，距離が短く，一致度が高いこととなる。なお，編集距離の処理の詳細は後述する。 As an example, the distance is calculated for the character string abcd, the character string bcd, and the character string bzd. In the case of the character string abcd and the character string bcd, one-time editing of “deleting the character a” is necessary to make the character string abcd the character string bcd. In the case of the character string abcd and the character string bzd, two edits of “delete character a” and “replace character c with character z” are required to make the character string abcd a character string bzd. Therefore, the pair of the character string abcd and the character string bcd has a shorter distance and a higher matching degree than the pair of the character string abcd and the character string bzd. Details of the edit distance processing will be described later.

参照候補作成部１３は，文字の連続の代わりに，単語の連続を入力として，応対履歴単語グループと質問解決データ各々との２つの単語群の距離を求めて一致度を計算し，質問解決データ全てとの一致度を求める。 The reference candidate creation unit 13 receives the word continuation instead of the character continuation, obtains the distance between the two word groups of the response history word group and each of the question solving data, calculates the degree of coincidence, and obtains the question solving data. Find the degree of agreement with everything.

ここで，選択される質問解決データは１つであってもよいが，複数であることが望ましい。したがって，参照候補作成部１３は，計算した一致度の上位数個を選択するようにしてもよい。本実施形態では，参照候補作成部１３は，計算した一致度の上位２つの質問解決データを選択する。選択した質問解決データを暫定候補とする。 Here, although one question solution data may be selected, it is desirable that there be a plurality. Therefore, the reference candidate creation unit 13 may select the top several of the calculated degrees of coincidence. In the present embodiment, the reference candidate creation unit 13 selects the top two question solution data with the calculated degree of coincidence. The selected question solution data is set as a provisional candidate.

応対履歴単語グループ＃１の図２１に示す単語群の場合に，質問解決データ＃３０，＃３１に対する計算による距離が，それぞれ，７９．０，８７．０であり，質問解決データ＃３０，＃３１が上位２つであると，参照候補作成部１３は，質問解決データ＃３０，＃３１を暫定候補とする。 In the case of the word group shown in FIG. 21 of the response history word group # 1, the calculated distances to the question solution data # 30 and # 31 are 79.0 and 87.0, respectively, and the question solution data # 30 and # 30 If 31 is the top two, the reference candidate creation unit 13 sets question solution data # 30 and # 31 as provisional candidates.

ステップＳ５３：参照候補作成部１３は，応対履歴単語グループの単語群を，選択した暫定候補の数分用意し，複写した応対履歴単語グループの単語群から，対応する暫定候補（質問解決データ＃３０，＃３１）の解決部に出現しない単語を除去し，除去後の単語群を参照候補とし，第３記憶部１７３へ格納する。 Step S53: The reference candidate creating unit 13 prepares the word groups of the response history word groups for the number of the selected temporary candidates, and selects the corresponding temporary candidates (question solution data # 30) from the copied word groups of the response history word groups. , # 31), a word that does not appear in the resolution unit is removed, and the word group after the removal is used as a reference candidate and stored in the third storage unit 173.

図２２は，応対履歴単語グループの単語群の暫定候補にもとづく除去例を示す図である。 FIG. 22 is a diagram showing an example of removal based on provisional candidates for the word group of the response history word group.

図２２（Ａ）では，応対履歴単語グループ＃１の単語で暫定候補である質問解決データ＃３０に出現しない単語「インストール」に除去フラグ（＃）が設定されていることを示す。図２２（Ｂ）では，応対履歴単語グループ＃１の単語で暫定候補である質問解決データ＃３１に出現しない単語「インストール」「熱く」「排気」「ふさい」「口」「でるとこ」に除去フラグ（＃）が設定されていることを示す。 FIG. 22A shows that the removal flag (#) is set for the word “install” that does not appear in the question solution data # 30, which is a provisional candidate, in the words of the response history word group # 1. In FIG. 22 (B), words in the response history word group # 1 are removed to the words “install”, “hot”, “exhaust”, “exhaust”, “mouth”, and “depot” that do not appear in the question solution data # 31 that is a temporary candidate. Indicates that the flag (#) is set.

参照候補作成部１３は，応対履歴単語グループから除去フラグが設定された単語を除去して，除去後の応対履歴単語グループを参照候補Ｃ１（図２２（Ａ）），参照候補Ｃ２（図２２（Ｂ））とする。 The reference candidate creation unit 13 removes the word for which the removal flag is set from the response history word group, and uses the response history word group after the removal as a reference candidate C1 (FIG. 22 (A)) and a reference candidate C2 (FIG. 22 ( B)).

ステップＳ６：成功応対検出部１４は，応対履歴音声データを参照して，検索結果が「ヒット」である応対を全て取り出して，その応対の閲覧データに記録された問題解決データを特定する。そして，成功応対検出部１４は，第２記憶部の応対履歴単語グループから応対が成功であるものを抽出して，抽出した応対履歴単語グループ各々から，特定した問題解決データに出現しない単語を除去し，除去した単語群で成功事例単語グループを作成して第４記憶部１７４に格納する。 Step S6: The successful response detection unit 14 refers to the response history voice data, extracts all responses whose search result is “hit”, and specifies the problem solving data recorded in the browsing data of the response. Then, the successful response detection unit 14 extracts the successful response from the response history word group in the second storage unit, and removes words that do not appear in the identified problem solving data from each of the extracted response history word groups Then, a successful case word group is created from the removed word group and stored in the fourth storage unit 174.

具体的に，成功応対検出部１４は，検索結果が「ヒット」である応対の音声データに対応する応対履歴単語グループ＃２について，応対履歴単語グループ＃２の単語群を質問解決データの数分用意し，複写した応対履歴単語グループの単語群から，各質問解決データの解決部に出現しない単語を除去する。 Specifically, for the response history word group # 2 corresponding to the response voice data for which the search result is “hit”, the successful response detection unit 14 determines the number of words in the response history word group # 2 as many as the number of question solution data. The words that do not appear in the solution section of each question solution data are removed from the prepared and copied word groups of the response history word group.

図２３は，成功事例単語グループの例を示す図である。 FIG. 23 is a diagram illustrating an example of a success case word group.

図２３（Ａ）は，応対履歴単語グループ＃２の単語で質問解決データ＃３０の解決部に出現しない単語「しっかり」に除去フラグが設定されていることを示す。図２３（Ｂ）は，応対履歴単語グループ＃２の単語で質問解決データ＃３１の解決部に出現しない単語「しっかり」「インストール」に除去フラグが設定されていることを示す。 FIG. 23A shows that the removal flag is set for the word “firm” that does not appear in the solution of the question resolution data # 30 in the words of the response history word group # 2. FIG. 23B shows that the removal flag is set for the words “firm” and “install” that do not appear in the solution of the question resolution data # 31 in the words of the response history word group # 2.

成功応対検出部１４は，応対履歴単語グループ＃２の単語から，除去フラグが設定された単語を除去して，除去後の単語群を成功事例単語グループＧ２とする。 The successful response detection unit 14 removes the word set with the removal flag from the words of the response history word group # 2, and sets the word group after the removal as a successful case word group G2.

ステップＳ７：参照候補照合部１５は，各参照候補について，参照候補の単語群と，成功事例単語グループの各々の単語群とを照合して，構成されている単語の類似度を計算する。参照候補照合部１５は，計算した類似度が最大となった組み合わせ（参照候補と成功事例単語グループ）とを特定する。 Step S7: For each reference candidate, the reference candidate collation unit 15 collates the word group of the reference candidate with each word group of the success case word group, and calculates the similarity of the configured words. The reference candidate matching unit 15 identifies the combination (reference candidate and successful case word group) that has the highest calculated similarity.

具体的には，参照候補照合部１５は，参照候補の単語群と，成功事例単語グループの単語群との単語列を，文書のベクトルとみなして，以下の式１を適用して類似度を求める。 Specifically, the reference candidate matching unit 15 regards the word string of the word group of the reference candidate and the word group of the successful case word group as a document vector, and applies the following equation 1 to determine the similarity. Ask.

ある応対（応対履歴ＩＤ＝ｉｎｓｉｄｅｎｔ００１）から作成された参照候補Ｃ１，Ｃ２と，成功事例単語グループＧ１，Ｇ２との間で単語の類似度を計算し，計算の結果，参照候補Ｃ１と成功事例単語グループＧ１の組み合わせの類似度が最大であったとする。 The word similarity is calculated between the reference candidates C1 and C2 created from a certain response (response history ID = insident001) and the success case word groups G1 and G2, and as a result of the calculation, the reference candidate C1 and the success case word It is assumed that the similarity of the combination of the group G1 is the maximum.

参照候補照合部１５は，参照候補Ｃ１と成功事例単語グループＧ１との組み合わせを出力する。 The reference candidate matching unit 15 outputs a combination of the reference candidate C1 and the success case word group G1.

ステップＳ８：辞書情報作成部１６は，類似度が最大である組み合わせの参照候補と成功事例単語グループのそれぞれの応対履歴情報から，辞書情報，類似応対情報を作成し出力する。 Step S8: The dictionary information creation unit 16 creates and outputs dictionary information and similar response information from the response history information of the reference candidate of the combination having the maximum similarity and the successful case word group.

辞書情報作成部１６は，参照候補Ｃ１と成功事例単語グループＧ１の応対履歴音声データ（応対履歴ＩＤ＝ｉｎｃｉｄｅｎｔ００１，ｉｎｃｉｄｅｎｔ００２）を取り出して，応対履歴音声データの検索キーワード「パソコン，落ちる」，「電源，切れる」を抽出する（図１３参照）。辞書情報作成部１６は，抽出した検索キーワード「パソコン，落ちる」と「電源，切れる」との組を辞書情報として出力する。 The dictionary information creation unit 16 takes out the response history voice data (response history ID = incident001, incident002) of the reference candidate C1 and the success case word group G1, and searches for the response history voice data search keywords “PC, falling”, “power supply, Is extracted (see FIG. 13). The dictionary information creating unit 16 outputs a set of the extracted search keywords “personal computer, falling” and “power, can be turned off” as dictionary information.

図２４は，辞書情報の例を示す図である。図２４に示す辞書情報は，辞書記憶部７に蓄積される。 FIG. 24 is a diagram illustrating an example of dictionary information. The dictionary information shown in FIG. 24 is accumulated in the dictionary storage unit 7.

さらに，辞書情報作成部１６は，参照候補Ｃ１と成功事例単語グループＧ１との組み合わせの情報をもとに，参照候補Ｃ１の元となる応対に類似する応対として，質問解決データをヒットできた，成功事例単語グループＧ１の元となる応対を示す類似応対情報を作成し出力する。 Furthermore, the dictionary information creation unit 16 was able to hit the question solution data as an answer similar to the original answer of the reference candidate C1, based on the combination information of the reference candidate C1 and the success case word group G1. Similar response information indicating the response that is the basis of the successful case word group G1 is created and output.

この類似応対情報をもとに，応対履歴管理装置５は，検索履歴記憶部３の検索履歴情報に，データ項目「類似応対履歴」を追加して，類似応対情報に示される，対応付けられた成功の応対の応対履歴ＩＤを記録する。 Based on the similar response information, the response history management device 5 adds the data item “similar response history” to the search history information in the search history storage unit 3 and associates it with the corresponding information indicated in the similar response information. Record the response history ID of a successful response.

図２５は，類似応対履歴のデータ項目が追加された検索履歴情報のデータ構成例を示す図である。図２５では，応対履歴ＩＤ＝ｉｎｃｉｄｅｎｔ００１のレコードの「類似応対履歴」に，質問解決データをヒットできた応対の応対履歴ＩＤ（ｉｎｃｉｄｅｎｔ００２）が記録される。 FIG. 25 is a diagram illustrating a data configuration example of search history information to which a data item of a similar response history is added. In FIG. 25, the response history ID (incident002) of the response that was able to hit the question solution data is recorded in “similar response history” of the record of response history ID = incident001.

図２６は，単語の出現距離の計算を説明するための図である。 FIG. 26 is a diagram for explaining the calculation of the appearance distance of words.

比較する文字列Ａ（ａｂｃｄ），文字列Ｂ（ｂｚｄ）とする。図２６（Ａ）に示すように，距離テーブルとして，テーブルの列を文字列の要素数＋１，テーブルの行を文字列Ｂの要素数＋１とする２次元テーブルを用意する。 A character string A (abcd) and a character string B (bzd) to be compared are used. As shown in FIG. 26A, as the distance table, a two-dimensional table is prepared in which the column of the table is the number of elements of the character string + 1, and the row of the table is the number of elements of the character string B + 1.

そして，図２６（Ｂ）に示すように，距離テーブルの最初の行を，文字列Ａの先頭からの要素数で順に埋めていく（０，１，２，３，４）。次に，距離テーブルの最初の列を，文字列Ｂの先頭からの要素数で順に埋めていく（０，１，２，３）。 Then, as shown in FIG. 26B, the first row of the distance table is sequentially filled with the number of elements from the beginning of the character string A (0, 1, 2, 3, 4). Next, the first column of the distance table is sequentially filled with the number of elements from the beginning of the character string B (0, 1, 2, 3).

次に，図２６（Ｃ）に示すように，距離テーブルの欄で，距離が未算出のうち，最小の行かつ最小の列の要素について，距離を設定する。ここで，追加，削除，置換の演算のうちコストが最小のものをそこまでの距離として設定する。文字ａと文字ｂとの距離を考えると，文字の挿入では距離＝２，文字の削除では距離＝２，文字置換では距離＝１であるので，最小の距離＝１を採用して設定する。図２６（Ｄ）に示すように，距離テーブルの全ての欄について距離を設定する。そして，図２６（Ｅ）に示すように，距離テーブルの最大の要素（最大の行かつ最大の列）の値（２）を文字列Ａと文字列Ｂとの距離とする。 Next, as shown in FIG. 26C, the distance is set for the element in the smallest row and the smallest column among the distances not yet calculated in the distance table column. Here, the operation with the lowest cost among the operations of addition, deletion, and replacement is set as the distance to that. Considering the distance between the character a and the character b, the distance is 2 for character insertion, the distance is 2 for character deletion, and the distance is 1 for character replacement. Therefore, the minimum distance is set to 1. As shown in FIG. 26D, distances are set for all the columns of the distance table. Then, as shown in FIG. 26 (E), the value (2) of the maximum element (maximum row and maximum column) of the distance table is set as the distance between the character string A and the character string B.

図２７に，辞書作成装置１のハードウェア構成例を示す。 FIG. 27 shows a hardware configuration example of the dictionary creation device 1.

図２７に示すように，辞書作成装置１は，ＣＰＵ１０１，主記憶部（メモリ）１０３，入出力インターフェイス１０５，外部記憶装置１１０，入力装置（キーボード等）１２０，出力装置（ディスプレイ等）１３０を備えるコンピュータ１００によって実施することができる。 As shown in FIG. 27, the dictionary creation device 1 includes a CPU 101, a main storage unit (memory) 103, an input / output interface 105, an external storage device 110, an input device (such as a keyboard) 120, and an output device (such as a display) 130. It can be implemented by the computer 100.

また，辞書作成装置１は，コンピュータ１００が実行可能なプログラムによって実施することができる。この場合に，辞書作成装置１が有すべき機能の処理内容を記述したプログラムが提供される。提供されたプログラムをコンピュータ１００が実行することによって，上記説明した辞書作成装置１の処理機能がコンピュータ１００上で実現される。 The dictionary creating apparatus 1 can be implemented by a program that can be executed by the computer 100. In this case, a program describing the processing contents of the functions that the dictionary creating apparatus 1 should have is provided. When the computer 100 executes the provided program, the processing function of the dictionary creating apparatus 1 described above is realized on the computer 100.

すなわち，辞書作成装置１の音声認識部１１，形態素解析部１２，参照候補作成部１３，成功応対検出部１４，参照候補照合部１５，および辞書情報作成部１６等は，プログラムで構成することができ，応対音声記憶部２，検索履歴記憶部３，質問集記憶部４および第１〜第４記憶部１７１〜１７４は，外部記憶装置１１０で構成することができる。 That is, the speech recognition unit 11, the morpheme analysis unit 12, the reference candidate creation unit 13, the successful response detection unit 14, the reference candidate collation unit 15, the dictionary information creation unit 16 and the like of the dictionary creation device 1 can be configured by a program. The answering voice storage unit 2, the search history storage unit 3, the question collection storage unit 4, and the first to fourth storage units 171 to 174 can be configured by the external storage device 110.

なお，コンピュータ１００は，可搬型記録媒体から直接プログラムを読み取り，そのプログラムに従った処理を実行することもできる。また，コンピュータ１００は，サーバコンピュータからプログラムが転送されるごとに，逐次，受け取ったプログラムに従った処理を実行することもできる。 The computer 100 can also read a program directly from a portable recording medium and execute processing according to the program. In addition, each time the program is transferred from the server computer, the computer 100 can sequentially execute processing according to the received program.

さらに，このプログラムは，コンピュータ１００で読み取り可能な記録媒体に記録しておくことができる。 Further, this program can be recorded on a recording medium readable by the computer 100.

以上の本実施形態に示されるように，辞書作成装置１を，コールセンタのオペレータの検索処理を支援する言い換え辞書情報の作成処理に適用した場合に，次のような効果が得られる。すなわち，
（１）辞書作成装置１によれば，顧客が問い合わせた質問や障害の解決策を提示するために，オペレータが質問解決データの蓄積を利用する場合に，検索処理で入力される単語と，質問解決データをヒットした検索で使用された単語との対応付けを示す辞書の作成が容易となる。よって，経験の浅いオペレータが検索時に入力した単語を，辞書を利用してより適切な単語へ置き換えることができるため，質問解決データのヒット率を向上させることができる。 As shown in the above embodiment, when the dictionary creation device 1 is applied to the creation processing of paraphrase dictionary information that supports the search processing of the call center operator, the following effects are obtained. That is,
(1) According to the dictionary creation device 1, when an operator uses the accumulation of question solution data to present a question or a problem solution that a customer has inquired, a word input in the search process and a question It is easy to create a dictionary that shows the correspondence with the word used in the search that hits the solution data. Therefore, words input by an inexperienced operator during search can be replaced with more appropriate words using a dictionary, so that the hit rate of question solution data can be improved.

（２）辞書作成装置１によれば，置き換え辞書を，検索が成功した応対の履歴情報だけでなく失敗の応対の履歴情報をも利用して自動作成するため，辞書情報の蓄積を効率的に行うことができる。コールセンタで発生する応対のうち，検索に失敗した事例数は少なくなくこのような事例の応対履歴情報を有効活用することができる。 (2) According to the dictionary creation device 1, since the replacement dictionary is automatically created using not only the history information of the successful response but also the history information of the failed response, the dictionary information can be stored efficiently. It can be carried out. Of the responses that occur in the call center, the number of cases that failed to be searched is small, and the response history information of such cases can be used effectively.

（３）辞書作成装置１によれば，失敗した応対の応対履歴情報に，類似する応対であって検索が成功した応対の応対履歴情報を関連づけた情報を出力することができる。よって，応対履歴情報の有用性がより向上し，応対履歴情報の活用範囲を広げることが可能となる。 (3) According to the dictionary creation device 1, it is possible to output information that associates the response history information of a failed response with the response history information of a similar response that has been successfully searched. Therefore, the usefulness of the response history information is further improved, and the utilization range of the response history information can be expanded.

本願発明の実施態様における特徴を列記すると，以下のようになる。 The features in the embodiments of the present invention are listed as follows.

（付記１）置き換え可能な単語群を示す辞書情報を生成する辞書作成装置であって，
問い合わせに対する解決情報を含む質問解決データを記憶する質問集記憶部と，
問い合わせに対する応対ごとに，顧客との対話でオペレータが発話した音声データを記憶する応対音声記憶部と，
応対ごとに，検索処理で使用された検索キーワードと，該検索処理の成功または失敗を示す検索結果と，検索が成功した場合に閲覧された質問解決データを示す閲覧データとを含む検索履歴情報を記憶する検索履歴記憶部と，
音声テキストデータを記憶する第１記憶部と，
応対履歴単語グループを記憶する第２記憶部と，
参照候補を記憶する第３記憶部と，
成功事例単語グループを記憶する第４記憶部と，
前記応対音声記憶部に記憶された音声データ各々に音声認識処理を行って，該音声データに対応する音声テキストデータを作成して前記第１記憶部に格納する音声認識部と，
前記第１記憶部の各音声テキストデータに形態素解析処理を行って，該形態素解析処理で得た単語の中から前記質問集記憶部の前記質問解決データに出現する単語を発話順に抽出して応対履歴単語グループを作成して前記第２記憶部に格納する形態素解析部と，
前記検索履歴記憶部に記憶された検索履歴情報をもとに，前記第２記憶部から，検索結果が失敗である応対の応対履歴単語グループを取り出す処理と，
前記取り出した応対履歴単語グループの単語の出現順序と前記質問集記憶部の各質問解決データに記述されている単語の出現順序との一致度を計算して，前記一致度が所定の値以上である質問解決データを選択する処理と，
前記検索結果が失敗である応対の応対履歴単語グループから，該応対履歴単語グループとの一致度にもとづいて選択された質問解決データに記述されていない単語を除去して参照候補を作成して前記第３記憶部に格納する処理とを行う参照候補作成部と，
前記検索履歴記憶部に記憶された検索履歴情報をもとに，検索結果が成功である応対を全て検出して，該検索結果が成功である応対の閲覧データである質問解決データを特定する処理と，
前記第２記憶部から，前記検索結果が成功である応対各々に対応する応対履歴単語グループを取り出して，該取り出した応対履歴単語グループから前記特定された質問解決データに記述されていない単語を除去して成功事例単語グループを作成して前記第４記憶部に格納する処理とを行う成功応対検出部と，
前記第３記憶部の参照候補の１つを前記第４記憶部の成功事例単語グループ各々と比較して，それぞれが含む単語の類似度を計算して，前記類似度が最大である参照候補と成功事例単語グループとの組み合わせを特定する参照候補照合部と，
前記特定された組み合わせの参照候補と成功事例単語グループのそれぞれの応対履歴情報から検索キーワードを抽出して，該抽出した検索キーワード群を辞書情報として出力する辞書情報作成部とを備える
ことを特徴とする辞書作成装置。 (Supplementary note 1) A dictionary creation device for generating dictionary information indicating replaceable word groups,
A question collection storage unit for storing question solution data including solution information for the inquiry;
For each response to an inquiry, a response voice storage unit that stores voice data uttered by an operator in a dialog with a customer;
For each response, search history information including a search keyword used in the search process, a search result indicating success or failure of the search process, and browsing data indicating the question solution data browsed when the search is successful. A search history storage unit for storing;
A first storage unit for storing voice text data;
A second storage unit for storing a response history word group;
A third storage unit for storing reference candidates;
A fourth storage unit for storing success case word groups;
A speech recognition unit that performs speech recognition processing on each of the speech data stored in the reception speech storage unit, creates speech text data corresponding to the speech data, and stores the speech text data in the first storage unit;
A morpheme analysis process is performed on each speech text data in the first storage unit, and words appearing in the question solution data in the question collection storage unit are extracted from the words obtained by the morpheme analysis process in the utterance order. A morpheme analyzer that creates a history word group and stores it in the second storage unit;
Based on the search history information stored in the search history storage unit, a process of extracting a response history word group of a response whose search result is unsuccessful from the second storage unit;
The degree of coincidence between the appearance order of words in the extracted response history word group and the appearance order of words described in each question solution data of the question collection storage unit is calculated, and the degree of coincidence is equal to or greater than a predetermined value. The process of selecting certain question resolution data;
A reference candidate is created by removing words that are not described in the question solution data selected based on the degree of matching with the response history word group from the response history word group of the response whose search result is failure, A reference candidate creation unit that performs processing to be stored in the third storage unit;
A process of detecting all the responses whose search results are successful based on the search history information stored in the search history storage unit, and identifying the question solution data as the browsing data of the responses whose search results are successful When,
A response history word group corresponding to each response whose search result is successful is extracted from the second storage unit, and words not described in the identified question solution data are removed from the extracted response history word group A successful response detection unit that creates a success case word group and stores it in the fourth storage unit;
One of the reference candidates in the third storage unit is compared with each successful case word group in the fourth storage unit, the similarity of the words included in each is calculated, and the reference candidate having the maximum similarity A reference candidate matching unit that identifies combinations with successful case word groups;
A dictionary information creating unit that extracts a search keyword from the response history information of each of the specified combination of reference candidates and successful case word groups, and outputs the extracted search keyword group as dictionary information. Dictionary creation device.

（付記２）前記参照候補作成部は，前記単語の出現順序の一致度を計算する前に，前記質問集記憶部の質問解決データを参照して，前記応対履歴単語グループに含まれる単語から前記質問解決データのいずれにも記述されていない単語を除去する
ことを特徴とする付記１に記載の辞書作成装置。 (Additional remark 2) The said reference candidate preparation part refers to the question solution data of the said question collection memory | storage part, before calculating the coincidence degree of the appearance order of the said word from the word contained in the said response history word group, The dictionary creation device according to appendix 1, wherein words that are not described in any of the question solution data are removed.

（付記３）前記参照候補作成部は，前記単語の出現順序の一致度を計算する前に，前記応対履歴単語グループに，所定時間内に出現した同一の単語が複数存在する場合に，該当する単語の１つのみを保持し他の単語を全て除去する
ことを特徴とする前記付記１または前記付記２のいずれかに記載の辞書作成装置。 (Supplementary Note 3) The reference candidate creation unit corresponds to the case where a plurality of identical words appearing within a predetermined time period exist in the response history word group before calculating the degree of coincidence of the appearance order of the words. The dictionary creation apparatus according to either Supplementary Note 1 or Supplementary Note 2, wherein only one word is retained and all other words are removed.

（付記４）前記辞書情報作成部は，前記参照候補照合部が特定した組み合わせをもとに，該組み合わせの参照候補の元となる応対に，該組み合わせられた成功事例単語グループの元となる応対を関連づける類似応対情報を作成して出力する
ことを特徴とする前記付記１ないし前記３のいずれか一項に記載の辞書作成装置。 (Additional remark 4) The said dictionary information preparation part is based on the combination which the said reference candidate collation part specified, The response used as the origin of this combined success example word group to the response which becomes the reference candidate of this combination The dictionary creation apparatus according to any one of the supplementary notes 1 to 3, wherein the similar response information for associating is generated and output.

（付記５）コンピュータに，置き換え可能な単語群を示す辞書情報を生成する処理を実行させるための辞書作成プログラムであって，
問い合わせに対する解決情報を含む質問解決データを記憶する質問集記憶部と，
問い合わせに対する応対ごとに，オペレータが発話した音声データを記憶する応対音声記憶部と，
応対ごとに，検索処理で使用された検索キーワードと，該検索処理の成功または失敗を示す検索結果と，検索が成功した場合に閲覧された質問解決データを示す閲覧データとを含む検索履歴情報を記憶する検索履歴記憶部と，
音声テキストデータを記憶する第１記憶部と，
応対履歴単語グループを記憶する第２記憶部と，
参照候補を記憶する第３記憶部と，
成功事例単語グループを記憶する第４記憶部とを備えるコンピュータに，
前記応対音声記憶部に記憶された音声データ各々に音声認識処理を行って，該音声データに対応する音声テキストデータを作成して前記第１記憶部に格納する処理と，
前記第１記憶部の各音声テキストデータに形態素解析処理を行って，該形態素解析処理で得た単語の中から前記質問集記憶部の前記質問解決データに出現する単語を発話順に抽出して応対履歴単語グループを作成して前記第２記憶部に格納する処理と，
前記検索履歴記憶部に記憶された検索履歴情報をもとに，前記第２記憶部から，検索結果が失敗である応対の応対履歴単語グループを取り出す処理と，
前記取り出した応対履歴単語グループの単語の出現順序と前記質問集記憶部の各質問解決データに記述されている単語の出現順序との一致度を計算して，前記一致度が所定の値以上である質問解決データを選択する処理と，
前記検索結果が失敗である応対の応対履歴単語グループから，該応対履歴単語グループとの一致度にもとづいて選択された質問解決データに記述されていない単語を除去して参照候補を作成して前記第３記憶部に格納する処理と，
前記検索履歴記憶部に記憶された検索履歴情報をもとに，検索結果が成功である応対を全て検出して，該検索結果が成功である応対の閲覧データである質問解決データを特定する処理と，
前記第２記憶部から，前記検索結果が成功である応対各々に対応する応対履歴単語グループを取り出して，該取り出した応対履歴単語グループから前記特定された質問解決データに記述されていない単語を除去して成功事例単語グループを作成して前記第４記憶部に格納する処理と，
前記第３記憶部の参照候補の１つを前記第４記憶部の成功事例単語グループ各々と比較して，それぞれが含む単語の類似度を計算して，前記類似度が最大である参照候補と成功事例単語グループとの組み合わせを特定する処理と，
前記特定された組み合わせの参照候補と成功事例単語グループのそれぞれの応対履歴情報から検索キーワードを抽出して，該抽出した検索キーワード群を辞書情報として出力する処理とを，
実行させるための辞書作成プログラム。 (Supplementary note 5) A dictionary creation program for causing a computer to execute processing for generating dictionary information indicating replaceable word groups,
A question collection storage unit for storing question solution data including solution information for the inquiry;
For each response to an inquiry, a response voice storage unit that stores voice data uttered by the operator;
For each response, search history information including a search keyword used in the search process, a search result indicating success or failure of the search process, and browsing data indicating the question solution data browsed when the search is successful. A search history storage unit for storing;
A first storage unit for storing voice text data;
A second storage unit for storing a response history word group;
A third storage unit for storing reference candidates;
In a computer comprising a fourth storage unit for storing success case word groups,
Processing for performing voice recognition processing on each of the voice data stored in the reception voice storage unit, creating voice text data corresponding to the voice data, and storing the voice text data in the first storage unit;
A morpheme analysis process is performed on each speech text data in the first storage unit, and words appearing in the question solution data in the question collection storage unit are extracted from the words obtained by the morpheme analysis process in the utterance order. A process of creating a history word group and storing it in the second storage unit;
Based on the search history information stored in the search history storage unit, a process of extracting a response history word group of a response whose search result is unsuccessful from the second storage unit;
The degree of coincidence between the appearance order of words in the extracted response history word group and the appearance order of words described in each question solution data of the question collection storage unit is calculated, and the degree of coincidence is equal to or greater than a predetermined value. The process of selecting certain question resolution data;
A reference candidate is created by removing words that are not described in the question solution data selected based on the degree of matching with the response history word group from the response history word group of the response whose search result is failure, Processing to be stored in the third storage unit;
A process of detecting all the responses whose search results are successful based on the search history information stored in the search history storage unit, and identifying the question solution data as the browsing data of the responses whose search results are successful When,
A response history word group corresponding to each response whose search result is successful is extracted from the second storage unit, and words not described in the identified question solution data are removed from the extracted response history word group A process of creating a success case word group and storing it in the fourth storage unit;
One of the reference candidates in the third storage unit is compared with each successful case word group in the fourth storage unit, the similarity of the words included in each is calculated, and the reference candidate having the maximum similarity A process for identifying combinations with success case word groups;
A process of extracting a search keyword from the response history information of each of the identified combination reference candidate and success case word group, and outputting the extracted search keyword group as dictionary information,
Dictionary creation program to be executed.

（付記６）前記コンピュータに，前記単語の出現順序の一致度を計算する前に，前記質問集記憶部の質問解決データを参照して，前記応対履歴単語グループに含まれる単語から前記質問解決データのいずれにも記述されていない単語を除去する処理を，
実行させるための前記付記５に記載の辞書作成プログラム。 (Additional remark 6) Before calculating the degree of coincidence of the appearance order of the words to the computer, the question resolution data is referred to from the words included in the response history word group by referring to the question resolution data in the question collection storage unit. The process of removing words that are not described in any of
The dictionary creation program according to appendix 5 for execution.

（付記７）前記コンピュータに，前記単語の出現順序の一致度を計算する前に，前記応対履歴単語グループに，所定時間内に出現した同一の単語が複数存在する場合に，該当する単語の１つのみを保持し他の単語を全て除去する処理を，
実行させるための前記付記５または付記６のいずれかに記載の辞書作成プログラム。 (Supplementary note 7) Before the computer calculates the degree of coincidence of the appearance order of the words, if there are a plurality of identical words appearing within a predetermined time in the response history word group, Process to keep only one and remove all other words,
The dictionary creation program according to any one of Supplementary Note 5 and Supplementary Note 6 for execution.

（付記８）前記コンピュータに，前記参照候補照合部が特定した組み合わせをもとに，該組み合わせの参照候補の元となる応対に，該組み合わせられた成功事例単語グループの元となる応対を関連づける類似応対情報を作成して出力する処理を，
実行させるための前記付記５ないし付記７のいずれかに記載の辞書作成プログラム。 (Additional remark 8) The similarity which associates the response which becomes the origin of this combined success example word group with the response which becomes the reference candidate of this combination based on the combination which the said reference candidate collation part specified to the said computer Processing to create and output the response information
The dictionary creation program according to any one of the supplementary notes 5 to 7 for execution.

（付記９）コンピュータが，置き換え可能な単語群を示す辞書情報を生成する辞書作成方法であって，
問い合わせに対する解決情報を含む質問解決データを記憶する質問集記憶部と，
問い合わせに対する応対ごとに，オペレータが発話した音声データを記憶する応対音声記憶部と，
応対ごとに，検索処理で使用された検索キーワードと，該検索処理の成功または失敗を示す検索結果と，検索が成功した場合に閲覧された質問解決データを示す閲覧データとを含む検索履歴情報を記憶する検索履歴記憶部と，
音声テキストデータを記憶する第１記憶部と，
応対履歴単語グループを記憶する第２記憶部と，
参照候補を記憶する第３記憶部と，
成功事例単語グループを記憶する第４記憶部とを備えるコンピュータが実行する，
前記応対音声記憶部に記憶された音声データ各々に音声認識処理を行って，該音声データに対応する音声テキストデータを作成して前記第１記憶部に格納する処理過程と，
前記第１記憶部の各音声テキストデータに形態素解析処理を行って，該形態素解析処理で得た単語の中から前記質問集記憶部の前記質問解決データに出現する単語を発話順に抽出して応対履歴単語グループを作成して前記第２記憶部に格納する処理過程と，
前記検索履歴記憶部に記憶された検索履歴情報をもとに，前記第２記憶部から，検索結果が失敗である応対の応対履歴単語グループを取り出す処理過程と，
前記取り出した応対履歴単語グループの単語の出現順序と前記質問集記憶部の各質問解決データに記述されている単語の出現順序との一致度を計算して，前記一致度が所定の値以上である質問解決データを選択する処理過程と，
前記検索結果が失敗である応対の応対履歴単語グループから，該応対履歴単語グループとの一致度にもとづいて選択された質問解決データに記述されていない単語を除去して参照候補を作成して前記第３記憶部に格納する処理過程と，
前記検索履歴記憶部に記憶された検索履歴情報をもとに，検索結果が成功である応対を全て検出して，該検索結果が成功である応対の閲覧データである質問解決データを特定する処理過程と，
前記第２記憶部から，前記検索結果が成功である応対各々に対応する応対履歴単語グループを取り出して，該取り出した応対履歴単語グループから前記特定された質問解決データに記述されていない単語を除去して成功事例単語グループを作成して前記第４記憶部に格納する処理過程と，
前記第３記憶部の参照候補の１つを前記第４記憶部の成功事例単語グループ各々と比較して，それぞれが含む単語の類似度を計算して，前記類似度が最大である参照候補と成功事例単語グループとの組み合わせを特定する処理過程と，
前記特定された組み合わせの参照候補と成功事例単語グループのそれぞれの応対履歴情報から検索キーワードを抽出して，該抽出した検索キーワード群を辞書情報として出力する処理過程とを備える
ことを特徴とする辞書作成方法。 (Supplementary note 9) A dictionary creation method in which a computer generates dictionary information indicating replaceable word groups,
A question collection storage unit for storing question solution data including solution information for the inquiry;
For each response to an inquiry, a response voice storage unit that stores voice data uttered by the operator;
For each response, search history information including a search keyword used in the search process, a search result indicating success or failure of the search process, and browsing data indicating the question solution data browsed when the search is successful. A search history storage unit for storing;
A first storage unit for storing voice text data;
A second storage unit for storing a response history word group;
A third storage unit for storing reference candidates;
A computer comprising a fourth storage unit for storing success case word groups;
A process of performing voice recognition processing on each of the voice data stored in the reception voice storage unit, creating voice text data corresponding to the voice data, and storing the voice text data in the first storage unit;
A morpheme analysis process is performed on each speech text data in the first storage unit, and words appearing in the question solution data in the question collection storage unit are extracted from the words obtained by the morpheme analysis process in the utterance order. A process of creating a history word group and storing it in the second storage unit;
Based on the search history information stored in the search history storage unit, a process of retrieving a response history word group of a response whose search result is unsuccessful from the second storage unit;
The degree of coincidence between the appearance order of words in the extracted response history word group and the appearance order of words described in each question solution data of the question collection storage unit is calculated, and the degree of coincidence is equal to or greater than a predetermined value. A process of selecting certain question resolution data;
A reference candidate is created by removing words that are not described in the question solution data selected based on the degree of matching with the response history word group from the response history word group of the response whose search result is failure, A process of storing in the third storage unit;
A process of detecting all the responses whose search results are successful based on the search history information stored in the search history storage unit, and identifying the question solution data as the browsing data of the responses whose search results are successful Process,
A response history word group corresponding to each response whose search result is successful is extracted from the second storage unit, and words not described in the identified question solution data are removed from the extracted response history word group A process of creating a success case word group and storing it in the fourth storage unit;
One of the reference candidates in the third storage unit is compared with each successful case word group in the fourth storage unit, the similarity of the words included in each is calculated, and the reference candidate having the maximum similarity A process of identifying combinations with successful case word groups,
A dictionary comprising: a search keyword extracted from the response history information of each of the identified combination reference candidate and successful case word group, and the extracted search keyword group is output as dictionary information. How to make.

（付記１０）前記一致度が所定の値以上である質問解決データを選択する処理過程において，前記一致度を計算する前に，前記質問集記憶部の質問解決データを参照して，前記応対履歴単語グループに含まれる単語から前記質問解決データのいずれにも記述されていない単語を除去する
ことを特徴とする前記付記９に記載の辞書作成方法。 (Additional remark 10) In the process of selecting the question solution data whose degree of coincidence is a predetermined value or more, before calculating the degree of coincidence, referring to the question solution data in the question collection storage unit, the response history The dictionary creation method according to appendix 9, wherein words that are not described in any of the question solution data are removed from words included in a word group.

（付記１１）前記一致度が所定の値以上である質問解決データを選択する処理過程において，前記一致度を計算する前に，前記応対履歴単語グループに，所定時間内に出現した同一の単語が複数存在する場合に，該当する単語の１つのみを保持し他の単語を全て除去する
ことを特徴とする前記付記９または前記付記１０に記載の辞書作成方法。 (Additional remark 11) In the process of selecting the question solution data in which the degree of coincidence is a predetermined value or more, before calculating the degree of coincidence, the same word appearing in the response history word group within a predetermined time The dictionary creation method according to Supplementary Note 9 or Supplementary Note 10, wherein when there are a plurality of words, only one of the corresponding words is retained and all other words are removed.

（付記１２）前記辞書情報を出力する処理過程において，前記参照候補照合部が特定した組み合わせをもとに，該組み合わせの参照候補の元となる応対に，該組み合わせられた成功事例単語グループの元となる応対を関連づける類似応対情報を作成して出力する
ことを特徴とする前記付記９ないし前記付記１１のいずれか一項に記載の辞書作成方法。 (Supplementary Note 12) In the process of outputting the dictionary information, based on the combination specified by the reference candidate matching unit, the response of the combined success case word group is used as the source of the reference candidate of the combination. 13. The dictionary creation method according to any one of the supplementary note 9 to the supplementary note 11, wherein similar response information that associates the response to be generated is generated and output.

１辞書作成装置
１１音声認識部
１２形態素解析部
１３参照候補作成部
１４成功応対検出部
１５参照候補照合部
１６辞書情報作成部
１７１第１記憶部
１７２第２記憶部
１７３第３記憶部
１７４第４記憶部
２応対音声記憶部
３検索履歴記憶部
４質問集記憶部
５応対履歴管理装置
６談話構造解析装置
７辞書記憶部
８応対履歴記憶部
DESCRIPTION OF SYMBOLS 1 Dictionary creation apparatus 11 Speech recognition part 12 Morphological analysis part 13 Reference candidate creation part 14 Successful response detection part 15 Reference candidate collation part 16 Dictionary information creation part 171 1st memory | storage part 172 2nd memory | storage part 173 3rd memory | storage part 174 4th Storage unit 2 Response voice storage unit 3 Search history storage unit 4 Question collection storage unit 5 Response history management device 6 Discourse structure analysis device 7 Dictionary storage unit 8 Response history storage unit

Claims

A dictionary creation device for generating dictionary information indicating replaceable word groups,
A question collection storage unit for storing question solution data including solution information for the inquiry;
For each response to an inquiry, a response voice storage unit that stores voice data uttered by the operator;
For each response, search history information including a search keyword used in the search process, a search result indicating success or failure of the search process, and browsing data indicating the question solution data browsed when the search is successful. A search history storage unit for storing;
A first storage unit for storing voice text data;
A second storage unit for storing a response history word group;
A third storage unit for storing reference candidates;
A fourth storage unit for storing success case word groups;
A speech recognition unit that performs speech recognition processing on each of the speech data stored in the reception speech storage unit, creates speech text data corresponding to the speech data, and stores the speech text data in the first storage unit;
A morpheme analysis process is performed on each speech text data in the first storage unit, and words appearing in the question solution data in the question collection storage unit are extracted from the words obtained by the morpheme analysis process in the utterance order. A morpheme analyzer that creates a history word group and stores it in the second storage unit;
Based on the search history information stored in the search history storage unit, a process of extracting a response history word group of a response whose search result is unsuccessful from the second storage unit;
The degree of coincidence between the appearance order of words in the extracted response history word group and the appearance order of words described in each question solution data of the question collection storage unit is calculated, and the degree of coincidence is equal to or greater than a predetermined value. The process of selecting certain question resolution data;
A reference candidate is created by removing words that are not described in the question solution data selected based on the degree of matching with the response history word group from the response history word group of the response whose search result is failure, A reference candidate creation unit that performs processing to be stored in the third storage unit;
A process of detecting all the responses whose search results are successful based on the search history information stored in the search history storage unit, and identifying the question solution data as the browsing data of the responses whose search results are successful When,
A response history word group corresponding to each response whose search result is successful is extracted from the second storage unit, and words not described in the identified question solution data are removed from the extracted response history word group A successful response detection unit that creates a success case word group and stores it in the fourth storage unit;
One of the reference candidates in the third storage unit is compared with each successful case word group in the fourth storage unit, the similarity of the words included in each is calculated, and the reference candidate having the maximum similarity A reference candidate matching unit that identifies combinations with successful case word groups;
A dictionary information creating unit that extracts a search keyword from the response history information of each of the specified combination of reference candidates and successful case word groups, and outputs the extracted search keyword group as dictionary information. Dictionary creation device.

The reference candidate creation unit refers to the question solution data in the question collection storage unit and calculates the question solution data from the words included in the response history word group before calculating the degree of coincidence of the appearance order of the words. The dictionary creation apparatus according to claim 1, wherein words that are not described in any of the words are removed.

The reference candidate creation unit may calculate one of the corresponding words when the response history word group includes a plurality of identical words that appear within a predetermined time before calculating the degree of coincidence of the appearance order of the words. 3. The dictionary creation device according to claim 1, wherein only a word is stored and all other words are removed.

The dictionary information creation unit, based on the combination specified by the reference candidate collation unit, associates a response that is a source of the reference candidate of the combination with a response that is a source of the combined successful case word group. The dictionary creation device according to any one of claims 1 to 3, wherein information is created and output.

A dictionary creation program for causing a computer to execute a process for generating dictionary information indicating replaceable word groups,
A question collection storage unit for storing question solution data including solution information for the inquiry;
For each response to an inquiry, a response voice storage unit that stores voice data uttered by the operator;
For each response, search history information including a search keyword used in the search process, a search result indicating success or failure of the search process, and browsing data indicating the question solution data browsed when the search is successful. A search history storage unit for storing;
A first storage unit for storing voice text data;
A second storage unit for storing a response history word group;
A third storage unit for storing reference candidates;
In a computer comprising a fourth storage unit for storing success case word groups,
Processing for performing voice recognition processing on each of the voice data stored in the reception voice storage unit, creating voice text data corresponding to the voice data, and storing the voice text data in the first storage unit;
A morpheme analysis process is performed on each speech text data in the first storage unit, and words appearing in the question solution data in the question collection storage unit are extracted from the words obtained by the morpheme analysis process in the utterance order. A process of creating a history word group and storing it in the second storage unit;
Based on the search history information stored in the search history storage unit, a process of extracting a response history word group of a response whose search result is unsuccessful from the second storage unit;
The degree of coincidence between the appearance order of words in the extracted response history word group and the appearance order of words described in each question solution data of the question collection storage unit is calculated, and the degree of coincidence is equal to or greater than a predetermined value. The process of selecting certain question resolution data;
A reference candidate is created by removing words that are not described in the question solution data selected based on the degree of matching with the response history word group from the response history word group of the response whose search result is failure, Processing to be stored in the third storage unit;
A process of detecting all the responses whose search results are successful based on the search history information stored in the search history storage unit, and identifying the question solution data as the browsing data of the responses whose search results are successful When,
A response history word group corresponding to each response whose search result is successful is extracted from the second storage unit, and words not described in the identified question solution data are removed from the extracted response history word group A process of creating a success case word group and storing it in the fourth storage unit;
One of the reference candidates in the third storage unit is compared with each successful case word group in the fourth storage unit, the similarity of the words included in each is calculated, and the reference candidate having the maximum similarity A process for identifying combinations with success case word groups;
A process of extracting a search keyword from the response history information of each of the identified combination reference candidate and success case word group, and outputting the extracted search keyword group as dictionary information,
Dictionary creation program to be executed.

A dictionary creation method in which a computer generates dictionary information indicating replaceable word groups,
A question collection storage unit for storing question solution data including solution information for the inquiry;
For each response to an inquiry, a response voice storage unit that stores voice data uttered by the operator;
For each response, search history information including a search keyword used in the search process, a search result indicating success or failure of the search process, and browsing data indicating the question solution data browsed when the search is successful. A search history storage unit for storing;
A first storage unit for storing voice text data;
A second storage unit for storing a response history word group;
A third storage unit for storing reference candidates;
A computer comprising a fourth storage unit for storing success case word groups;
A process of performing voice recognition processing on each of the voice data stored in the reception voice storage unit, creating voice text data corresponding to the voice data, and storing the voice text data in the first storage unit;
A morpheme analysis process is performed on each speech text data in the first storage unit, and words appearing in the question solution data in the question collection storage unit are extracted from the words obtained by the morpheme analysis process in the utterance order. A process of creating a history word group and storing it in the second storage unit;
Based on the search history information stored in the search history storage unit, a process of retrieving a response history word group of a response whose search result is unsuccessful from the second storage unit;
The degree of coincidence between the appearance order of words in the extracted response history word group and the appearance order of words described in each question solution data of the question collection storage unit is calculated, and the degree of coincidence is equal to or greater than a predetermined value. A process of selecting certain question resolution data;
A reference candidate is created by removing words that are not described in the question solution data selected based on the degree of matching with the response history word group from the response history word group of the response whose search result is failure, A process of storing in the third storage unit;
A process of detecting all the responses whose search results are successful based on the search history information stored in the search history storage unit, and identifying the question solution data as the browsing data of the responses whose search results are successful Process,
A response history word group corresponding to each response whose search result is successful is extracted from the second storage unit, and words not described in the identified question solution data are removed from the extracted response history word group A process of creating a success case word group and storing it in the fourth storage unit;
One of the reference candidates in the third storage unit is compared with each successful case word group in the fourth storage unit, the similarity of the words included in each is calculated, and the reference candidate having the maximum similarity A process of identifying combinations with successful case word groups,
A dictionary comprising: a search keyword extracted from the response history information of each of the identified combination reference candidate and successful case word group, and the extracted search keyword group is output as dictionary information. How to make.