JP2006133296A

JP2006133296A - Voice interactive device

Info

Publication number: JP2006133296A
Application number: JP2004319269A
Authority: JP
Inventors: Kazuya Nomura; 和也野村; Hitoshi Araki; 均荒木; Hiroteru Kawasaki; 剛照川崎; Tadashi Yoshida; 直史吉田
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2004-11-02
Filing date: 2004-11-02
Publication date: 2006-05-25

Abstract

<P>PROBLEM TO BE SOLVED: To provide an easy-to-use voice interactive device with which the user would not feel troublesome. <P>SOLUTION: A interactive control part 12 makes a dictionary control part 14 create a voice recognition dictionary, including possible phrases, such as "I do not understand" which a user may utter, when the user cannot find a reply to a question, and a voice recognition part 11 performs voice recognition, according to this voice recognition dictionary and outputs a result to the interactive control part 12. The interactive control part 12 decides it by an unknown expression word deciding part 18 whether the result of the voice recognition is, in agreement with an unknown expression word registered in an unknown expression word dictionary 17; and if the result is in agreement, a counter for counting the frequency in which the unknown expression word corresponding to the question was replied, is made to count up another count, and when a counter value exceeds pre-set value, the question is deleted from a scenario so as not to be made from the next time thereafter. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、利用者の音声を認識し、認識した音声に対応する音声を合成して出力し、音声により利用者と対話を行って利用者の要求を処理する音声対話装置に関するものである。 The present invention relates to a voice interaction apparatus that recognizes a user's voice, synthesizes and outputs a voice corresponding to the recognized voice, and processes a user's request by interacting with the user using the voice.

従来、音声対話装置において、利用者に質問をして、その回答に対応した質問を繰り返すことにより利用者の目的を達成するものがある。 2. Description of the Related Art Conventionally, there is a voice dialogue apparatus that achieves a user's purpose by asking a user a question and repeating a question corresponding to the answer.

このような音声対話装置では、利用者が質問に答えられなかった場合や、曖昧な答えをした場合、対話を継続することができずに、利用者の目的を達成することができなかった。 In such a speech dialogue apparatus, when the user cannot answer the question or makes an ambiguous answer, the dialogue cannot be continued and the user's purpose cannot be achieved.

このような問題を解決するため、利用者が「わかりません。」などの不明を意味する回答をした場合、その質問の回答を無かったものとし、次の質問を行うようにして対話を継続させることが提案されている（例えば特許文献１参照）。
特許第３５７６５１１号公報 In order to solve such problems, if the user makes an answer that means unknown, such as “I don't know.”, It is assumed that there is no answer to that question, and the dialogue continues as if the next question was asked. Has been proposed (see, for example, Patent Document 1).
Japanese Patent No. 3576511

しかしながら、このような音声対話装置においては、利用者の癖や利用される状況などにより、利用者が同じ質問に対して度々答えられない場合があり、このような場合、利用者は答えられない質問をされることを煩わしく感じてしまい、この点で改善の余地があった。 However, in such a voice interaction device, the user may not be able to answer the same question frequently due to the user's habits or the situation in which it is used, and in such a case, the user cannot answer. I was bothered to ask questions and there was room for improvement in this regard.

本発明は、従来の問題を解決するためになされたもので、利用者が煩わしく感じることのない、使い勝手の良い音声対話装置を提供することを目的とする。 The present invention has been made to solve the conventional problems, and it is an object of the present invention to provide an easy-to-use spoken dialogue apparatus that does not cause trouble for the user.

本発明の音声対話装置は、入力される音声を認識する音声認識手段と、前記音声認識手段が出力する音声認識結果が質問の答えがわからないことを表現する不明表現語かどうかを判定する不明表現語判定手段と、同じ質問に対し設定された回数より多く不明表現語が入力された場合は、次回以降は当該質問を行わないようにする対話制御手段とを備える構成を有している。 The speech dialogue apparatus according to the present invention includes speech recognition means for recognizing input speech, and an unknown expression for determining whether or not the speech recognition result output by the speech recognition means is an unknown expression word expressing that an answer to a question is unknown When an unknown expression word is input more than the number of times set for the same question, a dialogue control means for preventing the question from being performed next time is provided.

この構成により、同じ質問に対し何回も不明表現語を入力されると当該質問は行われないようになる。したがって、利用者に答えがわからない質問は行われないようになる。 With this configuration, when an unknown expression word is input many times for the same question, the question is not asked. Therefore, questions that the user does not know the answer will not be asked.

また、本発明の音声対話装置は、入力される音声を認識する音声認識手段と、前記音声認識手段が出力する音声認識結果が質問の答えがわからないことを表現する不明表現語かどうかを判定する不明表現語判定手段と、同じ質問に対し設定された回数より多く不明表現語が入力された場合は、次回以降は当該質問に替えて別の質問を行うようにする対話制御手段とを備える構成を有している。 Further, the speech dialogue apparatus of the present invention determines speech recognition means for recognizing input speech and whether or not the speech recognition result output by the speech recognition means is an unknown expression word expressing that the answer to the question is unknown. A configuration including an unknown expression word determination unit and a dialogue control unit that, when an unknown expression word is input more than the set number of times for the same question, performs another question instead of the question from the next time have.

この構成により、同じ質問に対し何回も不明表現語を入力されると当該質問は行われず別の質問が行われるようになる。したがって、利用者に答えがわからない質問は行われないようになる。 With this configuration, when an unknown word is input many times for the same question, the question is not asked and another question is asked. Therefore, questions that the user does not know the answer will not be asked.

本発明によれば、不明表現語判定手段で利用者が質問の答えがわからないことを表現する不明表現語を入力したかを判定し、同じ質問に対し設定された回数を超えて不明表現語を入力されたとき、次回以降は当該質問を行わないようにすることにより、利用者に答えがわからないと思われる質問を行わないようにすることができ、利用者に煩わしさを感じさせず、使い勝手を向上させることができる。 According to the present invention, the unknown expression word determination means determines whether the user has input an unknown expression word that expresses that the answer to the question is unknown, and the unknown expression word exceeds the number of times set for the same question. By preventing the question from being asked the next time when it is entered, it is possible to prevent the user from asking the question that the user does not know the answer. Can be improved.

以下、本発明の実施の形態について、図面を参照して説明する。 Embodiments of the present invention will be described below with reference to the drawings.

（第１の実施の形態）
図１は本発明の第１の実施の形態の音声対話装置を示す図である。 (First embodiment)
FIG. 1 is a diagram showing a voice interactive apparatus according to a first embodiment of the present invention.

図１に示すように、本実施の形態の音声対話装置は、利用者が入力した音声を認識する音声認識部１１と、利用者との間の音声による対話を制御する対話制御部１２と、対話の階層（種類や進度など）毎に必要な音声認識辞書が全対話階層分格納されている辞書格納部１３と、対話制御部１２からの指令により辞書格納部１３内に格納されている音声認識辞書を１個以上選択して結合することにより音声認識部１１が用いる音声認識辞書を作成する辞書制御部１４と、対話制御部１２の指令により利用者に対して発声を促す質問音声あるいは応答音声を出力する応答音声出力部１５と、この応答音声出力部１５で用いられる複数の音声を格納する応答音声格納部１６と、不明であることを意味する言葉（不明表現語）が項目として登録されている不明表現語辞書１７と、対話制御部１２の問い合わせ応じて不明表現語辞書１７を参照し音声認識結果が不明なことを表現しているかどうかを判定する不明表現語判定部１８とを備えており、例えば、ナビゲーション装置に搭載されて、各種検索や目的地設定などの操作を音声入力により補助するようになっている。 As shown in FIG. 1, the voice interaction apparatus according to the present embodiment includes a voice recognition unit 11 that recognizes a voice input by a user, a dialog control unit 12 that controls a voice conversation with the user, A dictionary storage unit 13 that stores necessary speech recognition dictionaries for each dialogue hierarchy (type, progress, etc.) for all dialogue layers, and a voice that is stored in the dictionary storage unit 13 in response to a command from the dialogue control unit 12 A dictionary control unit 14 that creates a speech recognition dictionary used by the speech recognition unit 11 by selecting and combining one or more recognition dictionaries, and a question voice or response that prompts the user to speak according to a command from the dialogue control unit 12 The response voice output unit 15 that outputs voice, the response voice storage unit 16 that stores a plurality of voices used in the response voice output unit 15, and words (unknown expression words) that are unknown are registered as items. Been An unknown expression word dictionary 17 and an unknown expression word determination unit 18 that refers to the unknown expression word dictionary 17 in response to an inquiry from the dialogue control unit 12 and determines whether the speech recognition result is unknown. For example, it is mounted on a navigation device and assists operations such as various searches and destination setting by voice input.

このような音声対話装置において、対話制御部１２は、内蔵の記憶部（図示していない）に予め格納されたシナリオに基づいて対話の制御を行う。このシナリオは、例えば、VoiceXML（the Voice eXtensible Markup Language、http://radiofly.to/nishi/voicexml-sdoc/voicexml.htmlを参照）により記述される。 In such a speech dialogue apparatus, the dialogue control unit 12 controls dialogue based on a scenario stored in advance in a built-in storage unit (not shown). This scenario is described by, for example, VoiceXML (the Voice eXtensible Markup Language, see http://radiofly.to/nishi/voicexml-sdoc/voicexml.html).

本実施の形態の施設検索のシナリオをVoiceXMLで記述したものを図２に示す。このシナリオに従った動作を図３に示す対話のフロー図を用いて説明する。 FIG. 2 shows a facility search scenario according to the present embodiment described in VoiceXML. The operation according to this scenario will be described with reference to the interactive flowchart shown in FIG.

まず、利用者の指示により音声対話が開始されると、対話制御部１２は、辞書制御部１４に操作の種別を表す言葉を含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３から図４に示すような操作の種別を表す言葉を含む音声認識辞書を作成する。 First, when a voice dialogue is started by a user instruction, the dialogue control unit 12 instructs the dictionary control unit 14 to create a dictionary including words representing the type of operation. In response to this command, the dictionary control unit 14 creates a speech recognition dictionary including words representing the types of operations as shown in FIG. 4 from the dictionary storage unit 13.

そして、対話制御部１２は、応答音声出力部１５に対し、利用者に対して「御用はなんでしょうか？」というメッセージを出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「御用はなんでしょうか？」というメッセージを選択し、出力する。 Then, the dialog control unit 12 instructs the response voice output unit 15 to output a message “What is your business?” To the user. In response to this command, the response voice output unit 15 selects and outputs a message “What is your use?” From the response voice storage unit 16.

また、対話制御部１２は、音声認識部１１に対し、辞書制御部１４が作成した辞書を用いて音声認識を実行することを指令する。 Further, the dialogue control unit 12 instructs the voice recognition unit 11 to execute voice recognition using the dictionary created by the dictionary control unit 14.

先の「御用はなんでしょうか？」というメッセージを聞いた利用者が、施設の検索を行うために「施設検索。」を発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「施設検索。」を対話制御部１２に出力する。 When the user who has heard the message “What is your business?” Utters “Facilities search” and inputs it to the voice interaction device in order to search for facilities, the voice recognition unit 11 Recognize and output “facility search” as a recognition result to the dialogue control unit 12.

対話制御部１２は、認識結果として「施設検索。」を受けると、施設検索のシナリオとして図２に示すようなシナリオを選択し、辞書制御部１４に施設の種類を表す言葉とともに「わかりません。」などの利用者が施設の種類を知らない場合に発声する可能性のある言葉をも含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３と不明表現語辞書１７とから図５に示すような施設の種類を表す言葉と「わかりません。」などの言葉とを含む音声認識辞書を作成する。 Upon receiving “facility search.” As the recognition result, the dialog control unit 12 selects a scenario as shown in FIG. 2 as the scenario for facility search, and the dictionary control unit 14 “not sure” along with a word indicating the type of facility. If the user does not know the type of facility, the creation of a dictionary that includes words that may be spoken is ordered. In response to this command, the dictionary control unit 14 creates a speech recognition dictionary including words indicating the type of facility as shown in FIG. 5 and words such as “I don't know” from the dictionary storage unit 13 and the unknown expression word dictionary 17. create.

そして、対話制御部１２は、シナリオに従い、応答音声出力部１５に対し、「施設の種類をお話ください。」というメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「施設の種類をお話ください。」というメッセージを選択し、出力する（Ｓ１１）。 Then, in accordance with the scenario, the dialogue control unit 12 instructs the response voice output unit 15 to output a message “Please tell us the type of facility.” To the user. In response to this command, the response voice output unit 15 selects and outputs a message “Please tell us the type of facility” from the response voice storage unit 16 (S11).

先の「施設の種類をお話ください。」というメッセージを聞いた利用者が、検索したい施設として「ゴルフ場。」を発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「ゴルフ場。」を対話制御部１２に出力する。 When a user who has heard the message “Please tell us the type of facility.” Speaks “Golf course.” As the facility to search for and inputs it to the voice interaction device, the voice recognition unit 11 recognizes this. Then, “golf course” is output to the dialogue control unit 12 as a recognition result.

対話制御部１２は、認識結果として「ゴルフ場。」を受けると、不明表現語判定部１８に「ゴルフ場。」が不明表現語であるか判定を行うよう指令する。この指令に対し、不明表現語判定部１８は、図６に示すような不明であることを表す言葉（不明表現語）を項目とする不明表現語辞書１７を参照して不明表現語であるかどうかを判定して結果を返送する。「ゴルフ場。」は不明表現語として登録されていないので、不明表現語でないことが返される。 Upon receiving “golf course.” As the recognition result, the dialogue control unit 12 instructs the unknown expression word determination unit 18 to determine whether “golf course.” Is an unknown expression word. In response to this command, the unknown expression word determination unit 18 refers to the unknown expression word dictionary 17 whose item is an unknown word (unknown expression word) as shown in FIG. Judge whether or not and return the result. Since “Golf course.” Is not registered as an unknown expression word, it is returned that it is not an unknown expression word.

対話制御部１２は、不明表現語の判定結果として不明表現語でないことが返されると、ゴルフ場を所在地で絞り込むため、辞書制御部１４に県名とともに「わかりません。」などの利用者がゴルフ場の所在する県名を知らない場合に発声する可能性のある言葉をも含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３と不明表現語辞書１７とから図７に示すような県名と「わかりません。」などの言葉とを含む音声認識辞書を作成する。 When the dialogue control unit 12 returns that the unknown expression word is not an unknown expression word, a user such as “I don't know” is sent to the dictionary control unit 14 along with the prefecture name to narrow down the golf course by location. Instructs the creation of a dictionary that includes words that may be spoken when the name of the prefecture where the golf course is located is unknown. In response to this command, the dictionary control unit 14 creates a speech recognition dictionary including the prefecture name as shown in FIG. 7 and words such as “I don't know” from the dictionary storage unit 13 and the unknown expression word dictionary 17.

そして、対話制御部１２は、シナリオに従い、応答音声出力部１５に対し、音声認識部１１で認識した「ゴルフ場。」と「のある県名をお話ください。」を組み合わせたメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「ゴルフ場のある県名をお話ください。」というメッセージを選択し、出力する（Ｓ１２）。 Then, the dialogue control unit 12 gives a message to the user according to the scenario, which is a combination of “golf course.” Recognized by the voice recognition unit 11 and “Please tell me the name of a certain prefecture.” Command to output. In response to this command, the response voice output unit 15 selects and outputs a message “Please tell me the prefecture name where the golf course is located” from the response voice storage unit 16 (S12).

先の「ゴルフ場のある県名をお話ください。」というメッセージを聞いた利用者が、ゴルフ場のある県名として「東京都。」を発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「東京都。」を対話制御部１２に出力する。 When a user who has heard the message “Please tell us the name of the prefecture where the golf course is located” utters “Tokyo” as the name of the prefecture where the golf course is located and inputs it to the voice interactive device, the voice recognition unit 11 However, this is recognized, and “Tokyo” is output to the dialogue control unit 12 as a recognition result.

対話制御部１２は、認識結果として「東京都。」を受けると、不明表現語判定部１８に「東京都。」が不明表現語であるか判定を行うよう指令する。この指令に対し、不明表現語判定部１８は、上述と同様に、不明表現語辞書１７を参照して不明表現語かどうかを判定して結果を返送する。「東京都。」は不明表現語として登録されていないので、不明表現語でないことが返される。 Upon receiving “Tokyo” as the recognition result, the dialogue control unit 12 instructs the unknown expression word determination unit 18 to determine whether “Tokyo.” Is an unknown expression word. In response to this command, the unknown expression word determination unit 18 determines whether it is an unknown expression word with reference to the unknown expression word dictionary 17 and returns the result in the same manner as described above. Since "Tokyo." Is not registered as an unknown expression word, it is returned that it is not an unknown expression word.

対話制御部１２は、不明表現語の判定結果として不明表現語でないことが返されると、辞書制御部１４に東京都のゴルフ場の名前を全て含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３から東京都のゴルフ場の名前を全て含む音声認識辞書を作成する。 When the dialogue control unit 12 returns that the unknown expression word is not an unknown expression word, the dialogue control unit 12 instructs the dictionary control unit 14 to create a dictionary including all names of golf courses in Tokyo. In response to this instruction, the dictionary control unit 14 creates a speech recognition dictionary including all the names of golf courses in Tokyo from the dictionary storage unit 13.

そして、対話制御部１２は、シナリオに従い、応答音声出力部１５に対し、施設の種類として入力された「ゴルフ場。」と「の名前をお話ください。」を組み合わせたメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「ゴルフ場の名前をお話ください。」というメッセージを選択し、出力する（Ｓ１３）。 Then, in accordance with the scenario, the dialogue control unit 12 sends a message combining “golf course.” And “Tell me your name” input to the response voice output unit 15 as the type of facility to the user. Command to output. In response to this command, the response voice output unit 15 selects and outputs the message “Please tell the name of the golf course” from the response voice storage unit 16 (S13).

先の「ゴルフ場の名前をお話ください。」というメッセージを聞いた利用者が、ゴルフ場の名前として「○○ゴルフ場。」を発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「○○ゴルフ場。」を対話制御部１２に出力する。 When the user who heard the previous message “Please tell us the name of the golf course.” Utters “XX golf course.” As the name of the golf course and inputs it to the voice interaction device, the voice recognition unit 11 This is recognized, and “XX golf course” is output to the dialogue control unit 12 as a recognition result.

対話制御部１２は、認識結果として「○○ゴルフ場。」を受けると、シナリオに従い、検索対象確定として、応答音声出力部１５に対し、確定した検索対象「○○ゴルフ場。」と「の地図を表示します。」を組み合わせたメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「○○ゴルフ場の地図を表示します。」というメッセージを選択し、出力する（Ｓ１４）。 Upon receiving “XX golf course.” As a recognition result, the dialogue control unit 12 determines the search target “XX golf course.” And “ Instructs the user to output a message combining "Display map." In response to this command, the response voice output unit 15 selects and outputs a message “display a golf course map” from the response voice storage unit 16 (S14).

また、対話制御部１２は、「○○ゴルフ場。」の地図を表示するようにナビゲーション装置に指令する。 In addition, the dialogue control unit 12 instructs the navigation device to display a map of “XX golf course.”

このようにして、検索対象の施設周辺の地図などをナビゲーション装置の表示画面に表示させることができる。 In this way, a map around the facility to be searched can be displayed on the display screen of the navigation device.

このような音声対話装置において、図８に示すように、利用者が、例えば「施設の種類をお話ください。」というメッセージに対し「知らない。」と発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「知らない。」を対話制御部１２に出力する。 In such a voice interaction device, as shown in FIG. 8, when a user utters “I don't know” in response to a message such as “Please tell me the type of facility” and inputs it to the voice interaction device, for example, The recognition unit 11 recognizes this, and outputs “I do not know” as the recognition result to the dialogue control unit 12.

対話制御部１２は、認識結果として「知らない。」を受けると、不明表現語判定部１８に「知らない。」が不明表現語であるか判定を行うよう指令する。この指令に対し、不明表現語判定部１８は、上述と同様に、不明表現語辞書１７を参照して不明表現語かどうかを判定して結果を返送する。「知らない。」は不明表現語として登録されているので、不明表現語であることが返される。 When the dialogue control unit 12 receives “I don't know” as the recognition result, it instructs the unknown expression word determination unit 18 to determine whether “I don't know” is an unknown expression word. In response to this command, the unknown expression word determination unit 18 determines whether it is an unknown expression word with reference to the unknown expression word dictionary 17 and returns the result in the same manner as described above. Since "I don't know" is registered as an unknown expression word, it is returned as an unknown expression word.

対話制御部１２は、不明表現語の判定結果として不明表現語であることが返されると、質問ごとに設けられた不明表現語が返された回数をカウントするカウンタを１カウントアップし、シナリオの次のステップに進み、施設を所在地で絞り込むため、辞書制御部１４に県名とともに「わかりません。」などの利用者がゴルフ場の所在する県名を知らない場合に発声する可能性のある言葉をも含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３と不明表現語辞書１７とから図７に示すような県名と「わかりません。」などの言葉とを含む音声認識辞書を作成する。 When it is returned that the unknown expression word is an unknown expression word as the determination result of the unknown expression word, the dialog control unit 12 increments a counter that counts the number of times the unknown expression word provided for each question is returned, and Proceed to the next step, in order to narrow down the facilities by location, there is a possibility that the dictionary control unit 14 will speak if the user does not know the name of the prefecture where the golf course is located, along with the name of the prefecture. Commands the creation of a dictionary that also includes words. In response to this command, the dictionary control unit 14 creates a speech recognition dictionary including the prefecture name as shown in FIG. 7 and words such as “I don't know” from the dictionary storage unit 13 and the unknown expression word dictionary 17.

そして、対話制御部１２は、シナリオに従い、応答音声出力部１５に対し、「施設。」と「のある県名をお話ください。」を組み合わせたメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「施設のある県名をお話ください。」というメッセージを選択し、出力する（Ｓ１２）。 Then, in accordance with the scenario, the dialogue control unit 12 instructs the response voice output unit 15 to output to the user a message that combines “facility.” And “Please tell me the name of the prefecture where there is.” . In response to this command, the response voice output unit 15 selects and outputs a message “Please tell me the name of the prefecture where the facility is located” from the response voice storage unit 16 (S12).

先の「施設のある県名をお話ください。」というメッセージを聞いた利用者が、施設のある県名として「東京都。」を発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「東京都。」を対話制御部１２に出力する。 When the user who heard the previous message “Please tell me the name of the prefecture where the facility is located” utters “Tokyo” as the name of the prefecture where the facility is located and inputs it to the voice interactive device, the speech recognition unit 11 This is recognized, and “Tokyo” is output to the dialogue control unit 12 as a recognition result.

対話制御部１２は、認識結果として「東京都。」を受けると、不明表現語判定部１８に「東京都。」が不明表現語であるか判定を行うよう指令する。この指令に対し、不明表現語判定部１８は、上述と同様に、不明表現語辞書１７を参照して不明を表す言葉かどうかを判定して結果を返送する。「東京都。」は不明を表す言葉として登録されていないので、不明を表す言葉でないことが返される。 Upon receiving “Tokyo” as the recognition result, the dialogue control unit 12 instructs the unknown expression word determination unit 18 to determine whether “Tokyo.” Is an unknown expression word. In response to this command, the unknown expression word determination unit 18 refers to the unknown expression word dictionary 17 to determine whether the word represents unknown and returns the result. Since “Tokyo” is not registered as an unknown word, it is returned that it is not an unknown word.

対話制御部１２は、不明表現語の判定結果として不明表現語でないことが返されると、辞書制御部１４に東京都の全ての施設の名前を含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３から東京都の全ての施設の名前を含む音声認識辞書を作成する。 When the dialogue control unit 12 returns that the unknown expression word is not an unknown expression word, the dialogue control unit 12 instructs the dictionary control unit 14 to create a dictionary including the names of all facilities in Tokyo. In response to this instruction, the dictionary control unit 14 creates a speech recognition dictionary including the names of all facilities in Tokyo from the dictionary storage unit 13.

そして、対話制御部１２は、シナリオに従い、応答音声出力部１５に対し、「施設。」と「の名前をお話ください。」を組み合わせたメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「施設の名前をお話ください。」というメッセージを選択し、出力する（Ｓ１３）。 Then, in accordance with the scenario, the dialogue control unit 12 instructs the response voice output unit 15 to output to the user a message that combines “facility” and “Tell me your name.” In response to this command, the response voice output unit 15 selects and outputs a message “Please tell me the name of the facility” from the response voice storage unit 16 (S13).

先の「施設の名前をお話ください。」というメッセージを聞いた利用者が、ゴルフ場の名前として「○○ゴルフ場。」を発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「○○ゴルフ場。」を対話制御部１２に出力する。 When the user who heard the previous message “Please tell us the name of the facility” utters “Golf course” as the name of the golf course and inputs it to the voice interaction device, the voice recognition unit 11 Is output to the dialogue control unit 12 as a recognition result.

このようにして、対話の途中で「知らない。」などの不明表現語を入力された場合でも、対話を続行し、検索対象の施設周辺の地図などをナビゲーション装置の表示画面に表示させることができる。 In this way, even when an unknown word such as “I don't know” is entered during the dialogue, the dialogue can be continued and a map around the facility to be searched can be displayed on the display screen of the navigation device. it can.

また、対話制御部１２は、一通りの対話が終了した後（例えば、施設検索の対話が終了し検索対象が確定した後）、今回の対話で使われた質問の、質問ごとに設けられた不明表現語が返された回数をカウントするカウンタの値を予め設定された値と比較し、カウンタの値が予め設定された値より大きい質問が有る場合は、シナリオからその質問を削除する修正を行う。 In addition, the dialogue control unit 12 is provided for each question of the questions used in the current dialogue after the completion of the dialogue (for example, after the facility search dialogue is finished and the search target is determined). Compare the value of the counter that counts the number of times the unknown word is returned with a preset value, and if there is a question whose counter value is larger than the preset value, fix that to delete that question from the scenario Do.

例えば、図２の施設検索のシナリオにおいて、図８のように、施設の種類を問い合わせる質問に対して不明表現語が返された回数が予め設定された値より大きくなった場合、対話制御部１２は、図９に示すように、施設の種類を問い合わせる質問を削除するとともに、施設の種類を問い合わせた回答を使用する部分を、それを代表する言葉「施設」に変更する修正を行い、次回の施設検索から修正したシナリオに従って処理を行う。 For example, in the facility search scenario of FIG. 2, when the number of times an unknown expression word is returned for a question inquiring about the type of facility becomes larger than a preset value as shown in FIG. 8, the dialogue control unit 12. As shown in FIG. 9, the question for inquiring about the type of facility is deleted, and the part that uses the response for inquiring about the type of facility is changed to the word “facility” that represents it. Process according to the scenario corrected from the facility search.

具体的には、図１０に示す対話のフロー図のように、まず、利用者の指示により音声対話が開始されると、対話制御部１２は、辞書制御部１４に操作の種別を表す言葉を含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３から図４に示すような操作の種別を表す言葉を含む音声認識辞書を作成する。 Specifically, as shown in the flow chart of the dialogue shown in FIG. 10, first, when a voice dialogue is started according to a user instruction, the dialogue control unit 12 sends a word indicating the type of operation to the dictionary control unit 14. Directs creation of a dictionary containing it. In response to this command, the dictionary control unit 14 creates a speech recognition dictionary including words representing the types of operations as shown in FIG. 4 from the dictionary storage unit 13.

対話制御部１２は、認識結果として「施設検索。」を受けると、施設検索のシナリオとして図９に示すようなシナリオを選択し、施設を所在地で絞り込むため、辞書制御部１４に県名とともに「わかりません。」などの利用者が施設の所在する県名を知らない場合に発声する可能性のある言葉をも含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３と不明表現語辞書１７とから図７に示すような県名と「わかりません。」などの言葉とを含む音声認識辞書を作成する。 Upon receiving “facility search” as the recognition result, the dialogue control unit 12 selects a scenario as shown in FIG. 9 as a facility search scenario, and in order to narrow down the facility by location, the dictionary control unit 14 together with the prefecture name “ If you don't know the name of the prefecture where the facility is located, you can create a dictionary that includes words that may be spoken. In response to this command, the dictionary control unit 14 creates a speech recognition dictionary including the prefecture name as shown in FIG. 7 and words such as “I don't know” from the dictionary storage unit 13 and the unknown expression word dictionary 17.

そして、対話制御部１２は、シナリオに従い、応答音声出力部１５に対し、「施設のある県名をお話ください。」というメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「施設のある県名をお話ください。」というメッセージを選択し、出力する（Ｓ２１）。 Then, the dialogue control unit 12 instructs the response voice output unit 15 to output a message “Please tell the name of the prefecture where the facility is located” to the user according to the scenario. In response to this command, the response voice output unit 15 selects and outputs a message “Please tell me the name of the prefecture where the facility is located” from the response voice storage unit 16 (S21).

そして、対話制御部１２は、シナリオに従い、応答音声出力部１５に対し、「施設の名前をお話ください。」というメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「施設の名前をお話ください。」というメッセージを選択し、出力する（Ｓ２２）。 Then, the dialogue control unit 12 instructs the response voice output unit 15 to output a message “Please tell me the name of the facility” to the user according to the scenario. In response to this command, the response voice output unit 15 selects and outputs a message “Please tell me the name of the facility” from the response voice storage unit 16 (S22).

先の「施設の名前をお話ください。」というメッセージを聞いた利用者が、施設の名前として「○○ゴルフ場。」を発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「○○ゴルフ場。」を対話制御部１２に出力する。 When the user who heard the previous message “Please tell us the name of the facility” utters “XX golf course” as the name of the facility and inputs it to the voice interaction device, the voice recognition unit 11 It recognizes and outputs “XX golf course” to the dialogue control unit 12 as a recognition result.

対話制御部１２は、認識結果として「○○ゴルフ場。」を受けると、シナリオに従い、検索対象確定として、応答音声出力部１５に対し、確定した検索対象「○○ゴルフ場。」と「の地図を表示します。」を組み合わせたメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「○○ゴルフ場の地図を表示します。」というメッセージを選択し、出力する（Ｓ２３）。 Upon receiving “XX golf course.” As a recognition result, the dialogue control unit 12 determines the search target “XX golf course.” And “ Instructs the user to output a message combining "Display map." In response to this command, the response voice output unit 15 selects and outputs a message “display a golf course map” from the response voice storage unit 16 (S23).

このように本実施の形態においては、不明表現語辞書１７に不明なことを表現する言葉を登録しておき、装置が発した質問に対し利用者が不明表現語辞書１７に登録されている言葉で回答したとき、質問ごとに設けられたカウンタをカウントアップし、カウンタの値が予め設定された値より大きくなったら、該当する質問をシナリオから削除しているので、次回から当該質問をしないようにすることができ、利用者が煩わしく感じることをなくし、使い勝手を向上させることができる。 As described above, in the present embodiment, a word expressing unknown is registered in the unknown expression word dictionary 17, and the user is registered in the unknown expression word dictionary 17 with respect to the question issued by the apparatus. When answering in, count up the counter provided for each question, and if the counter value becomes larger than the preset value, the corresponding question is deleted from the scenario, so do not ask that question from the next time Therefore, it is possible to eliminate the user's annoyance and improve usability.

なお、本実施の形態においては、利用者が限定されているナビゲーション装置に搭載される場合を述べたので、利用者を識別することは行わなかったが、複数の利用者が利用する装置では利用者を識別し、利用者ごとに質問ごとのカウンタを設けて、利用者ごとに質問を削除できるようにしてもよい。具体的には、音声、指紋、顔画像、ＩＣカードなどによる認証を用いる。 In the present embodiment, the case where the system is installed in a navigation device with limited users has been described. Therefore, the user is not identified, but is used in an apparatus used by a plurality of users. A user may be identified, a counter for each question may be provided for each user, and the question may be deleted for each user. Specifically, authentication by voice, fingerprint, face image, IC card or the like is used.

また、本実施の形態においては、不明表現語辞書１７に登録されている言葉を回答された場合にカウンタをカウントアップしたが、質問を発したときにタイマをセットし、当該タイマがタイムアウトしたとき（利用者が回答できずに無言のとき）も質問ごとのカウンタをカウントアップするようにしてもよい。 Further, in this embodiment, the counter is counted up when a word registered in the unknown expression word dictionary 17 is answered, but when a question is issued, a timer is set, and when the timer times out The counter for each question may be counted up (when the user cannot answer and is silent).

また、利用者の回答が音声認識辞書に無く、音声認識できなかった場合にも質問ごとのカウンタをカウントアップするようにしてもよい。 Also, the counter for each question may be counted up even when the user's answer is not in the speech recognition dictionary and speech recognition is not possible.

また、本実施の形態においては、同じ質問に対し不明なことを表現する言葉を、設定された回数より多く回答されたとき、該当する質問をシナリオから削除するようにしたが、利用者が煩わしく感じない程度の頻度で質問を復活させ、正しく回答されたときはそのまま質問を残し、やはり不明なことを表現する言葉を回答されたら削除するようにしてもよい。 Further, in this embodiment, when a word expressing unknown to the same question is answered more than the set number of times, the corresponding question is deleted from the scenario, but the user is troublesome. The question may be revived at such a frequency that it does not feel, and when the answer is correctly answered, the question may be left as it is, and a word expressing an unknown thing may be deleted if the answer is answered.

また、本実施の形態においては、同じ質問に対し不明なことを表現する言葉を、設定された回数より多く回答されたとき、該当する質問をシナリオから削除するようにしたが、該当する質問が削除されたシナリオを用意しておき、そのシナリオと差し替えるようにしてもよい。 Moreover, in this embodiment, when a word expressing unknown to the same question is answered more than the set number of times, the corresponding question is deleted from the scenario. A deleted scenario may be prepared and replaced with the scenario.

（第２の実施の形態）
次に、図１１は本発明の第２の実施の形態の音声対話装置を示す図である。なお、本実施の形態は、上述の第１の実施の形態と同様に構成されているので、図面を流用し、同様な構成には同一の符号を付して特徴部分のみ説明する。 (Second Embodiment)
Next, FIG. 11 is a diagram showing a voice interactive apparatus according to the second embodiment of the present invention. Since the present embodiment is configured in the same manner as the above-described first embodiment, the same reference numerals are given to the same configurations, and only characteristic portions will be described.

本実施の形態の音声対話装置の対話制御部２１は、図１２に示すような、利用者が不明なことを表現する言葉で回答したときは予め設定された別の質問をするようなシナリオを持っており、この質問に対し不明なことを表現する言葉で回答した回数が予め設定された回数を超えると、この質問を削除し、代わりにシナリオに書かれた別の質問をするようにシナリオを書き換えることを特徴としている。 The dialogue control unit 21 of the voice dialogue apparatus according to the present embodiment, as shown in FIG. 12, creates a scenario in which another preset question is asked when a user expresses an unknown word. If the number of times you have answered this question with a word expressing unknown exceeds the number of times set in advance, this scenario will be deleted, and another question written in the scenario will be asked instead. It is characterized by rewriting.

具体的には、図１３に示す対話のフロー図のように、まず、利用者の指示により音声対話が開始されると、対話制御部２１は、辞書制御部１４に操作の種別を表す言葉を含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３から図４に示すような操作の種別を表す言葉を含む音声認識辞書を作成する。 Specifically, as shown in the flowchart of the dialogue shown in FIG. 13, first, when a voice dialogue is started according to a user instruction, the dialogue control unit 21 sends a word indicating the type of operation to the dictionary control unit 14. Directs creation of a dictionary containing it. In response to this command, the dictionary control unit 14 creates a speech recognition dictionary including words representing the types of operations as shown in FIG. 4 from the dictionary storage unit 13.

そして、対話制御部２１は、応答音声出力部１５に対し、利用者に対して「御用はなんでしょうか？」というメッセージを出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「御用はなんでしょうか？」というメッセージを選択し、出力する。 Then, the dialogue control unit 21 instructs the response voice output unit 15 to output a message “What is your use?” To the user. In response to this command, the response voice output unit 15 selects and outputs a message “What is your use?” From the response voice storage unit 16.

また、対話制御部２１は、音声認識部１１に対し、辞書制御部１４が作成した辞書を用いて音声認識を実行することを指令する。 Further, the dialogue control unit 21 instructs the voice recognition unit 11 to execute voice recognition using the dictionary created by the dictionary control unit 14.

先の「御用はなんでしょうか？」というメッセージを聞いた利用者が、施設の検索を行うために「施設検索。」を発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「施設検索。」を対話制御部２１に出力する。 When the user who has heard the message “What is your business?” Utters “Facilities search” and inputs it to the voice interaction device in order to search for facilities, the voice recognition unit 11 Recognize and output “facility search” as a recognition result to the dialogue control unit 21.

対話制御部２１は、認識結果として「施設検索。」を受けると、施設検索のシナリオとして図１２に示すようなシナリオを選択し、施設を所在地で絞り込むため、辞書制御部１４に県名とともに「わかりません。」などの利用者が施設の所在する県名を知らない場合に発声する可能性のある言葉をも含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３と不明表現語辞書１７とから図７に示すような県名と「わかりません。」などの言葉とを含む音声認識辞書を作成する。 Upon receiving “facility search” as the recognition result, the dialogue control unit 21 selects a scenario as shown in FIG. 12 as the facility search scenario, and in order to narrow down the facility by location, the dictionary control unit 14 together with the prefecture name “ If you don't know the name of the prefecture where the facility is located, you can create a dictionary that includes words that may be spoken. In response to this command, the dictionary control unit 14 creates a speech recognition dictionary including the prefecture name as shown in FIG. 7 and words such as “I don't know” from the dictionary storage unit 13 and the unknown expression word dictionary 17.

そして、対話制御部２１は、シナリオに従い、応答音声出力部１５に対し、「施設のある県名をお話ください。」というメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「施設のある県名をお話ください。」というメッセージを選択し、出力する（Ｓ３１）。 Then, the dialogue control unit 21 instructs the response voice output unit 15 to output a message “Please tell the name of the prefecture where the facility is located” to the user according to the scenario. In response to this command, the response voice output unit 15 selects and outputs a message “Please tell me the name of the prefecture where the facility is located” from the response voice storage unit 16 (S31).

先の「施設のある県名をお話ください。」というメッセージを聞いた利用者が、施設のある県名として「東京都。」を発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「東京都。」を対話制御部２１に出力する。 When the user who heard the previous message “Please tell me the name of the prefecture where the facility is located” utters “Tokyo” as the name of the prefecture where the facility is located and inputs it to the voice interactive device, the speech recognition unit 11 This is recognized, and “Tokyo” is output to the dialogue control unit 21 as a recognition result.

対話制御部２１は、認識結果として「東京都。」を受けると、不明表現語判定部１８に「東京都。」が不明表現語であるか判定を行うよう指令する。この指令に対し、不明表現語判定部１８は、上述の実施の形態と同様に、不明表現語辞書１７を参照して不明表現語かどうかを判定して結果を返送する。「東京都。」は不明表現語として登録されていないので、不明表現語でないことが返される。 When the dialogue control unit 21 receives “Tokyo” as the recognition result, it instructs the unknown expression word determination unit 18 to determine whether “Tokyo.” Is an unknown expression word. In response to this command, the unknown expression word determination unit 18 determines whether the expression is an unknown expression word with reference to the unknown expression word dictionary 17 and returns the result, as in the above-described embodiment. Since "Tokyo." Is not registered as an unknown expression word, it is returned that it is not an unknown expression word.

対話制御部２１は、シナリオに従い、不明表現語の判定結果を判定し（Ｓ３２）、不明表現語でないと判定すると、辞書制御部１４に東京都の全ての施設の名前を含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３から東京都の全ての施設の名前を含む音声認識辞書を作成する。 The dialogue control unit 21 determines the determination result of the unknown expression word according to the scenario (S32), and if it is determined that it is not the unknown expression word, instructs the dictionary control unit 14 to create a dictionary including the names of all facilities in Tokyo. To do. In response to this instruction, the dictionary control unit 14 creates a speech recognition dictionary including the names of all facilities in Tokyo from the dictionary storage unit 13.

そして、対話制御部２１は、シナリオに従い、応答音声出力部１５に対し、「施設の名前をお話ください。」というメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「施設の名前をお話ください。」というメッセージを選択し、出力する（Ｓ３５）。 Then, the dialogue control unit 21 instructs the response voice output unit 15 to output a message “Please tell me the name of the facility” to the user according to the scenario. In response to this command, the response voice output unit 15 selects and outputs a message “Please tell me the name of the facility” from the response voice storage unit 16 (S35).

先の「施設の名前をお話ください。」というメッセージを聞いた利用者が、施設の名前として「○○ゴルフ場。」を発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「○○ゴルフ場。」を対話制御部２１に出力する。 When the user who heard the previous message “Please tell us the name of the facility” utters “XX golf course” as the name of the facility and inputs it to the voice interaction device, the voice recognition unit 11 It recognizes and outputs “XX golf course” to the dialogue control unit 21 as a recognition result.

対話制御部２１は、認識結果として「○○ゴルフ場。」を受けると、シナリオに従い、検索対象確定として、応答音声出力部１５に対し、確定した検索対象「○○ゴルフ場。」と「の地図を表示します。」を組み合わせたメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「○○ゴルフ場の地図を表示します。」というメッセージを選択し、出力する（Ｓ３６）。 Upon receiving “XX golf course.” As the recognition result, the dialogue control unit 21 determines the search target “XX golf course.” And “ Instructs the user to output a message combining "Display map." In response to this command, the response voice output unit 15 selects and outputs a message “display a golf course map” from the response voice storage unit 16 (S36).

また、対話制御部２１は、「○○ゴルフ場。」の地図を表示するようにナビゲーション装置に指令する。 In addition, the dialogue control unit 21 instructs the navigation device to display a map of “XX golf course.”

このようにして、施設検索において、まず施設の所在地を絞り込み、次に施設の名前により検索対象の施設周辺の地図などをナビゲーション装置の表示画面に表示させることができる。 Thus, in the facility search, the location of the facility can be narrowed down first, and then the map around the facility to be searched can be displayed on the display screen of the navigation device by the name of the facility.

このような音声対話装置において、図１４に示すように、利用者が、「施設のある県名をお話ください。」というメッセージに対し「わかりません。」と発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「わかりません。」を対話制御部２１に出力する。 In such a voice interaction device, as shown in FIG. 14, when the user utters “I don't know” in response to the message “Please tell me the name of the prefecture where the facility is located.” The voice recognition unit 11 recognizes this and outputs “I don't know” as the recognition result to the dialogue control unit 21.

対話制御部２１は、認識結果として「わかりません。」を受けると、不明表現語判定部１８に「わかりません。」が不明表現語であるか判定を行うよう指令する。この指令に対し、不明表現語判定部１８は、上述と同様に、不明表現語辞書１７を参照して不明表現語かどうかを判定して結果を返送する。「わかりません。」は不明表現語として登録されているので、不明表現語であることが返される。 Upon receiving “I don't know” as the recognition result, the dialogue control unit 21 instructs the unknown expression word determination unit 18 to determine whether “I don't know” is an unknown expression word. In response to this command, the unknown expression word determination unit 18 determines whether it is an unknown expression word with reference to the unknown expression word dictionary 17 and returns the result in the same manner as described above. Since “I don't know” is registered as an unknown expression word, it is returned as an unknown expression word.

対話制御部２１は、シナリオに従い、不明表現語の判定結果を判定し（Ｓ３２）、不明表現語であると判定すると、質問ごとに設けられた不明表現語が返された回数をカウントするカウンタを１カウントアップし、シナリオの次のステップに進み、施設を種別で絞り込むため、辞書制御部１４に施設の種類を表す言葉とともに「わかりません。」などの利用者が施設の種類を知らない場合に発声する可能性のある言葉をも含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３と不明表現語辞書１７とから図５に示すような施設の種類を表す言葉と「わかりません。」などの言葉とを含む音声認識辞書を作成する。 The dialogue control unit 21 determines the determination result of the unknown expression word according to the scenario (S32), and determines that the unknown expression word is an unknown expression word, the counter that counts the number of times the unknown expression word provided for each question is returned. When the user does not know the type of the facility, such as “I don't know” along with the word indicating the type of the facility in the dictionary control unit 14 in order to count up by one, advance to the next step of the scenario, and narrow down the facility by type Directs the creation of a dictionary that also includes words that may be spoken. In response to this command, the dictionary control unit 14 creates a speech recognition dictionary including words indicating the type of facility as shown in FIG. 5 and words such as “I don't know” from the dictionary storage unit 13 and the unknown expression word dictionary 17. create.

そして、対話制御部２１は、シナリオに従い、応答音声出力部１５に対し、「施設の種類をお話ください。」というメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「施設の種類をお話ください。」というメッセージを選択し、出力する（Ｓ３３）。 Then, in accordance with the scenario, the dialogue control unit 21 instructs the response voice output unit 15 to output a message “Please tell us the type of facility.” To the user. In response to this command, the response voice output unit 15 selects and outputs a message “Please tell us the type of facility” from the response voice storage unit 16 (S33).

先の「施設の種類をお話ください。」というメッセージを聞いた利用者が、検索したい施設として「ゴルフ場。」を発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「ゴルフ場。」を対話制御部２１に出力する。 When a user who has heard the message “Please tell us the type of facility.” Speaks “Golf course.” As the facility to search for and inputs it to the voice interaction device, the voice recognition unit 11 recognizes this. Then, “golf course.” Is output to the dialogue control unit 21 as a recognition result.

対話制御部２１は、認識結果として「ゴルフ場。」を受けると、不明表現語判定部１８に「ゴルフ場。」が不明表現語であるか判定を行うよう指令する。この指令に対し、不明表現語判定部１８は、上述と同様に、不明表現語辞書１７を参照して不明表現語かどうかを判定して結果を返送する。「ゴルフ場。」は不明表現語として登録されていないので、不明表現語でないことが返される。 Upon receiving “golf course.” As the recognition result, the dialogue control unit 21 instructs the unknown expression word determination unit 18 to determine whether “golf course.” Is an unknown expression word. In response to this command, the unknown expression word determination unit 18 determines whether it is an unknown expression word with reference to the unknown expression word dictionary 17 and returns the result in the same manner as described above. Since “Golf course.” Is not registered as an unknown expression word, it is returned that it is not an unknown expression word.

対話制御部２１は、不明表現語の判定結果として不明表現語でないことが返されると、辞書制御部１４に全国のゴルフ場の名前を全て含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３から全国のゴルフ場の名前を全て含む音声認識辞書を作成する。 When the dialogue control unit 21 returns that the unknown expression word is not an unknown expression word, the dialogue control unit 21 instructs the dictionary control unit 14 to create a dictionary including all golf course names nationwide. In response to this command, the dictionary control unit 14 creates a speech recognition dictionary including all golf course names nationwide from the dictionary storage unit 13.

そして、対話制御部２１は、シナリオに従い、応答音声出力部１５に対し、施設の種類として入力された「ゴルフ場。」と「の名前をお話ください。」を組み合わせたメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「ゴルフ場の名前をお話ください。」というメッセージを選択し、出力する（Ｓ３４）。 Then, in accordance with the scenario, the dialogue control unit 21 sends a message combining “golf course.” And “Tell me your name” input to the response voice output unit 15 as the type of facility. Command to output. In response to this command, the response voice output unit 15 selects and outputs a message “Please tell the name of the golf course” from the response voice storage unit 16 (S34).

先の「ゴルフ場の名前をお話ください。」というメッセージを聞いた利用者が、ゴルフ場の名前として「○○ゴルフ場。」を発声して音声対話装置に入力すると、音声認識部１１が、これを認識し、認識結果として「○○ゴルフ場。」を対話制御部２１に出力する。 When the user who heard the previous message “Please tell us the name of the golf course.” Utters “XX golf course.” As the name of the golf course and inputs it to the voice interaction device, the voice recognition unit 11 This is recognized, and “XX golf course” is output to the dialogue control unit 21 as a recognition result.

このようにして、施設のある県名の質問に対して「わかりません。」などの不明表現語を入力された場合は、代わりに施設の種類を質問して対話を続行し、検索対象の施設周辺の地図などをナビゲーション装置の表示画面に表示させることができる。 In this way, if an unknown expression such as “I don't know” is entered for the question of the prefecture name where the facility is located, the facility type will be asked instead and the dialogue will continue. A map around the facility can be displayed on the display screen of the navigation device.

また、対話制御部２１は、一通りの対話が終了した後（例えば、施設検索の対話が終了し検索対象が確定した後）、今回の対話で使われた質問の、質問ごとに設けられた不明表現語が返された回数をカウントするカウンタの値を予め設定された値と比較し、カウンタの値が予め設定された値より大きい質問が有る場合は、シナリオからその質問を削除するとともに別の質問で置き換える修正を行う。 In addition, the dialogue control unit 21 is provided for each question of the questions used in the current dialogue after the completion of the dialogue (for example, after the facility search dialogue is finished and the search target is determined). Compare the value of the counter that counts the number of times the unknown word is returned with a preset value, and if there is a question whose counter value is greater than the preset value, delete the question from the scenario and Modify to replace with the question.

例えば、図１２の施設検索のシナリオにおいて、図１４のように、施設のある県名を問い合わせる質問に対して不明表現語が返された回数が予め設定された値より大きくなった場合、対話制御部２１は、図１５に示すように、施設のある県名を問い合わせる質問を削除するとともに、その代わりとしてシナリオに書かれている施設の種類を問い合わせる質問をするように変更する修正を行い、次回の施設検索から修正したシナリオに従って処理を行う。 For example, in the facility search scenario of FIG. 12, when the number of times an unknown word is returned in response to a question asking for the name of a prefecture with a facility is greater than a preset value as shown in FIG. As shown in FIG. 15, the unit 21 deletes the question for inquiring about the name of the prefecture where the facility is located, and performs a modification to change the question to inquire about the type of facility written in the scenario instead. Process according to the revised scenario from the facility search.

具体的には、図１６に示す対話のフロー図のように、まず、利用者の指示により音声対話が開始されると、対話制御部２１は、辞書制御部１４に操作の種別を表す言葉を含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３から図４に示すような操作の種別を表す言葉を含む音声認識辞書を作成する。 Specifically, as shown in the flow chart of the dialogue shown in FIG. 16, first, when a voice dialogue is started by a user's instruction, the dialogue control unit 21 sends a word indicating the type of operation to the dictionary control unit 14. Directs creation of a dictionary containing it. In response to this command, the dictionary control unit 14 creates a speech recognition dictionary including words representing the types of operations as shown in FIG. 4 from the dictionary storage unit 13.

対話制御部２１は、認識結果として「施設検索。」を受けると、施設検索のシナリオとして図１５に示すようなシナリオを選択し、辞書制御部１４に施設の種類を表す言葉とともに「わかりません。」などの利用者が施設の種類を知らない場合に発声する可能性のある言葉をも含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３と不明表現語辞書１７とから図５に示すような施設の種類を表す言葉と「わかりません。」などの言葉とを含む音声認識辞書を作成する。 Upon receiving “facility search” as the recognition result, the dialogue control unit 21 selects a scenario as shown in FIG. 15 as a scenario for facility search, and the dictionary control unit 14 “is not understood” together with a word indicating the type of facility. If the user does not know the type of facility, the creation of a dictionary that includes words that may be spoken is ordered. In response to this command, the dictionary control unit 14 creates a speech recognition dictionary including words indicating the type of facility as shown in FIG. 5 and words such as “I don't know” from the dictionary storage unit 13 and the unknown expression word dictionary 17. create.

そして、対話制御部２１は、シナリオに従い、応答音声出力部１５に対し、「施設の種類をお話ください。」というメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「施設の種類をお話ください。」というメッセージを選択し、出力する（４１）。 Then, in accordance with the scenario, the dialogue control unit 21 instructs the response voice output unit 15 to output a message “Please tell us the type of facility.” To the user. In response to this command, the response voice output unit 15 selects and outputs a message “Please tell us the type of facility” from the response voice storage unit 16 (41).

対話制御部２１は、認識結果として「ゴルフ場。」を受けると、不明表現語判定部１８に「ゴルフ場。」が不明表現語であるか判定を行うよう指令する。この指令に対し、不明表現語判定部１８は、上述と同様に、不明表現語辞書１７を参照して不明表現語かどうかを判定して結果を返送する。「ゴルフ場。」は不明表現語として登録されていないので、不明表現語でないことが返される。 Upon receiving “golf course.” As the recognition result, the dialogue control unit 21 instructs the unknown expression word determination unit 18 to determine whether “golf course.” Is an unknown expression word. In response to this command, the unknown expression word determination unit 18 determines whether the expression is an unknown expression word with reference to the unknown expression word dictionary 17 and returns the result as described above. Since “golf course.” Is not registered as an unknown expression word, it is returned that it is not an unknown expression word.

対話制御部２１は、不明表現語の判定結果として不明表現語でないことが返されると、辞書制御部１４に全国のゴルフ場の名前を全て含む辞書の作成を指令する。この指令により辞書制御部１４は、辞書格納部１３から全国のゴルフ場の名前を全て含む音声認識辞書を作成する。 When the conversation control unit 21 returns that the unknown expression word is not an unknown expression word, the dialogue control unit 21 instructs the dictionary control unit 14 to create a dictionary including all golf course names nationwide. In response to this command, the dictionary control unit 14 creates a speech recognition dictionary including all golf course names nationwide from the dictionary storage unit 13.

そして、対話制御部２１は、シナリオに従い、応答音声出力部１５に対し、施設の種類として入力された「ゴルフ場。」と「の名前をお話ください。」を組み合わせたメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「ゴルフ場の名前をお話ください。」というメッセージを選択し、出力する（Ｓ４２）。 Then, in accordance with the scenario, the dialogue control unit 21 sends a message combining “golf course.” And “Tell me your name” input to the response voice output unit 15 as the type of facility. Command to output. In response to this command, the response voice output unit 15 selects and outputs a message “Please tell the name of the golf course” from the response voice storage unit 16 (S42).

対話制御部２１は、認識結果として「○○ゴルフ場。」を受けると、シナリオに従い、検索対象確定として、応答音声出力部１５に対し、確定した検索対象「○○ゴルフ場。」と「の地図を表示します。」を組み合わせたメッセージを利用者に対して出力することを指令する。この指令に対し、応答音声出力部１５は、応答音声格納部１６から「○○ゴルフ場の地図を表示します。」というメッセージを選択し、出力する（Ｓ４３）。 Upon receiving “XX golf course.” As the recognition result, the dialogue control unit 21 determines the search target “XX golf course.” And “ Instructs the user to output a message combining "Display map." In response to this command, the response voice output unit 15 selects and outputs a message “display a golf course map” from the response voice storage unit 16 (S43).

このように本実施の形態においては、不明表現語辞書１７に不明なことを表現する言葉を登録しておき、装置が発した質問に対し利用者が不明表現語辞書１７に登録されている言葉で回答したとき、質問ごとに設けられたカウンタをカウントアップし、カウンタの値が予め設定された値より大きくなったら、該当する質問をシナリオから削除して別の質問で置き換えているので、次回から当該質問をしないようにすることができ、利用者が煩わしく感じることをなくし、使い勝手を向上させることができる。 As described above, in the present embodiment, a word expressing unknown is registered in the unknown expression word dictionary 17, and the user is registered in the unknown expression word dictionary 17 with respect to the question issued by the apparatus. When answering in, count up the counter provided for each question, and when the counter value becomes larger than the preset value, the corresponding question is deleted from the scenario and replaced with another question, so next time Therefore, the question can be avoided, the user can be prevented from feeling bothersome, and the usability can be improved.

また、本実施の形態においては、同じ質問に対し不明なことを表現する言葉を、設定された回数より多く回答されたとき、該当する質問をシナリオから削除して別の質問に置き換えるようにシナリオを修正したが、該当する質問が削除され別の質問に置き換えられたたシナリオを用意しておき、そのシナリオと差し替えるようにしてもよい。 Also, in the present embodiment, when a word expressing unknown to the same question is answered more than the set number of times, the scenario is such that the corresponding question is deleted from the scenario and replaced with another question. However, it is also possible to prepare a scenario in which the corresponding question is deleted and replaced with another question, and replace that scenario.

以上のように、本発明にかかる音声対話装置は、利用者に煩わしさを感じさせず、使い勝手を向上させることができるという効果を有し、利用者の音声を認識し、認識した音声に対応する音声を合成して出力し、音声により利用者と対話を行って利用者の要求を処理する音声対話装置等として有用である。 As described above, the voice interaction apparatus according to the present invention has an effect of improving usability without making the user feel bothersome, recognizes the voice of the user, and supports the recognized voice. It is useful as a voice dialogue device or the like that synthesizes and outputs a voice to be processed and interacts with the user by voice to process a user request.

本発明の第１の実施の形態における音声対話装置のブロック図The block diagram of the voice interactive apparatus in the 1st Embodiment of this invention 本発明の第１の実施の形態における音声対話装置のシナリオの例を示す図The figure which shows the example of the scenario of the voice interactive apparatus in the 1st Embodiment of this invention 本発明の第１の実施の形態における音声対話装置の動作説明のための対話のフロー図Dialog flow diagram for explaining the operation of the voice interaction apparatus according to the first embodiment of the present invention. 本発明の第１の実施の形態における音声対話装置の操作の種別を認識するための音声認識辞書を示す図The figure which shows the speech recognition dictionary for recognizing the kind of operation of the voice interactive apparatus in the 1st Embodiment of this invention. 本発明の第１の実施の形態における音声対話装置の施設の種別を認識するための音声認識辞書を示す図The figure which shows the speech recognition dictionary for recognizing the classification | type of the facility of the voice interactive apparatus in the 1st Embodiment of this invention. 本発明の第１の実施の形態における音声対話装置の不明表現語辞書の内容を示す図The figure which shows the content of the unknown expression word dictionary of the voice interactive apparatus in the 1st Embodiment of this invention 本発明の第１の実施の形態における音声対話装置の県名を認識するための音声認識辞書を示す図The figure which shows the speech recognition dictionary for recognizing the prefecture name of the speech dialogue apparatus in the 1st Embodiment of this invention. 本発明の第１の実施の形態における音声対話装置の不明表現語を回答されたときの対話のフロー図Flow chart of dialogue when answering unknown expression word of voice dialogue apparatus in the first exemplary embodiment of the present invention 本発明の第１の実施の形態における音声対話装置の変更されたシナリオの例を示す図The figure which shows the example of the changed scenario of the voice interactive apparatus in the 1st Embodiment of this invention 本発明の第１の実施の形態における音声対話装置の変更されたシナリオによる動作を説明するための対話のフロー図Dialog flow diagram for explaining the operation of the voice interaction device according to the changed scenario in the first exemplary embodiment of the present invention 本発明の第２の実施の形態における音声対話装置のブロック図The block diagram of the voice interactive apparatus in the 2nd Embodiment of this invention 本発明の第２の実施の形態における音声対話装置のシナリオの例を示す図The figure which shows the example of the scenario of the voice interactive apparatus in the 2nd Embodiment of this invention 本発明の第２の実施の形態における音声対話装置の動作説明のための対話のフロー図Dialogue flow diagram for explaining the operation of the speech dialogue apparatus according to the second embodiment of the present invention 本発明の第２の実施の形態における音声対話装置の不明表現語を回答されたときの対話のフロー図Flow diagram of dialogue when unknown expression word of voice dialogue apparatus in second embodiment of present invention is answered 本発明の第２の実施の形態における音声対話装置の変更されたシナリオの例を示す図The figure which shows the example of the changed scenario of the voice interactive apparatus in the 2nd Embodiment of this invention 本発明の第２の実施の形態における音声対話装置の変更されたシナリオによる動作を説明するための対話のフロー図Dialog flow diagram for explaining the operation according to the changed scenario of the voice interaction apparatus according to the second embodiment of the present invention.

Explanation of symbols

１１音声認識部
１２対話制御部
１３辞書格納部
１４辞書制御部
１５応答音声出力部
１６応答音声格納部
１７不明表現語辞書
１８不明表現語判定部
２１対話制御部 DESCRIPTION OF SYMBOLS 11 Speech recognition part 12 Dialog control part 13 Dictionary storage part 14 Dictionary control part 15 Response voice output part 16 Response voice storage part 17 Unknown expression word dictionary 18 Unknown expression word determination part 21 Dialog control part

Claims

The same question as the voice recognition means for recognizing the input speech, and the unknown expression word determination means for determining whether the voice recognition result output by the voice recognition means is an unknown expression word expressing that the answer to the question is unknown. In contrast, when the unknown expression word is inputted more than the set number of times, the voice dialogue device includes dialogue control means for preventing the question from being performed next time.

The same question as the voice recognition means for recognizing the input speech, and the unknown expression word determination means for determining whether the voice recognition result output by the voice recognition means is an unknown expression word expressing that the answer to the question is unknown. A spoken dialogue apparatus comprising dialogue control means for performing another question instead of the question when the unknown expression word is input more than the set number of times.