JP2009037458A

JP2009037458A - Response system and response content control method

Info

Publication number: JP2009037458A
Application number: JP2007201828A
Authority: JP
Inventors: Ryo Murakami; 涼村上
Original assignee: Toyota Motor Corp
Current assignee: Toyota Motor Corp
Priority date: 2007-08-02
Filing date: 2007-08-02
Publication date: 2009-02-19

Abstract

<P>PROBLEM TO BE SOLVED: To provide a response system and a response content control method, capable of outputting a proper response sentence. <P>SOLUTION: This response system concerned in one embodiment is a response system for outputting the response sentence with respect to an input input-sentence, and has a microphone 31 for inputting the input sentence, a language understanding part 12 for understanding language information of the input sentence input into the microphone, a dialogue management part 13 for determining a response proposal sentence in response to the language information, a response sentence generating part 14 for generating the response sentence based on the response proposal sentence, a speaker 32 for outputting the response sentence generated by the response sentence generating part 14. The dialogue management part 13 calculates similarity between the response proposal sentence and a prohibited sentence, referring to a prohibited sentence database 16 storing the prohibited sentence. The dialogue management part 13 limits an output of the response proposal sentence in response to a comparison result of the similarity with a determination value. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は応答システム、及び応答内容制御方法に関し、特に詳しくは、入力された言語情報に対して応答する応答システム、及びその応答内容を制御する応答内容制御方法に関する。 The present invention relates to a response system and a response content control method, and more particularly to a response system that responds to input language information and a response content control method that controls the response content.

近年、マイクに入力した音声に対して音声認識処理を行い、その音声認識結果に対して対話する音声対話システムが開発されている（非特許文献１）。また、展示施設などにおいて、施設の案内をさせる音声対話ロボットが開発されている。 In recent years, a speech dialogue system has been developed that performs speech recognition processing on speech input to a microphone and interacts with the speech recognition result (Non-Patent Document 1). In addition, in an exhibition facility or the like, a voice dialogue robot for guiding the facility has been developed.

このような、音声対話処理では、例えば、入力した音声に対して音声認識処理を行い、音声データをテキストデータに変換する。そして、そのテキストデータを言語理解モジュールにおいて処理するとともに、対話管理部で情報検索を行い、応答文を生成する。
「音声対話システム」河原達也・荒木雅弘共著オーム社（平成１８年１０月１５日発行） In such a voice interaction process, for example, a voice recognition process is performed on the input voice, and the voice data is converted into text data. Then, the text data is processed in the language understanding module, and the dialogue management unit searches for information to generate a response sentence.
"Spoken Dialogue System" Tatsuya Kawahara and Masahiro Araki Ohmsha (issued on October 15, 2006)

しかしながら、展示施設において、音声対話システムを利用する場合、利用者に対して適切でない応答をする場合があるという問題点がある。すなわち、車両展示施設などで、施設利用者に対して好ましくないような応答文を発話してしまうことがある。この問題点について、図１０を用いて説明する。図１０は、利用者が発話した質問に対して、音声対話ロボットが応答する様子を模式的に示す図である。 However, when using a voice interaction system in an exhibition facility, there is a problem that an inappropriate response may be made to the user. That is, an unpleasant response sentence may be uttered to a facility user at a vehicle display facility or the like. This problem will be described with reference to FIG. FIG. 10 is a diagram schematically illustrating how the voice interactive robot responds to a question spoken by the user.

車両展示施設において、車両整備などで時間をもてあました利用者１０５が音声対話ロボット１０４に対して最新ニュースを問い合わせたとする。ここでは、施設の利用者１０５が音声対話ロボット１０４に対して「最新ニュースを教えて」と質問している。このとき、音声対話ロボット１０４は、インターネットを介してニュースをダウンロードして、インターネット上の最新ニュースを読み上げる。 It is assumed that a user 105 who has had time for vehicle maintenance in the vehicle exhibition facility inquires the voice dialogue robot 104 about the latest news. Here, the facility user 105 asks the voice interactive robot 104 “Tell me the latest news”. At this time, the voice interaction robot 104 downloads the news via the Internet and reads the latest news on the Internet.

ここでは、音声対話ロボット１０４が、利用者１０５に対して、「隣家の飼い犬の鳴き声がうるさいといって、その犬を殺す事件が長野でありました。」という応答文を発話している。このような、残虐なニュースなどが発話された場合、利用者１０５が不快な気分となってしまう。 Here, the voice dialogue robot 104 utters a response to the user 105, saying, “Now, the incident of killing the dog was due to the loudness of the neighbor's dog.” When such cruel news is spoken, the user 105 feels uncomfortable.

このように、従来の音声対話ロボット１０４では、利用者を不快にさせるような不適切な応答をしてしまうおそれがある。このような、問題点は、音声を発話する音声対話ロボットに限らず、応答文をモニタ上などに出力する応答システムにおいても発生する。 As described above, the conventional voice interactive robot 104 may give an inappropriate response that makes the user uncomfortable. Such a problem occurs not only in a voice interactive robot that utters voice, but also in a response system that outputs a response sentence on a monitor or the like.

本発明は、かかる課題を解決するためになされたものであり、適切な応答文を出力することができる応答システム、及び応答内容制御方法を提供することを目的とする。 The present invention has been made to solve such a problem, and an object thereof is to provide a response system and a response content control method capable of outputting an appropriate response sentence.

本発明の第１の態様にかかる応答システムは、入力された入力文に対する応答文を出力する応答システムであって、入力文を入力する入力部と、前記入力部に入力された入力文の言語情報を理解する言語理解部と、前記言語情報に応じた応答候補文を決定する管理部と、前記応答候補文に基づいて応答文を生成する応答文生成部と、前記応答文生成部で生成された応答文を出力する出力部と、を備え、前記管理部が、禁止文が記憶されている禁止文データベースを参照することにより、前記応答候補文と前記禁止文との類似度を算出し、前記類似度に基づいて前記応答候補文の出力を制限するものである。これにより、不適切な内容の応答文の出力が制限されるため、適切な応答文を出力することができる。 A response system according to a first aspect of the present invention is a response system that outputs a response sentence to an input sentence that has been input, the input unit that inputs the input sentence, and the language of the input sentence that is input to the input part Generated by a language understanding unit that understands information, a management unit that determines a response candidate sentence according to the language information, a response sentence generation unit that generates a response sentence based on the response candidate sentence, and a response sentence generation unit An output unit that outputs the response sentence, and the management unit calculates a similarity between the response candidate sentence and the prohibited sentence by referring to a prohibited sentence database in which the prohibited sentence is stored. The output of the response candidate sentence is limited based on the similarity. As a result, output of a response sentence with inappropriate content is restricted, and therefore an appropriate response sentence can be output.

本発明の第２の態様にかかる応答システムは、上記の応答システムであって、前記管理部が、前記類似度に基づいて前記応答候補文の判定値を算出し、前記判定値としきい値との比較結果に基づいて、前記応答候補文の出力を制限するか否かを判定するものである。これにより、不適切な内容の応答文の出力を確実に制限することができる。 A response system according to a second aspect of the present invention is the response system described above, wherein the management unit calculates a determination value of the response candidate sentence based on the similarity, and the determination value and the threshold value Whether or not to limit the output of the response candidate sentence is determined based on the comparison result. Thereby, the output of the response sentence with inappropriate contents can be surely limited.

本発明の第３の態様にかかる応答システムは、上記の応答システムであって、前記応答候補文をネットワーク上で更新される情報に基づいて決定することを特徴とするものである。内容が未確定な応答候補文が決定される場合であっても、適切な応答文を生成することができる。 A response system according to a third aspect of the present invention is the response system described above, wherein the response candidate sentence is determined based on information updated on a network. Even when a response candidate sentence whose contents are undetermined is determined, an appropriate response sentence can be generated.

本発明の第４の態様にかかる応答システムは、上記の応答システムであって、前記禁止文データベースに記憶されている禁止文が複数の形態素を含むことを特徴とするものである。これにより、不適切な応答候補文の出力を簡便かつ確実に制限することができる。 A response system according to a fourth aspect of the present invention is the response system described above, wherein the prohibited text stored in the prohibited text database includes a plurality of morphemes. Thereby, the output of an inappropriate response candidate sentence can be restricted easily and reliably.

本発明の第５の態様にかかる応答システムは、上記の応答システムであって、前記応答候補文、及び前記発話禁止文を形態素解析し、前記応答候補文に含まれる形態素と、前記発話禁止文に含まれる形態素とを比較して、前記形態素の一致数に応じて前記類似度を算出するものである。これにより、不適切な応答候補文の出力を簡便かつ確実に制限することができる。 A response system according to a fifth aspect of the present invention is the response system described above, wherein the response candidate sentence and the utterance prohibition sentence are morphologically analyzed, the morpheme included in the response candidate sentence, and the utterance prohibition sentence Is compared with the morpheme included in the morpheme, and the similarity is calculated according to the number of matches of the morpheme. Thereby, the output of an inappropriate response candidate sentence can be restricted easily and reliably.

本発明の第６の態様にかかる応答システムは、上記の応答システムであって、前記形態素の一致数が、特定の品詞において形態素が一致する数であることを特徴とするものである。これにより、不適切な応答候補文の出力を簡便かつ確実に制限することができる。 A response system according to a sixth aspect of the present invention is the response system described above, wherein the number of coincidence of the morphemes is a number of coincidence of morphemes in a specific part of speech. Thereby, the output of an inappropriate response candidate sentence can be restricted easily and reliably.

本発明の第７の態様にかかる応答システムは、入力された音声データに対する応答文を発話する応答システムであって、入力された音声データに対して音声認識処理を行う音声認識部と、前記音声認識処理された音声データの言語情報を理解する言語理解部と、前記言語情報に応じた応答候補文を決定するとともに、禁止文が記憶されている禁止文データベースを参照することによって前記応答候補文の内容に応じて前記応答候補文の発話を制限する対話管理部と、を備えるものである。これにより、不適切な内容の応答文の出力が制限されるため、適切な応答文を出力することができる。 A response system according to a seventh aspect of the present invention is a response system that utters a response sentence to input voice data, the voice recognition unit performing voice recognition processing on the input voice data, and the voice A language understanding unit that understands the language information of the speech data that has undergone recognition processing, determines a response candidate sentence according to the language information, and refers to the prohibited sentence database in which the prohibited sentence is stored, and thus the response candidate sentence A dialogue management unit that restricts the utterance of the response candidate sentence according to the content of the response candidate sentence. As a result, output of a response sentence with inappropriate content is restricted, and therefore an appropriate response sentence can be output.

本発明の第８の態様にかかる応答内容制御方法は、上記の応答内容制御方法であって、入力された入力文に対して出力される応答文の内容を制御する応答内容制御方法であって、入力された入力文の言語情報を理解するステップと、前記言語情報に応じた応答候補文を決定するステップと、前記類似度から決定された判定値としきい値とを比較するステップと、を備え、禁止文が記憶されている禁止文データベースを参照することにより、前記応答候補文と前記禁止文との類似度を算出し、前記類似度に基づいて前記応答候補文の出力を制限するものである。これにより、不適切な内容の応答文の出力が制限されるため、適切な応答文を出力することができる。 A response content control method according to an eighth aspect of the present invention is the response content control method described above, and is a response content control method for controlling the content of a response sentence that is output in response to an input sentence that has been input. A step of understanding linguistic information of the inputted input sentence, a step of determining a response candidate sentence according to the linguistic information, and a step of comparing a determination value determined from the similarity with a threshold value. Preparing a similarity between the response candidate sentence and the prohibited sentence by referring to a prohibited sentence database in which the prohibited sentence is stored, and restricting the output of the response candidate sentence based on the similarity It is. As a result, output of a response sentence with inappropriate content is restricted, and therefore an appropriate response sentence can be output.

本発明の第９の態様にかかる応答内容制御方法は、上記の応答内容制御方法であって、前記類似度に基づいて前記応答候補文の判定値を算出し、前記判定値としきい値との比較結果に基づいて、前記応答候補文の出力を制限するか否かを判定するものである。これにより、不適切な内容の応答文の出力を確実に制限することができる。 A response content control method according to a ninth aspect of the present invention is the response content control method described above, wherein a determination value of the response candidate sentence is calculated based on the similarity, and the determination value and the threshold value are calculated. Based on the comparison result, it is determined whether to limit the output of the response candidate sentence. Thereby, the output of the response sentence with inappropriate contents can be surely limited.

本発明の第１０の態様にかかる応答内容制御方法は、上記の応答内容制御方法であって、前記応答候補文をネットワーク上で更新される情報に基づいて決定することを特徴とするものである。内容が未確定な応答候補文が決定される場合であっても、適切な応答文を生成することができる。 A response content control method according to a tenth aspect of the present invention is the response content control method described above, wherein the response candidate sentence is determined based on information updated on a network. . Even when a response candidate sentence whose contents are undetermined is determined, an appropriate response sentence can be generated.

本発明の第１１の態様にかかる応答内容制御方法は、上記の応答内容制御方法であって、前記禁止文データベースに記憶されている禁止文が複数の形態素を含むことを特徴とするものである。これにより、不適切な応答候補文の出力を簡便かつ確実に制限することができる。 A response content control method according to an eleventh aspect of the present invention is the response content control method described above, wherein the prohibited text stored in the prohibited text database includes a plurality of morphemes. . Thereby, the output of an inappropriate response candidate sentence can be restricted easily and reliably.

本発明の第１２の態様にかかる応答内容制御方法は、上記の応答内容制御方法であって、前記応答候補文と前記禁止文を形態素解析し、前記応答候補文に含まれる形態素と、前記禁止文に含まれる形態素とを比較して、形態素の一致数に応じて前記類似度を算出するものである。これにより、不適切な応答候補文の出力を簡便かつ確実に制限することができる。 A response content control method according to a twelfth aspect of the present invention is the response content control method described above, wherein the response candidate sentence and the prohibited sentence are morphologically analyzed, the morpheme included in the response candidate sentence, and the prohibition The morphemes included in the sentence are compared, and the similarity is calculated according to the number of morpheme matches. Thereby, the output of an inappropriate response candidate sentence can be restricted easily and reliably.

本発明の第１３の態様にかかる応答内容制御方法は、上記の応答内容制御方法であって、前記形態素の一致数が、特定の品詞での形態素の一致数であることを特徴とするものである。これにより、不適切な応答候補文の出力を簡便かつ確実に制限することができる。 A response content control method according to a thirteenth aspect of the present invention is the response content control method described above, wherein the number of morpheme matches is the number of morpheme matches in a specific part of speech. is there. Thereby, the output of an inappropriate response candidate sentence can be restricted easily and reliably.

本発明の第１４の態様にかかる応答内容制御方法は、上記の応答内容制御方法であって、入力された音声データに対して発話される応答文の内容を制御する応答内容制御方法であって、入力された音声データに対して音声認識処理を行うステップと、前記音声認識処理されたデータの言語情報を理解するステップと、前記言語情報に応じて応答候補文を決定するステップと、禁止文が記憶されている禁止文データベースを参照して、前記応答候補文の内容に応じて前記応答候補文の発話を制限するステップと、を備えるステップと、を備えるものである。これにより、不適切な内容の応答文の出力が制限されるため、適切な応答文を出力することができる。 A response content control method according to a fourteenth aspect of the present invention is the response content control method described above, wherein the response content control method controls the content of a response sentence uttered in response to input voice data. Performing speech recognition processing on the input speech data; understanding language information of the speech recognition processed data; determining a response candidate sentence according to the language information; With reference to the prohibited sentence database in which is stored, limiting the utterance of the response candidate sentence according to the content of the response candidate sentence. As a result, output of a response sentence with inappropriate content is restricted, and therefore an appropriate response sentence can be output.

本発明によれば、適切な応答文を出力することができる応答システム、及び応答内容制御方法を提供することを提供することを目的とする。 According to the present invention, it is an object to provide a response system and a response content control method capable of outputting an appropriate response sentence.

本実施の形態にかかる応答システムは、入力された入力文に対する応答文を出力する応答システムである。そして、応答システムは、入力文を入力する入力部と、前記入力部に入力された入力文の言語情報を理解する言語理解部と、前記言語情報に応じた応答候補文を決定する管理部と、前記判定値がしきい値よりも低い応答候補文に基づいて応答文を生成する応答文生成部と、前記応答文生成部で生成された応答文を出力する出力部と、を有している。そして、管理部は、禁止文が記憶されている禁止文データベースを参照することにより、前記応答候補文と前記禁止文との類似度を算出している。さらに、管理部は、前記類似度に基づいて前記応答候補文の出力を制限している。 The response system according to the present embodiment is a response system that outputs a response sentence to an input sentence that has been input. The response system includes: an input unit that inputs an input sentence; a language understanding unit that understands language information of the input sentence input to the input unit; and a management unit that determines a response candidate sentence according to the language information; A response sentence generation unit that generates a response sentence based on a response candidate sentence whose determination value is lower than a threshold; and an output unit that outputs the response sentence generated by the response sentence generation unit. Yes. The management unit calculates the similarity between the response candidate sentence and the prohibited sentence by referring to the prohibited sentence database in which the prohibited sentence is stored. Furthermore, the management unit restricts the output of the response candidate sentence based on the similarity.

本実施の形態にかかる応答システムを用いた応答ロボット１００について図１を用いて説明する。図１は、応答ロボット１００の構成を示すブロック図である。応答ロボットは、処理コンピュータ１０と、マイク３１と、スピーカ３２と、モータ３３とを有している。ここでは、応答ロボット１００がヒューマノイド型の音声対話ロボットであるとして説明する。また、応答ロボット１００は、例えば、車両展示施設などで利用されているとする。そして、発話者１０２が発話すると、応答ロボット１００は自律的に対話を行う。 A response robot 100 using the response system according to the present embodiment will be described with reference to FIG. FIG. 1 is a block diagram showing the configuration of the response robot 100. The response robot has a processing computer 10, a microphone 31, a speaker 32, and a motor 33. Here, it is assumed that the response robot 100 is a humanoid type voice interaction robot. Further, it is assumed that the response robot 100 is used in, for example, a vehicle exhibition facility. And if the speaker 102 speaks, the response robot 100 will talk autonomously.

マイク３１は、発話者１０２が発話した音声を取得する。すなわち、発話された文が、マイク３１によって受音される。この発話文には、言語情報が含まれている。例えば、発話文には、応答ロボット１００に対する質問や要求などを示す言語情報が含まれている。従って、マイク３１は、言語情報を含む入力文を入力する入力部となる。マイク３１として、例えば、複数のマイクロフォンを有するマイクロフォンアレイを用いてもよい。 The microphone 31 acquires the voice uttered by the speaker 102. That is, the spoken sentence is received by the microphone 31. This spoken sentence includes language information. For example, the utterance sentence includes language information indicating a question or request to the response robot 100. Therefore, the microphone 31 serves as an input unit for inputting an input sentence including language information. As the microphone 31, for example, a microphone array having a plurality of microphones may be used.

マイク３１に入力された入力文の音声データは、処理コンピュータ１０によって処理される。処理コンピュータ１０は、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）、通信用のインターフェースなどを有し、応答ロボット１００の各種動作を制御する。そして、処理コンピュータ１０は、例えばＲＯＭに格納された制御プログラムに従って各種の制御を実行する。すなわち、処理コンピュータ１０は、応答ロボット１００を制御する制御部となり、音声対話プログラムを実行する。これにより、入力された言語情報の質問や要求に応答する応答文が生成される。すなわち、応答文に対応する出力データを生成する。なお、処理コンピュータ１０の構成に付いては、後述する。また、処理コンピュータ１０は物理的に単一な構成に限られるものではない。 The voice data of the input sentence input to the microphone 31 is processed by the processing computer 10. The processing computer 10 includes a central processing unit (CPU), a read only memory (ROM), a random access memory (RAM), a communication interface, and the like, and controls various operations of the response robot 100. And the processing computer 10 performs various control according to the control program stored, for example in ROM. That is, the processing computer 10 serves as a control unit that controls the response robot 100 and executes a voice interaction program. Thus, a response sentence is generated in response to the input language information question or request. That is, output data corresponding to the response sentence is generated. The configuration of the processing computer 10 will be described later. Further, the processing computer 10 is not limited to a physically single configuration.

そして、処理コンピュータ１０によって生成された応答文に対応する音声データがスピーカ３２から出力される。これにより、応答ロボット１００が応答文を読み上げる。すなわち、応答文を外部へ所定の方向および大きさで出力するこのように、スピーカ３２からは、処理コンピュータ１０で生成された応答文に対応する出力データを出力する出力部となる。また、処理コンピュータ１０は、インターネット１０３に接続されている。そして、処理コンピュータ１０は、インターネット１０３上の情報データをダウンロードする。 Then, audio data corresponding to the response sentence generated by the processing computer 10 is output from the speaker 32. Thereby, the response robot 100 reads out the response sentence. That is, the response sentence is output to the outside in a predetermined direction and size. Thus, the speaker 32 serves as an output unit that outputs output data corresponding to the response sentence generated by the processing computer 10. The processing computer 10 is connected to the Internet 103. Then, the processing computer 10 downloads information data on the Internet 103.

モータ３３は、応答文を出力する際に、各部位を動作させる。例えば、モータ３３は、ヒューマノイド型ロボットの、口唇、首、腕、足、関節、車輪などを動作させる。これにより、応答する際に、手振り、身振りなどの所定の動作が実施される。 The motor 33 operates each part when outputting a response sentence. For example, the motor 33 operates the lip, neck, arm, foot, joint, wheel, and the like of a humanoid robot. Thereby, when responding, predetermined operations such as hand gestures and gestures are performed.

処理コンピュータ１０は、音声認識部１１、言語理解部１２、対話管理部１３、応答文生成部１４、音声合成部１５、発話禁止文データベース１６、質問応答データベース１７、及びモータ制御部２０を有している。 The processing computer 10 includes a speech recognition unit 11, a language understanding unit 12, a dialogue management unit 13, a response sentence generation unit 14, a speech synthesis unit 15, an utterance prohibition sentence database 16, a question response database 17, and a motor control unit 20. ing.

音声認識部１１は、音声認識処理を行う。音声認識部１１は、例えば、入力文の音声データをテキストデータに変換する。これにより、テキスト形式の入力データが生成される。すなわち、音声データを単語列、文字列に変換する。例えば、音声認識部１１は、言語モデルと音響モデルを用いてデコードする。さらに、音声認識部１１は、形態素解析を行う。すなわち、音声認識結果のテキストを形態素に分割し、それぞれの形態素に品詞情報を付加する。なお、音声認識処理を行う前に、雑音除去処理などを行ってもよい。このような音声認識処理は、所定の音声認識エンジンを用いて実施することができる。 The voice recognition unit 11 performs voice recognition processing. For example, the voice recognition unit 11 converts voice data of an input sentence into text data. Thereby, input data in text format is generated. That is, the voice data is converted into a word string and a character string. For example, the speech recognition unit 11 performs decoding using a language model and an acoustic model. Furthermore, the speech recognition unit 11 performs morphological analysis. That is, the speech recognition result text is divided into morphemes, and part-of-speech information is added to each morpheme. Note that a noise removal process or the like may be performed before the voice recognition process. Such voice recognition processing can be performed using a predetermined voice recognition engine.

言語理解部１２は、音声認識されたデータに対して言語理解処理を行う。これにより、入力文の言語情報が理解される。すなわち、言語理解部１２は、入力文における質問、要求等の内容を理解する。例えば、テキストデータに対して係りうけ解析などを行う。各形態素がどの形態素にかかっているかが解析される。例えば、名詞（例えば、主語や主語等）や形容動詞がどの動詞に係っているかが判別される。あるいは、形容詞がどの名詞に係っているか判別される。また、言語理解処理において、所定の概念モデルを用いてもよい。言語理解処理については、公知の処理方法を用いることができる。 The language understanding unit 12 performs language understanding processing on the speech-recognized data. Thereby, the language information of the input sentence is understood. That is, the language understanding unit 12 understands the contents of questions, requests, etc. in the input sentence. For example, a dependency analysis is performed on text data. It is analyzed which morpheme each morpheme covers. For example, it is determined to which verb a noun (for example, subject, subject, etc.) or adjective verb relates. Alternatively, it is determined which noun the adjective is associated with. In the language understanding process, a predetermined conceptual model may be used. A known processing method can be used for language understanding processing.

対話管理部１３は、対話管理を行う。例えば、言語理解された質問に対して、質問応答データベース１７を参照して、発話候補文を決定する。質問応答データベース１７には、質問に応答する回答文がデータベースとして記憶されている。例えば、（１）質問文Ａ「トイレはどこですか？」に対する回答文として「右斜め前です。」、（２）質問文Ｂ「あなたの名前は？」に対する回答文として、「私の名前はＡＡＡです。」などが記憶されている。すなわち、質問応答データベース１７には、質問文と回答文とが対応付けられて記憶されている。そして、対話管理部１３は、入力されたテキスト文に最も近い質問文を選択する。質問応答データベース１７の中で最も近い質問文に対応する回答文を発話候補文として選択する。これにより、入力文の内容に応じて発話候補文が決定される。 The dialogue management unit 13 performs dialogue management. For example, an utterance candidate sentence is determined with reference to the question answering database 17 for a question whose language is understood. The question answering database 17 stores answer sentences that answer questions as a database. For example, (1) As for the answer to question sentence A “Where is the toilet?”, “As for right front”, (2) As for the answer to question sentence B “What is your name?” "It is AAA." That is, the question answer database 17 stores a question sentence and an answer sentence in association with each other. Then, the dialogue management unit 13 selects a question sentence that is closest to the input text sentence. An answer sentence corresponding to the closest question sentence in the question answering database 17 is selected as an utterance candidate sentence. Thereby, an utterance candidate sentence is determined according to the contents of the input sentence.

さらに、対話管理部１３は、特定の発話に対して、知能処理を行う。例えば、インターネット情報を求める質問に対しては、インターネット１０３からテキスト情報を取得する。発話者が発した質問が、「なにか最新ニュースを教えて」の場合、インターネットからニュースを取得する。具体的には、インターネット１０３から情報を取得して、ｈｔｍｌのタグ情報を用いて、記事の部分を抜き出す。これにより、ニュースのテキストデータがダウンロードされる。そして、このニュースのテキスト文を発話候補文とする。 Furthermore, the dialogue management unit 13 performs an intelligent process on a specific utterance. For example, text information is acquired from the Internet 103 for questions that require Internet information. If the question made by the speaker is "Tell me some recent news", get the news from the Internet. Specifically, information is acquired from the Internet 103, and an article portion is extracted using html tag information. Thus, news text data is downloaded. The text sentence of this news is set as an utterance candidate sentence.

このように、「なにか最新ニュースを教えて」という発話が発話者１０１によってなされた場合、対話管理部１３は、最新ニュースをインターネット１０３からダウンロードする。最新ニュースは、例えば、ポータルサイトからダウンロードされる。具体的には、ヤフーニュースをダウンロードして、最新ニュースとすることができる。そして、最新ニュースを発話候補文とする。 As described above, when the speaker 101 utters “tell me some latest news”, the dialogue management unit 13 downloads the latest news from the Internet 103. The latest news is downloaded from a portal site, for example. Specifically, Yahoo News can be downloaded and used as the latest news. And let the latest news be an utterance candidate sentence.

さらに、対話管理部１３は、発話候補文に対して、類似度計算、及び発話判定を行なう。例えば、対話管理部１３は、発話禁止文データベース１６を参照して、発話禁止文と発話候補文の類似度を計算する。なお、発話禁止文データベース１６には、１以上の発話禁止文が記憶されている。そして、それぞれの発話禁止文と発話候補文の類似度を算出する。その類似度に基づいて判定値を算出し、発話判定を行なう。具体的には、対話管理部１３は、判定値としきい値とを比較して、その比較結果に基づいて応答候補文の出力を制限するか否かを判定する。すなわち、対話管理部１３は、発話候補文が発話に適しているかを判定する。発話判定によって、発話候補文が発話に適していないと判定された場合、発話候補文を変更する。例えば、発話に適していないと判定された発話候補文とは異なる文を発話候補文とする。具体的には、最新ニュースの次に新しいニュースを発話候補文とする。そして、類似度、及び判定値を求めて、発話判定を行なう。そして、発話に適した発話候補文になるまで、処理を繰り返す。すなわち、発話に適したニュースが現れるまで、同様の処理を繰り返す。 Furthermore, the dialogue management unit 13 performs similarity calculation and utterance determination on the utterance candidate sentence. For example, the dialogue management unit 13 refers to the utterance prohibition sentence database 16 and calculates the similarity between the utterance prohibition sentence and the utterance candidate sentence. The utterance prohibition sentence database 16 stores one or more utterance prohibition sentences. Then, the similarity between each utterance prohibited sentence and utterance candidate sentence is calculated. A determination value is calculated based on the similarity, and speech determination is performed. Specifically, the dialogue management unit 13 compares the determination value with a threshold value, and determines whether to limit the output of the response candidate sentence based on the comparison result. That is, the dialogue management unit 13 determines whether the utterance candidate sentence is suitable for utterance. If the utterance determination determines that the utterance candidate sentence is not suitable for utterance, the utterance candidate sentence is changed. For example, a sentence different from an utterance candidate sentence determined to be unsuitable for utterance is set as an utterance candidate sentence. Specifically, a new news next to the latest news is set as an utterance candidate sentence. Then, the similarity and the determination value are obtained, and the speech determination is performed. Then, the process is repeated until an utterance candidate sentence suitable for utterance is obtained. That is, the same processing is repeated until news suitable for utterance appears.

このように、対話管理部１３は、入力文の言語情報に応じて発話候補文を決定する。そして、対話管理部１３は、発話禁止文データベース１６を参照して、発話候補文の内容に応じて発話候補文の出力を制限する。すなわち、発話禁止文データベース１６の内容に基づいて、発話候補文の内容が不適切と判断された場合、発話候補文の発話を禁じる。なお、対話管理部１３における処理は、上記の処理に限られるものではない。例えば、発話者１０１が発話した内容をそのまま言い返すような処理を行ってもよい。すなわち、発話者１０１が発話した文をそのまま発話候補文としてもよい。発話候補文の選定は、特に限定されるものではない。 Thus, the dialogue management unit 13 determines the utterance candidate sentence according to the language information of the input sentence. Then, the dialogue management unit 13 refers to the utterance prohibition sentence database 16 and restricts the output of the utterance candidate sentences according to the contents of the utterance candidate sentences. That is, when it is determined that the content of the utterance candidate sentence is inappropriate based on the content of the utterance prohibited sentence database 16, the utterance of the utterance candidate sentence is prohibited. Note that the processing in the dialogue management unit 13 is not limited to the above processing. For example, you may perform the process which returns the content which the speaker 101 uttered as it is. That is, a sentence uttered by the speaker 101 may be used as an utterance candidate sentence as it is. The selection of the utterance candidate sentence is not particularly limited.

応答文生成部１４は、発話候補文に基づいて応答文を生成する。例えば、最新ニュースに定形文を加えたものを応答文として生成する。例えば、最新ニュースの前に「インターネットの情報です」という定形文を追加したり、最新ニュースの後に、「というニュースがありました。」というような定形文を追加したりする。もちろん、上記の例と異なる定形文でもよく、定形文がなくてもよい。さらには、発話候補文を適宜、変形してもよい。応答文生成部１４は、発話判定結果によっては、応答候補文の出力が制限された状態で、応答文を生成する。 The response sentence generation unit 14 generates a response sentence based on the utterance candidate sentence. For example, a response sentence is generated by adding a fixed sentence to the latest news. For example, a fixed sentence such as “Information on the Internet” is added before the latest news, or a fixed sentence such as “There was a news called” is added after the latest news. Of course, it may be a fixed sentence different from the above example, and there may be no fixed sentence. Furthermore, the utterance candidate sentence may be appropriately modified. The response sentence generation unit 14 generates a response sentence in a state where output of the response candidate sentence is restricted depending on the utterance determination result.

音声合成部１５は、応答文に対して音声合成処理を行う。これにより、テキスト形式であった応答文のデータが、音声形式のデータ（例えば、ＷＡＶデータ等）に変換される。そして、音声合成部で合成された音声データは、スピーカ３２によって出力される。音声合成処理には、公知の方法を用いることができる。このようにして、定形文と組み合わされた最新ニュースが読み上げられる。 The speech synthesis unit 15 performs speech synthesis processing on the response sentence. As a result, the response sentence data in the text format is converted into voice format data (for example, WAV data). Then, the voice data synthesized by the voice synthesizer is output by the speaker 32. A known method can be used for the speech synthesis process. In this way, the latest news combined with the fixed phrase is read out.

次に、図２を用いて、本実施の形態にかかる応答ロボット１００の制御方法について説明する。図２は、本実施の形態にかかる発話内容の制御方法を示すフローチャートである。ここでは、主として、対話管理部１３での処理について説明する。まず、応答ロボット１００が発話候補文を確定する（ステップＳ１０１）。言語理解部１２で理解された言語情報に基づいて、発話候補文が決定される。上記のように、インターネット１０３からダウンロードした最新ニュースを発話候補文とする。 Next, the control method of the response robot 100 according to the present embodiment will be described with reference to FIG. FIG. 2 is a flowchart showing a method for controlling the utterance content according to the present embodiment. Here, the process in the dialogue management unit 13 will be mainly described. First, the response robot 100 determines an utterance candidate sentence (step S101). An utterance candidate sentence is determined based on the linguistic information understood by the language understanding unit 12. As described above, the latest news downloaded from the Internet 103 is set as an utterance candidate sentence.

次に、発話禁止文データベース１６に記憶されている発話禁止文と、発話候補文との類似度Ａ_ｉ（ｉ＝１、２・・・ｍ）を算出する（ステップＳ１０２）。ここで、ｍは発話禁止文データベースに記憶されている発話禁止文の数である。すなわち、発話禁止文データベース１６にｍ個の発話禁止文が登録されているとする。この場合、ｍ個の類似度が算出される。すなわち、発話禁止文データベース１６に記憶されている全発話禁止文に対して、類似度をそれぞれ算出する。発話禁止文は、完全な文に限らず、単語であってもよい。 Next, the similarity A _i (i = 1, 2... M) between the utterance prohibition sentence stored in the utterance prohibition sentence database 16 and the utterance candidate sentence is calculated (step S102). Here, m is the number of prohibited speech statements stored in the prohibited speech database. That is, it is assumed that m utterance prohibition sentences are registered in the utterance prohibition sentence database 16. In this case, m similarities are calculated. That is, the similarity is calculated for each utterance prohibition sentence stored in the utterance prohibition sentence database 16. The utterance prohibition sentence is not limited to a complete sentence but may be a word.

類似度は、例えば、図２の式１によって算出される。式１において、関数Ｆ（ｘ，ｙ）は形態素ｘと形態素ｙとが、一致したときに１を返し、それ以外のときに０を返す関数である。また、Ｂｋ（ｋ＝１、・・・・Ｌ）は発話候補文の形態素であり、Ｌは発話候補文の形態素数である。すなわち、発話候補文には、Ｌ個の形態素が含まれている。さらに、Ｃｉｊ（ｉ＝１、２・・・ｍ、ｊ＝１、２・・・ｎ_ｉ）は発話禁止文データベースの形態素である。但し、ｎ_ｉはｉ番目の発話禁止文の形態素数である。すなわち、ｉ番目の発話禁止文には、ｎ_ｉ個の形態素が含まれている。 The similarity is calculated by, for example, Equation 1 in FIG. In Formula 1, function F (x, y) is a function that returns 1 when morpheme x and morpheme y match, and returns 0 otherwise. Bk (k = 1,... L) is a morpheme of the utterance candidate sentence, and L is a morpheme number of the utterance candidate sentence. That is, the utterance candidate sentence includes L morphemes. Further, Cij (i = 1, 2,..., M, j = 1, 2,..., N _i ) is a morpheme of the speech prohibition sentence database. However, _ni is the morpheme number of the i-th speech prohibition sentence. That is, the i th utterance prohibition statement contains a n _i number of morphemes.

このように、関数Ｆ（Ｂ_ｋ、Ｃ_ｉｊ）は、発話候補文の形態素が発話禁止文に含まれる場合、１を返し、発話候補文の形態素が発話禁止文に含まれない場合、０を返す。従って、１つの発話禁止文におけるＦ（Ｂ_ｋ、Ｃ_ｉｊ）の合計が、その禁止文における形態素の一致数となる。すなわち、式１は、発話禁止文と発話候補文との間の形態素の一致数を発話禁止文の形態素数で割った値を示す。ここで、形態素の一致数とは、発話候補文と発話禁止文とで共通する形態素の数となる。そして、発話候補文の形態素と発話禁止文の形態素の一致数に基づいて、発話候補文に対するその発話禁止文の類似度が算出される。すなわち、発話候補文に含まれる形態素のうち発話禁止文に含まれる形態素と一致する形態素の数を求める。そして、形態素の一致数を発話禁止文の形態素数で割ることにより、類似度Ａ_ｉが算出される。全ての発話禁止文に対して、類似度を算出する。 Thus, the function F (B _k , C _ij ) returns 1 when the morpheme of the utterance candidate sentence is included in the utterance prohibition sentence, and returns 0 when the morpheme of the utterance candidate sentence is not included in the utterance prohibition sentence. return. Therefore, the sum of F (B _k , C _ij ) in one utterance prohibited sentence is the number of coincidence of morphemes in the prohibited sentence. That is, Expression 1 shows a value obtained by dividing the number of morpheme matches between the utterance prohibited sentence and the utterance candidate sentence by the morpheme number of the utterance prohibited sentence. Here, the coincidence number of morphemes is the number of morphemes common to the utterance candidate sentence and the utterance prohibition sentence. Then, based on the number of matches between the morpheme of the utterance candidate sentence and the utterance prohibition sentence, the similarity of the utterance prohibition sentence to the utterance candidate sentence is calculated. That is, the number of morphemes that match the morphemes included in the utterance prohibition sentence among the morphemes included in the utterance candidate sentence is obtained. Then, the similarity A _i is calculated by dividing the number of matching morphemes by the number of morphemes of the utterance prohibition sentence. The similarity is calculated for all utterance prohibited sentences.

また、特定の品詞に着目して、類似度を決定してもよい。すなわち、特定品詞の形態素の一致数に基づいて、類似度を算出してもよい。例えば、名詞、動詞、形容詞のみに着目した場合、名詞、動詞、形容詞の形態素が一致したときに、１を返す。助詞、連体詞、助動詞などの他の形態素は、カウントされない。この場合、Ｐｉは、ｉ番目の発話禁止文における特定の品詞（名詞、動詞、形容詞）の形態素の数となる。ここでは、特定の品詞以外の形態素はカウントされない。なお、着目する品詞は、名詞、動詞、形容詞に限定されるものではない。例えば、名詞、動詞のみについて着目してもよい。このように、発話候補文と発話禁止文とで共通する、特定品詞の形態素数を一致数としてもよい。さらに、各品詞に対して重み付けを行ってもよい。 Further, the similarity may be determined by paying attention to a specific part of speech. That is, the similarity may be calculated based on the number of morpheme matches for a specific part of speech. For example, when focusing only on nouns, verbs, and adjectives, 1 is returned when morphemes of nouns, verbs, and adjectives match. Other morphemes such as particles, conjunctions, auxiliary verbs are not counted. In this case, Pi is the number of morphemes of a specific part of speech (noun, verb, adjective) in the i-th speech prohibition sentence. Here, morphemes other than a specific part of speech are not counted. Note that the part of speech of interest is not limited to nouns, verbs, and adjectives. For example, attention may be paid only to nouns and verbs. As described above, the number of morphemes of specific parts of speech that are common to the utterance candidate sentence and the utterance prohibition sentence may be used as the coincidence number. Further, each part of speech may be weighted.

このようにして、各発話禁止文に対して類似度Ａ_ｉを求める。類似度Ａ_ｉは、発話禁止文と発話候補文とがどの程度類似しているかを示す尺度となる。従って、発話禁止文と類似度が高い発話候補文は、発話に不適切な内容を含んでいるおそれがある。 In this way, we obtain the similarity A _i relative to each utterance prohibition statement. The similarity A _i is a scale indicating how similar the utterance prohibited sentence and the utterance candidate sentence are. Therefore, the utterance candidate sentence having a high similarity to the utterance prohibition sentence may include content inappropriate for the utterance.

そして、類似度Ａ_ｉに基づいて判定値を算出し、判定値としきい値を比較する（ステップＳ１０３）。ここでは、判定値をＡ_ｉの最大値としているため、判定値＝ＭＡＸ（Ａ_ｉ）となる。すなわち、全ての類似度の中でも最も大きい値を判定値とする。そして、判定値と予め定められたしきい値とを比較する。 Then, it calculates a determination value based on the similarity A _i, compares the determined value with a threshold (step S103). Here, since the determination value is the maximum value of A _i , determination value = MAX (A _i ). That is, the largest value among all the similarities is set as the determination value. Then, the determination value is compared with a predetermined threshold value.

判定値がしきい値Ｚを超えていない場合、発話候補文の内容でロボットに発話させる（ステップＳ１０４）。すなわち、発話候補文に基づいて応答文を生成する。例えば、上記のように、定形文に最新ニュースを組み合わせた文を応答文とする。そして、その応答文をスピーカ３２から出力する。判定値がしきい値Ｚを越えていない場合、発話候補文が全ての発話禁止文に類似していないため、発話に不適切な内容ではないと判定される。このため、発話候補文に基づいて応答文を生成して、出力する。 If the determination value does not exceed the threshold value Z, the robot is uttered with the content of the utterance candidate sentence (step S104). That is, a response sentence is generated based on the utterance candidate sentence. For example, as described above, a sentence in which a fixed sentence is combined with the latest news is used as a response sentence. Then, the response sentence is output from the speaker 32. If the determination value does not exceed the threshold value Z, the utterance candidate sentence is not similar to all utterance prohibition sentences, and therefore, it is determined that the content is not inappropriate for utterance. For this reason, a response sentence is generated based on the utterance candidate sentence and output.

判定値がしきい値Ｚを超えている場合、発話候補文の内容でロボットに発話させない（ステップＳ１０５）。すなわち、判定値がしきい値Ｚ以上場合、発話に不適切な内容が含まれていると推測される。そのため、その発話候補文を発話させない。発話候補文の出力が制限された状態で、発話文を生成する。判定値としきい値Ｚの比較結果に基づいて、発話候補文の発話が制限されるか否かが判定される。そして、出力が制限された発話候補文以外の発話候補文に基づいて応答文が生成される。これにより、不適切なニュースなどが出力されるのを防ぐことができる。また、上記のように、別のニュースを応答候補文とすることで、適切な内容のニュースを発話させることが可能になる。すなわち、判定値がしきい値Ｚ以下となるニュースを応答候補文とすることができる。 If the determination value exceeds the threshold value Z, the robot is not uttered with the content of the utterance candidate sentence (step S105). That is, when the determination value is equal to or greater than the threshold value Z, it is estimated that inappropriate content is included in the utterance. Therefore, the utterance candidate sentence is not uttered. An utterance sentence is generated in a state where output of the utterance candidate sentence is restricted. Based on the comparison result between the determination value and the threshold value Z, it is determined whether or not the utterance of the utterance candidate sentence is restricted. Then, a response sentence is generated based on an utterance candidate sentence other than the utterance candidate sentence whose output is restricted. This can prevent inappropriate news and the like from being output. Further, as described above, by making another news as a response candidate sentence, it becomes possible to utter news with appropriate contents. That is, news with a determination value equal to or less than the threshold value Z can be set as a response candidate sentence.

ここで、類似度の計算例について、図３乃至図５を用いて詳細に説明する。図３〜図５は、類似度計算を説明するための図である。ここで、図３乃至図５では、発話候補文、及び発話禁止文を変えて類似度を算出している。図３乃至図５では、左側に発話候補文を示し、右側に発話禁止文を示している。 Here, an example of calculating the similarity will be described in detail with reference to FIGS. 3 to 5 are diagrams for explaining the similarity calculation. Here, in FIGS. 3 to 5, the similarity is calculated by changing the utterance candidate sentence and the utterance prohibition sentence. 3 to 5, the utterance candidate sentence is shown on the left side, and the utterance prohibition sentence is shown on the right side.

まず、図３での類似度計算例について説明する。ここで、「殺す（動詞）」が発話禁止文となっている。すなわち、発話候補文には、「殺す（動詞）」が含まれる。従って、類似度Ａ_１＝１／１＝１となる。また、名詞、動詞、形容詞に着目して、類似度を算出する。発話候補文には、「殺す（動詞）」が含まれている。この場合、「殺す（動詞）」の形態素が発話禁止文、及び発話候補文の両方に含まれる。従って、発話候補文と発話禁止文との間で、１つの形態素が一致する。すなわち、形態素の一致数は１となり、式１の分子は１となる。さらに、発話禁止文の形態素のうち、特定の品詞の形態素数は１つである。従って、式１の分母は１となる。よって、類似度Ａ_１＝１／１＝１となる。 First, the similarity calculation example in FIG. 3 will be described. Here, “kill (verb)” is an utterance prohibition sentence. That is, the utterance candidate sentence includes “kill (verb)”. Therefore, the similarity A ₁ = 1/1 = 1. The similarity is calculated by paying attention to nouns, verbs, and adjectives. The utterance candidate sentence includes “kill (verb)”. In this case, the “kill (verb)” morpheme is included in both the utterance prohibited sentence and the utterance candidate sentence. Therefore, one morpheme matches between the utterance candidate sentence and the utterance prohibition sentence. That is, the coincidence number of morphemes is 1, and the numerator of Formula 1 is 1. Furthermore, the number of morphemes of a specific part of speech is one among the morphemes of the speech prohibition sentence. Therefore, the denominator of Equation 1 is 1. Therefore, the similarity A ₁ = 1/1 = 1.

次に、図４での類似度計算例について説明する。図４の計算例では、発話候補文が図３の計算例と同じであり、発話禁止文が変わっている。ここで、「うるさいから出て行って」という発話禁止文に対して類似度を計算する。また、名詞、動詞、形容詞に着目して、類似度を算出する。発話候補文には、「うるさい（形容詞）」が含まれている。この場合、「うるさい（形容詞）」の形態素が発話禁止文、及び発話候補文の両方に含まれる。従って、発話候補文と発話禁止文との間で、１つの形態素が一致する。すなわち、形態素の一致数は１となり、式１の分子は１となる。さらに、発話禁止文の形態素のうち、特定の品詞の形態素数は３つである。従って、式１の分母は３となる。よって、類似度Ａ_２＝１／３＝０．３３となる。 Next, the similarity calculation example in FIG. 4 will be described. In the calculation example of FIG. 4, the utterance candidate sentence is the same as the calculation example of FIG. 3, and the utterance prohibition sentence is changed. Here, the similarity is calculated for the utterance prohibition sentence “go out of the noisy”. The similarity is calculated by paying attention to nouns, verbs, and adjectives. The utterance candidate sentence includes “noisy (adjective)”. In this case, a morpheme of “noisy (adjective)” is included in both the utterance prohibition sentence and the utterance candidate sentence. Therefore, one morpheme matches between the utterance candidate sentence and the utterance prohibition sentence. That is, the coincidence number of morphemes is 1, and the numerator of Formula 1 is 1. Furthermore, the number of morphemes of a specific part of speech among the morphemes of the speech prohibition sentence is three. Therefore, the denominator of Equation 1 is 3. Therefore, the similarity A ₂ = 1/3 = 0.33.

次に、図５での類似度計算例について説明する。この計算例は、対話管理部１３が、発話者の発話を記憶して、あるタイミングでそのまま言い返す機能を持っている場合の例である。例えば、発話者の過去の発話である「うるさいから出ていって」を発話候補文としている。また、発話禁止文として、「うるさいから出ていって」という文が発話禁止文データベース１６に登録されている。この場合、「うるさい（形容詞）」、「出（動詞）」、「いっ（動詞）」の３形態素が一致するため、一致数は３となる。また、特定の品詞に着目した場合の全形態素数も３である。よって、類似度Ａ_２＝３／３＝１となる。従って、他の発話候補文に基づいて、応答文を生成する。 Next, a similarity calculation example in FIG. 5 will be described. This calculation example is an example in the case where the dialogue management unit 13 has a function of storing the utterance of the speaker and rephrasing it at a certain timing. For example, an utterance candidate sentence is “out of noisy” which is the utterance of the speaker in the past. In addition, a sentence “get out of noisy” is registered in the utterance prohibition sentence database 16 as an utterance prohibition sentence. In this case, since the three morphemes “Noisy (adjective)”, “Out (verb)”, and “I (verb)” match, the number of matches is 3. Further, the total morpheme number is 3 when focusing on a specific part of speech. Therefore, the similarity A ₂ = 3/3 = 1. Therefore, a response sentence is generated based on another utterance candidate sentence.

過去に応答ロボット１００が発話した文で、お客様に失礼な文を発話禁止文データベース１６に追加しておくことが好ましい。すなわち、応答ロボット１００が発話した文のログを確認して、不適切な文を発話禁止文データベース１６に追加する。こうすることによって、応答ロボット１００は、その失礼な文に類似する文を発話しないようになる。よって、発話内容を適切に制御することができる。 It is preferable to add to the speech prohibition sentence database 16 sentences that have been uttered by the response robot 100 in the past and are rude to the customer. That is, the log of the sentence uttered by the response robot 100 is confirmed, and an inappropriate sentence is added to the utterance prohibition sentence database 16. By doing so, the response robot 100 does not speak a sentence similar to the rude sentence. Therefore, it is possible to appropriately control the utterance content.

このように制御することで、適切な内容の文を発話させることができる。例えば、図１０に示したニュースが最新ニュースであった場合、そのニュースの発話が禁止される。そして、図６に示すような適切なニュースを読み上げることができる。これにより、不適切な内容の出力が制限され、対話者が不快な気分になるのを防ぐことができる。 By controlling in this way, a sentence having an appropriate content can be uttered. For example, when the news shown in FIG. 10 is the latest news, the utterance of the news is prohibited. And appropriate news as shown in FIG. 6 can be read aloud. As a result, output of inappropriate contents is restricted, and it is possible to prevent the conversation person from feeling uncomfortable.

なお、上記の例では、インターネットニュースを読み上げる例について説明したが、本実施の形態はこれに限られるものではない。例えば、現在の交通情報などを発話させてもよい。具体的には、車両展示施設などにおいて、お客様が交通情報を問い合わせた場合、図７のような内容の応答文を発話する。さらには、通訳ロボットにおいても、本制御方法を利用可能である。すなわち、翻訳して欲しくない内容を発話禁止文として登録すればよい。この場合、処理コンピュータ１０に翻訳ソフトなどを組み込む。もちろん、本制御方法は、日本語に限らず、英語などの他の言語を出力するものに対しても利用可能である。 In the above example, an example of reading out Internet news has been described. However, the present embodiment is not limited to this. For example, current traffic information may be uttered. Specifically, when a customer inquires about traffic information at a vehicle exhibition facility or the like, a response sentence having the contents as shown in FIG. 7 is uttered. Furthermore, this control method can also be used in an interpreting robot. That is, the contents that you do not want to translate may be registered as utterance prohibition sentences. In this case, translation software or the like is incorporated into the processing computer 10. Of course, this control method can be used not only for Japanese but also for other languages such as English.

なお、上記の例では、類似度Ａ_ｉの最大値を判定値としたが、これ以外の値を判定値としてもよい。例えば、類似度Ａ_ｉの平均値を判定値としてもよい。すなわち、発話禁止文が１０個ある場合、それら１０個に対して算出された類似度Ａ_ｉの平均値を判定値とする。また、上からｓ番目（ｓは２以上の整数）の類似度を判定値としてもよく、上から１番目からｓ番目までの類似度の平均値を判定値としてもよい。例えば、上から５番目の類似度を判定値としてもよく、上位５個の類似度の平均値を判定値としてもよい。 In the above example, the maximum value of the similarity A _i is used as the determination value, but other values may be used as the determination value. For example, an average value of the similarity A _i may be used as the determination value. That is, when there are ten utterance prohibition sentences, the average value of the similarities A _i calculated for the ten utterance prohibition sentences is set as the determination value. Further, the s-th similarity (s is an integer of 2 or more) from the top may be used as the determination value, and the average value of the first to s-th similarity from the top may be used as the determination value. For example, the fifth similarity from the top may be used as the determination value, and the average value of the top five similarities may be used as the determination value.

このように、最大値以外の値を判定値としてもよい。これにより、正確な発話判定を行なうことができる。例えば、適切な内容の文が発話禁止文データベース１６に登録された場合、適切な内容の発話候補文の発話が制限されてしまう。最大値以外を用いることで、適切な内容の発話禁止文との類似度が偶然に高くなったとしても、発話が禁止されなくなる。よって、適切な内容を発話することができる。また、発話禁止文の登録数を増やした場合でも、正確に判定することができる。従って、発話禁止文の選定、登録を容易に行うことができる。このように、類似度に基づいた判定値を用いることで、発話判定を正確に行なうことができる。 Thus, a value other than the maximum value may be used as the determination value. Thereby, accurate speech determination can be performed. For example, when a sentence having an appropriate content is registered in the utterance prohibition sentence database 16, the utterance of an utterance candidate sentence having an appropriate content is limited. By using a value other than the maximum value, even if the similarity with an utterance prohibition sentence having an appropriate content is accidentally increased, the utterance is not prohibited. Therefore, it is possible to speak appropriate content. Further, even when the number of registered utterance prohibition sentences is increased, it can be determined accurately. Therefore, it is possible to easily select and register an utterance prohibition sentence. As described above, the speech determination can be performed accurately by using the determination value based on the similarity.

発話禁止文には、１以上の形態素が含まれていればよいが、２以上の形態素を設けることが好ましい。すなわち、２以上の形態素を含む発話禁止文を発話禁止文データベース１６に登録することが好ましい。２以上の形態素を含む発話禁止文と発話候補文を比較することによって、より正確に類似度を算出することができる。これについて、図８を用いて説明する。図８は形態素が１つの発話禁止文しかない例を示す図である。図８では、発話禁止文に１つの形態素しか含まれないため、発話禁止文データベースをＮＧワードデータベースとして示している。そして、ＮＧワードデータベースには、例えば、「バカ」、「アホ」、「殺す」、「出」が登録されている。図８において、左側にＮＧワードデータベースが示され、右側に発話候補文１〜３が示されている。発話候補文１、２は、発話させたくない文例であり、発話候補文３は発話させたい文例である。 The speech prohibition sentence only needs to include one or more morphemes, but it is preferable to provide two or more morphemes. That is, it is preferable to register an utterance prohibition sentence including two or more morphemes in the utterance prohibition sentence database 16. By comparing an utterance prohibition sentence including two or more morphemes with an utterance candidate sentence, the similarity can be calculated more accurately. This will be described with reference to FIG. FIG. 8 is a diagram showing an example in which a morpheme has only one utterance prohibition sentence. In FIG. 8, since only one morpheme is included in the utterance prohibition sentence, the utterance prohibition sentence database is shown as an NG word database. In the NG word database, for example, “Baka”, “Aho”, “Kill”, and “Out” are registered. In FIG. 8, the NG word database is shown on the left side, and the utterance candidate sentences 1 to 3 are shown on the right side. The utterance candidate sentences 1 and 2 are sentence examples that are not desired to be uttered, and the utterance candidate sentence 3 is an example sentence that is desired to be uttered.

例えば、発話候補文１の場合、「バカ」の言葉が一致するため、発話が禁止される。同様に発話候補文２の場合、「出」が一致するため、発話が禁止される。実際は発話させたい発話候補文３の場合であっても、「出」が一致するため、発話が禁止されてしまう。従って、発話させたい文と、発話させたくない文の判別が困難になってしまう。すなわち、１形態素しか発話禁止文に含まれない場合、ＮＧワードに一致する形態素が発話候補文に存在するか否かしか判別できない。このため、上記の類似度という尺度が算出できなくなる。 For example, in the case of the utterance candidate sentence 1, since the word “idiot” matches, the utterance is prohibited. Similarly, in the case of the utterance candidate sentence 2, since “out” matches, the utterance is prohibited. Actually, even in the case of the utterance candidate sentence 3 that is desired to be uttered, since “out” matches, the utterance is prohibited. Therefore, it becomes difficult to discriminate between a sentence to be uttered and a sentence not to be uttered. That is, when only one morpheme is included in the utterance prohibition sentence, it can be determined only whether or not a morpheme matching the NG word exists in the utterance candidate sentence. For this reason, the above-mentioned measure of similarity cannot be calculated.

このため、発話禁止文に形態素が１つしかない場合、発話禁止文データベース１６の設計が困難になる。すなわち、発話禁止文データベース１６に登録する単語の選定が困難になる。また、登録した単語によっては、思わぬところで弊害が発生するおそれがある。すなわち、発話させたい文も発話が禁止されてしまうことがある。なお、名詞、形容詞、動詞などの特定の品詞に着目することによって、不適切な内容の発話を確実に制限することができる。 For this reason, when the utterance prohibition sentence has only one morpheme, it becomes difficult to design the utterance prohibition sentence database 16. That is, it becomes difficult to select words to be registered in the utterance prohibition sentence database 16. Moreover, depending on the registered word, there is a possibility that an adverse effect may occur unexpectedly. That is, there are cases where the utterance of a sentence that is desired to be uttered is prohibited. Note that utterances with inappropriate content can be surely restricted by focusing on specific parts of speech such as nouns, adjectives and verbs.

さらに、応答ロボット１００では、発話禁止文データベース１６の設計、及びメンテナンスなどを容易に行うことができる。例えば、応答ロボット１００が、発話内容の過去ログから、発話禁止文データベース１６に発話禁止文を登録することが可能である。すなわち、発話内容の過去ログを参照して、実際に発話した内容から不適切な内容を含む発話文を発話禁止文として登録する。これにより、簡便に発話禁止文データベース１６を作成することができる。 Furthermore, the response robot 100 can easily design and maintain the speech prohibition sentence database 16. For example, the response robot 100 can register the utterance prohibition sentence in the utterance prohibition sentence database 16 from the past log of the utterance contents. That is, referring to the past log of utterance contents, an utterance sentence including inappropriate contents from the actually uttered contents is registered as an utterance prohibition sentence. Thereby, the speech prohibition sentence database 16 can be created simply.

また、応答ロボット１００が行うタスクに近い文章のテストセットで、しきい値を調整することができる。これにより、しきち値の調整を容易に行うことができる。例えば、適切な応答候補文と、不適切な応答候補文を予め作成したテストセットを用意する。具体的には、不適切な応答候補文と、適切な応答候補文とを５０ずつ用意する。そして、それぞれの応答候補文に対して類似度を算出し、判定値を求める。適切な応答文を出力し、不適切な応答文の出力が制限されるような、しきい値を設定する。このようにすることで、しきい値の設定を容易に行うことができる。しきい値を調整することによって、不適切な内容の発話を確実に制限し、適切な内容を確実に発話することができる。 Further, the threshold value can be adjusted with a test set of sentences close to the task performed by the response robot 100. Thereby, the threshold value can be easily adjusted. For example, a test set in which an appropriate response candidate sentence and an inappropriate response candidate sentence are prepared in advance is prepared. Specifically, 50 inappropriate response candidate sentences and 50 appropriate response candidate sentences are prepared. Then, a similarity is calculated for each response candidate sentence, and a determination value is obtained. A threshold value is set so that an appropriate response sentence is output and output of an inappropriate response sentence is restricted. In this way, the threshold value can be set easily. By adjusting the threshold value, it is possible to surely limit the utterances of inappropriate contents and reliably utter appropriate contents.

例えば、自動車のショールームにおいて、応答ロボットが展示されている自動車に関する質問に対して応答するロボットであるとする。自動車のショールームにおいて、想定される質問に対する応答文を予めテストセットとして用意する。そして、それぞれの応答文と発話禁止文データベース１６に登録されている発話禁止文を比較する。不適切な応答文での判定値がしきい値以上となり、適切な内容での判定値がしきい値よりも小さくなるように、しきい値を設定する。このようにすることで、発話候補文の出力を制限するか否かの判定を確実かつ簡便に行なうことができる。 For example, in a car showroom, a response robot is assumed to be a robot that responds to questions about a car on display. In an automobile showroom, a response sentence to an assumed question is prepared in advance as a test set. Then, each response sentence and the utterance prohibition sentence registered in the utterance prohibition sentence database 16 are compared. The threshold value is set so that the determination value with an inappropriate response sentence is equal to or greater than the threshold value, and the determination value with appropriate content is smaller than the threshold value. In this way, it is possible to reliably and easily determine whether or not to limit the output of the utterance candidate sentence.

次に、図９を用いて、本実施の形態にかかる応答ロボット１００の別の制御方法について説明する。図９は、本実施の形態にかかる発話内容制御方法の別の態様を示すフローチャートである。まず、音声の入力があるか否かを判定する（ステップＳ２０１）。例えば、マイク３１の入力に基づいて、音声入力の有無を判定することができる。そして、音声認識処理を行う（ステップＳ２０２）。ここでは、上記のように、音声認識部１１が音声認識処理を行い、音声データをテキストデータに変換する。 Next, another control method of the response robot 100 according to the present embodiment will be described with reference to FIG. FIG. 9 is a flowchart showing another aspect of the utterance content control method according to the present embodiment. First, it is determined whether or not there is a voice input (step S201). For example, the presence or absence of voice input can be determined based on the input of the microphone 31. Then, voice recognition processing is performed (step S202). Here, as described above, the speech recognition unit 11 performs speech recognition processing and converts speech data into text data.

そして、音声認識部１２が音声認識処理結果に基づいて、言語理解処理を行う（ステップＳ２０３）。その言語理解処理結果に基づいて、対話管理処理を行う（ステップＳ２０４）。その後、対話管理処理の結果に基づいて、発話候補文を確定する（ステップＳ２０５）。そして、発話候補文と、発話禁止文とを比較して、類似度を算出する（ステップＳ２０６）。そして、類似度の最大値を判定値として、しきい値Ｚと比較する（ステップＳ２０７）。判定値がしきい値Ｚよりも小さい場合、発話候補文の内容を応答文と確定する
（ステップＳ２０８）。なお、これらの処理工程については、上記の処理コンピュータ１０による処理と同様であるため、詳細な説明を省略する。 Then, the speech recognition unit 12 performs language understanding processing based on the speech recognition processing result (step S203). A dialog management process is performed based on the language understanding process result (step S204). Thereafter, the utterance candidate sentence is determined based on the result of the dialogue management process (step S205). Then, the utterance candidate sentence and the utterance prohibition sentence are compared, and the similarity is calculated (step S206). Then, the maximum value of the similarity is compared with the threshold value Z as a determination value (step S207). If the determination value is smaller than the threshold value Z, the content of the utterance candidate sentence is confirmed as a response sentence (step S208). Since these processing steps are the same as the processing by the processing computer 10 described above, detailed description thereof is omitted.

判定値がしきい値Ｚ以上の場合、応答文を変更する（ステップＳ２０９）。ここでは、応答ロボット１００が発話者に対して、「申し訳ありませんが、別の質問をお願いします。」という応答文を生成する。そして、上記の通り、応答文のテキストデータに対して、音声合成処理を行う（ステップＳ２１０）。そして、音声を出力する。応答ロボット１００は、このような処理を繰り返す。これにより、簡便に、適切な内容を出力することができる。また、しきい値が判定よりも小さい場合、応答ロボット１００が、発話者の対話に対して、質問を返している。これにより、簡便に発話内容を確定することができる。 If the determination value is greater than or equal to the threshold value Z, the response sentence is changed (step S209). Here, the response robot 100 generates a response sentence “Sorry, please ask another question” to the speaker. Then, as described above, speech synthesis processing is performed on the text data of the response sentence (step S210). Then, the voice is output. The response robot 100 repeats such processing. Thereby, an appropriate content can be output simply. When the threshold value is smaller than the determination, the response robot 100 returns a question to the conversation of the speaker. Thereby, the utterance content can be determined easily.

なお、上記の応答文は、スピーカから音声データとして出力されなくでもよい。例えば、表示画面上などに表示することによって、発話者に対して出力してもよい。また、入力についても、音声入力に限られるものではない。例えば、キーボード、ポインティングデバイス、タッチパネル等を用いて、テキスト入力して、応答する応答システムであっても適用可能である。このように、データの入出力は音声データを用いたものに限られるものではない。もちろん、本実施の形態にかかる応答システムの適用は、ロボットへの適用に限られるものではない。また、応答候補文や発話禁止文などは、主語、及び述語を有する完全な文でなくてもよく、詞や単語でもよい。 Note that the response sentence does not have to be output as audio data from the speaker. For example, it may be output to the speaker by displaying on a display screen or the like. Also, input is not limited to voice input. For example, a response system that responds by inputting text using a keyboard, a pointing device, a touch panel, or the like is also applicable. Thus, the input / output of data is not limited to that using audio data. Of course, application of the response system according to the present embodiment is not limited to application to a robot. Further, the response candidate sentence and the speech prohibition sentence may not be a complete sentence having a subject and a predicate, but may be a lyrics or a word.

上記のように、応答ロボット１００は、インターネット１０３などのネットワーク上で更新される情報（例えば、最新ニュース）をダウンロードして、応答候補文としている。このような場合、応答候補文の内容が未確定である。すなわち、上記の応答ロボット１００は、カーナビゲーションシステムなどと異なり、予め定められた内容を発話するわけではない。応答ロボット１００は、不確定な内容を発話するための知能処理を行っている。このため、従来の制御方法では、状況によっては、不適切な内容が出力されてしまうことがある。本実施の形態による制御方法では、出力される応答文の内容が不適切な場合、その出力を制限する。応答文の内容が不適切な場合、応答ロボット１００は、例えば、別の応答候補文から応答文を生成する。従って、簡便に適切な内容の応答文を出力することができる。 As described above, the response robot 100 downloads information (for example, the latest news) updated on a network such as the Internet 103 and uses it as a response candidate sentence. In such a case, the contents of the response candidate sentence are unconfirmed. That is, unlike the car navigation system or the like, the response robot 100 does not utter predetermined content. The response robot 100 performs intelligent processing for speaking uncertain content. For this reason, in the conventional control method, an inappropriate content may be output depending on the situation. In the control method according to the present embodiment, when the content of the response sentence to be output is inappropriate, the output is limited. When the content of the response sentence is inappropriate, for example, the response robot 100 generates a response sentence from another response candidate sentence. Therefore, it is possible to easily output a response sentence with appropriate contents.

本発明の実施の形態にかかる応答システムを有する応答ロボットの構成を示すブロック図である。It is a block diagram which shows the structure of the response robot which has a response system concerning embodiment of this invention. 本発明の実施の形態にかかる応答内容制御方法を示すフローチャートである。It is a flowchart which shows the response content control method concerning embodiment of this invention. 本実施の形態において、応答内容を制御するための処理例を示す図である。In this Embodiment, it is a figure which shows the process example for controlling the content of a response. 本実施の形態において、応答内容を制御するための処理例を示す図である。In this Embodiment, it is a figure which shows the process example for controlling the content of a response. 本実施の形態において、応答内容を制御するための処理例を示す図である。In this Embodiment, it is a figure which shows the process example for controlling the content of a response. 本発明の実施の形態にかかる応答システムを有するロボットが応答している様子を模式的に示す図である。It is a figure which shows typically a mode that the robot which has a response system concerning embodiment of this invention is responding. 本発明の実施の形態にかかる応答システムを有するロボットが応答している様子を模式的に示す図である。It is a figure which shows typically a mode that the robot which has a response system concerning embodiment of this invention is responding. 好適な発話禁止文の例を説明するための図である。It is a figure for demonstrating the example of a suitable speech prohibition sentence. 本発明の実施の形態にかかる別の応答内容制御方法を示すフローチャートである。It is a flowchart which shows another response content control method concerning embodiment of this invention. 従来の応答システムを有するロボットが応答している様子を模式的に示す図である。It is a figure which shows typically a mode that the robot which has the conventional response system is responding.

Explanation of symbols

１０処理コンピュータ、１１音声認識部、１２言語理解部、１３対話管理部、
１４応答文生成部、１５音声合成部、１６発話禁止文データベース、
１７質問応答データベース、２０モータ制御部、
３１マイク、３２スピーカ、３３各モータ、
１００応答ロボット、１０１発話者、１０３インターネット、
１０４音声対話ロボット、１０５施設利用者 10 processing computers, 11 speech recognition units, 12 language understanding units, 13 dialogue management units,
14 response sentence generation part, 15 speech synthesis part, 16 utterance prohibition sentence database,
17 question answering database, 20 motor control unit,
31 microphones, 32 speakers, 33 motors,
100 response robot, 101 speaker, 103 Internet,
104 Spoken Dialogue Robot, 105 Facility User

Claims

A response system that outputs a response sentence to an input sentence,
An input part for inputting an input sentence;
A language understanding unit for understanding language information of an input sentence input to the input unit;
A management unit for determining a response candidate sentence according to the language information;
A response sentence generator for generating a response sentence based on the response candidate sentence;
An output unit that outputs the response sentence generated by the response sentence generation unit,
The management unit calculates a similarity between the response candidate sentence and the prohibited sentence by referring to a prohibited sentence database in which the prohibited sentence is stored, and outputs the response candidate sentence based on the similarity. Response system to limit.

The management unit calculates a determination value of the response candidate sentence based on the similarity, and determines whether to limit output of the response candidate sentence based on a comparison result between the determination value and a threshold value The response system according to claim 1.

3. The response system according to claim 2, wherein the response candidate sentence is determined based on information updated on a network.

The response system according to any one of claims 1 to 3, wherein the prohibited sentence stored in the prohibited sentence database includes a plurality of morphemes.

Morphological analysis of the response candidate sentence and the utterance prohibition sentence, comparing the morpheme included in the response candidate sentence with the morpheme included in the utterance prohibition sentence, and the similarity according to the number of matches of the morpheme The response system according to claim 1, wherein the response system is calculated.

6. The response system according to claim 5, wherein the number of matching morphemes is a number of matching morphemes in a specific part of speech.

A response system that utters a response sentence to input voice data,
A voice recognition unit that performs voice recognition processing on the input voice data;
A language understanding unit for understanding language information of the voice data subjected to the voice recognition processing;
A dialogue management unit that determines a response candidate sentence according to the language information and restricts the utterance of the response candidate sentence according to the content of the response candidate sentence by referring to a prohibited sentence database in which the prohibited sentence is stored A response system comprising:

A response content control method for controlling the content of a response sentence output for an input sentence,
Understanding the language information of the input sentence,
Determining a response candidate sentence according to the language information;
Comparing a determination value determined from the degree of similarity with a threshold value,
Response content control that calculates the similarity between the response candidate sentence and the prohibited sentence by referring to the prohibited sentence database in which the prohibited sentence is stored, and restricts the output of the response candidate sentence based on the similarity Method.

9. The determination value of the response candidate sentence is calculated based on the similarity, and it is determined whether to limit the output of the response candidate sentence based on a comparison result between the determination value and a threshold value. The response content control method described.

10. The response content control method according to claim 8, wherein the response candidate sentence is determined based on information updated on a network.

The response content control method according to any one of claims 8 to 10, wherein the prohibited text stored in the prohibited text database includes a plurality of morphemes.

A morpheme analysis is performed on the response candidate sentence and the prohibited sentence, the morpheme included in the response candidate sentence is compared with the morpheme included in the prohibited sentence, and the similarity is calculated according to the number of morpheme matches. Item 12. The response content control method according to any one of Items 8 to 11.

13. The response content control method according to claim 12, wherein the number of morpheme matches is the number of morpheme matches in a specific part of speech.

A response content control method for controlling the content of a response sentence uttered with respect to input voice data,
Performing voice recognition processing on the input voice data;
Understanding language information of the speech-recognized data;
Determining a response candidate sentence according to the language information;
A response content control method comprising: a step of referring to a prohibited text database in which a prohibited text is stored and restricting the utterance of the response candidate text according to the content of the response candidate text.