JP4204042B2

JP4204042B2 - Game machine, game execution method, and program

Info

Publication number: JP4204042B2
Application number: JP2003358072A
Authority: JP
Inventors: 信勝平野谷
Original assignee: Aruze Corp
Current assignee: Universal Entertainment Corp
Priority date: 2003-10-17
Filing date: 2003-10-17
Publication date: 2009-01-07
Anticipated expiration: 2023-10-17
Also published as: JP2005118369A

Abstract

<P>PROBLEM TO BE SOLVED: To make a character take action according to the game situation etc. not to repeat a determined speech even if the utterance of a user is not the one previously registered in a database for the conversation control. <P>SOLUTION: In a game system GS comprising a conversation control device 1 for outputting a sentence of reply in response to the utterance of the user and a game device 2 for controlling characters and executing the game, the game device 2 outputs the information describing the game situation to the conversation control device 1, and the conversation control device 1 selects a sentence of reply based on the information describing the game situation informed by the game device 2 and outputs the sentence of reply if topic specifying information suitable for the utterance information can not be found in a plurality of topic specifying information stored in the conversation database 500. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、ユーザが仮想人格であるキャラクタとの会話を行いながらゲームを遊戯することが可能なゲーム機、ゲームの実行方法、並びにプログラムに関し、より詳しくは予め用意した回答対象以外の発話を受け付けた場合に、ゲームの状態などに応じて回答文を選択し、自然な会話の実現が可能なゲーム機、ゲームの実行方法、並びにプログラムに関する。 The present invention relates to a game machine capable of playing a game while having a conversation with a character having a virtual personality, a game execution method, and a program, and more specifically, accepts utterances other than those prepared in advance. It is related with the game machine which can select a reply sentence according to the state of a game, etc., and can implement | achieve natural conversation, the execution method of a game, and a program.

近年の音声分析技術・音声認識技術の発展に伴い、ユーザの自然音声を入力として受け付け、この入力に対して所定の応答をなすマン−マシン・インターフェイスを用いた会話システムが普及しつつある。このような会話システムの例として、単なるキーワードマッチングではなく、会話の履歴、話題の推移などが加味された自然な会話に近い対話を人間対機械の間で成立させる会話システムが提案されている（例えば、特許文献１）。 With the recent development of speech analysis technology and speech recognition technology, conversation systems using a man-machine interface that accepts a user's natural speech as an input and makes a predetermined response to this input are becoming widespread. As an example of such a conversation system, there has been proposed a conversation system that establishes a conversation close to a natural conversation between humans and machines, taking into consideration the history of conversation, transition of topics, etc., rather than just keyword matching ( For example, Patent Document 1).

このような会話システムを用いたビジネス或いは応用技術としては、会話システムを用いたエンターテイメント（例えば、家庭用・業務用ゲーム）が考えられる。一例としては、モニターに表示されるキャラクタ（仮想人格）と会話し、会話に基づいてゲームを進めるゲームソフトや、或いはゲームを行いつつキャラクタと会話をおこなう（会話の内容がゲームの勝敗・進行に影響を与えない）ゲームソフトなど、主体となるゲームに副次的構成要素として会話システムを組み込むことが考えられている。
特開２００２−３５８３０４号公報 As business or applied technology using such a conversation system, entertainment (for example, home and business games) using the conversation system can be considered. As an example, it is possible to have a conversation with a character (virtual personality) displayed on a monitor, and to have a game software that advances a game based on the conversation, or to have a conversation with a character while playing a game (the content of the conversation is the outcome of the game) It is considered to incorporate a conversation system as a secondary component in a main game such as game software that does not affect the game.
JP 2002-358304 A

上記のような従来の会話システムにおいては、キャラクタとの対話はユーザの発話内容により決められていた。すなわち、ユーザの発話への回答は、予め決められた回答を複数記憶するデータベースの内容を参照し、発話に対応する回答をデータベースから検索してユーザに返すものである。このようなデータベースから単純に回答を検索する技術では、会話システムは、データベースに記憶された決められた台詞を一意的に発話する反応となる。かかる従来の技術においては、ユーザの発話が同じであればこの発話に対する回答も同じとなり、ユーザが同じ発話を繰り返す限り基本的の同じ台詞を繰り返し出力することとなる。 In the conventional conversation system as described above, the dialogue with the character is determined by the content of the user's utterance. That is, the answer to the user's utterance refers to the contents of a database that stores a plurality of predetermined answers, searches the database for an answer corresponding to the utterance, and returns it to the user. In the technique of simply searching for answers from such a database, the conversation system is a reaction that uniquely utters a predetermined line stored in the database. In such a conventional technique, if the user's utterance is the same, the answer to the utterance is the same, and the same basic speech is repeatedly output as long as the user repeats the same utterance.

ユーザの発話に対応する回答がデータベースに記憶されていないなどの理由により、会話システム内にユーザの発話に対する回答が用意されていない場合は、会話システムは応答不能処理を行い、たとえば「もう一度言ってくれる？」などの回答をすることとなるが、ユーザが同じ発話を繰り返す限り、会話システムは応答不能処理を繰り返し、たとえば「もう一度言ってくれる？」などの回答を繰り返すこととなる。 If the answer to the user's utterance is not prepared in the conversation system, for example because the answer corresponding to the user's utterance is not stored in the database, the conversation system performs an unresponsive process, for example, Will be answered, but as long as the user repeats the same utterance, the conversation system repeats the non-response process, for example, "Please tell me again?"

かかる応答不能処理が繰り返されれば、ユーザは苛立ちや不自然さを感じ、人間と機械の間の自然な対話処理実現の大きな障害となっていた。 If such an unresponsive process is repeated, the user feels frustrated and unnatural, which has been a major obstacle to realizing a natural dialogue process between a human and a machine.

本発明の目的は、ユーザの発話が未登録の内容である場合は、現在のゲーム状態等に基づいて回答を行う技術を提供することにある。 An object of the present invention is to provide a technique for making an answer based on a current game state or the like when a user's utterance is unregistered content.

上記課題を解決するための手段として、本発明は以下のような特徴を有する。
本発明の第１の態様は、ユーザからの発話に応答する回答文を出力する会話処理手段（会話制御装置）と、キャラクタの制御をおこなうとともにゲームを実行するゲーム制御手段（ゲーム装置）とを有するゲーム機（ゲームシステム）として提案される。 As means for solving the above problems, the present invention has the following features.
According to a first aspect of the present invention, there is provided a conversation processing means (conversation control apparatus) that outputs an answer sentence in response to an utterance from a user, and a game control means (game apparatus) that controls a character and executes a game. Proposed as a game machine (game system).

このゲーム機は、会話処理手段がユーザの発話情報を受け付ける受付手段（音声認識部）と、複数の話題特定情報とそれぞれの話題特定情報に対応付けられた回答文を記憶する会話データベース手段（会話データベース）と、従前の会話及び従前の回答文により定まる談話履歴を参照し、この談話履歴により定まる話題特定情報と発話情報とを比較してその発話情報に適した話題特定情報を選択し、この話題特定情報に対応付けられた回答文を出力する会話制御手段（会話制御部）とを有しており、ゲーム制御手段はゲーム状態を示す情報を会話制御手段に出力し、会話制御手段は、発話情報に適した話題特定情報を会話データベース手段に記憶された複数の話題特定情報から見つけ出すことができない場合、ゲーム状態を示す情報に基づいて回答文を選択することを特徴としている。 In this game machine, a conversation processing means accepts a user's utterance information (speech recognition unit), a conversation database means (conversation) that stores a plurality of topic identification information and an answer sentence associated with each topic identification information Database), the conversation history determined by the previous conversation and the previous answer sentence, the topic identification information determined by the conversation history is compared with the utterance information, and the topic identification information suitable for the utterance information is selected. Conversation control means (conversation control unit) for outputting an answer sentence associated with the topic identification information, the game control means outputs information indicating the game state to the conversation control means, the conversation control means, If topic specific information suitable for speech information cannot be found from a plurality of topic specific information stored in the conversation database means, based on information indicating the game state Is characterized in that selects an answer sentence.

このゲーム機によれば、予め会話データベース手段に用意されていない発話を受け付けた場合でも、決められた台詞を繰り返すのではなく、現在のゲームの主導権（手番）ユーザにあるのかゲーム機（ＣＰＵ）あるのかなどのゲーム状態に応じた回答文を出力することが可能となる。 According to this game machine, even when an utterance that is not prepared in advance in the conversation database means is accepted, the game machine (whether the user has the initiative (number) of the current game does not repeat the determined line. It is possible to output an answer sentence according to the game state such as whether there is a CPU.

本発明の第２の態様は、ユーザからの発話に応答する回答文を出力する会話処理手段（会話制御装置）と、キャラクタの制御をおこなうとともにゲームを実行するゲーム制御手段（ゲーム装置）と、キャラクタの感情を示す感情状態情報を記憶し、この感情状態情報を発話情報に応じて更新する感情状態情報管理手段とを有するゲーム機（ゲームシステム）として提案される。 According to a second aspect of the present invention, there is provided a conversation processing means (conversation control apparatus) for outputting an answer sentence in response to an utterance from a user, a game control means (game apparatus) for controlling a character and executing a game, It is proposed as a game machine (game system) having emotion state information management means for storing emotion state information indicating the emotion of the character and updating the emotion state information according to the utterance information.

このゲーム機は、会話処理手段はユーザの発話情報を受け付ける受付手段（音声認識部）と、複数の話題特定情報とそれぞれの話題特定情報に対応付けられた回答文を記憶する会話データベース手段（会話データベース）と、従前の会話及び従前の回答文により定まる談話履歴を参照し、この談話履歴により定まる話題特定情報と発話情報とを比較してその発話情報に適した話題特定情報を選択し、この話題特定情報に対応付けられた回答文を出力する会話制御手段（会話制御部）とを有しており、会話制御手段は、発話情報に適した話題特定情報を会話データベース手段に記憶された複数の話題特定情報から見つけ出すことができない場合、感情状態情報に応じて回答文を選択することを特徴としている。 In this game machine, the conversation processing means is a reception means (speech recognition unit) that receives user's utterance information, and a conversation database means (conversation) that stores a plurality of topic identification information and an answer sentence associated with each topic identification information. Database), the conversation history determined by the previous conversation and the previous answer sentence, the topic identification information determined by the conversation history is compared with the utterance information, and the topic identification information suitable for the utterance information is selected. A conversation control unit (conversation control unit) that outputs an answer sentence associated with the topic identification information. The conversation control unit stores a plurality of topic identification information suitable for the utterance information stored in the conversation database unit. When it is not possible to find out from the topic specific information, the answer sentence is selected according to the emotional state information.

このゲーム機によれば、予め会話データベース手段に用意されていない発話を受け付けた場合でも、決められた台詞を繰り返すのではなく、キャラクタの感情に応じた回答文を出力することが可能となる。 According to this game machine, even when an utterance that is not prepared in advance in the conversation database means is received, it is possible to output an answer sentence according to the character's emotion, instead of repeating the determined line.

本発明の第３の態様は、ユーザからの発話に応答する回答文を出力する会話処理手段（会話制御装置）と、キャラクタの制御をおこなうとともにゲームを実行するゲーム制御手段（ゲーム装置）と、キャラクタの感情を示す感情状態情報を記憶し、更新する感情状態情報管理手段とを有するゲーム機（ゲームシステム）として提案される。 According to a third aspect of the present invention, there is provided a conversation processing means (conversation control apparatus) for outputting an answer sentence in response to an utterance from a user, a game control means (game apparatus) for controlling a character and executing a game, It is proposed as a game machine (game system) having emotion state information management means for storing and updating emotion state information indicating the emotion of the character.

このゲーム機は、会話処理手段がユーザの発話情報を受け付ける受付手段（音声認識部）と、複数の話題特定情報とそれぞれの話題特定情報に対応付けられた回答文を記憶する会話データベース手段（会話データベース）と、従前の会話及び従前の回答文により定まる談話履歴を参照し、この談話履歴により定まる話題特定情報と発話情報とを比較してその発話情報に適した話題特定情報を選択し、この話題特定情報に対応付けられた回答文を出力する会話制御手段（会話制御部）とを有しており、会話制御手段は、発話情報に適した話題特定情報を会話データベース手段に記憶された話題特定情報から見つけ出すことができなかった回数である未登録発話カウント値を記憶しており、発話情報に適した話題特定情報を会話データベース手段に記憶された複数の話題特定情報から見つけ出すことができない場合に、未登録発話カウント値に基づいて回答文を選択することを特徴としている。 In this game machine, a conversation processing means accepts a user's utterance information (speech recognition unit), a conversation database means (conversation) that stores a plurality of topic identification information and an answer sentence associated with each topic identification information Database), the conversation history determined by the previous conversation and the previous answer sentence, the topic identification information determined by the conversation history is compared with the utterance information, and the topic identification information suitable for the utterance information is selected. A conversation control unit (conversation control unit) that outputs an answer sentence associated with the topic identification information, and the conversation control unit includes topic identification information suitable for the utterance information stored in the conversation database unit. Stores the unregistered utterance count value, which is the number of times it could not be found from the specific information, and the conversation database means for identifying topic specific information suitable for the utterance information If it is not possible to find from the stored plurality of topic specification information it has been, is characterized by selecting a reply sentence based on the unregistered utterance count.

このゲーム機によれば、予め会話データベース手段に用意されていない発話を受け付けた場合、決められた台詞を繰り返すのではなく、未登録発話が今まで行われた回数に応じた回答文を出力することが可能となる。 According to this game machine, when an utterance that is not prepared in advance in the conversation database means is received, an answer sentence corresponding to the number of times that an unregistered utterance has been performed is output instead of repeating a predetermined line. It becomes possible.

上記第１の態様の発明は、ユーザとキャラクタの会話を行いながらゲームを実行するゲームの実行方法としても成立する。 The invention of the first aspect is also established as a game execution method for executing a game while having a conversation between a user and a character.

この方法は、複数の話題特定情報とそれぞれの話題特定情報に対応付けられた回答文を記憶させるステップと、ユーザの発話情報に応答するために、従前の会話及び従前の回答文により定まる談話履歴を参照し、この談話履歴により定まる話題特定情報と発話情報とを照合して回答文を選択するステップと、前述の選択するステップにおいて、発話情報に適した話題特定情報を会話データベース手段に記憶された話題特定情報から見つけ出すことができなかった場合、ゲーム状態に基づいて回答文を選択するステップとを有することを特徴としている。 This method includes a step of storing a plurality of topic specifying information and answer sentences associated with each of the topic specifying information, and a conversation history determined by the previous conversation and the previous answer sentence in order to respond to the user's utterance information. The topic identification information suitable for the utterance information is stored in the conversation database means in the steps of selecting the answer sentence by comparing the topic identification information determined by the discourse history with the utterance information and the selecting step described above. A step of selecting an answer sentence based on the game state when the topic identification information cannot be found.

上記第２の態様の発明は、ユーザとキャラクタの会話を行いながらゲームを実行するゲームの実行方法としても成立する。 The invention of the second aspect is also established as a game execution method in which a game is executed while a user and a character have a conversation.

この方法は、複数の話題特定情報とそれぞれの話題特定情報に対応付けられた回答文を記憶させるステップと、ユーザの発話情報に応答するために、従前の会話及び従前の回答文により定まる談話履歴を参照し、この談話履歴により定まる話題特定情報と発話情報とを照合して回答文を選択するステップと、キャラクタの感情を示す感情状態情報を記憶させ、この感情状態情報を更新するステップと、前述の選択するステップにおいて、発話情報に適した話題特定情報を記憶された話題特定情報から見つけ出すことができなかった場合、感情状態情報に基づいて回答文を選択するステップとを有することを特徴としている。 This method includes a step of storing a plurality of topic specifying information and answer sentences associated with each of the topic specifying information, and a conversation history determined by the previous conversation and the previous answer sentence in order to respond to the user's utterance information. And referring to the topic identification information determined by the discourse history and the utterance information, selecting the answer sentence, storing the emotion state information indicating the emotion of the character, and updating the emotion state information, A step of selecting an answer sentence based on the emotional state information when the topic specifying information suitable for the utterance information cannot be found from the stored topic specifying information in the selecting step described above. Yes.

上記第３の態様の発明は、ユーザとキャラクタの会話を行いながらゲームを実行するゲームの実行方法において、複数の話題特定情報とそれぞれの話題特定情報に対応付けられた回答文を記憶させるステップと、ユーザの発話情報に応答するために、従前の会話及び従前の回答文により定まる談話履歴を参照し、この談話履歴により定まる話題特定情報と発話情報とを照合して回答文を選択するステップと、選択するステップにおいて、話題特定情報と発話情報とを照合した結果、記憶された複数の話題特定情報から、発話情報に適した話題特定情報を見つけ出すことができなかった場合の回数である未登録発話カウント値を記憶させるステップと、選択するステップにおいて、発話情報に適した話題特定情報を記憶された話題特定情報から見つけ出すことができなかった場合、未登録発話カウント値に基づいて回答文を選択するステップとを有することを特徴としている。 In the third aspect of the invention, there is provided a game execution method in which a game is executed while a user and a character are talking, and a plurality of topic specifying information and a response sentence associated with each of the topic specifying information are stored. , Referring to the conversation history determined by the previous conversation and the previous answer sentence in order to respond to the user's utterance information, and comparing the topic identification information determined by the conversation history with the utterance information and selecting the answer sentence; In the selecting step, as a result of collating the topic identification information with the utterance information, the number of times when the topic identification information suitable for the utterance information could not be found from the plurality of stored topic identification information is unregistered In the step of storing the utterance count value and the step of selecting, the topic specifying information suitable for the utterance information is stored from the stored topic specifying information. If you can not out put, it is characterized by a step of selecting a reply sentence based on the unregistered utterance count.

上記第１から第３の態様にかかる発明は、ゲーム機としてコンピュータを機能させるためのプログラム、およびゲーム実行方法をコンピュータに実行させるためのプログラムとしても成立する。 The invention according to the first to third aspects is also established as a program for causing a computer to function as a game machine and a program for causing a computer to execute a game execution method.

ここで、「発話」は必ずしも音声でなくともよく、キーボードなどの文字入力手段により生成された文字列データであってもよい。「回答文」は、音声、画面表示された文字列のいずれでなされるものでもかまわない。また、本発明では「ゲーム」とは、娯楽目的のもののみならず、教育目的、研究目的などの娯楽以外の目的を有するゲームを含む。「発話情報」とは、ユーザが発した、一つのまとまりを有する言葉の集合をいい、例えば、一つの文（一部が省略された省略文を含む）である。「話題特定情報」とは、ユーザとゲーム機が演じるキャラクタ間の会話のテーマを把握するための情報である。 Here, the “utterance” does not necessarily have to be voice, but may be character string data generated by a character input means such as a keyboard. The “answer sentence” may be made of either a voice or a character string displayed on the screen. In the present invention, the “game” includes not only a game for entertainment but also a game having a purpose other than entertainment such as an educational purpose and a research purpose. The “utterance information” refers to a set of words having a single unit that is uttered by the user, and is, for example, a single sentence (including abbreviated sentences in which some are omitted). “Topic identification information” is information for grasping the theme of conversation between the character played by the user and the game machine.

本発明によれば、ユーザの発話が会話制御のためのデータベースに予め登録されていないものであっても、決められた台詞を繰り返すのではなく、ゲーム状態などに応じた対応をキャラクタに取らせることができるため、より自然な会話処理の実現が可能となる。 According to the present invention, even if the user's utterance is not registered in the database for conversation control in advance, the character is allowed to take action according to the game state or the like instead of repeating the determined dialogue. Therefore, more natural conversation processing can be realized.

以下、図面を参照しながら本発明の実施の形態について説明する。本実施の形態は、ユーザの音声を受け付け、これに対する回答を行うことが可能なゲームシステムに関するものである。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. The present embodiment relates to a game system that can accept a user's voice and make an answer thereto.

［ゲームシステムの構成例］
図１は、ゲームシステムの構成例を示すブロック図である。ゲームシステムＧＳは、会話制御装置１と、ゲーム装置２とを有している。ゲームシステムＧＳは、さらに会話制御装置１及びゲーム装置２に接続された音声入力手段３と、ゲーム装置２に接続された非音声入力手段４と、ゲーム装置２に接続された画像出力手段５と、会話制御装置１及びゲーム装置２に接続された音声出力手段６と、会話制御装置１およびゲーム装置２に接続された感情状態情報管理手段７を有している。 [Game system configuration example]
FIG. 1 is a block diagram illustrating a configuration example of a game system. The game system GS includes a conversation control device 1 and a game device 2. The game system GS further includes a voice input unit 3 connected to the conversation control device 1 and the game device 2, a non-voice input unit 4 connected to the game device 2, and an image output unit 5 connected to the game device 2. The voice output means 6 connected to the conversation control apparatus 1 and the game apparatus 2 and the emotion state information management means 7 connected to the conversation control apparatus 1 and the game apparatus 2 are provided.

会話制御装置１は、ユーザと会話制御装置１との間での会話が成立するように、ユーザの発話に応じて回答を返す機能を有する。 The conversation control device 1 has a function of returning an answer according to the user's utterance so that a conversation between the user and the conversation control device 1 is established.

また、会話制御装置１は、ユーザとの会話に応じて、会話制御装置１が演じるキャラクタの感情を示す感情状態情報を変化させ、且つキャラクタの感情に応じた回答文を出力する。 In addition, the conversation control device 1 changes emotion state information indicating the emotion of the character played by the conversation control device 1 according to the conversation with the user, and outputs an answer sentence corresponding to the emotion of the character.

キャラクタの感情は、感情状態情報によって記述されている。感情状態情報は、感情を示す情報を累積的に記憶できる情報であればどのようなものでもよく、たとえば感情フラグの累積値を感情状態情報として用いることができる。 The emotion of the character is described by emotion state information. The emotional state information may be any information as long as information indicating emotions can be stored cumulatively. For example, the cumulative value of the emotion flag can be used as the emotional state information.

感情フラグは感情を区別できる情報であればどのようなデータを用いてもよく、例えば、「平常」の感情を示す感情フラグとして文字データ「Ａ」を割り当て、「激怒」の感情を示す感情フラグとして文字データ「Ｂ」を割り当て、「怒り」の感情を示す感情フラグとして文字データ「Ｃ」を割り当て、「喜び」の感情を示す感情フラグとして文字データ「Ｄ」を割り当てるなどのようにする。本実施の形態にかかる会話制御装置１は、この感情フラグに基づいて記憶される感情状態情報を参照して、会話制御装置１が提供するキャラクタ（疑似人格、仮想人格）の感情を制御することとなる。 As long as the emotion flag is information that can distinguish emotions, any data may be used. For example, the character data “A” is assigned as an emotion flag indicating “normal” emotion, and the emotion flag indicating “furious” emotion is indicated. Character data “B” is assigned, character data “C” is assigned as an emotion flag indicating “anger”, character data “D” is assigned as an emotion flag indicating “joy”, and so on. The conversation control device 1 according to the present embodiment refers to emotion state information stored based on the emotion flag, and controls the emotion of the character (pseudo personality, virtual personality) provided by the conversation control device 1. It becomes.

ゲーム装置２は、ユーザの入力に応答して所定のゲーム処理を実行し、実行結果を利用者（ユーザ）に提示する機能を有する。なお、ゲームシステムＧＳが扱うゲームの種類はどのようなものでもよいが、本実施の形態では対戦型ゲームである麻雀ゲームを扱うゲーム装置２について説明する。ゲーム装置２は、上述の感情状態情報に基づいて、キャラクタの表示を変化させるように機能する。キャラクタの表示には表情（例えば笑顔、怒った顔、泣いた顔など）のみならず、動作（そっくり返って笑う、拳を振り上げて怒る、顔を覆って泣き崩れる、など）やその他感情を表現できるものがすべて含まれる。 The game apparatus 2 has a function of executing a predetermined game process in response to a user input and presenting an execution result to a user (user). Note that any type of game may be handled by the game system GS, but in the present embodiment, a game apparatus 2 that handles a mahjong game that is a competitive game will be described. The game apparatus 2 functions to change the display of the character based on the emotion state information described above. Character display can express not only facial expressions (for example, smiling faces, angry faces, crying faces), but also movements (turning and laughing, waving fists, getting angry, covering faces, crying) and other emotions Everything is included.

会話制御装置１及びゲーム装置２はそれぞれ、たとえばコンピュータ、ワークステーションなどの情報処理装置であって、演算処理装置（ＣＰＵ）、主メモリ（ＲＡＭ）、読出し専用メモリ（ＲＯＭ）、入出力装置（Ｉ／Ｏ）、ハードディスク装置等の外部記憶装置を具備している装置で構成されている。前記ＲＯＭ、もしくは外部記憶装置などに情報処理装置を会話制御装置１及び／又はゲーム装置２として機能させるためのプログラム、もしくは会話制御方法及び／又はゲームの実行方法をコンピュータに実行させるためのプログラムが記憶されており、該プログラムを主メモリ上に載せ、ＣＰＵがこれを実行することにより会話制御装置１及び／又はゲーム装置２が実現される。また、上記プログラムは必ずしも当該装置内の記憶装置に記憶されていなくともよく、磁気ディスク、光ディスク、光磁気ディスク、ＣＤ（Compact Disc）、ＤＶＤ（Digital Video Disc）などのコンピュータ読み取り可能なプログラム記録媒体や、外部の装置（例えば、ＡＳＰ（アプリケーション・サービス・プロバイダ）のサーバなど）から提供され、これを主メモリに乗せる構成であっても良い。 Each of the conversation control device 1 and the game device 2 is an information processing device such as a computer or a workstation, for example, an arithmetic processing unit (CPU), a main memory (RAM), a read only memory (ROM), an input / output device (I). / O), and a device having an external storage device such as a hard disk device. A program for causing an information processing device to function as the conversation control device 1 and / or the game device 2 in the ROM or an external storage device, or a program for causing a computer to execute the conversation control method and / or the game execution method. The conversation control device 1 and / or the game device 2 is realized by placing the program on the main memory and executing it by the CPU. In addition, the program does not necessarily have to be stored in a storage device in the apparatus, and a computer-readable program recording medium such as a magnetic disk, an optical disk, a magneto-optical disk, a CD (Compact Disc), a DVD (Digital Video Disc), etc. Alternatively, it may be provided from an external device (for example, a server of an ASP (Application Service Provider)), and this may be placed on the main memory.

また、図１では、会話制御装置１とゲーム装置２とを互いに独立した装置として表示しているが、会話制御装置１とゲーム装置２とを互いに独立した装置である必要はなく、演算処理装置（ＣＰＵ）、主メモリ（ＲＡＭ）、読出し専用メモリ（ＲＯＭ）、入出力装置（Ｉ／Ｏ）、ハードディスク装置等の外部記憶装置を共通に使用して、同一ハードウエアによって会話制御装置１とゲーム装置２を実現する構成としてもかまわない。 In FIG. 1, the conversation control device 1 and the game device 2 are displayed as independent devices. However, the conversation control device 1 and the game device 2 do not have to be independent from each other. (CPU), main memory (RAM), read-only memory (ROM), input / output device (I / O), external storage device such as a hard disk device, etc. are used in common, and the conversation control device 1 and the game with the same hardware A configuration for realizing the device 2 may be used.

また、会話制御装置１とゲーム装置２とは物理的に互いに独立した情報処理装置であって、ネットワークを介して接続されて相互に情報の授受を行い、ゲームシステムＧＳを構成するものであってもかまわない。 The conversation control device 1 and the game device 2 are information processing devices that are physically independent from each other, and are connected via a network to exchange information with each other to constitute a game system GS. It doesn't matter.

音声入力手段３は、ユーザの音声信号を電気信号、光信号など所定の信号に変換して会話制御装置１及びゲーム装置２に供給する機能を有し、たとえば音声入力用マイクである。音声入力手段３は、会話制御装置１がプログラム等によって実現する疑似人格（キャラクタ）に対するユーザの発話の入力受付を行うとともに、ゲームに関する入力（例えば、ポンをする、リーチをかける、ロンをする）を受け付ける。 The voice input means 3 has a function of converting a user's voice signal into a predetermined signal such as an electric signal or an optical signal and supplying the signal to the conversation control device 1 and the game device 2, and is a voice input microphone, for example. The voice input means 3 accepts input of the user's utterance to the pseudo personality (character) realized by the conversation control device 1 by a program or the like, and inputs related to the game (for example, pong, reach, or ron) Accept.

非音声入力手段４は、音声入力以外のユーザによる入力をゲーム装置２に提供する機能を有し、例えば、キーボード、ジョイスティック、コントローラ、ポインティングデバイスなどである。これにより、ユーザはゲームに必要な入力、たとえば牌をつもる、牌を捨てる、ポンをする、リーチをかける、ロンをするなどの処理をゲーム装置２に要求することができる。 The non-speech input unit 4 has a function of providing the game apparatus 2 with input by the user other than the voice input, and is a keyboard, a joystick, a controller, a pointing device, or the like, for example. Thereby, the user can request the game apparatus 2 to perform input necessary for the game, such as creating a bag, throwing away the bag, pong, reaching, and playing.

画像出力手段５は、ゲーム装置２がゲームの進行に従って生成するゲーム画面（キャラクタの表示を含む）をユーザに表示する機能を有し、例えば、液晶ディスプレイ装置などである。 The image output means 5 has a function of displaying a game screen (including character display) generated by the game apparatus 2 as the game progresses to the user, and is, for example, a liquid crystal display apparatus.

音声出力手段６は、会話制御装置１がキャラクタ（疑似人格、仮想人格）の回答文として出力した内容を音声信号として出力し、またゲームに関する音声／音響（効果音、ＢＧＭなど）を出力する機能を有し、たとえばサウンドボード、スピーカなどである。 The voice output means 6 outputs the content output as an answer sentence of the character (pseudo personality, virtual personality) by the conversation control device 1 as a voice signal, and also outputs voice / sound (sound effect, BGM, etc.) related to the game. For example, a sound board or a speaker.

感情状態情報管理手段７は、会話制御装置１から出力される感情フラグを受け取り、感情フラグを感情状態情報に反映させる処理を行うとともに、会話制御装置１に現状の感情状態情報を返し、感情状態情報に応じた回答文の出力をさせる。 The emotional state information management means 7 receives the emotional flag output from the conversation control device 1, performs a process of reflecting the emotional flag in the emotional state information, returns the current emotional state information to the conversational control device 1, and sends the emotional state. The answer sentence according to the information is output.

なお、図１に示す構成例では、感情状態情報管理手段７は会話制御装置１およびゲーム装置２から独立した構成要素として表示したが、感情状態情報管理手段７は、会話制御装置１若しくはゲーム装置２に搭載される構成、或いはこれら装置の一部として組み込まれていてもかまわない。 In the configuration example shown in FIG. 1, the emotion state information management unit 7 is displayed as a component independent of the conversation control device 1 and the game device 2, but the emotion state information management unit 7 may be the conversation control device 1 or the game device. 2 may be incorporated, or may be incorporated as a part of these devices.

［ゲーム装置］
次に、図２を参照しながらゲーム装置２の構成例について説明する。
ゲーム装置２は、前述のように、演算処理装置（ＣＰＵ）、主メモリ（ＲＡＭ）、読出し専用メモリ（ＲＯＭ）、入出力装置（Ｉ／Ｏ）、ハードディスク装置等の外部記憶装置を具備している情報処理装置であって、所定のプログラムを情報処理装置で実行することにより、ゲーム装置２およびゲーム装置２を構成する以下の構成要素が実現される。
図２は、ゲーム装置２の構成例を示すブロック図である。ゲーム装置２は、文字列／命令変換部２０１と、ゲーム進行制御部２０２と、画像処理部２０３と、音声処理部２０４とを有している。 [Game device]
Next, a configuration example of the game apparatus 2 will be described with reference to FIG.
As described above, the game apparatus 2 includes an external storage device such as an arithmetic processing unit (CPU), a main memory (RAM), a read-only memory (ROM), an input / output device (I / O), and a hard disk device. By executing a predetermined program on the information processing apparatus, the game apparatus 2 and the following components constituting the game apparatus 2 are realized.
FIG. 2 is a block diagram illustrating a configuration example of the game apparatus 2. The game apparatus 2 includes a character string / command conversion unit 201, a game progress control unit 202, an image processing unit 203, and a sound processing unit 204.

文字列／命令変換部２０１は、会話制御装置１から送られる、ユーザの発話を文字列情報を所定のコマンドに変換する機能を有し、たとえばユーザが「ポン！」と発話した場合、文字列／命令変換部２０１は、会話制御装置１から送られる文字列情報「ポン」を「ポン」の実行コマンドに変更してゲーム進行制御部２０２に渡す。なお、ユーザの発話の内、ゲームの進行に関係ないもの（「こんにちは」「名前は？」）については、文字列／命令変換部２０１は何らの出力をしない。 The character string / command conversion unit 201 has a function of converting the user's utterance sent from the conversation control device 1 into character commands, and for example, when the user utters “Pon!” The command conversion unit 201 changes the character string information “Pong” sent from the conversation control device 1 to an execution command of “Pong” and passes it to the game progress control unit 202. It should be noted that, out of the user's speech, those not related to the progress of the game ( "Hello", "name?") For the string / instruction conversion unit 201 is not the any of the output.

ゲーム進行制御部２０２は、ユーザの入力に応じてゲームを進行させ、かつ進行に従って、画面表示のためのデータ、音声出力のためのデータを指定する機能を有する。 The game progress control unit 202 has a function of causing the game to proceed in accordance with user input and designating data for screen display and data for audio output according to the progress.

また、ゲーム進行制御部２０２は、感情状態情報管理手段７から感情状態情報を受け取り、これに応じて、キャラクタの画像制御を実行する。例えば、感情状態情報がキャラクタが喜んでいる状態を示していれば、喜んだ表情や動作を示すようにキャラクタの制御が行われる。 In addition, the game progress control unit 202 receives emotion state information from the emotion state information management means 7, and executes image control of the character in response thereto. For example, if the emotional state information indicates a state in which the character is pleased, the character is controlled so as to show a happy expression or action.

また、ゲーム進行制御部２０２は、ゲーム状態情報を会話制御装置１に渡す。本明細書中において、「ゲーム状態情報」は、ゲーム装置２が把握可能なゲームの進行に関する情報であって、一例としては、いわゆるターン制を採用するゲームにおいては、ユーザ側の手番であるのか或いはゲーム装置２（ＣＰＵ）側の手番であるのかを示す情報である。 In addition, the game progress control unit 202 passes the game state information to the conversation control device 1. In the present specification, the “game state information” is information relating to the progress of the game that can be grasped by the game apparatus 2, and as an example, in the game adopting a so-called turn system, it is a user's turn. Or the game device 2 (CPU) side.

画像処理部２０３は、ゲーム進行制御部２０２から指定された画面表示のためのデータを予めゲームに必要な画像データを記憶している記憶部（図略）から読み出し、これを画像出力手段５に提供する。
音声処理部２０４は、ゲーム進行制御部２０２から指定された画面表示のためのデータを予めゲームに必要な音声データ・音響データを記憶している記憶部（図略）から読み出し、これを音声出力手段６に提供する。 The image processing unit 203 reads the screen display data designated by the game progress control unit 202 from a storage unit (not shown) that stores image data necessary for the game in advance, and stores the data in the image output unit 5. provide.
The voice processing unit 204 reads out data for screen display designated by the game progress control unit 202 from a storage unit (not shown) that stores voice data / acoustic data necessary for the game in advance, and outputs this as voice output. Provided to means 6.

［会話制御装置の構成例］
［全体構成］
図３は、本実施の形態に係る会話制御装置１の構成例を示す機能ブロック図である。
会話制御装置１は、前述のように、演算処理装置（ＣＰＵ）、主メモリ（ＲＡＭ）、読出し専用メモリ（ＲＯＭ）、入出力装置（Ｉ／Ｏ）、ハードディスク装置等の外部記憶装置を具備している情報処理装置であって、所定のプログラムを情報処理装置で実行することにより、会話制御装置１および会話制御装置１を構成する以下の構成要素が実現される。
図３に示すように、会話制御装置１は、音声認識部２００と、会話制御部３００と、文解析部４００と、会話データベース５００と、音声認識辞書記憶部６００とを備えている。 [Configuration example of conversation control device]
[overall structure]
FIG. 3 is a functional block diagram illustrating a configuration example of the conversation control device 1 according to the present embodiment.
As described above, the conversation control device 1 includes an external storage device such as an arithmetic processing unit (CPU), a main memory (RAM), a read-only memory (ROM), an input / output device (I / O), and a hard disk device. The following components constituting the conversation control device 1 and the conversation control device 1 are realized by executing a predetermined program by the information processing device.
As shown in FIG. 3, the conversation control device 1 includes a speech recognition unit 200, a conversation control unit 300, a sentence analysis unit 400, a conversation database 500, and a speech recognition dictionary storage unit 600.

［音声認識部］
音声認識部２００は、音声入力手段３から提供される、発話に応じた信号に基づいて、発話内容に対応する文字列を特定するものである。具体的には、音声入力手段３から音声信号が入力された音声認識部２００は、この音声信号を、音声認識辞書記憶部６００に格納されている辞書および会話データベース５００と照合して、音声信号から推測される音声認識結果を出力する。なお、図３に示す構成例では、音声認識部２００は、会話制御部３００に会話データベース５００の記憶内容の取得を要求し、会話制御部３００が要求に応じて取得した会話データベース５００の記憶内容を受け取るようになっているが、音声認識部２００が直接会話データベース５００の記憶内容を取得して音声信号との比較を行う構成であってもかまわない。 [Voice recognition part]
The voice recognition unit 200 specifies a character string corresponding to the utterance content based on the signal corresponding to the utterance provided from the voice input unit 3. Specifically, the voice recognition unit 200 to which the voice signal is input from the voice input unit 3 collates this voice signal with the dictionary and the conversation database 500 stored in the voice recognition dictionary storage unit 600 to obtain the voice signal. The speech recognition result estimated from the above is output. In the configuration example illustrated in FIG. 3, the speech recognition unit 200 requests the conversation control unit 300 to acquire the storage content of the conversation database 500, and the storage content of the conversation database 500 acquired by the conversation control unit 300 in response to the request. However, the voice recognition unit 200 may directly acquire the stored contents of the conversation database 500 and compare it with the voice signal.

［音声認識部の構成例］
図４に、音声認識部２００の構成例を示す機能ブロック図を示す。音声認識部２００は、特徴抽出部２００Ａと、バッファメモリ（ＢＭ）２００Ｂと、単語照合部２００Ｃと、バッファメモリ（ＢＭ）２００Ｄと、候補決定部２００Ｅと、単語仮説絞込部２００Ｆを有している。単語照合部２００Ｃ及び単語仮説絞込部２００Ｆは音声認識辞書記憶部６００に接続されており、候補決定部２００Ｅは会話制御部３００に接続されている。 [Configuration example of voice recognition unit]
FIG. 4 is a functional block diagram illustrating a configuration example of the voice recognition unit 200. The speech recognition unit 200 includes a feature extraction unit 200A, a buffer memory (BM) 200B, a word matching unit 200C, a buffer memory (BM) 200D, a candidate determination unit 200E, and a word hypothesis narrowing unit 200F. Yes. The word matching unit 200C and the word hypothesis narrowing unit 200F are connected to the speech recognition dictionary storage unit 600, and the candidate determination unit 200E is connected to the conversation control unit 300.

単語照合部２００Ｃに接続された音声認識辞書記憶部６００は、音素隠れマルコフモデルを（以下、隠れマルコフモデルをＨＭＭという。）を記憶している。音素ＨＭＭは、各状態を含んで表され、各状態はそれぞれ以下の情報を有する。（ａ）状態番号、（ｂ）受理可能なコンテキストクラス、（ｃ）先行状態、及び後続状態のリスト、（ｄ）出力確率密度分布のパラメータ、及び（ｅ）自己遷移確率及び後続状態への遷移確率から構成されている。なお、本実施形態において用いる音素ＨＭＭは、各分布がどの話者に由来するかを特定する必要があるため、所定の話者混合ＨＭＭを変換して生成する。ここで、出力確率密度関数は３４次元の対角共分散行列をもつ混合ガウス分布である。また、単語照合部２００Ｃに接続された音声認識辞書記憶部６００は単語辞書を記憶している。単語辞書は、音素ＨＭＭの各単語毎にシンボルで表した読みを示すシンボル列を格納する。 The speech recognition dictionary storage unit 600 connected to the word matching unit 200C stores a phoneme hidden Markov model (hereinafter, the hidden Markov model is referred to as an HMM). The phoneme HMM is represented including each state, and each state has the following information. (A) state number, (b) acceptable context class, (c) list of preceding and subsequent states, (d) parameters of output probability density distribution, and (e) self-transition probabilities and transitions to subsequent states. It consists of probabilities. Note that the phoneme HMM used in the present embodiment is generated by converting a predetermined speaker mixed HMM because it is necessary to specify which speaker each distribution is derived from. Here, the output probability density function is a mixed Gaussian distribution having a 34-dimensional diagonal covariance matrix. The speech recognition dictionary storage unit 600 connected to the word collation unit 200C stores a word dictionary. The word dictionary stores a symbol string indicating a symbolic reading for each word of the phoneme HMM.

話者の発声音声はマイクロホンなどに入力されて音声信号に変換された後、特徴抽出部２００Ａに入力される。特徴抽出部２００Ａは、入力された音声信号をＡ／Ｄ変換した後、特徴パラメータを抽出し、これを出力する。特徴パラメータを抽出し、これを出力する方法としては様々なものが考えられるが、例えば一例としては、ＬＰＣ分析を実行し、対数パワー、１６次ケプストラム係数、Δ対数パワー及び１６次Δケプストラム係数を含む３４次元の特徴パラメータを抽出する方法などが挙げられる。抽出された特徴パラメータの時系列はバッファメモリ（ＢＭ）２００Ｂを介して単語照合部２００Ｃに入力される。 The voice of the speaker is input to a microphone or the like and converted into an audio signal, and then input to the feature extraction unit 200A. The feature extraction unit 200A performs A / D conversion on the input audio signal, extracts feature parameters, and outputs them. There are various methods for extracting and outputting feature parameters. For example, as an example, LPC analysis is performed, and logarithmic power, 16th-order cepstrum coefficient, Δlogarithmic power, and 16th-order Δcepstrum coefficient are calculated. Examples include a method of extracting 34-dimensional feature parameters. The extracted time series of feature parameters is input to the word matching unit 200C via the buffer memory (BM) 200B.

単語照合部２００Ｃは、ワン−パス・ビタビ復号化法を用いて、バッファメモリ２００Ｂを介して入力される特徴パラメータのデータに基づいて、音声認識辞書記憶部６００に記憶された音素ＨＭＭと単語辞書とを用いて単語仮説を検出し、尤度を計算して出力する。ここで、単語照合部２００Ｃは、各時刻の各ＨＭＭの状態毎に、単語内の尤度と発声開始からの尤度を計算する。尤度は、単語の識別番号、単語の開始時刻、先行単語の違い毎に個別にもつ。また、計算処理量の削減のために、音素ＨＭＭ及び単語辞書とに基づいて計算される総尤度のうちの低い尤度のグリッド仮説を削減するようにしてもよい。単語照合部２００Ｃは、検出した単語仮説とその尤度の情報を発声開始時刻からの時間情報（具体的には、例えばフレーム番号）とともにバッファメモリ２００Ｄを介して候補決定部２００Ｅ及び単語仮説絞込部２００Ｆに出力する。 The word matching unit 200C uses the one-pass Viterbi decoding method, and based on the feature parameter data input via the buffer memory 200B, the phoneme HMM and the word dictionary stored in the speech recognition dictionary storage unit 600 Is used to detect the word hypothesis, and the likelihood is calculated and output. Here, the word matching unit 200C calculates the likelihood in the word and the likelihood from the start of utterance for each state of each HMM at each time. The likelihood is individually provided for each word identification number, word start time, and difference between preceding words. Further, in order to reduce the amount of calculation processing, the low likelihood grid hypothesis among the total likelihoods calculated based on the phoneme HMM and the word dictionary may be reduced. The word matching unit 200C includes information on the detected word hypothesis and its likelihood along with time information from the utterance start time (specifically, for example, a frame number) and the candidate determination unit 200E and the word hypothesis narrowing down via the buffer memory 200D. Output to the unit 200F.

候補決定部２００Ｅは、会話制御部３００を参照して、検出した単語仮説と所定の談話空間内の話題特定情報とを比較し、検出した単語仮説の内、所定の談話空間内の話題特定情報と一致するものがあるか否かを判定し、一致するものがある場合は、その一致する単語仮説を認識結果として出力し、一方一致するものがない場合は、単語仮説絞込部２００Ｆに単語仮説の絞込を行うよう要求する。 The candidate determining unit 200E refers to the conversation control unit 300, compares the detected word hypothesis with the topic specifying information in the predetermined discourse space, and among the detected word hypotheses, the topic specifying information in the predetermined discourse space If there is a match, the matching word hypothesis is output as a recognition result. If there is no match, the word hypothesis narrowing unit 200F Request to narrow down hypotheses.

候補決定部２００Ｅの動作例を説明する。今、単語照合部２００Ｃが複数の単語仮説「カンタク」「カタク」「カントク」およびその尤度（認識率）を出力し、所定の談話空間は「映画」に関するものでありその話題特定情報には「カントク（監督）」は含まれているが、「カンタク（干拓）」及び「カタク（仮託）」は含まれていないとする。また「カンタク」「カタク」「カントク」の尤度（認識率）は「カンタク」が最も高く「カントク」は最も低く、「カタク」は両者の中間であったとする。 An operation example of the candidate determination unit 200E will be described. Now, the word matching unit 200C outputs a plurality of word hypotheses “Kantaku”, “Katak”, “Kantoku” and the likelihood (recognition rate), and the predetermined discourse space relates to “movie”, and the topic specifying information includes “Kantoku (director)” is included, but “Kantaku (reclaimed)” and “Katak (temporary contract)” are not included. Further, the likelihood (recognition rate) of “Kantaku”, “Katak”, and “Kantoku” is highest in “Kantaku”, lowest in “Kantoku”, and “Katak” is in between.

上記の状況において、候補決定部２００Ｅは、検出した単語仮説と所定の談話空間内の話題特定情報とを比較して、単語仮説「カントク」が、所定の談話空間内の話題特定情報と一致するものであると判定し、単語仮説「カントク」を認識結果として出力し、会話制御部に渡す。このように処理することにより、現在扱われている話題「映画」に関連した「カントク（監督）」が、より上位の尤度（認識率）を有する単語仮説「カンタク」「カタク」に優先されて選択され、その結果会話の文脈に即した音声認識結果を出力することが可能となる。 In the above situation, the candidate determining unit 200E compares the detected word hypothesis with the topic specifying information in the predetermined discourse space, and the word hypothesis “Kantoku” matches the topic specifying information in the predetermined discourse space. The word hypothesis “Kantoku” is output as a recognition result and passed to the conversation control unit. By processing in this way, “Kantoku (Director)” related to the topic “Movie” currently being handled is prioritized over the word hypotheses “Kantaku” and “Katak” with higher likelihood (recognition rate). As a result, it is possible to output a speech recognition result in accordance with the context of the conversation.

一方、一致するものがない場合は、候補決定部２００Ｅからの単語仮説の絞込を行う要求に応じて単語仮説絞込部２００Ｆが認識結果を出力するよう動作する。単語仮説絞込部２００Ｆは、単語照合部２００Ｃからバッファメモリ２００Ｄを介して出力される複数個の単語仮説に基づいて、音声認識辞書記憶部６００に記憶された統計的言語モデルを参照して、終了時刻が等しく開始時刻が異なる同一の単語の単語仮説に対して、当該単語の先頭音素環境毎に、発声開始時刻から当該単語の終了時刻に至る計算された総尤度のうちの最も高い尤度を有する１つの単語仮説で代表させるように単語仮説の絞り込みを行った後、絞り込み後のすべての単語仮説の単語列のうち、最大の総尤度を有する仮説の単語列を認識結果として出力する。本実施形態においては、好ましくは、処理すべき当該単語の先頭音素環境とは、当該単語より先行する単語仮説の最終音素と、当該単語の単語仮説の最初の２つの音素とを含む３つの音素並びをいう。 On the other hand, if there is no match, the word hypothesis narrowing unit 200F operates to output the recognition result in response to a request for narrowing down word hypotheses from the candidate determination unit 200E. The word hypothesis narrowing-down unit 200F refers to a statistical language model stored in the speech recognition dictionary storage unit 600 based on a plurality of word hypotheses output from the word matching unit 200C via the buffer memory 200D. For the same word hypothesis of the same word with the same end time but different start time, the highest likelihood of the calculated total likelihood from the utterance start time to the end time of the word for each head phoneme environment of the word After narrowing down word hypotheses so that they are represented by a single word hypothesis having a degree, a word string of a hypothesis having the maximum total likelihood is output as a recognition result among the word strings of all the word hypotheses after narrowing down To do. In the present embodiment, preferably, the first phoneme environment of the word to be processed is three phonemes including the final phoneme of the word hypothesis preceding the word and the first two phonemes of the word hypothesis of the word. Say a line.

単語仮説絞込部２００Ｆによる単語絞込処理の例を図５を参照しながら説明する。図５は、単語仮説絞込部２００Ｆの処理の一例を示すタイミングチャートである。
例えば（ｉ−１）番目の単語Ｗi-１の次に、音素列ａ１，ａ２，…，ａnからなるｉ番目の単語Ｗiがくるときに、単語Ｗi-１の単語仮説として６つの仮説Ｗａ，Ｗｂ，Ｗｃ，Ｗｄ，Ｗｅ，Ｗｆが存在しているとする。ここで、前者３つの単語仮説Ｗａ，Ｗｂ，Ｗｃの最終音素は／ｘ／であるとし、後者３つの単語仮説Ｗｄ，Ｗｅ，Ｗｆの最終音素は／ｙ／であるとする。終了時刻ｔeにおいて単語仮説Ｗａ，Ｗｂ，Ｗｃを前提とする３つの仮説と、単語仮説Ｗｄ，Ｗｅ，Ｗｆを前提とする１の仮説が残っているものとすると、先頭音素環境が等しい前者３つ仮説のうち、総尤度が最も高い仮説一つを残し、その他を削除する。 An example of word narrowing processing by the word hypothesis narrowing unit 200F will be described with reference to FIG. FIG. 5 is a timing chart showing an example of processing of the word hypothesis narrowing-down unit 200F.
For example, when the i-th word Wi consisting of the phoneme sequence a1, a2,..., An comes after the (i-1) -th word Wi-1, six hypotheses Wa, Assume that Wb, Wc, Wd, We, and Wf exist. Here, it is assumed that the final phoneme of the former three word hypotheses Wa, Wb, and Wc is / x /, and the final phoneme of the latter three word hypotheses Wd, We, and Wf is / y /. Assuming that three hypotheses premised on the word hypotheses Wa, Wb, and Wc and one hypothesis premised on the word hypotheses Wd, We, and Wf remain at the end time te, the former three with the same initial phoneme environment Of the hypotheses, one of the hypotheses with the highest total likelihood is left and the others are deleted.

なお、単語仮説Ｗｄ，Ｗｅ，Ｗｆを前提とする仮説は先頭音素環境が他の３つの仮説と違うため、すなわち、先行する単語仮説の最終音素がｘではなくｙであるため、この単語仮説Ｗｄ，Ｗｅ，Ｗｆを前提とする仮説は削除しない。すなわち、先行する単語仮説の最終音素毎に１つのみ仮説を残す。 Note that the hypothesis premised on the word hypothesis Wd, We, Wf is that the leading phoneme environment is different from the other three hypotheses, that is, the final phoneme of the preceding word hypothesis is y instead of x, so this word hypothesis Wd , We and Wf are not deleted. That is, only one hypothesis is left for each final phoneme of the preceding word hypothesis.

以上の実施形態においては、当該単語の先頭音素環境とは、当該単語より先行する単語仮説の最終音素と、当該単語の単語仮説の最初の２つの音素とを含む３つの音素並びとして定義されているが、本発明はこれに限らず、先行する単語仮説の最終音素と、最終音素と連続する先行する単語仮説の少なくとも１つの音素とを含む先行単語仮説の音素列と、当該単語の単語仮説の最初の音素を含む音素列とを含む音素並びとしてもよい。
以上の実施の形態において、特徴抽出部２００Ａと、単語照合部２００Ｃと、候補決定部２００Ｅと、単語仮説絞込部２００Ｆとは、例えば、デジタル電子計算機などのコンピュータで構成され、バッファメモリ２００Ｂ，２００Ｄと、音声認識辞書記憶部６００とは、例えばハードデイスクメモリなどの記憶装置で構成される。 In the above embodiment, the first phoneme environment of the word is defined as three phoneme sequences including the final phoneme of the word hypothesis preceding the word and the first two phonemes of the word hypothesis of the word. However, the present invention is not limited to this, the phoneme string of the preceding word hypothesis including the final phoneme of the preceding word hypothesis, and at least one phoneme of the preceding word hypothesis continuous with the last phoneme, and the word hypothesis of the word A phoneme sequence including a phoneme string including the first phoneme may be used.
In the above embodiment, the feature extraction unit 200A, the word collation unit 200C, the candidate determination unit 200E, and the word hypothesis narrowing unit 200F are configured by a computer such as a digital electronic computer, for example, and include a buffer memory 200B, The 200D and the speech recognition dictionary storage unit 600 are configured by a storage device such as a hard disk memory, for example.

以上の実施形態においては、単語照合部２００Ｃ、単語仮説絞込部２００Ｆとを用いて音声認識を行っているが、本発明はこれに限らず、例えば、音素ＨＭＭを参照する音素照合部と、例えばＯｎｅ
ＰａｓｓＤＰアルゴリズムを用いて統計的言語モデルを参照して単語の音声認識を行う音声認識部とで構成してもよい。
また、本実施の形態では、音声認識部２００は会話制御装置１の一部分として説明するが、音声認識部２００、音声認識辞書記憶部６００、及び会話データベース５００より構成される、独立した音声認識装置とすることも可能である。 In the above embodiment, speech recognition is performed using the word matching unit 200C and the word hypothesis narrowing unit 200F. However, the present invention is not limited to this, for example, a phoneme matching unit that refers to a phoneme HMM; For example, One
You may comprise with the speech recognition part which performs speech recognition of a word with reference to a statistical language model using a Pass DP algorithm.
In the present embodiment, the voice recognition unit 200 is described as a part of the conversation control device 1, but an independent voice recognition device including the voice recognition unit 200, the voice recognition dictionary storage unit 600, and the conversation database 500. It is also possible.

［音声認識部の動作例］
つぎに図６を参照しながら音声認識部２００の動作について説明する。図６は、音声認識部２００の動作例を示すフロー図である。音声入力手段３より音声信号を受け取ると、音声認識部２００は入力された音声の特徴分析を行い、特徴パラメータを生成する（ステップＳ６０１）。次に、この特徴パラメータと音声認識辞書記憶部６００に記憶された音素ＨＭＭ及び言語モデルとを比較して、所定個数の単語仮説及びその尤度を取得する（ステップＳ６０２）。次に、音声認識部２００は、取得した所定個数の単語仮説と検出した単語仮説と所定の談話空間内の話題特定情報とを比較し、検出した単語仮説の内、所定の談話空間内の話題特定情報と一致するものがあるか否かを判定する（ステップＳ６０３、Ｓ６０４）。一致するものがある場合は、音声認識部２００はその一致する単語仮説を認識結果として出力する（ステップＳ６０５）。一方、一致するものがない場合は、音声認識部２００は取得した単語仮説の尤度に従って、最大尤度を有する単語仮説を認識結果として出力する（ステップＳ６０６）。 [Operation example of voice recognition unit]
Next, the operation of the speech recognition unit 200 will be described with reference to FIG. FIG. 6 is a flowchart showing an operation example of the speech recognition unit 200. When receiving a voice signal from the voice input unit 3, the voice recognition unit 200 performs a feature analysis of the input voice and generates a feature parameter (step S601). Next, this feature parameter is compared with the phoneme HMM and language model stored in the speech recognition dictionary storage unit 600 to obtain a predetermined number of word hypotheses and their likelihoods (step S602). Next, the speech recognition unit 200 compares the acquired predetermined number of word hypotheses with the detected word hypothesis and the topic identification information in the predetermined discourse space, and among the detected word hypotheses, the topic in the predetermined discourse space It is determined whether there is a match with the specific information (steps S603 and S604). If there is a match, the speech recognition unit 200 outputs the matching word hypothesis as a recognition result (step S605). On the other hand, when there is no match, the speech recognition unit 200 outputs the word hypothesis having the maximum likelihood as the recognition result according to the acquired likelihood of the word hypothesis (step S606).

［音声認識辞書記憶部］
再び、図３に戻って、会話制御装置１の構成例の説明を続ける。
音声認識辞書記憶部６００は、標準的な音声信号に対応する文字列を格納するものである。この照合をした音声認識部２００は、その音声信号に対応する単語仮説に対応する文字列を特定し、その特定した文字列を文字列信号として会話制御部３００に出力する。 [Voice recognition dictionary storage]
Returning to FIG. 3 again, the description of the configuration example of the conversation control device 1 will be continued.
The speech recognition dictionary storage unit 600 stores a character string corresponding to a standard speech signal. The voice recognition unit 200 that has performed the collation identifies a character string corresponding to the word hypothesis corresponding to the voice signal, and outputs the identified character string to the conversation control unit 300 as a character string signal.

［文解析部］
次に、図１及び図７を参照しながら文解析部４００の構成例について説明する。図７は、会話制御装置１の部分拡大ブロック図であって、会話制御部３００及び文解析部４００の具体的構成例を示すブロック図である。なお、図７においては、会話制御部３００、文解析部４００、および会話データベース５００のみ図示しており、その他の構成要素の表示は省略されている。 [Sentence Analysis Department]
Next, a configuration example of the sentence analysis unit 400 will be described with reference to FIGS. 1 and 7. FIG. 7 is a partial enlarged block diagram of the conversation control device 1, and is a block diagram illustrating a specific configuration example of the conversation control unit 300 and the sentence analysis unit 400. In FIG. 7, only the conversation control unit 300, the sentence analysis unit 400, and the conversation database 500 are shown, and the display of other components is omitted.

前記文解析部４００は、音声認識部２００で特定された文字列を解析するものである。この文解析部４００は、本実施の形態では、図７に示すように、文字列特定部４１０と、形態素抽出部４２０と、形態素データベース４３０と、入力種類判定部４４０と、発話種類データベース４５０とを有している。文字列特定部４１０は、音声入力手段３及び音声認識部２００で特定された一連の文字列を一文節毎に区切るものである。この一文節とは、文法の意味を崩さない程度に文字列をできるだけ細かく区切った一区切り文を意味する。具体的に、文字列特定部４１０は、一連の文字列の中に、ある一定以上の時間間隔があるときは、その部分で文字列を区切る。文字列特定部４１０は、その区切った各文字列を形態素抽出部４２０及び入力種類判定部４４０に出力する。尚、以下で説明する「文字列」は、一文節毎の文字列を意味するものとする。 The sentence analysis unit 400 analyzes the character string specified by the speech recognition unit 200. In this embodiment, as shown in FIG. 7, the sentence analysis unit 400 includes a character string identification unit 410, a morpheme extraction unit 420, a morpheme database 430, an input type determination unit 440, and an utterance type database 450. have. The character string specifying unit 410 divides a series of character strings specified by the voice input unit 3 and the voice recognition unit 200 for each sentence. This one-sentence means a delimiter sentence in which character strings are divided as finely as possible without breaking the meaning of the grammar. Specifically, when there is a certain time interval or more in a series of character strings, the character string specifying unit 410 divides the character string at that portion. The character string specifying unit 410 outputs the divided character strings to the morpheme extracting unit 420 and the input type determining unit 440. It should be noted that “character string” described below means a character string for each phrase.

［形態素抽出部］
形態素抽出部４２０は、文字列特定部４１０で区切られた一文節の文字列に基づいて、その一文節の文字列の中から、文字列の最小単位を構成する各形態素を第一形態素情報として抽出するものである。ここで、形態素とは、本実施の形態では、文字列に現された語構成の最小単位を意味するものとする。この語構成の最小単位としては、例えば、名詞、形容詞、動詞などの品詞が挙げられる。 [Morpheme extraction unit]
The morpheme extraction unit 420 sets, as first morpheme information, each morpheme constituting the minimum unit of the character string from the character string of the one phrase according to the character string of the one sentence divided by the character string specifying unit 410. To extract. Here, in this embodiment, the morpheme means the minimum unit of the word structure represented in the character string. Examples of the minimum unit of the word structure include parts of speech such as nouns, adjectives and verbs.

図８は、文字列とこの文字列から抽出される形態素との関係を示す図である。各形態素は、図８に示すように、本実施の形態ではm１,m２,m３…,と表現することができる。図８に示すように、文字列特定部４１０から文字列が入力された形態素抽出部４２０は、入力された文字列と、形態素データベース４３０に予め格納されている形態素群（この形態素群は、それぞれの品詞分類に属する各形態素についてその形態素の見出し語・読み・品詞・活用形などを記述した形態素辞書として用意されている）とを照合する。その照合をした形態素抽出部４２０は、その文字列の中から、予め記憶された形態素群のいずれかと一致する各形態素（m１,m２、…）を抽出する。この抽出された各形態素を除いた要素（n１,n２,n３…）は、例えば助動詞等が挙げられる。 FIG. 8 is a diagram illustrating a relationship between a character string and a morpheme extracted from the character string. As shown in FIG. 8, each morpheme can be expressed as m1, m2, m3... In the present embodiment. As shown in FIG. 8, the morpheme extraction unit 420 to which the character string is input from the character string specifying unit 410 includes the input character string and a morpheme group stored in advance in the morpheme database 430. Morphemes that belong to the part-of-speech classification are prepared as a morpheme dictionary that describes the morpheme entry word, reading, part-of-speech, utilization form, etc.). The matched morpheme extraction unit 420 extracts each morpheme (m1, m2,...) That matches one of the previously stored morpheme groups from the character string. Examples of the elements (n1, n2, n3...) Excluding each extracted morpheme include auxiliary verbs.

この形態素抽出部４２０は、抽出した各形態素を第一形態素情報として話題特定情報検索蔀３２０に出力する。なお、第一形態素情報は構造化されている必要はない。ここで「構造化」とは、文字列の中に含まれる形態素を品詞等に基づいて分類し配列することをいい、たとえば発話文である文字列を、「主語＋目的語＋述語」などの様に、所定の順番で形態素を配列してなるデータに変換することを言う。もちろん、構造化した第一形態素情報を用いたとしても、それが本実施の形態を実現をさまたげることはない。 The morpheme extraction unit 420 outputs each extracted morpheme to the topic identification information search box 320 as first morpheme information. Note that the first morpheme information need not be structured. Here, “structured” means to classify and arrange morphemes contained in a character string based on the part of speech, for example, a character string that is an utterance sentence, such as “subject + object + predicate”. In the same way, it refers to conversion into data obtained by arranging morphemes in a predetermined order. Of course, even if structured first morpheme information is used, this does not interfere with the implementation of the present embodiment.

［入力種類判定部］
入力種類判定部４４０は、文字列特定部４１０で特定された文字列に基づいて、発話内容の種類（発話種類）を判定し、判定結果に従って発話種類を示す情報を出力する機能を有する。この発話種類は、発話内容の種類を特定する情報であって、本実施の形態では、例えば図９に示す「発話文のタイプ」を意味する。図９は、「発話文のタイプ」と、その発話文のタイプを表す二文字のアルファベット、及びその発話文のタイプに該当する発話文の例を示す図である。 [Input type determination unit]
The input type determination unit 440 has a function of determining the type of utterance content (speech type) based on the character string specified by the character string specifying unit 410 and outputting information indicating the utterance type according to the determination result. This utterance type is information for specifying the type of utterance content, and in the present embodiment, it means, for example, the “spoken sentence type” shown in FIG. FIG. 9 is a diagram illustrating an example of “spoken sentence type”, a two-letter alphabet representing the type of the spoken sentence, and a spoken sentence corresponding to the type of the spoken sentence.

ここで、「発話文のタイプ」は、本実施の形態では、図９に示すように、陳述文（D ; Declaration）、時間文（T ; Time）、場所文（L ; Location）、反発文（N ; Negation）などから構成され、この発話文のタイプに従って回答文を決定するために用いられる。この各タイプから構成される文は、肯定文又は質問文で構成される。「陳述文」とは、利用者の意見又は考えを示す文を意味するものである。この陳述文は本実施の形態では、図９に示すように、例えば「私は佐藤が好きです」のような文である。「場所文」とは、場所的な概念を伴う文を意味するものである。「時間文」とは、時間的な概念を伴う文を意味するものである。「反発文」とは、陳述文を否定するときの文を意味する。「発話文のタイプ」についての例文は図９に示す通りである。 In this embodiment, as shown in FIG. 9, “spoken sentence type” is a statement sentence (D; Declaration), a time sentence (T; Time), a location sentence (L; Location), and a repulsive sentence. (N; Negation) and the like, and is used to determine the answer sentence according to the type of the utterance sentence. The sentence composed of each type is composed of an affirmative sentence or a question sentence. The “declaration sentence” means a sentence indicating a user's opinion or idea. In the present embodiment, this statement is a sentence such as “I like Sato” as shown in FIG. “Place sentence” means a sentence with a place concept. “Time sentence” means a sentence with a temporal concept. “Rebound sentence” means a sentence when a statement is denied. An example sentence for “spoken sentence type” is as shown in FIG.

「発話文のタイプ」を判定する場合、入力種類判定部４４０は、本実施の形態では、陳述文であることを判定するための定義表現辞書、反発文であることを判定するための反発表現辞書等の表現と発話文のタイプの対応関係を記述した辞書群を用いる。図１０は、使用する辞書と、その辞書に該当する表現が含まれている場合になされる判定の種類を示す。具体例を挙げると、文字列特定部４１０から文字列が入力された入力種類判定部４４０は、入力された文字列に基づいて、その文字列と発話種類データベース４５０に格納されている各辞書とを照合する。その照合をした入力種類判定部４４０は、その文字列の中から、各辞書に関係する要素（Ｄ，Ｎなどの判定の種類を示すデータ）を抽出する。 In the case of determining “spoken sentence type”, in this embodiment, the input type determining unit 440 is a definition expression dictionary for determining that it is a statement sentence, and a repulsive expression for determining that it is a repulsive sentence. A dictionary group describing the correspondence between the expression of the dictionary and the type of utterance is used. FIG. 10 shows a dictionary to be used and types of determinations that are made when the corresponding expression is included in the dictionary. As a specific example, the input type determination unit 440 to which a character string is input from the character string specifying unit 410, based on the input character string, each character string and each dictionary stored in the utterance type database 450, Is matched. The input type determination unit 440 that has performed the collation extracts elements (data indicating the type of determination such as D and N) related to each dictionary from the character string.

この入力種類判定部４４０は、抽出した要素に基づいて、「発話文のタイプ」を判定する。例えば、入力種類判定部４４０は、ある事象について陳述している要素が文字列の中に含まれる場合には、その要素が含まれている文字列を陳述文として判定する。入力種類判定部４４０は、判定した「発話文のタイプ」を回答取得部３５０に出力する。 The input type determination unit 440 determines “spoken sentence type” based on the extracted elements. For example, when an element that describes a certain event is included in a character string, the input type determination unit 440 determines the character string that includes the element as a statement. The input type determination unit 440 outputs the determined “spoken sentence type” to the answer acquisition unit 350.

［会話データベース］
次に、会話データベース５００が記憶するデータのデータ構成例について図１１を参照しながら説明する。図１１は、会話データベース５００が記憶するデータのデータ構成の一例を示す概念図である。 [Conversation database]
Next, a data configuration example of data stored in the conversation database 500 will be described with reference to FIG. FIG. 11 is a conceptual diagram illustrating an example of a data configuration of data stored in the conversation database 500.

前記会話データベース５００は、図１１に示すように、話題を特定するための話題特定情報８１０を予め複数記憶する。又、それぞれの話題特定情報８１０は、他の話題特定情報と関連づけられていてもよく、例えば、図１１に示す例では、話題特定情報Ｃが特定されると、この話題特定情報Ｃに関連づけられている他の話題特定情報Ａ、Ｂ，Ｄが定まるように記憶されている。 As shown in FIG. 11, the conversation database 500 stores a plurality of pieces of topic specifying information 810 for specifying topics in advance. Each topic specifying information 810 may be associated with other topic specifying information. For example, in the example shown in FIG. 11, when the topic specifying information C is specified, it is associated with the topic specifying information C. The other topic specifying information A, B, D is stored so as to be determined.

具体的には、話題特定情報８１０は、本実施の形態では、利用者から入力されると予想される入力内容又は利用者への回答文に関連性のある「キーワード」を意味する。 Specifically, in the present embodiment, the topic identification information 810 means “keywords” that are relevant to the input content expected to be input by the user or the answer sentence to the user.

話題特定情報８１０には、一又は複数の話題タイトル８２０が対応付けられて記憶されている。話題タイトル８２０は、一つの文字、複数の文字列又はこれらの組み合わせからなる形態素により構成されている。各話題タイトル８２０には、利用者への回答文８３０が対応付けられて記憶されている。また、回答文８３０の種類を示す複数の回答種類は、回答文に対応付けられている。 One or more topic titles 820 are stored in the topic specifying information 810 in association with each other. The topic title 820 is composed of morphemes composed of one character, a plurality of character strings, or a combination thereof. Each topic title 820 stores an answer sentence 830 to the user in association with it. In addition, a plurality of answer types indicating the type of the answer sentence 830 are associated with the answer sentence.

各回答文８３０には、感情条件パラメータ８４０が対応付けられている。感情条件パラメータ８４０は、感情状態情報の条件を示す情報である。例えば、感情状態情報が「怒り」の感情フラグの累積値１０を示しているとき、ある回答文Ａの感情条件パラメータ８４０が「怒り」の感情フラグの累積値５以下を記述しており、一方別の回答文Ｂの感情条件パラメータ８４０が「怒り」の感情フラグの累積値８以上を記述している場合は、回答文Ａは選択されず、回答文Ｂが会話制御装置１がユーザへの回答として選択されることとなる。 Each answer sentence 830 is associated with an emotion condition parameter 840. The emotion condition parameter 840 is information indicating a condition of emotion state information. For example, when the emotion state information indicates the cumulative value 10 of the emotion flag “anger”, the emotion condition parameter 840 of a certain response sentence A describes the cumulative value 5 or less of the emotion flag “anger”. When the emotion condition parameter 840 of another answer sentence B describes the cumulative value of the emotion flag of “anger” of 8 or more, the answer sentence A is not selected, and the answer sentence B is sent from the conversation control device 1 to the user. It will be selected as an answer.

更に、回答文８３０には会話制御装置１が提供する疑似人格の感情を示すデータである感情フラグ８５０が対応付けされて記憶されている。感情フラグ８５０は感情を区別できる情報であればどのようなデータを用いてもよく、例えば、「平常」の感情を示す感情フラグとして文字データ「Ａ」を割り当て、「激怒」の感情を示す感情フラグとして文字データ「Ｂ」を割り当て、「怒り」の感情を示す感情フラグ８５０として文字データ「Ｃ」を割り当て、「喜び」の感情を示す感情フラグとして文字データ「Ｄ」を割り当てるなどのようにする。本実施の形態にかかる会話制御装置１は、この感情フラグ８５０を参照して、会話制御装置１が提供するキャラクタ（疑似人格）の感情を制御することとなる。 Furthermore, an emotion flag 850 that is data indicating the emotion of the pseudo personality provided by the conversation control device 1 is stored in association with the answer sentence 830. The emotion flag 850 may be any data as long as it can distinguish emotions. For example, the character data “A” is assigned as an emotion flag indicating “normal” emotion and the emotion indicating “anger” emotion is indicated. Character data “B” is assigned as a flag, character data “C” is assigned as an emotion flag 850 indicating “anger”, character data “D” is assigned as an emotion flag indicating “joy”, and so on. To do. The conversation control device 1 according to the present embodiment refers to the emotion flag 850 and controls the emotion of the character (pseudo personality) provided by the conversation control device 1.

次に、ある話題特定情報と他の話題特定情報との関連づけについて説明する。図１２は、ある話題特定情報８１０Ａと他の話題特定情報８１０Ｂ、８１０Ｃ_１〜８１０Ｃ_４、８１０Ｄ_１〜８１０Ｄ_３…との関連付けを示す図である。なお、以下の説明において「関連づけされて記憶される」とは、ある情報Ｘを読み取るとその情報Ｘに関連づけられている情報Ｙを読み取りできることをいい、例えば、情報Ｘのデータの中に情報Ｙを読み出すための情報（例えば、情報Ｙの格納先アドレスを示すポインタ、情報Ｙの格納先物理メモリアドレス、論理アドレスなど）が格納されている状態を、「情報Ｙが情報Ｘに『関連づけされて記憶され』ている」というものとする。 Next, the association between certain topic specifying information and other topic specifying information will be described. FIG. 12 is a diagram showing an association between certain topic specifying information 810A and other topic specifying information 810B, 810C _{1 to} 810C ₄ , 810D _{1 to} 810D ₃ . In the following description, “stored in association” means that when information X is read, information Y associated with the information X can be read. For example, information Y in the data of the information X Is stored as information (for example, a pointer indicating the storage destination address of information Y, a physical memory address of the storage destination of information Y, and a logical address). "Remembered".

図１２に示す例では、話題特定情報は他の話題特定情報との間で上位概念、下位概念、同義語、対義語（本図の例では省略）が関連づけされて記憶させることができる。本図に示す例では、話題特定情報８１０Ａ（＝「映画」）に対する上位概念の話題特定情報として話題特定情報８１０Ｂ（＝「娯楽」）が話題特定情報８１０Ａに関連づけされて記憶されており、たとえば話題特定情報（「映画」）に対して上の階層に記憶される。 In the example shown in FIG. 12, the topic identification information can be stored in association with other topic identification information in association with a higher concept, a lower concept, a synonym, and a synonym (omitted in the example of this figure). In the example shown in this figure, topic specifying information 810B (= “entertainment”) is stored in association with the topic specifying information 810A as topic specifying information of the higher concept for the topic specifying information 810A (= “movie”). The topic specific information (“movie”) is stored in the upper hierarchy.

また、話題特定情報８１０Ａ（＝「映画」）に対する下位概念の話題特定情報８１０Ｃ_１（＝「監督」）、話題特定情報８１０Ｃ_２（＝「主演」）、話題特定情報８１０Ｃ_３（＝「配給会社」）、話題特定情報８１０Ｃ_４（＝「上映時間」）、および話題特定情報８１０Ｄ_１（＝「七人の侍」）、話題特定情報８１０Ｄ_２（＝「乱」）、話題特定情報８１０Ｄ_３（＝「用心棒」）、…、が話題特定情報８１０Ａに関連づけされて記憶されている。 Further, topic specific information 810C ₁ (= “director”), topic specific information 810C ₂ (= “starring”), topic specific information 810C ₃ (= “distribution company” for the topic specific information 810A (= “movie”) )), Topic identification information 810C ₄ (= “screening time”), topic identification information 810D ₁ (= “Seven Samurai”), topic identification information 810D ₂ (= “Ran”), topic identification information 810D ₃ ( = "Bouncer"), ... are stored in association with the topic identification information 810A.

又、話題特定情報８１０Ａには、同義語９００が関連づけられている。この例では、話題特定情報８１０Ａであるキーワード「映画」の同義語として「作品」、「内容」、「シネマ」が記憶されている様子を示している。このような同意語を定めることにより、発話にはキーワード「映画」は含まれていないが「作品」、「内容」、「シネマ」が発話文等に含まれている場合に、話題特定情報８１０Ａが発話文等に含まれているものとして取り扱うことを可能とする。 In addition, the synonym 900 is associated with the topic identification information 810A. In this example, “works”, “contents”, and “cinema” are stored as synonyms of the keyword “movie” that is the topic identification information 810A. By defining such synonyms, the topic specifying information 810A is obtained when the utterance does not include the keyword “movie” but includes “works”, “contents”, and “cinema” in the utterance sentence or the like. Can be handled as being included in an utterance sentence.

本実施の形態にかかる会話制御装置１は、会話データベース５００の記憶内容を参照することにより、ある話題特定情報を特定するとその話題特定情報に関連づけられて記憶されている他の話題特定情報及びその話題特定情報の話題タイトル、回答文などを高速で検索・抽出することが可能となる。 When the conversation control device 1 according to the present embodiment identifies certain topic identification information by referring to the stored content of the conversation database 500, the other topic identification information stored in association with the topic identification information and It is possible to search and extract topic titles and answer sentences of topic identification information at high speed.

［話題タイトル］
次に、話題タイトル８２０（「第二形態素情報」ともいう）のデータ構成例について、図１３を参照しながら説明する。図１３は、話題タイトルのデータ構成例を示す図である。 [Topic title]
Next, a data configuration example of the topic title 820 (also referred to as “second morpheme information”) will be described with reference to FIG. FIG. 13 is a diagram illustrating a data configuration example of topic titles.

話題特定情報８１０Ｄ_１、８１０Ｄ_２、８１０Ｄ_３、…は、それぞれ複数の異なる話題タイトル８２０_１、８２０_２、…、話題タイトル８２０_３、８２０_４、…、話題タイトル８２０_５、８２０_６、…を有している。 The topic identification information 810D ₁ , 810D ₂ , 810D ₃ ,... Has a plurality of different topic titles 820 ₁ , 820 ₂ ,..., Topic titles 820 ₃ , 820 ₄ ,..., Topic titles 820 ₅ , 820 ₆ ,. is doing.

本実施の形態では、図１３に示すように、それぞれの話題タイトル８２０_１〜８２０_６、…は、第一特定情報１３０１と、第二特定情報及１３０２と、第三特定情報１３０３によって構成される情報である。ここで、第一特定情報１３０１は、本実施の形態では、話題を構成する主要な形態素を意味するものである。第一特定情報の例としては、例えば文を構成する主語が挙げられる。また、第二特定情報１３０２は、本実施の形態では、第一特定情報１３０１と密接な関連性を有する形態素を意味するものである。この第二特定情報１３０２は、例えば目的語が挙げられる。更に、第三特定情報１３０３は、本実施の形態では、ある対象についての動きを示す形態素、又は名詞等を修飾する形態素を意味するものである。この第三特定情報１３０３は、例えば動詞、副詞又は形容詞が挙げられる。なお、第一特定情報１３０１、第二特定情報１３０２、第三特定情報１３０３それぞれの意味は上述の内容に限定される必要はなく、別の意味を第一特定情報１３０１、第二特定情報１３０２、第三特定情報１３０３に与えても、これらから文の内容を把握可能な限り、本実施の形態は成立する。 In the present embodiment, as shown in FIG. 13, each topic title 820 _{1 to} 820 ₆ ,... Is configured by first specifying information 1301, second specifying information 1302, and third specifying information 1303. Information. Here, in the present embodiment, the first specific information 1301 means main morphemes constituting a topic. As an example of the first specific information, for example, a subject constituting a sentence can be cited. Further, the second specific information 1302 means a morpheme having a close relationship with the first specific information 1301 in the present embodiment. The second specific information 1302 is, for example, an object. Further, in the present embodiment, the third specifying information 1303 means a morpheme that indicates movement of a certain object or a morpheme that modifies a noun or the like. Examples of the third specific information 1303 include verbs, adverbs, and adjectives. The meanings of the first specific information 1301, the second specific information 1302, and the third specific information 1303 do not have to be limited to the above-described contents, and other meanings are the first specific information 1301, the second specific information 1302, Even if the third specific information 1303 is given, the present embodiment is established as long as the contents of the sentence can be grasped from these.

例えば、主語が「七人の侍」、形容詞が「面白い」である場合には、図１３に示すように、話題タイトル（第二形態素情報）８２０_２は、第一特定情報１３０１である形態素「七人の侍」と、第三特定情報１３０３である形態素「面白い」とから構成されることになる。なお、この話題タイトル８２０_２には第二特定情報１３０２である形態素は含まれておらず、該当する形態素がないことを示すための記号「＊」が第二特定情報１３０２として格納されている。 For example, the subject is "Seven Samurai" and the adjective is "interesting", as shown in FIG. 13, the topic title (second morpheme information) 820 ₂ is the first identification information 1301 morpheme " It consists of “Seven Samurai” and the morpheme “Funny” which is the third specific information 1303. Incidentally, this is the topic title 820 ₂ not included morpheme is the second specifying information 1302, the symbol for indicating that there is no corresponding morpheme "*" is stored as the second specification information 1302.

なお、この話題タイトル８２０_２（七人の侍；＊；面白い）は、「七人の侍は面白い」の意味を有する。
また、本明細書中、この話題タイトル８２０を構成する括弧内は、以下では左から第一特定情報１３０１、第二特定情報１３０２、第三特定情報１３０３の順番となっている。また、話題タイトル８２０のうち、第一から第三特定情報に含まれる形態素がない場合には、その部分については、「＊」を示すことにする。 The topic title 820 ₂ (Seven Samurai; *; Interesting) has the meaning of “Seven Samurai is interesting”.
In the present specification, the parentheses constituting the topic title 820 are in the order of the first specific information 1301, the second specific information 1302, and the third specific information 1303 from the left in the following. In addition, in the topic title 820, when there is no morpheme included in the first to third specific information, “*” is indicated for the portion.

なお、上記話題タイトル８２０を構成する特定情報は、上記のような第一から第三特定情報のように三つに限定されるものではなく、更に他の特定情報（第四特定情報、およびそれ以上）を有するようにしてもよい。 Note that the specific information constituting the topic title 820 is not limited to three like the first to third specific information as described above, and is further limited to other specific information (fourth specific information, and it). You may make it have the above.

［回答文］
次に、回答文８３０について説明する。回答文８３０は、図１４に示すように、本実施の形態では、利用者から発話された発話文のタイプに対応した回答をするために、陳述（D ; Declaration）、時間（T ; Time）、場所（L ; Location）、否定（N ; Negation）などのタイプ（回答種類）に分類されている。また肯定文は「Ａ」とし、質問文は「Ｑ」とする。
例えば、話題タイトル（８２０）１−１が（佐藤;＊;好き）{これは、「佐藤が好きです」に含まれる形態素を抽出したもの}である場合には、その話題タイトル（８２０）１-１に対応する回答文（８３０）１−１は、（DA；陳述肯定文「私も佐藤が好きです」）、（TA；時間肯定文「私は打席に立ったときの佐藤が好きです」）などが挙げられる。後述する回答取得部３５０は、その話題タイトル８２０に対応付けられた一の回答文８３０を取得する。 [Answer]
Next, the answer sentence 830 will be described. As shown in FIG. 14, in the present embodiment, the reply sentence 830 includes a statement (D; Declaration) and a time (T; Time) in order to make a reply corresponding to the type of utterance sentence uttered by the user. , Location (L; Location), negation (N; Negation), and other types (answer types). The affirmative sentence is “A” and the question sentence is “Q”.
For example, when the topic title (820) 1-1 is (Sato; *; likes) {this is an extracted morpheme included in "I like Sato"}, the topic title (820) 1 The answer sentence (830) 1-1 corresponding to -1 is (DA; statement affirmation sentence "I also like Sato"), (TA; time affirmation sentence "I like Sato when I was standing at bat" ]). An answer acquisition unit 350 described later acquires one answer sentence 830 associated with the topic title 820.

［会話制御部］
ここで図７に戻り、会話制御部３００の構成例を説明する。
会話制御部３００は、会話制御装置１内の各構成要素（音声認識部２００，文解析部４００、会話データベース５００，音声認識辞書記憶部６００）間のデータの受け渡しを制御するとともに、発話に応答する回答文の決定、出力を行う機能を有する。 [Conversation control unit]
Here, returning to FIG. 7, a configuration example of the conversation control unit 300 will be described.
The conversation control unit 300 controls the data transfer between the constituent elements (the speech recognition unit 200, the sentence analysis unit 400, the conversation database 500, and the speech recognition dictionary storage unit 600) in the conversation control device 1, and responds to the utterance. Has a function to determine and output an answer sentence.

前記会話制御部３００は、本実施の形態では、図７に示すように、管理部３１０と、話題特定情報検索部３２０と、省略文補完部３３０と、話題検索部３４０と、回答取得部３５０とを有している。前記管理部３１０は、会話制御部３００の全体を制御し、会話制御処理の結果として、回答文を音声出力手段６に渡すように動作する。 In the present embodiment, as shown in FIG. 7, the conversation control unit 300 includes a management unit 310, a topic specifying information search unit 320, an abbreviated sentence complement unit 330, a topic search unit 340, and an answer acquisition unit 350. And have. The management unit 310 operates to control the entire conversation control unit 300 and to pass an answer sentence to the voice output unit 6 as a result of the conversation control process.

また、管理部３１０は談話履歴を記憶し、且つ必要に応じて更新する機能を有する。管理部３１０は話題特定情報検索部３２０と、省略文補完部３３０と、話題検索部３４０と、回答取得部３５０からの要求に応じて、記憶している談話履歴の全部又は一部をこれら各部に渡す機能を有する。
「談話履歴」とは、ユーザと会話制御装置１間の会話の話題や主題を特定する情報であって、後述する談話履歴は「着目話題特定情報」「着目話題タイトル」「利用者入力文話題特定情報」「回答文話題特定情報」の少なくともいずれか一つを含む情報である。また、談話履歴に含まれる「着目話題特定情報」「着目話題タイトル」「回答文話題特定情報」は直前の会話によって定められたものに限定されず、過去の所定期間の間に着目話題特定情報」「着目話題タイトル」「回答文話題特定情報」となったもの、若しくはそれらの累積的記録であってもよい。 The management unit 310 has a function of storing the discourse history and updating it as necessary. In response to requests from the topic identification information search unit 320, the abbreviated sentence complement unit 330, the topic search unit 340, and the answer acquisition unit 350, the management unit 310 converts all or part of the stored discourse history into these units. The function to pass to.
The “discourse history” is information for specifying the topic and subject of the conversation between the user and the conversation control device 1, and the discourse history to be described later is “target topic specification information”, “target topic title”, “user input sentence topic” This information includes at least one of “specific information” and “answer sentence topic specific information”. In addition, “focused topic identification information”, “focused topic title”, and “answer sentence topic specific information” included in the discourse history are not limited to those determined by the previous conversation, but focused topic identification information during a past predetermined period. "Remarked topic title", "Reply sentence topic specific information", or a cumulative record thereof.

以下、会話制御部３００を構成する各部について説明する。
［話題特定情報検索部］
話題特定情報検索部３２０は、形態素抽出部４２０で抽出された第一形態素情報と談話履歴に含まれる各話題特定情報とを照合し、それら話題特定情報の中から、第一形態素情報を構成する形態素と一致する話題特定情報を検索するものである。具体的に、話題特定情報検索部３２０は、形態素抽出部４２０から入力された第一形態素情報が「佐藤」及び「好き」の二つの形態素で構成される場合には、入力された第一形態素情報と話題特定情報群とを照合する。 Hereinafter, each part which comprises the conversation control part 300 is demonstrated.
[Topic specific information search part]
The topic identification information search unit 320 collates the first morpheme information extracted by the morpheme extraction unit 420 with each topic identification information included in the discourse history, and configures the first morpheme information from the topic identification information. It searches for topic specific information that matches a morpheme. Specifically, the topic identification information search unit 320, when the first morpheme information input from the morpheme extraction unit 420 is composed of two morphemes "Sato" and "like", the input first morpheme The information is collated with the topic specific information group.

この照合をした話題特定情報検索部３２０は、着目話題タイトル８２０focus（前回までに検索された話題タイトル）に第一形態素情報を構成する形態素（例えば「佐藤」）が含まれているときは、その着目話題タイトル８２０focusを回答取得部３５０に出力する。一方、着目話題タイトル８２０focusに第一形態素情報を構成する形態素が含まれていないときは、話題特定情報検索部３２０は、第一形態素情報に基づいて利用者入力文話題特定情報を決定し、入力された第一形態素情報及び利用者入力文話題特定情報を省略文補完部３３０に出力する。なお、「利用者入力文話題特定情報」は、第一形態素情報に含まれる形態素の内、利用者が話題としている内容に該当する形態素に相当する話題特定情報、若しくは第一形態素情報に含まれる形態素の内、利用者が話題としている内容に該当する可能性がある形態素に相当する話題特定情報をいう。 The topic identification information search unit 320 that has performed this collation, when a morpheme constituting the first morpheme information (for example, “Sato”) is included in the focused topic title 820focus (topic title searched up to the previous time), The subject topic title 820focus is output to the answer acquisition unit 350. On the other hand, when the morpheme constituting the first morpheme information is not included in the focused topic title 820focus, the topic identification information search unit 320 determines the user input sentence topic identification information based on the first morpheme information and inputs it. The first morpheme information and the user input sentence topic specifying information are output to the abbreviated sentence complementing unit 330. "User input sentence topic specific information" is included in the topic specific information corresponding to the morpheme corresponding to the content that the user is talking about or the first morpheme information among the morphemes included in the first morpheme information. The topic specific information corresponding to the morpheme which may correspond to the content which the user is talking about among morphemes.

［省略文補完部］
省略文補完部３３０は、前記第一形態素情報を、前回までに検索された話題特定情報８１０（以下、「着目話題特定情報」）及び前回の回答文に含まれる話題特定情報８１０（以下、「回答文話題特定情報」という）を利用して、補完することにより複数種類の補完された第一形態素情報を生成する。例えば発話文が「好きだ」という文であった場合、省略文補完部３３０は、着目話題特定情報「佐藤」を、第一形態素情報「好き」に含めて、補完された第一形態素情報「佐藤、好き」を生成する。 [Abbreviated sentence completion part]
The abbreviated sentence complementing unit 330 uses the first morpheme information as the topic specifying information 810 (hereinafter referred to as “focused topic specifying information”) searched up to the previous time and the topic specifying information 810 (hereinafter referred to as “ A plurality of types of complemented first morpheme information is generated by complementing using “answer sentence topic specifying information”). For example, when the utterance sentence is a sentence “I like”, the abbreviated sentence complementing unit 330 includes the topic topic identification information “Sato” in the first morpheme information “like” and the complemented first morpheme information “ "Sato likes".

すなわち、第一形態素情報を「Ｗ」、着目話題特定情報や回答文話題特定情報の集合を「Ｄ」とすると、省略文補完部３３０は、第一形態素情報「Ｗ」に集合「Ｄ」の要素を含めて、補完された第一形態素情報を生成する。 In other words, if the first morpheme information is “W” and the set of the topic topic identification information and the answer sentence topic specification information is “D”, the abbreviated sentence complementing unit 330 adds the set “D” to the first morpheme information “W”. Complemented first morpheme information including elements is generated.

これにより、第一形態素情報を用いて構成される文が、省略文であって日本語として明解でない場合などにおいて、省略文補完部３３０は、集合「Ｄ」を用いて、その集合「Ｄ」の要素（例えば、"佐藤"）を第一形態素情報「Ｗ」に含めることができる。この結果、省略文補完部３３０は、第一形態素情報「好き」を補完された第一形態素情報「佐藤、好き」にすることができる。なお、補完された第一形態素情報「佐藤、好き」は、「佐藤が好きだ」という発話内容に対応する。 As a result, when the sentence constructed using the first morpheme information is an abbreviated sentence and is not clear as Japanese, the abbreviated sentence complementing unit 330 uses the set “D” to set the set “D”. (For example, “Sato”) can be included in the first morpheme information “W”. As a result, the abbreviated sentence complementing unit 330 can change the first morpheme information “like” to the first morpheme information “Sato, like”. The complemented first morpheme information “Sato, I like” corresponds to the utterance content “I like Sato”.

すなわち、省略文補完部３３０は、利用者の発話内容が省略文である場合などであっても、集合「Ｄ」を用いて省略文を補完することができる。この結果、省略文補完部３３０は、第一形態素情報から構成される文が省略文であっても、その文が適正な日本語となるようにすることができる。 That is, the abbreviated sentence complementing unit 330 can supplement the abbreviated sentence using the set “D” even when the user's utterance content is an abbreviated sentence. As a result, even if the sentence composed of the first morpheme information is an abbreviated sentence, the abbreviated sentence complementing unit 330 can make the sentence in proper Japanese.

また、省略文補完部３３０が、前記集合「Ｄ」に基づいて、補完後の第一形態素情報に一致する話題タイトル８２０を検索する。補完後の第一形態素情報に一致する話題タイトル８２０を発見した場合は、省略文補完部３３０はこの話題タイトル８２０を回答取得部３５０に出力する。回答取得部３５０は、省略文補完部３３０で検索された適切な話題タイトル８２０に基づいて、利用者の発話内容に最も適した回答文１０３０を出力することができる。 In addition, the abbreviated sentence complementing unit 330 searches for the topic title 820 that matches the first morpheme information after completion based on the set “D”. When a topic title 820 that matches the first morpheme information after complement is found, the abbreviated sentence complement unit 330 outputs the topic title 820 to the answer acquisition unit 350. The answer acquisition unit 350 can output the answer sentence 1030 most suitable for the user's utterance content based on the appropriate topic title 820 searched by the abbreviated sentence complementing unit 330.

尚、省略文補完部３３０は、集合「Ｄ」の要素を第一形態素情報に含めるだけに限定されるものではない。この省略文補完部３３０は、着目話題タイトルに基づいて、その話題タイトルを構成する第一特定情報、第二特定情報又は第三特定情報のいずれかに含まれる形態素を、抽出された第一形態素情報に含めても良い。 Note that the abbreviated sentence complementing unit 330 is not limited to only including elements of the set “D” in the first morpheme information. The abbreviated sentence complementing unit 330 extracts the first morpheme extracted from the morpheme included in any one of the first specific information, the second specific information, or the third specific information constituting the topic title based on the topic title of interest. It may be included in the information.

［話題検索部］
話題検索部３４０は、省略文補完部３３０で話題タイトル８２０が決まらなかったとき、第一形態素情報と、利用者入力文話題特定情報に対応する各話題タイトル８２０とを照合し、各話題タイトル８２０の中から、第一形態素情報に最も適する話題タイトル８２０を検索するものである。 [Topic Search Department]
When the topic title 820 is not determined by the abbreviated sentence complementing unit 330, the topic search unit 340 collates the first morpheme information with each topic title 820 corresponding to the user input sentence topic specifying information, and each topic title 820 The topic title 820 that is most suitable for the first morpheme information is searched for.

具体的に、省略文補完部３３０から検索命令信号が入力された話題検索部３４０は、入力された検索命令信号に含まれる利用者入力文話題特定情報及び第一形態素情報に基づいて、その利用者入力文話題特定情報に対応付けられた各話題タイトル８２０の中から、その第一形態素情報に最も適した話題タイトル８２０を検索する。話題検索部３４０は、その検索した話題タイトル８２０を検索結果信号として回答取得部３５０に出力する。 Specifically, the topic search unit 340 to which the search command signal is input from the abbreviated sentence complement unit 330 is used based on the user input sentence topic identification information and the first morpheme information included in the input search command signal. The topic title 820 most suitable for the first morpheme information is searched from the topic titles 820 associated with the person input sentence topic identification information. The topic search unit 340 outputs the searched topic title 820 to the answer acquisition unit 350 as a search result signal.

図１５は、ある話題特定情報８１０（＝「佐藤」）に対応付けされた、話題タイトル８２０，回答文８３０、感情条件パラメータ８４０、感情フラグ８５０の具体例を示す図である。図１５に示すように、例えば、話題検索部３４０は、入力された第一形態素情報「佐藤、好き」に話題特定情報８１０（＝「佐藤」）が含まれるので、その話題特定情報８１０（＝「佐藤」）を特定し、次に、その話題特定情報８１０（＝「佐藤」）に対応付けられた各話題タイトル（８２０）１-１,１-２,…と入力された第一形態素情報「佐藤、好き」とを照合する。 FIG. 15 is a diagram illustrating a specific example of the topic title 820, the answer sentence 830, the emotion condition parameter 840, and the emotion flag 850 associated with certain topic identification information 810 (= “Sato”). As shown in FIG. 15, for example, the topic search unit 340 includes the topic identification information 810 (= “Sato”) in the input first morpheme information “Sato, I like”, so the topic identification information 810 (= First, the first morpheme information that is input as each topic title (820) 1-1, 1-2,... Associated with the topic specifying information 810 (= “Sato”) Match “Sato, I like”.

話題検索部３４０は、その照合結果に基づいて、各話題タイトル（８２０）１-１〜１-２の中から、入力された第一形態素情報「佐藤、好き」と一致する話題タイトル（８２０）１-１（佐藤；＊；好き）を特定する。話題検索部３４０は、検索した話題タイトル（８２０）１-１（佐藤；＊；好き）を検索結果信号として回答取得部３５０に出力する。一方、入力された第一形態素情報「佐藤、好き」と一致する話題タイトルが発見できない場合は、話題検索部３４０は回答取得部３５０に、ユーザの発話が未登録発話（応答不能）であることを通知する。 The topic search unit 340, based on the comparison result, the topic title (820) that matches the input first morpheme information “Sato, I like” from among the topic titles (820) 1-1 and 1-2. Specify 1-1 (Sato; *; likes). The topic search unit 340 outputs the searched topic title (820) 1-1 (Sato; *; likes) to the answer acquisition unit 350 as a search result signal. On the other hand, if a topic title that matches the input first morpheme information “Sato, likes” cannot be found, the topic search unit 340 indicates that the user's utterance is an unregistered utterance (response impossible) in the answer acquisition unit 350. To be notified.

［回答取得部］
回答取得部３５０は、話題検索部３４０で検索された話題タイトル８２０に基づいて、その話題タイトル８２０に対応付けられた回答文８３０、感情条件パラメータ８４０、感情フラグ８５０を取得する。また、回答取得部３５０は、話題検索部３４０で検索された話題タイトルに基づいて、その話題タイトルに対応付けられた各回答種類と、入力種類判定部４４０で判定された発話種類とを照合する。その照合をした回答取得部３５０は、各回答種類の中から、判定された発話種類と一致する回答種類を検索する。 [Answer section]
The answer acquisition unit 350 acquires an answer sentence 830, an emotion condition parameter 840, and an emotion flag 850 associated with the topic title 820 based on the topic title 820 searched by the topic search unit 340. Also, the answer acquisition unit 350 collates each answer type associated with the topic title with the utterance type determined by the input type determination unit 440 based on the topic title searched by the topic search unit 340. . The answer acquisition unit 350 that has performed the collation searches for an answer type that matches the determined utterance type from among the answer types.

図１５に示すように、例えば、回答取得部３５０は、話題検索部３４０で検索された話題タイトルが話題タイトル１-１（佐藤;*;好き）である場合には、その話題タイトル１-１に対応付けられている回答文１-１（DA,TAなど）の中から、入力種類判定部４４０で判定された「発話文のタイプ」（例えばDA）と一致する回答種類（DA）を特定する。 As shown in FIG. 15, for example, when the topic title searched by the topic search unit 340 is the topic title 1-1 (Sato; *; likes), the topic title 1-1 The answer type (DA) that matches the “spoken sentence type” (for example, DA) determined by the input type determination unit 440 is identified from the answer sentences 1-1 (DA, TA, etc.) associated with To do.

さらに、回答取得部３５０は感情状態情報管理手段７のその時点で記憶している感情状態情報を参照し、特定された発話文のタイプ（例えばＤＡ）に対応する複数の回答文の候補から、感情状態情報に合致する条件を記述している感情条件パラメータ８４０を特定し（例えば、「すべて２以下」）、その感情条件パラメータ８４０に対応する内容（例えば、「私も佐藤が好きです」を取得する。
ここで、上記"DA"、"TA"等のうち、"A"は、肯定形式を意味する。従って、発話種類及び回答種類に"A"が含まれているときは、ある事柄について肯定することを示している。また、発話種類及び回答種類には、"DQ"、"TQ"等の種類を含めることもできる。この"DQ"、"TQ"等のうち"Q"は、ある事柄についての質問を意味する。 Further, the answer acquisition unit 350 refers to the emotional state information stored at that time of the emotional state information management unit 7, and selects a plurality of answer sentence candidates corresponding to the specified type of utterance sentence (for example, DA), The emotion condition parameter 840 describing the condition that matches the emotion state information is identified (for example, “all 2 or less”), and the content corresponding to the emotion condition parameter 840 (for example, “I like Sato too”) get.
Here, among the “DA”, “TA”, etc., “A” means an affirmative form. Therefore, when “A” is included in the utterance type and the answer type, it indicates that a certain matter is affirmed. In addition, types such as “DQ” and “TQ” can be included in the utterance type and the answer type. Of these “DQ”, “TQ”, etc., “Q” means a question about a certain matter.

回答種類が上記質問形式（Q）からなるときは、この回答種類に対応付けられる回答文８３０は、肯定形式（A）で構成される。この肯定形式（A）で作成された回答文８３０としては、質問事項に対して回答する文等が挙げられる。例えば、発話文が「あなたはスロットマシンを操作したことがありますか?」である場合には、この発話文についての発話種類は、質問形式（Q）となる。この質問形式（Q）に対応付けられる回答文８３０は、例えば「私はスロットマシンを操作したことがあります」（肯定形式（A））が挙げられる。 When the answer type is the question format (Q), the answer sentence 830 associated with the answer type is configured in an affirmative format (A). Examples of the answer sentence 830 created in the affirmative form (A) include a sentence that answers a question item. For example, when the utterance sentence is "Have you operated the slot machine?", The utterance type for this utterance sentence is a question form (Q). The answer sentence 830 associated with this question format (Q) is, for example, “I have operated the slot machine” (affirmative format (A)).

一方、発話種類が肯定形式（A）からなるときは、この回答種類に対応付けられる回答文は、質問形式（Q）で構成される。この質問形式（Q）で作成された回答文としては、発話内容に対して聞き返す質問文、又は特定の事柄を聞き出す質問文等が挙げられる。例えば、発話文が「私はスロットマシンで遊ぶのが趣味です」である場合には、この発話文についての発話種類は、肯定形式（A）となる。この肯定形式（A）に対応付けられる回答文は、例えば"パチンコで遊ぶのは趣味ではないのですか?"（特定の事柄を聞き出す質問文（Q））が挙げられる。 On the other hand, when the utterance type is an affirmative form (A), the answer sentence associated with the answer type is configured with a question form (Q). Examples of the answer sentence created in this question format (Q) include a question sentence that is replied to the utterance content, or a question sentence that asks a specific matter. For example, when the utterance sentence is “I play with a slot machine”, the utterance type for this utterance sentence is an affirmative form (A). The answer sentence associated with this affirmative form (A) is, for example, “isn't it a hobby to play with pachinko?” (Question sentence (Q) to ask for a specific matter).

回答取得部３５０は、取得した回答文８３０の内容を回答文信号として管理部３１０に出力するともに、回答文８３０に対応した感情フラグ８４０を感情状態情報管理部６００に出力する。また、回答取得部３５０から回答文信号が入力された管理部３１０は、入力された回答文信号をに出力する。 The answer acquisition unit 350 outputs the content of the acquired answer sentence 830 as an answer sentence signal to the management unit 310 and outputs an emotion flag 840 corresponding to the answer sentence 830 to the emotion state information management unit 600. In addition, the management unit 310 to which the answer sentence signal is input from the answer acquisition unit 350 outputs the input answer sentence signal.

一方、ユーザの発話が未登録発話である場合は、回答取得部３５０は、未登録発話に対する応答を行うよう未登録発話応答部３６０に要求する。 On the other hand, if the user's utterance is an unregistered utterance, the answer acquisition unit 350 requests the unregistered utterance response unit 360 to respond to the unregistered utterance.

［未登録発話応答部］
未登録発話応答部３６０は、回答取得部３５０から未登録発話に対する回答を行うよう要求される（未登録発話情報）と、ゲーム装置２からゲーム状態情報を受け取り、或いは感情状態情報管理手段７から感情状態情報を受け取り、ゲーム状態情報／感情状態情報に基づいて未登録発話に対する回答として用意されている複数の未登録発話対応回答文からいずれかを選択して、これを管理部３１０に渡す。 [Unregistered utterance response unit]
The unregistered utterance response unit 360 receives the game state information from the game apparatus 2 or receives the state information from the emotion state information management unit 7 when requested to reply to the unregistered utterance from the answer acquisition unit 350 (unregistered utterance information). Emotion state information is received, based on the game state information / emotion state information, one of a plurality of unregistered utterance correspondence response texts prepared as responses to unregistered utterances is selected and passed to the management unit 310.

なお、未登録発話応答部３６０による未登録発話対応回答文の選択は、ゲーム状態情報／感情状態情報のいずれか若しくは双方に基づいて行われてよい。 Note that the unregistered utterance response answer sentence by the unregistered utterance response unit 360 may be selected based on either or both of the game state information / emotion state information.

また、未登録発話応答部３６０が未登録発話を受け付けた回数（以下、「未登録発話カウント値」という）をカウントし、この未登録発話カウント値に応じて未登録発話対応回答文の選択を行うように構成されていてもよい。 In addition, the unregistered utterance response unit 360 counts the number of times the unregistered utterance is accepted (hereinafter referred to as “unregistered utterance count value”), and selects an unregistered utterance corresponding response sentence according to the unregistered utterance count value. It may be configured to do.

図１６は、未登録発話応答部３６０の構成例を示す機能ブロック図である。この図に示す例において、未登録発話応答部３６０は未登録発話対応回答文選択部３６０Ａと、未登録発話対応回答文選択部３６０Ａに接続された未登録発話対応回答文記憶部３６０Ｂと、未登録発話対応回答文選択部３６０Ａに接続された未登録発話カウンタ部３６０Ｃとを有している。 FIG. 16 is a functional block diagram illustrating a configuration example of the unregistered utterance response unit 360. In the example shown in this figure, the unregistered utterance response unit 360 includes an unregistered utterance correspondence answer sentence selection unit 360A, an unregistered utterance correspondence answer sentence selection unit 360A connected to the unregistered utterance correspondence answer sentence storage unit 360B, An unregistered utterance counter section 360C connected to the registered utterance correspondence answer sentence selection section 360A.

未登録発話対応回答文選択部３６０Ａは、ゲーム状態情報、感情状態情報、未登録発話を受け付けた回数のうち少なくとも一つに基づいて、未登録発話対応回答文の選択を行う機能を有する。 The unregistered utterance correspondence answer sentence selection unit 360A has a function of selecting an unregistered utterance correspondence answer sentence based on at least one of the game state information, the emotion state information, and the number of times the unregistered utterance is accepted.

未登録発話対応回答文記憶部３６０Ｂは、予め複数の未登録発話対応回答文を記憶しておく機能を有する。図１７（Ａ）から１７（Ｃ）はそれぞれ、未登録発話対応回答文記憶部３６０Ｂが記憶する未登録発話対応回答文テーブルであって、未登録発話対応回答文テーブルは、未登録発話対応回答文を選択する条件と、その条件に対応する未登録発話対応回答文の内容を対応付けて記憶するテーブルである。 The unregistered utterance correspondence answer sentence storage unit 360B has a function of storing a plurality of unregistered utterance correspondence answer sentences in advance. FIGS. 17A to 17C are unregistered utterance correspondence answer sentence tables stored in the unregistered utterance correspondence answer sentence storage unit 360B, respectively. The unregistered utterance correspondence answer sentence table is an unregistered utterance correspondence answer table. It is a table which memorize | stores the conditions for selecting a sentence, and the content of the answer sentence corresponding to the unregistered utterance corresponding to the condition.

図１７（Ａ）は、ゲーム状態情報に基づいて未登録発話対応回答文を選択する場合に用いられる未登録発話対応回答文テーブルの例を示す。この未登録発話対応回答文テーブルを用いる場合、ゲーム状態情報が「ＣＰＵ側主導権」を示す場合（ゲーム装置２の番である場合）、ユーザが未登録発話を発すると、「ちょっと静かにしてて」という未登録発話対応回答文が選択されて、回答文として会話制御装置１より出力されることとなる。一方、ゲーム状態情報が「ユーザ側主導権」を示す場合（ユーザの番である場合）、ユーザが未登録発話を発すると、「ごめ〜ん！聞いてなかった。それよりあなたの番よ…」という未登録発話対応回答文が選択されて、回答文として会話制御装置１より出力されることとなる。かかる未登録発話対応回答文テーブルを用いて選択を行うと、どちらの手番であるかに応じた内容の回答文が選択されるので、より自然な会話内容を実現することが可能となる。 FIG. 17A shows an example of an unregistered utterance correspondence answer sentence table used when selecting an unregistered utterance correspondence answer sentence based on the game state information. When this unregistered utterance correspondence reply sentence table is used, when the game state information indicates “CPU side initiative” (in the case of the game device 2), when the user utters an unregistered utterance, An unregistered utterance correspondence answer sentence “” is selected and output from the conversation control apparatus 1 as an answer sentence. On the other hand, when the game state information indicates “user side initiative” (when it is the user's turn), when the user utters an unregistered utterance, “Sorry! I did n’t listen. The unregistered utterance correspondence answer sentence “...” Is selected and output from the conversation control apparatus 1 as an answer sentence. When the selection is made using the unregistered utterance correspondence answer sentence table, the answer sentence having the contents corresponding to which number is selected is selected, so that more natural conversation contents can be realized.

図１７（Ｂ）は、感情状態情報に基づいて未登録発話対応回答文を選択する場合に用いられる未登録発話対応回答文テーブルの例を示す。この未登録発話対応回答文テーブルを用いる場合、キャラクタの感情を示す感情状態情報が「良好」を示す場合（ゲーム装置２の番である場合）、ユーザが未登録発話を発すると、「それより、あなたの番よ」というユーザに対して好感を示す未登録発話対応回答文が選択されて、回答文として会話制御装置１より出力されることとなる。一方、感情状態情報が「嫌悪」を示す場合（ユーザの番である場合）、ユーザが未登録発話を発すると、「あんたの番だから、早くやりなよ！！」というユーザに対して攻撃的な内容の未登録発話対応回答文が選択されて、回答文として会話制御装置１より出力されることとなる。かかる未登録発話対応回答文テーブルを用いて選択を行うと、キャラクタの感情状態に応じた回答文が選択されるので、より自然な会話内容を実現することが可能となる。 FIG. 17B shows an example of an unregistered utterance correspondence answer sentence table used when an unregistered utterance correspondence answer sentence is selected based on emotion state information. When this unregistered utterance correspondence sentence table is used, when the emotion state information indicating the emotion of the character indicates “good” (in the case of the game device 2), when the user utters an unregistered utterance, An unregistered utterance correspondence answer sentence showing a favorable feeling for the user “Your turn” is selected and output as an answer sentence from the conversation control device 1. On the other hand, when the emotional state information indicates “disgust” (when it is the user's turn), when the user utters an unregistered utterance, it is aggressive against the user who says “Your turn, so do it soon!” An unregistered utterance-response answer sentence with the correct content is selected and output from the conversation control apparatus 1 as an answer sentence. When the selection is made using the unregistered utterance correspondence answer sentence table, an answer sentence corresponding to the emotional state of the character is selected, so that more natural conversation contents can be realized.

なお、上記の例では感情状態情報を「良好」および「嫌悪」の２種類に分類して未登録発話対応回答文の選択を行うものとしたが、本実施の形態はこれに制限されるものではなく、「良好」「嫌悪」以外を条件としてもよいし、２種類でなく、３種類以上の感情状態を未登録発話対応回答文の選択の条件に用いてもよい。 In the above example, the emotional state information is classified into two types of “good” and “disgusted” and the answer sentence corresponding to the unregistered utterance is selected, but this embodiment is limited to this. Instead, conditions other than “good” and “disgust” may be used, and three or more emotion states may be used as conditions for selecting an unregistered utterance correspondence answer sentence instead of two kinds.

図１７（Ｃ）は、未登録発話カウント値に基づいて未登録発話対応回答文を選択する場合に用いられる未登録発話対応回答文テーブルの例を示す。この未登録発話対応回答文テーブルを用いる場合、未登録発話カウント値が所定数Ｎ以上である場合、ユーザが未登録発話を発すると、「しつこいわね。あんたの番よ！」というユーザに対していらつきを示す未登録発話対応回答文が選択されて、回答文として会話制御装置１より出力されることとなる。一方、未登録発話カウント値が所定数Ｎ未満である場合、ユーザが未登録発話を発すると、「ゴメンね、アナタの番よ。」というユーザに対してニュートラルな態度の未登録発話対応回答文が選択されて、回答文として会話制御装置１より出力されることとなる。かかる未登録発話対応回答文テーブルを用いて選択を行うと、未登録発話が所定回数以上繰り返されると、あたかもキャラクタが未登録発話の繰り返しに反応したかのような回答文が選択されるので、より自然な会話内容を実現することが可能となる。 FIG. 17C shows an example of an unregistered utterance correspondence answer sentence table used when selecting an unregistered utterance correspondence answer sentence based on the unregistered utterance count value. When this unregistered utterance correspondence reply sentence table is used, if the unregistered utterance count value is a predetermined number N or more and the user utters an unregistered utterance, the user who is “stubbornly! An unregistered utterance response sentence indicating flicker is selected and output from the conversation control device 1 as an answer sentence. On the other hand, when the unregistered utterance count value is less than the predetermined number N, when the user utters the unregistered utterance, the reply sentence corresponding to the unregistered utterance having a neutral attitude with respect to the user “Sorry, it's your turn”. Is selected and output from the conversation control device 1 as an answer sentence. When selecting using such an unregistered utterance correspondence answer table, if the unregistered utterance is repeated a predetermined number of times, an answer sentence is selected as if the character reacted to the repetition of the unregistered utterance. It becomes possible to realize more natural conversation contents.

図１６に戻り、未登録発話応答部３６０の説明を続ける。
未登録発話カウンタ部３６０Ｃは、未登録発話応答部３６０が未登録発話を受け付けるごとに、未登録発話カウント値をインクリメントして記憶する機能を有する。上述の図１７（Ｃ）に示す未登録発話対応回答文テーブルを用いる場合、未登録発話対応回答文選択部３６０Ａは、未登録発話カウンタ部３６０Ｃの記憶内容を参照して、未登録発話対応回答文の選択を行う。なお、未登録発話カウンタ部３６０Ｃは記憶している未登録発話カウント値を所定の期間の経過若しくは所定の条件の成立をトリガとして、未登録発話カウント値をリセットするようにしてもよい。「所定の期間」の例としては「１０分」であり、所定の条件の例としては、ゲームの終了（例えば、半荘の終了）などである。 Returning to FIG. 16, the description of the unregistered utterance response unit 360 is continued.
The unregistered utterance counter unit 360C has a function of incrementing and storing an unregistered utterance count value every time the unregistered utterance response unit 360 accepts an unregistered utterance. When the unregistered utterance correspondence answer sentence table shown in FIG. 17C is used, the unregistered utterance correspondence answer sentence selection unit 360A refers to the stored contents of the unregistered utterance counter part 360C, and the unregistered utterance correspondence answer Select a sentence. The unregistered utterance counter unit 360C may reset the unregistered utterance count value by using the stored unregistered utterance count value as a trigger when a predetermined period elapses or a predetermined condition is satisfied. An example of the “predetermined period” is “10 minutes”, and an example of the predetermined condition is the end of the game (for example, the end of a half-house).

［感情状態情報管理手段］
図１に戻り、ゲームシステムＧＳの構成要素である感情状態情報管理手段７について説明する。
まず、感情状態情報管理手段７の構成例について、図１８を参照しながら説明する。図１８は、感情状態情報管理手段７の構成例を示す機能ブロック図である。 [Emotional state information management means]
Returning to FIG. 1, the emotion state information management means 7 as a component of the game system GS will be described.
First, a configuration example of the emotional state information management unit 7 will be described with reference to FIG. FIG. 18 is a functional block diagram illustrating a configuration example of the emotion state information management unit 7.

感情状態情報管理手段７は、会話制御装置１からキャラクタの感情の変化を生成する感情フラグ８５０を受け取り、感情状態情報の書き換えを行う感情状態情報更新部７Ａと、感情状態情報の記憶を行う感情状態情報記憶部７Ｂと、会話制御装置１およびゲーム装置２に感情状態情報の全部又は一部を通知する感情状態情報通知部７Ｃとを有している。 The emotion state information management means 7 receives an emotion flag 850 that generates a change in the emotion of the character from the conversation control device 1, and an emotion state information update unit 7A that rewrites the emotion state information, and an emotion that stores the emotion state information A state information storage unit 7B and an emotion state information notification unit 7C for notifying all or a part of the emotion state information to the conversation control device 1 and the game device 2 are provided.

感情状態情報更新部７Ａは、感情フラグ８５０を受け取ると、その感情フラグ８５０に相当する値（例えば、その感情フラグのカウント値）を所定の値（例えば＋１）だけ変化させるように、感情状態情報を書き換える。また、感情状態情報更新部７Ａは、ゲーム状態情報（例えば、ユーザの勝ちを示す情報）を受け取ると、所定の内容で感情状態情報を書き換える（例えば、「怒り」の感情フラグに相当する値を１０増加させる、或いは「喜び」「怒り」の感情フラグに相当する値を現在の値の２倍とする、など）。 When the emotion state information update unit 7A receives the emotion flag 850, the emotion state information update unit 7A changes the value corresponding to the emotion flag 850 (for example, the count value of the emotion flag) by a predetermined value (for example, +1). Rewrite. When the emotion state information updating unit 7A receives the game state information (for example, information indicating the user's win), the emotion state information updating unit 7A rewrites the emotion state information with a predetermined content (for example, a value corresponding to the emotion flag of “anger”). Or increase the value corresponding to the emotion flag of “joy” or “anger” by twice the current value).

感情状態情報記憶部７Ｂが記憶する感情状態情報のデータ構成例を図１９に示す。感情状態情報１９００は、感情フラグ種類１９０１ごとに、累積値１９０２を有している。累積値１９０２は、各感情フラグごとに増減が可能である。例えば、感情状態情報管理手段７が「喜び」を示す感情フラグを受け取るごとに、累積値１９０２の値がインクリメントされる。どのように増減されるかは、予め感情状態情報更新部７Ａがプログラムとして記憶されており、プログラムに従って所定の増減処理が実行される。 FIG. 19 shows a data configuration example of emotional state information stored in the emotional state information storage unit 7B. The emotional state information 1900 has a cumulative value 1902 for each emotion flag type 1901. The cumulative value 1902 can be increased or decreased for each emotion flag. For example, every time the emotion state information management means 7 receives an emotion flag indicating “joy”, the value of the cumulative value 1902 is incremented. The emotion state information update unit 7A is stored as a program in advance as to how to increase or decrease, and a predetermined increase / decrease process is executed according to the program.

感情状態情報通知部７Ｃは、感情状態情報記憶部７Ｂに記憶されている感情状態情報１９００の全部又は一部を会話制御装置１及びゲーム装置２に通知する。会話制御装置１は受け取った感情状態情報１９００を参照して、会話データベース５００が記憶する感情条件パラメータ８４０を満たす回答文８３０の内容を取得し、ユーザへの回答として出力する。よって、本ゲームシステムＧＳ、はキャラクタの感情に応じた内容を有する回答を出力することができ、自然な会話を成立させることが可能となる。また、ゲーム装置２は、感情状態情報１９００を受け取るとその感情状態情報１９００に応じたキャラクタの表示や動作などを示すよう、キャラクタ表示処理を実行する。これにより、本ゲームシステムＧＳはキャラクタの感情の変化に応じたキャラクタの表示をすることが可能となる。 Emotion state information notification unit 7C notifies conversation control device 1 and game device 2 of all or part of emotion state information 1900 stored in emotion state information storage unit 7B. The conversation control device 1 refers to the received emotion state information 1900, acquires the content of the answer sentence 830 that satisfies the emotion condition parameter 840 stored in the conversation database 500, and outputs it as a reply to the user. Therefore, this game system GS can output an answer having contents according to the emotion of the character, and can establish a natural conversation. In addition, when the game apparatus 2 receives the emotion state information 1900, the game apparatus 2 performs a character display process so as to indicate the display and operation of the character according to the emotion state information 1900. Thereby, this game system GS can display the character according to the change of the emotion of the character.

［ゲームシステムの動作例］
上記のような構成を有する、本実施の形態にかかるゲームシステムＧＳの動作例について、図２０を参照しながら説明する。図２０は、本実施の形態にかかるゲームシステムＧＳの動作例の内、会話制御に関する主要な処理を示すフローチャートである。本フローチャートについては、ゲーム進行に関する処理は示していない。 [Game system operation example]
An operation example of the game system GS according to the present embodiment having the above-described configuration will be described with reference to FIG. FIG. 20 is a flowchart showing main processing relating to conversation control in the operation example of the game system GS according to the present embodiment. This flowchart does not show processing related to game progress.

まず、会話制御装置１は、ユーザからの発話を音声認識部２００にて受け付け、会話制御部３００にて処理できるデータに変化するユーザ発話受付処理を行う（ステップＳ２００１）。 First, the conversation control device 1 accepts an utterance from the user by the voice recognition unit 200, and performs a user utterance acceptance process that changes to data that can be processed by the conversation control unit 300 (step S2001).

次に、会話制御装置１は、ユーザからの発話に応じた回答文を出力するために、会話制御処理を行う（ステップＳ２００２）。会話制御処理（ステップＳ２００２）は、以下の手順により実施することができる。図２１は、本実施の形態に係る会話制御方法の手順を示すフロー図である。
先ず、音声入力手段３が、利用者からの発話内容を取得するステップを行う（ステップＳ２１０１）。具体的には、音声入力手段３は、利用者の発話内容を構成する音声を取得する。音声入力手段３は、取得した音声を音声信号として音声認識部２００に出力する。 Next, the conversation control device 1 performs conversation control processing in order to output an answer sentence corresponding to the utterance from the user (step S2002). The conversation control process (step S2002) can be performed according to the following procedure. FIG. 21 is a flowchart showing the procedure of the conversation control method according to the present embodiment.
First, the voice input unit 3 performs a step of acquiring the utterance content from the user (step S2101). Specifically, the voice input unit 3 acquires the voice constituting the user's utterance content. The voice input unit 3 outputs the acquired voice to the voice recognition unit 200 as a voice signal.

次いで、音声認識部２００が、音声入力手段３で取得した発話内容に基づいて、発話内容に対応する文字列を特定するステップを行う（ステップＳ２１０２）。具体的には、音声入力手段３から音声信号が入力された音声認識部２００は、入力された音声信号に基づいて、その音声信号に対応する単語仮説（候補）を特定する。音声認識部２００は、特定した単語仮説（候補）に対応付けられた文字列を取得し、取得した文字列を文字列信号として会話制御部３００に出力する。 Next, the voice recognition unit 200 performs a step of specifying a character string corresponding to the utterance content based on the utterance content acquired by the voice input unit 3 (step S2102). Specifically, the voice recognition unit 200 to which a voice signal is input from the voice input unit 3 specifies a word hypothesis (candidate) corresponding to the voice signal based on the input voice signal. The voice recognition unit 200 acquires a character string associated with the identified word hypothesis (candidate), and outputs the acquired character string to the conversation control unit 300 as a character string signal.

そして、文字列特定部４１０が、音声認識部２００で特定された一連の文字列を一文毎に区切るステップを行う（ステップＳ２１０３）。具体的には、管理部３１０から文字列信号（あるいは形態素信号）が入力された文字列特定部４１０は、その入力された一連の文字列の中に、ある一定以上の時間間隔があるときは、その部分で文字列を区切る。文字列特定部４１０は、その区切った各文字列を形態素抽出部４２０及び入力種類判定部４４０に出力する。なお、文字列特定部４１０は、入力された文字列がキーボードから入力された文字列である場合には、句読点又はスペース等のある部分で文字列を区切るのが好ましい。 Then, the character string specifying unit 410 performs a step of dividing the series of character strings specified by the voice recognition unit 200 for each sentence (step S2103). Specifically, the character string specifying unit 410 to which a character string signal (or morpheme signal) is input from the management unit 310 has a certain time interval or more in the input series of character strings. , Delimit the string at that part. The character string specifying unit 410 outputs the divided character strings to the morpheme extracting unit 420 and the input type determining unit 440. In addition, when the input character string is a character string input from the keyboard, the character string specifying unit 410 preferably divides the character string at a part such as a punctuation mark or a space.

その後、形態素抽出部４２０が、文字列特定部４１０で特定された文字列に基づいて、文字列の最小単位を構成する各形態素を第一形態素情報として抽出するステップを行う（ステップＳ２１０４）。具体的に、文字列特定部４１０から文字列が入力された形態素抽出部４２０は、入力された文字列と、形態素データベース４３０に予め格納されている形態素群とを照合する。なお、その形態素群は、本実施の形態では、それぞれの品詞分類に属する各形態素について、その形態素の見出し語・読み・品詞・活用形などを記述した形態素辞書として準備されている。
この照合をした形態素抽出部４２０は、入力された文字列の中から、予め記憶された形態素群に含まれる各形態素と一致する各形態素（m１,m２、…）を抽出する。形態素抽出部４２０は、抽出した各形態素を第一形態素情報として話題特定情報検索部３２０に出力する。 Thereafter, the morpheme extraction unit 420 performs a step of extracting each morpheme constituting the minimum unit of the character string as the first morpheme information based on the character string specified by the character string specifying unit 410 (step S2104). Specifically, the morpheme extraction unit 420 to which the character string is input from the character string specifying unit 410 collates the input character string with a morpheme group stored in advance in the morpheme database 430. In this embodiment, the morpheme group is prepared as a morpheme dictionary in which each morpheme belonging to each part-of-speech classification describes a morpheme entry word, reading, part-of-speech, utilization form, and the like.
The matched morpheme extraction unit 420 extracts each morpheme (m1, m2,...) That matches each morpheme included in the previously stored morpheme group from the input character string. The morpheme extraction unit 420 outputs each extracted morpheme to the topic identification information search unit 320 as first morpheme information.

次いで、入力種類判定部４４０が、文字列特定部４１０で特定された一文を構成する各形態素に基づいて、「発話文のタイプ」を判定するステップを行う（ステップＳ２１０５）。具体的には、文字列特定部４１０から文字列が入力された入力種類判定部４４０は、入力された文字列に基づいて、その文字列と発話種類データベース４５０に格納されている各辞書とを照合し、その文字列の中から、各辞書に関係する要素を抽出する。この要素を抽出した入力種類判定部４４０は、抽出した要素に基づいて、その要素がどの「発話文のタイプ」に属するのかを判定する。入力種類判定部４４０は、判定した「発話文のタイプ」（発話種類）を回答取得部３５０に出力する。 Next, the input type determination unit 440 performs a step of determining “spoken sentence type” based on each morpheme constituting one sentence specified by the character string specifying unit 410 (step S2105). Specifically, the input type determination unit 440, to which the character string is input from the character string specifying unit 410, determines the character string and each dictionary stored in the utterance type database 450 based on the input character string. Collation is performed, and elements related to each dictionary are extracted from the character string. The input type determination unit 440 that extracted this element determines to which “spoken sentence type” the element belongs based on the extracted element. The input type determination unit 440 outputs the determined “spoken sentence type” (speech type) to the answer acquisition unit 350.

そして、話題特定情報検索部３２０が、形態素抽出部４２０で抽出された第一形態素情報と着目話題タイトル８２０focusとを比較するステップを行う（ステップＳ２１０６）。
第一形態素情報を構成する形態素と着目話題タイトル８２０focusとが一致する場合、話題特定情報検索部３２０は、その話題タイトル８２０を回答取得部３５０に出力する。一方、話題特定情報検索部３２０は、第一形態素情報を構成する形態素と話題タイトルと８２０が一致しなかった場合には、入力された第一形態素情報及び利用者入力文話題特定情報８１０を検索命令信号として省略文補完部３３０に出力する。 Then, the topic specifying information search unit 320 compares the first morpheme information extracted by the morpheme extraction unit 420 with the topic topic title 820focus (step S2106).
If the morpheme constituting the first morpheme information matches the topic topic title 820focus, the topic identification information search unit 320 outputs the topic title 820 to the answer acquisition unit 350. On the other hand, if the morpheme constituting the first morpheme information and the topic title 820 do not match, the topic specific information search unit 320 searches the input first morpheme information and user input sentence topic specific information 810. An abbreviated sentence complementing unit 330 outputs the command signal.

その後、省略文補完部３３０が、話題特定情報検索部３２０から入力された第一形態素情報に基づいて、着目話題特定情報及び回答文話題特定情報を、入力された第一形態素情報に含めるステップを行う（ステップＳ２１０７）。具体的には、第一形態素情報を「Ｗ」、着目話題特定情報及び回答文話題特定情報の集合を「Ｄ」とすると、省略文補完部３３０は、第一形態素情報「Ｗ」に話題特定情報「Ｄ」の要素を含めて、補完された第一形態素情報を生成し、この補完された第一形態素情報と集合「Ｄ」に関連づけされたすべての話題タイトル８２０とを照合し、補完された第一形態素情報と一致する話題タイトル８２０があるか検索する。補完された第一形態素情報と一致する話題タイトル８２０がある場合は、省略文補完部３３０は、その話題タイトル８２０を回答取得部３５０に出力する。一方、補完された第一形態素情報と一致する話題タイトル８２０を発見しなかった場合は、省略文補完部３３０は、第一形態素情報と利用者入力文話題特定情報とを話題検索部３４０に渡す。 Thereafter, the abbreviated sentence complementing unit 330 includes the focused topic specifying information and the answer sentence topic specifying information in the input first morpheme information based on the first morpheme information input from the topic specifying information search unit 320. This is performed (step S2107). Specifically, when the first morpheme information is “W” and the set of the topic topic identification information and the answer sentence topic identification information is “D”, the abbreviated sentence complementing unit 330 identifies the topic as the first morpheme information “W”. Complemented first morpheme information including the element of information “D” is generated, and the complemented first morpheme information is collated with all topic titles 820 associated with the set “D” to be complemented. Whether there is a topic title 820 that matches the first morpheme information is searched. If there is a topic title 820 that matches the complemented first morpheme information, the abbreviated sentence complementing unit 330 outputs the topic title 820 to the answer acquisition unit 350. On the other hand, when the topic title 820 that matches the supplemented first morpheme information is not found, the abbreviated sentence complementing unit 330 passes the first morpheme information and the user input sentence topic specifying information to the topic search unit 340. .

次いで、話題検索部３４０は、第一形態素情報と、利用者入力文話題特定情報とを照合し、各話題タイトル８２０の中から、第一形態素情報に適した話題タイトル８２０を検索するステップを行う（ステップＳ２１０８）。具体的には、省略文補完部３３０から検索命令信号が入力された話題検索部３４０は、入力された検索命令信号に含まれる利用者入力文話題特定情報及び第一形態素情報に基づいて、その利用者入力文話題特定情報に対応付けられた各話題タイトル８２０の中から、その第一形態素情報に適した話題タイトル８２０を検索する。話題検索部３４０は、その検索の結果得られた話題タイトル８２０を検索結果信号として回答取得部３５０に出力する。 Next, the topic search unit 340 collates the first morpheme information with the user input sentence topic identification information, and performs a step of searching for the topic title 820 suitable for the first morpheme information from each topic title 820. (Step S2108). Specifically, the topic search unit 340 to which the search command signal is input from the abbreviated sentence complement unit 330 is based on the user input sentence topic identification information and the first morpheme information included in the input search command signal. A topic title 820 suitable for the first morpheme information is searched from the topic titles 820 associated with the user input sentence topic identification information. The topic search unit 340 outputs the topic title 820 obtained as a result of the search to the answer acquisition unit 350 as a search result signal.

次いで、回答取得部３５０が、話題特定情報検索部３２０、省略文補完部３３０，あるいは話題検索部３４０で検索された話題タイトル８２０に基づいて、判定された利用者の発話種類と話題タイトル８２０に対応付けられた各回答種類とを照合する。この照合をした回答取得部３５０は、各回答種類の中から、判定された発話種類と一致する回答種類を検索する（ステップＳ２１０９）。 Next, the answer acquisition unit 350 determines the utterance type and the topic title 820 of the user determined based on the topic title 820 searched by the topic identification information search unit 320, the abbreviated sentence complement unit 330, or the topic search unit 340. Collate each associated answer type. The answer acquisition unit 350 that has performed this collation searches the answer types that match the determined utterance type from among the answer types (step S2109).

具体的に、話題検索部３４０から検索結果信号と、入力種類判定部４４０から「発話文のタイプ」とが入力された回答取得部３５０は、入力された検索結果信号に対応する「話題タイトル」と、入力された「発話文のタイプ」とに基づいて、その「話題タイトル」に対応付けられている回答種類群の中から、「発話文のタイプ」（DAなど）と一致する回答種類を特定する。 Specifically, the answer acquisition unit 350 to which the search result signal is input from the topic search unit 340 and the “spoken sentence type” is input from the input type determination unit 440 is the “topic title” corresponding to the input search result signal. Based on the entered “spoken sentence type”, the answer type matching the “spoken sentence type” (such as DA) is selected from the answer type group associated with the “topic title”. Identify.

この回答取得部３５０は、会話データベース５００から、特定した回答種類に対応付けられた回答文８３０を取得する。回答取得部３５０は、管理部３１０を介して、取得した回答文８３０を出力部６００に出力する。管理部３１０から回答文が入力された出力部６００は、入力された回答文８３０を出力する（ステップＳ２１１０）。 The answer acquisition unit 350 acquires an answer sentence 830 associated with the specified answer type from the conversation database 500. The answer acquisition unit 350 outputs the acquired answer sentence 830 to the output unit 600 via the management unit 310. The output unit 600 to which the answer sentence is input from the management unit 310 outputs the input answer sentence 830 (step S2110).

図２０に戻り、ゲームシステムＧＳの動作例の説明を続ける。会話制御処理（ステップＳ２００２）において、ユーザの発話に対応する話題特定情報が会話データベース５００内にない場合（ステップＳ２００３、ＹＥＳ）には、会話制御部３００の未登録発話応答部３６０がゲーム状態、感情状態情報、未対応発話カウント値などに応じて未登録発話対応回答文の選択、出力による会話制御を行う（ステップＳ２００４）。 Returning to FIG. 20, the description of the operation example of the game system GS is continued. In the conversation control process (step S2002), when the topic identification information corresponding to the user's utterance is not in the conversation database 500 (YES in step S2003), the unregistered utterance response unit 360 of the conversation control unit 300 is in the game state. Conversation control is performed by selecting and outputting an unregistered utterance correspondence answer sentence in accordance with emotion state information, an unsupported utterance count value, and the like (step S2004).

一方、会話制御処理（ステップＳ２００２）において、ユーザの発話に対応する話題特定情報が会話データベース５００内にある場合（ステップＳ２００３、ＮＯ）には、ステップＳ２００２において回答文の選択、出力が行われているので特別な処理は行わない。 On the other hand, in the conversation control process (step S2002), when the topic identification information corresponding to the user's utterance is in the conversation database 500 (step S2003, NO), the answer text is selected and output in step S2002. Therefore, no special processing is performed.

次に、ゲームシステムＧＳは、ゲーム終了条件の成立の判定を行う（ステップＳ２００５）。ゲーム終了条件が成立している場合（例えば遊技時間の終了）は、ゲームシステムＧＳは、ゲームを終了させる。一方、ゲーム終了条件が成立していない場合は、再度ステップＳ２００１に戻り、ユーザ発話の受付を待つ。
以上で、ゲームシステムの主要な処理（ゲーム制御に関する処理を除く）が終了することとなる。
［その他、変形例など］
（１）会話制御装置１の音声認識部２００、会話制御部３００、文解析部４００の構成及び動作は、上記実施の形態において説明したものに限られず、会話データベース手段を利用して利用者の発話内容に応じた回答を返すものであれば、実施の形態において説明したもの以外のどのような音声認識部２００、会話制御部３００、文解析部４００であっても、本発明に係る会話制御装置１の構成要素として使用可能である。
（２）上記実施の形態の説明では、会話制御装置１への発話の入力は音声によるものとして説明したが、会話制御装置１への発話の入力は音声に限られるものではなく、キーボード、タッチパネル、ポインティングデバイスなどの非音声入力手段４により、文字列データとして会話制御装置１へ発話を入力し、会話制御装置１がこの文字列データとして入力された発話に対して会話データベース５００を用いて回答文を出力する構成としても本発明は成立する。 Next, the game system GS determines whether the game end condition is satisfied (step S2005). When the game end condition is satisfied (for example, the end of the game time), the game system GS ends the game. On the other hand, when the game end condition is not satisfied, the process returns to step S2001 again and waits for acceptance of the user utterance.
This completes the main processing of the game system (except for processing related to game control).
[Other variations]
(1) The configurations and operations of the speech recognition unit 200, the conversation control unit 300, and the sentence analysis unit 400 of the conversation control device 1 are not limited to those described in the above embodiment, and the user can use the conversation database means. Any speech recognition unit 200, conversation control unit 300, and sentence analysis unit 400 other than those described in the embodiment as long as it returns an answer corresponding to the utterance content, the conversation control according to the present invention. It can be used as a component of the device 1.
(2) In the description of the above embodiment, the input of the utterance to the conversation control device 1 has been described as being by voice, but the input of the utterance to the conversation control device 1 is not limited to voice, but a keyboard, a touch panel An utterance is input to the conversation control device 1 as character string data by the non-voice input means 4 such as a pointing device, and the conversation control device 1 answers the utterance input as the character string data using the conversation database 500. The present invention is also established as a configuration for outputting a sentence.

ゲームシステムの構成例を示すブロック図Block diagram showing a configuration example of a game system ゲーム装置の構成例を示す機能ブロック図Functional block diagram showing a configuration example of a game device 会話制御装置の構成例を示す機能ブロック図Functional block diagram showing a configuration example of a conversation control device 音声認識部の構成例を示す機能ブロック図Functional block diagram showing a configuration example of the speech recognition unit 単語仮説絞込部の処理の一例を示すタイミングチャートTiming chart showing an example of processing of the word hypothesis narrowing unit 音声認識部の動作例を示すフロー図Flow diagram showing an example of the operation of the voice recognition unit 会話制御装置の部分拡大ブロック図Partial enlarged block diagram of the conversation control device 文字列とこの文字列から抽出される形態素との関係を示す図The figure which shows the relationship between the character string and the morpheme extracted from this character string 「発話文のタイプ」と、その発話文のタイプを表す二文字のアルファベット、及びその発話文のタイプに該当する発話文の例を示す図The figure which shows the example of the utterance sentence which corresponds to the type of the utterance sentence, the two letter alphabet which shows the type of the utterance sentence, and the type of the utterance sentence 文のタイプとそのタイプを判定するための辞書の関係を示す図The figure which shows the relationship between the type of sentence and the dictionary for judging the type 会話データベースが記憶するデータのデータ構成の一例を示す概念図Conceptual diagram showing an example of the data structure of data stored in the conversation database ある話題特定情報と他の話題特定情報との関連付けを示す図The figure which shows the correlation with a certain topic specific information and other topic specific information 話題タイトルのデータ構成例を示す図Diagram showing data structure example of topic title 回答文のタイプを説明する図Illustration explaining the type of answer text ある話題特定情報に対応付けされた話題タイトル，回答文、感情条件パラメータ、感情フラグの具体例を示す図The figure which shows the specific example of the topic title matched with a certain topic specific information, an answer sentence, an emotion condition parameter, and an emotion flag 未登録発話応答部の構成例を示す機能ブロック図Functional block diagram showing a configuration example of an unregistered utterance response unit 未登録発話対応回答文記憶部に記憶される未登録発話対応回答文テーブルのデータ構成例を示す図The figure which shows the data structural example of the unregistered utterance corresponding reply sentence table memorize | stored in the unregistered utterance corresponding reply sentence memory | storage part 感情状態情報管理手段の構成例を示す機能ブロック図Functional block diagram showing a configuration example of emotion state information management means 感情状態情報記憶部に記憶される感情状態情報のデータ構成例を示す図The figure which shows the data structural example of the emotion state information memorize | stored in an emotion state information storage part ゲームシステムの主要な動作を示したフローチャートFlow chart showing the main operations of the game system 会話制御処理の例を示したフローチャートFlow chart showing an example of conversation control processing

Explanation of symbols

ＧＳ … ゲームシステム
１ … 会話制御装置
２ … ゲーム装置
３ … 音声入力手段
４ … 非音声入力手段
５ … 画像出力手段
６ … 音声出力手段
７ … 感情状態情報管理手段
２００ … 音声認識部
３００ … 会話制御部
３６０ … 未登録発話応答部
４００ … 文解析部
５００ … 会話データベース
６００ … 音声認識辞書記憶部 GS ... Game system 1 ... Conversation control device 2 ... Game device 3 ... Voice input means 4 ... Non-voice input means 5 ... Image output means 6 ... Voice output means 7 ... Emotion state information management means 200 ... Voice recognition unit 300 ... Conversation control Unit 360 ... unregistered utterance response unit 400 ... sentence analysis unit 500 ... conversation database 600 ... speech recognition dictionary storage unit

Claims

In a game machine having an input means for receiving an utterance from a user, a conversation processing means for outputting an answer sentence in response to the utterance from the user, and a game control means for controlling a character and executing a game,
The conversation processing means includes
A recognition unit that receives and recognizes the utterance from the user accepted by the input means as the utterance information from the user;
Means for storing a discourse history that is information for identifying the topic and subject of conversation between the user and the conversation processing means;
Conversation storing a plurality of topic specifying information determined by the discourse history and associated with a topic title composed of a morpheme composed of one character, a plurality of character strings, or a combination thereof and an answer sentence to the user Database means;
A morpheme extraction unit that extracts speech information from the user into a morpheme composed of one character, a plurality of character strings, or a combination thereof;
Collating the morpheme information extracted by the morpheme extraction unit with the topic specifying information stored in the conversation database means, and selecting the topic specifying information that matches the morpheme constituting the morpheme information from each topic specifying information Means for selecting the answer sentence to the user by:
When the topic identification information corresponding to the user's utterance is unregistered in the conversation database means, it is not possible to find the topic identification information that matches the morpheme constituting the morpheme information from each topic identification information. An unregistered utterance response unit that receives unregistered utterance information to inform the effect,
With
The unregistered utterance response unit
An unregistered utterance correspondence answer sentence storage unit for storing a plurality of unregistered utterance correspondence answer sentences;
When the unregistered utterance information is received, a predetermined one of the unregistered utterance correspondence answer sentences stored in the unregistered utterance correspondence answer sentence storage unit based on the game state information regarding the progress state of the game output from the game control means. An unregistered utterance correspondence answer sentence selection unit for selecting an unregistered utterance correspondence answer sentence,
A game machine comprising:

An emotion flag constituting emotion state information indicating the emotion of the character is associated with the topic title, and the game machine receives an emotion flag associated with the selected answer sentence and based on the emotion flag Emotional state information management means for updating emotional state information is further provided,
When the unregistered utterance correspondence answer sentence selection unit receives the unregistered utterance information, the unregistered utterance correspondence answer sentence storage unit stores the unregistered utterance correspondence answer sentence storage unit based on the emotion state information output from the emotion state information management unit. The game machine according to claim 1 , wherein a predetermined unregistered utterance correspondence answer sentence is selected from the utterance correspondence answer sentences .

The unregistered utterance response unit further includes an unregistered utterance counter unit that increments and stores an unregistered utterance count value each time the unregistered utterance information is received,
When the unregistered utterance correspondence answer sentence selection unit receives the unregistered utterance information, the unregistered utterance correspondence answer sentence storage unit stores the unregistered utterance correspondence answer sentence storage unit based on the unregistered utterance count value output from the unregistered utterance counter unit. The game machine according to claim 1 or 2 , wherein a predetermined unregistered utterance correspondence answer sentence is selected from unregistered utterance correspondence answer sentences .

Along with recognizing speech information as a series of character strings, a character string specifying unit that separates the character strings into one sentence,
An input type determining unit that determines the type of utterance content based on the character string specified by the character string specifying unit, and outputs information indicating the utterance type to the means for selecting the answer sentence according to the determination result;
The game machine according to any one of claims 1 to 3, further comprising:

It further comprises an abbreviated sentence complementing unit that generates a plurality of types of complemented morpheme information by complementing the morpheme information using the topic identification information included in the previous topic identification information and the previous answer sentence. The game machine according to any one of claims 1 to 4 .

An input means for receiving an utterance from the user; a conversation processing means for outputting an answer sentence in response to the utterance from the user; and a game control means for controlling a character and executing a game, wherein the conversation processing means A means for storing a discourse history, which is information for specifying a topic or subject of conversation between the user and the conversation processing means, and a morpheme composed of one character, a plurality of character strings, or a combination thereof, and In a method of executing a game on a game machine comprising a conversation database means for storing a plurality of topic identification information determined by the discourse history and a topic title associated with the answer sentence of
The recognition unit of the conversation processing means receives and recognizes the utterance from the user accepted by the input means as the utterance information from the user; and
The morpheme extraction unit of the conversation processing means extracts the utterance information from the user into a morpheme consisting of one character, a plurality of character strings, or a combination thereof;
The topic identification information search unit of the conversation processing unit collates the morpheme information extracted by the morpheme extraction unit with the topic identification information stored in the conversation database unit, and constructs morpheme information from each topic identification information Searching for topic specific information that matches a morpheme to
When the topic identification information corresponding to the user's utterance is registered in the conversation database unit, the answer acquisition unit of the conversation processing unit is associated with the topic identification information searched by the topic identification information search unit Retrieving a sentence;
When the topic identification information corresponding to the user's utterance is not registered in the conversation database means, the topic identification information search unit finds the topic identification information that matches the morpheme constituting the morpheme information from each topic identification information. The unregistered utterance response unit of the conversation processing means receives from the response acquisition unit unregistered utterance information informing that, if not possible,
Including
When the unregistered utterance response unit receives the unregistered utterance information, the unregistered utterance response unit stores the unregistered utterance correspondence answer sentence storage unit based on the game state information regarding the progress of the game output from the game control unit. A method comprising: selecting and outputting a predetermined unregistered utterance correspondence answer sentence from utterance correspondence answer sentences .

The emotion flag constituting emotion state information indicating the emotion of the character is associated with the topic title, and the game machine receives the emotion flag associated with the answer sentence acquired by the answer acquisition unit and the emotion It further comprises emotion state information management means for updating the emotion state information based on the flag,
When the unregistered utterance correspondence answer sentence selection unit receives the unregistered utterance information, the unregistered utterance correspondence answer sentence storage unit stores the unregistered utterance correspondence answer sentence storage unit based on the emotion state information output from the emotion state information management unit. The method according to claim 6, wherein a predetermined unregistered utterance correspondence answer sentence is selected from the utterance correspondence answer sentences.

The unregistered utterance response unit further includes an unregistered utterance counter unit that increments and stores an unregistered utterance count value each time the unregistered utterance information is received,
When the unregistered utterance correspondence answer sentence selection unit receives the unregistered utterance information, the unregistered utterance correspondence answer sentence storage unit stores the unregistered utterance correspondence answer sentence storage unit based on the unregistered utterance count value output from the unregistered utterance counter unit. The method according to claim 6 or 7, wherein a predetermined unregistered utterance correspondence answer sentence is selected from unregistered utterance correspondence answer sentences.

After the step in which the morpheme extraction unit of the conversation processing unit extracts the utterance information from the user into the morpheme, the character string specifying unit of the conversation processing unit recognizes the utterance information as a series of character strings, and the character string A step for separating each sentence into phrases,
The input type determination unit of the conversation processing means determines the type of utterance content based on the character string specified by the character string specifying unit, and outputs information indicating the utterance type to the answer acquisition unit according to the determination result And steps to
The method according to claim 6, further comprising:

After the step in which the topic specifying information search unit of the conversation processing means searches for the topic specifying information, the abbreviated sentence complementing unit of the conversation processing means obtains the topic specifying information included in the previous topic specifying information and the previous answer sentence. The method according to any one of claims 6 to 9, further comprising generating a plurality of types of complemented morpheme information by complementing the morpheme information using the method .

The program for functioning a computer as a game machine of any one of Claims 1 thru | or 5 .

The program for making a computer perform the method of any one of Claim 6 thru | or 10 .