JP3194719B2

JP3194719B2 - Dialogue system

Info

Publication number: JP3194719B2
Application number: JP20310998A
Authority: JP
Inventors: 和子高橋
Original assignee: 株式会社エイ・ティ・アール音声翻訳通信研究所
Priority date: 1998-07-17
Filing date: 1998-07-17
Publication date: 2001-08-06
Anticipated expiration: 2018-07-17
Also published as: JP2000035798A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、人間と対話を行う
対話システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a dialog system for interacting with a human.

【０００２】[0002]

【従来の技術】対話理解については、これまで様々なモ
デルが提案されてきており、その多くは雑談ではなく目
的のある対話を対象としている。これまでに提案されて
いるモデルでは対話をプランとしてとらえ、各発話はゴ
ールを達成するための行為と結び付けて考えているもの
が多い。2. Description of the Related Art Various models for dialogue understanding have been proposed so far, and most of them are intended not for chat but for purposeful dialogue. In the models proposed so far, dialogue is often considered as a plan, and each utterance is associated with an action to achieve a goal.

【０００３】例えば、従来技術文献「山田耕一ほか，”
質問応答システムにおけるユーザ発話モデルと協調的応
答の生成」，情報処理学会論文誌，Ｖｏｌ．３５，Ｎ
ｏ．１１，ｐｐ．２２６５−２２７５，１９９４年１１
月」においては、ユーザの質問の意図を推論し、意図に
応じて応答を生成することによって様々な協調的応答を
可能にすることができることを特徴としている。この従
来技術文献において、ユーザの発話とその意図の関係を
表すユーザ発話モデルを導入し、日常会話でよく見られ
る協調的応答の分類を行い、応答の種類毎に、ユーザの
意図と応答の関係を開示し、そして、ユーザ発話モデル
に基づいてユーザの発話からその意図を推論する方法に
ついて開示している。この方法では、ドメインで独立な
意図推論ルールと話題となる対象物に関する知識を用
い、推論された意図を用いて先に分類された協調的な応
答を生成する方法について開示している。[0003] For example, the prior art document "Koichi Yamada et al.,"
Generation of User Utterance Model and Collaborative Response in Question Answering System ", Transactions of Information Processing Society of Japan, Vol. 35, N
o. 11, pp. 2265-2275, 1994 11
The month is characterized in that various cooperative responses can be enabled by inferring the intention of the user's question and generating a response according to the intention. In this prior art document, a user utterance model representing a relationship between a user's utterance and its intention is introduced, a cooperative response often seen in daily conversation is classified, and a relationship between the user's intention and the response is classified for each type of response. And a method of inferring the intention from the user's utterance based on the user's utterance model is disclosed. This method discloses a method of generating a cooperative response that has been classified using the inferred intentions, using independent intention inference rules in the domain and knowledge of the topic object.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、上述の
従来技術文献の質問応答システムでは、ユーザ側とシス
テム側との役割によってモデルが異なるために、適用す
る対話の範囲が狭いという問題点があった。また、シス
テム側が予めスロットを用意し、発話によりそのスロッ
トを充填してゆく、いわゆるスロットはめ込み方式を用
いることも考えられるが、用意するスロットは限定さ
れ、規格外の処理は行うことはできず、柔軟性を欠くと
いう問題点があった。However, in the question answering system of the above-mentioned prior art document, there is a problem that the range of dialogue to be applied is narrow because models differ depending on the roles of the user and the system. . Also, it is conceivable to use a so-called slot insertion method in which the system side prepares a slot in advance and fills the slot by utterance, but the prepared slot is limited, and processing outside the standard can not be performed, There was a problem of lack of flexibility.

【０００５】本発明の目的は以上の問題点を解決し、従
来技術に比較して適用する対話の範囲を広くすることが
でき、しかも柔軟性がある対話システムを提供すること
にある。[0005] An object of the present invention is to solve the above problems and to provide a flexible dialog system capable of expanding a range of dialogs to be applied as compared with the prior art, and of being flexible.

【０００６】[0006]

【課題を解決するための手段】本発明に係る対話システ
ムは、発声される発声音声文の音声を文字列に音声認識
して、音声認識された文字列に応答して情報収集のため
の対話における応答の発語内容の文字列を生成した後、
発語内容の文字列を音声合成して出力する対話システム
であって、対話の進行とともに変化する対話の状況を示
す内部状態を、（ａ）αがφを知っていることを表す第１の様相演算子
Ｋ（α，φ）と、（ｂ）αがφを信じていることを表す第２の様相演算子
Ｂ（α，φ）と、（ｃ）αがφを知る必要があることを表す第３の様相演
算子Ｎ（α，φ）とを用いて表して格納する内部状態記
述記憶装置と、（Ａ）標準命題様相論理の性質を様相演算子を用いて表
した一般の公理と、（Ｂ）必然性の規則を表す推論規則と、（Ｃ）複数の事実が論理積で成立するならば、ある事実
が成立するということを表す背景知識と、発話の順序に
関する制約を示す発話生成規則とを有するタスク依存の
知識とを格納する知識記述記憶装置と、発話の順序に関
する制約を示す発話生成規則を格納する発話生成規則記
憶装置と、発声される発声音声文の音声を文字列に音声
認識して出力する音声認識手段と、文字列を情報要求又
は情報提供の内容を表す中間言語に変換するための変換
パターンモデルを参照して、上記音声認識手段によって
音声認識された文字列を、情報要求又は情報提供の内容
を表す中間言語に変換して出力する前処理手段と、上記
前処理手段から出力される情報要求又は情報提供の内容
を表す中間言語から、上記内部状態記述記憶装置内の内
部状態を参照して、内部状態を遷移させてその内部状態
を更新するとともに、遷移後の内部状態を出力する状態
遷移処理手段と、上記状態遷移処理手段から出力される
内部状態から、上記知識記述記憶装置内の知識と、上記
発話生成規則記憶装置内の発話生成規則とを参照して、
現在の状態における当該対話において応答して発話すべ
き内容を推論してその中間言語を生成して出力する推論
処理手段と、上記内部状態記述記憶装置内の内部状態を
参照して、上記推論処理手段から出力される応答して発
話すべき内容の中間言語を、情報要求又は情報提供の中
間言語の出力データに変換して出力するデータ出力処理
手段と、情報要求又は情報提供の内容を表す中間言語を
文字列に逆変換するための逆変換パターンモデルを参照
して、上記データ出力処理手段から出力される情報要求
又は情報提供の内容を表す中間言語の出力データを、当
該中間言語に対応する文字列に逆変換して出力する後処
理手段と、上記後処理手段から出力される文字列を音声
合成してそれに対応する音声を出力する音声合成手段と
を備えたことを特徴とする。A dialogue system according to the present invention recognizes a voice of an uttered voice sentence as a character string, and responds to the voice-recognized character string to collect information. After generating a string of the utterance content of the response in,
A dialogue system for synthesizing and outputting a character string of utterance content, wherein an internal state indicating a dialogue state that changes with the progress of the dialogue is represented by: (a) a first state indicating that α knows φ Modal operator K (α, φ); (b) a second modal operator B (α, φ) indicating that α believes φ; and (c) α needs to know φ. And an internal state description storage device that expresses and stores a third propositional operator N (α, φ) using a modal operator, and (A) a general axiom expressing the properties of standard propositional modal logic using a modality operator And (B) inference rules representing the rules of necessity; and (C) background knowledge representing that if a plurality of facts are ANDed, certain facts are established; and utterances indicating restrictions on the order of utterances. A knowledge description storage device for storing task-dependent knowledge having a generation rule; An utterance generation rule storage device for storing an utterance generation rule indicating a utterance, a voice recognition unit for recognizing and outputting a voice of an uttered voice sentence as a character string, and displaying the character string as an information request or information provision content With reference to a conversion pattern model for converting to an intermediate language, a pre-processing means for converting the character string recognized by the voice recognition means to an intermediate language representing the content of the information request or information provision and outputting the same, From the intermediate language representing the content of the information request or information provision output from the preprocessing means, referring to the internal state in the internal state description storage device, and transitioning the internal state to update the internal state, State transition processing means for outputting the internal state after the transition; and internal state output from the state transition processing means, the knowledge in the knowledge description storage device and the utterance generation rule storage device Referring to the speech generation rule,
Inference processing means for inferring the content to be uttered in response to the dialogue in the current state, generating and outputting the intermediate language, and the inference processing with reference to the internal state in the internal state description storage device Data output processing means for converting the intermediate language of the content to be spoken in response to the information output from the means into output data of an intermediate language for requesting or providing information, and outputting the data; and an intermediate representing the content of the information request or providing information. With reference to an inverse conversion pattern model for inversely converting a language into a character string, the output data of the intermediate language representing the content of the information request or information provision output from the data output processing means corresponds to the intermediate language. Post-processing means for inverting the character string and outputting the same, and voice synthesizing means for synthesizing the character string output from the post-processing means and outputting the corresponding voice. To.

【０００７】[0007]

【発明の実施の形態】以下、図面を参照して本発明に係
る実施形態について説明する。Embodiments of the present invention will be described below with reference to the drawings.

【０００８】図１は、本発明に係る一実施形態である対
話システムの構成を示すブロック図である。本実施形態
の対話システムは、発声される発声音声文の音声を文字
列に音声認識して、音声認識された文字列に応答して情
報収集のための対話における応答の発語内容の文字列を
生成した後、発語内容の文字列を音声合成して出力する
対話システムであって、（ａ）対話の進行とともに変化
する対話の状況を示す内部状態を格納する内部状態記述
メモリ３１と、（ｂ）一般の公理、推論規則及び、タス
ク依存の知識を含む知識を格納する知識記述メモリ３３
と、（ｃ）発話の順序に関する制約を示す発話生成規則
を格納する発話生成規則メモリ３２と、（ｄ）発声され
る発声音声文のデジタル音声信号を文字列に音声認識し
て出力する音声認識部３と、（ｅ）上記音声認識部３に
よって音声認識された文字列を、パターンモデルメモリ
１１内の所定の変換パターンモデルを参照して、情報要
求又は情報提供の内容を表す中間言語に変換して出力す
る前処理部４と、（ｆ）上記前処理部４から出力される
情報要求又は情報提供の中間言語に基づいて、上記内部
状態記述メモリ３１内の内部状態を参照して、内部状態
を遷移させてその内部状態を更新するとともに、遷移後
の内部状態を出力する状態遷移処理部２１と、（ｇ）上
記状態遷移処理部２１から出力される内部状態に基づい
て、上記知識記述メモリ３３内の知識と、上記発話生成
規則メモリ３２内の発話生成規則とを参照して、現在の
状態に鑑みて当該対話において応答して発話すべき内容
を推論してその中間言語を生成して出力する推論処理部
２２と、（ｈ）上記推論処理部２２から出力される応答
して発話すべき内容の中間言語に基づいて、上記内部状
態記述メモリ３１内の内部状態を参照して、情報要求又
は情報提供の中間言語の出力データに変換して出力する
データ出力処理部２３と、（ｉ）上記データ出力処理部
２３から出力される情報要求又は情報提供の中間言語の
出力データに基づいて、パターンモデルメモリ１２内の
所定の逆変換パターンモデルを参照して、当該中間言語
に対応する文字列に逆変換して出力する後処理部６と、
（ｊ）上記後処理部６から出力される文字列を音声合成
してそれに対応する音声を出力する音声合成部７とを備
えたことを特徴としている。FIG. 1 is a block diagram showing the configuration of a dialogue system according to an embodiment of the present invention. The dialogue system of the present embodiment recognizes the voice of the uttered voice sentence to be uttered into a character string, and responds to the voice-recognized character string in response to the utterance content in the dialogue for information collection. Is generated, and then a speech string of the utterance content is synthesized and output, and (a) an internal state description memory 31 for storing an internal state indicating the state of the dialogue that changes as the dialogue progresses; (B) Knowledge description memory 33 for storing knowledge including general axioms, inference rules, and task-dependent knowledge
(C) an utterance generation rule memory 32 for storing utterance generation rules indicating restrictions on the order of utterances, and (d) voice recognition for recognizing and outputting a digital voice signal of an uttered voice sentence to be uttered as a character string. Unit 3 and (e) converting the character string speech-recognized by the speech recognition unit 3 into an intermediate language representing the contents of the information request or information provision with reference to a predetermined conversion pattern model in the pattern model memory 11. (F) referring to the internal state in the internal state description memory 31 based on the information request or the intermediate language of information provision output from the preprocessing section 4 and A state transition processing unit 21 that changes the state to update the internal state and outputs the internal state after the transition, and (g) the knowledge description based on the internal state output from the state transition processing unit 21 With reference to the knowledge in the memory 33 and the utterance generation rules in the utterance generation rule memory 32, in consideration of the current state, the contents to be uttered in response to the dialogue are inferred to generate the intermediate language. (H) referring to the internal state in the internal state description memory 31 based on the intermediate language of the content to be uttered in response to the content output from the inference processing section 22; A data output processing unit 23 that converts the data into output data of an information request or information provision intermediate language and outputs the converted data; and (i) based on the information request or information provision intermediate language output data output from the data output processing unit 23 A post-processing unit 6 that refers to a predetermined inverse conversion pattern model in the pattern model memory 12 and inversely converts the character string corresponding to the intermediate language and outputs the character string;
(J) a voice synthesizing unit 7 for voice synthesizing the character string output from the post-processing unit 6 and outputting a voice corresponding thereto.

【０００９】本実施形態では、情報収集を目的とする対
話に対して、情報の授受に焦点をあてた対話モデルを開
示し、対話によってどのような情報が伝わり、何に影響
されどのようにして発話が生起するのかに注目して対話
システムを構築する。ここで、発話の生起を説明するた
めに「知らねばならない」という様相オペレータを導入
し、これを使って発話生成規則を記述する。発話は相手
の要求に対する情報提供として生起されるか、又は得ら
れた情報をもとに発話生成規則によって生起されるかの
いずれかである。この枠組みによって、情報要求側の発
話も情報提供側の発話も統一的に説明を与えることがで
きる。In this embodiment, a dialogue model focusing on information exchange is disclosed for a dialogue for information collection, and what kind of information is conveyed by the dialogue, how it is influenced by what, and how it is influenced A dialogue system is constructed focusing on whether utterances occur. Here, in order to explain the occurrence of the utterance, a modal operator of "must know" is introduced, and the utterance generation rule is described using the operator. The utterance is either generated as information provision to the request of the other party, or is generated by an utterance generation rule based on the obtained information. With this framework, the utterance of the information requesting side and the utterance of the information providing side can be uniformly explained.

【００１０】本実施形態では、内部状態を表現するため
に、次の３つの様相演算子を導入する。（ａ）Ｋ（α，φ）−αがφを知っている（ｂ）Ｂ（α，φ）−αがφを信じている（ｃ）Ｎ（α，φ）−αがφを知る必要があるここで、Ｋ、Ｂは心的状況を示すのに一般によく使われ
る演算子であり、公理系としてはそれぞれ様相論理の公
理系Ｓ４、ＫＤ４５を持つことが知られている。これら
に加えて本実施形態ではＮという演算子を導入する。Ｎ
は発話を引き起こす原因を記述するものであり、公理系
としては、ＫＤに相当し、以下に詳述する。本実施形態
では、様相演算子を含まない式を事実（fact）と呼び、
様相演算子を含む心的状態を表す式と区別する。In this embodiment, the following three modal operators are introduced to represent the internal state. (A) K (α, φ) -α knows φ. (B) B (α, φ) -α believes φ. (C) N (α, φ) -α needs to know φ. Here, K and B are operators commonly used to indicate a mental situation, and it is known that axiomatic systems have modal logic axiomatic systems S4 and KD45, respectively. In addition, in this embodiment, an operator N is introduced. N
Describes the cause of the utterance, and corresponds to KD as an axiomatic system, and will be described in detail below. In the present embodiment, an expression that does not include the modality operator is called a fact.
Distinguish from mental expressions that include modal operators.

【００１１】標準命題様相論理の性質には以下のものが
あり、どの性質を公理として持つかによって体系が異な
る。ただし、Ｏは様相演算子を表す。There are the following properties of the standard propositional modal logic, and the system differs depending on which property has an axiom. Here, O represents a modal operator.

【数１】Ｋｃ：Ｏ（α，φ→ψ）→（Ｏ（α，φ）→Ｏ
（α，ψ））Kc: O (α, φ → ψ) → (O (α, φ) → O
(Α, ψ))

【数２】Ｔｃ：Ｏ（α，φ）→φTc: O (α, φ) → φ

【数３】Ｄｃ：Ｏ（α，φ）→¬Ｏ（α，¬φ）Dc: O (α, φ) → ¬O (α, ¬φ)

【数４】４：Ｏ（α，φ）→Ｏ（α，Ｏ（α，φ））## EQU4 ## 4: O (α, φ) → O (α, O (α, φ))

【数５】５：¬Ｏ（α，¬φ）→Ｏ（α，¬Ｏ（α，¬φ））5: ¬O (α, ¬φ) → O (α, ¬O (α, ¬φ))

【００１２】ここで、¬は否定を表し、以下同様であ
る。上記性質Ｋｃは、Ｏが含意(→)のもとに閉じている
ことを表している。また、上記性質Ｔｃは、反射律を表
している。さらに、上記性質Ｄｃは、連鎖律を表してい
る。また、上記性質４は、推移律を表している。さら
に、上記性質５は、ユークリッド律を表している。性質
Ｋｃ、Ｄｃ、４を公理として持つ体系をＳ４、性質Ｋ
ｃ、Ｄｃ、４、５の場合はＫＤ４５又はｗｅａｋ−Ｓ
５、性質Ｋｃ、Ｄｃの場合はＫＤと呼ぶ。さらに、推論
規則は以下のように表すことができる。Here, ¬ indicates negation, and so on. The property Kc indicates that O is closed under the implication (→). Further, the property Tc represents the reflection rule. Further, the property Dc represents a chain rule. In addition, the property 4 represents a transition law. Further, the property 5 represents the Euclidean rule. A system having properties Kc, Dc, and 4 as axioms is S4, property K
KD45 or weak-S for c, Dc, 4, 5
5. Properties Kc and Dc are called KD. Further, the inference rules can be expressed as:

【００１３】[0013]

【数６】ＮＥＣ：Ｉｆ｜＝φ ｔｈｅｎＯ（α，φ）NEC: If | = φ then O (α, φ)

【００１４】上記推論規則（ＮＥＣ）は、必然性の規則
を表しており、推論規則（ＮＥＣ）はどの体系にも共通
である。The above inference rules (NEC) represent inevitable rules, and the inference rules (NEC) are common to all systems.

【００１５】次いで、対話モデルの記述言語について説
明する。発話は要求（request）と告知（inform）の２
種類で記述する。まず、発話はすべて相手に（少なくと
も文字上は）正しく伝わるものと仮定する。以下では、
対話は対話参加者αとβの二者の間で行なわれているも
のとする。ただし、本実施形態の対話システムでは、一
方の対話参加者は、デジタル計算機などの機械である。Next, a description language of the conversation model will be described. Utterances are two types: request and inform.
Describe by type. First, it is assumed that all utterances are correctly (at least literally) transmitted to the other party. Below,
It is assumed that the dialogue is being conducted between the dialog participants α and β. However, in the dialogue system of the present embodiment, one of the dialogue participants is a machine such as a digital computer.

【００１６】（１）情報要求−ｒｅｑｕｅｓｔ（α，
β，Ｐ）（ａ）前条件：Ｂ（α，Ｎ（α，Ｐ））∧¬Ｋ（α，
Ｐ）（ｂ）後条件：Ｋ（β，Ｎ（α，Ｐ））この情報要求−ｒｅｑｕｅｓｔ（α，β，Ｐ）は、もし
対話参加者αが事実Ｐを知る必要があるにもかかわらず
現在知らないのならば、αは相手の対話参加者βにＰに
関する情報を要求し、その結果、βはαがＰを知る必要
があることを知ることを表している。(1) Information request—request (α,
β, P) (a) Precondition: B (α, N (α, P)) ∧¬K (α,
P) (b) Post-condition: K (β, N (α, P)) This information request—request (α, β, P) is used even if the dialog participant α needs to know the fact P. If not currently known, α requests information about P from the other conversation participant β, so that β indicates that α knows that P needs to know P.

【００１７】（２）情報提供−ｉｎｆｏｒｍ（α，β，
Ｐ）（ａ）前条件：Ｂ（α，Ｎ（β，Ｐ））∧Ｋ（α，Ｐ）（ｂ）後条件：Ｋ（β，Ｋ（α，Ｐ））この情報提供−ｉｎｆｏｒｍ（α，β，Ｐ）は、もし対
話参加者βが事実Ｐを知る必要があると対話参加者αが
思っており、自分がその情報を知っていれば、αはβに
Ｐに関する情報を告知し、その結果、βはαがＰを知っ
ていることを知ることを表している。このとき、公理に
より、Ｋ（β，Ｐ）も成り立つことに注意する。(2) Information provision-inform (α, β,
P) (a) Precondition: B (α, N (β, P)) ∧K (α, P) (b) Postcondition: K (β, K (α, P)) Provide this information—inform (α) , Β, P), the dialogue participant α thinks that the dialogue participant β needs to know the fact P, and if he knows the information, α informs β of information about P. , And consequently, β indicates that α knows P. At this time, note that K (β, P) also holds according to the axiom.

【００１８】次いで、タスク依存の知識について説明す
る。各対話参加者は一般の公理や推論規則に加えて、タ
スク依存の知識として背景知識及び発話生成規則を持
つ。Next, the task-dependent knowledge will be described. Each participant has background knowledge and utterance generation rules as task-dependent knowledge in addition to general axioms and inference rules.

【００１９】背景知識は次式で表される。The background knowledge is expressed by the following equation.

【数７】Ｐ₁∧…∧Ｐ_n→Ｑ₁∧…∧Ｑ_m ただし、Ｐ₁，…，Ｐ_n，Ｑ₁…Ｑ_mは事実である。例え
ば、ホテルの予約において、人数、日程、部屋のタイプ
が決まり、かつ条件を満たす部屋があいていれば予約は
達成されるという背景知識は、以下のように表される。
例１：背景知識[Equation 7] _{_{P 1 ∧ ... ∧P n → Q}} 1 ∧ ... ∧Q m _{_{However, P 1, ..., P n}} , Q 1 ... Q m is a fact. For example, in a hotel reservation, the background knowledge that the number of people, the schedule, and the type of room are determined, and if there is a room that satisfies the conditions, the reservation is achieved is expressed as follows.
Example 1: Background knowledge

【数８】ｎｕｍｂｅｒ∧ｄａｔｅｓ∧ｒｏｏｍｔｙｐｅ
∧ａｖａｉｌａｂｌｅ→ｒｅｓｅｒｖｅｄ[Equation 8] number @ datas @ roomtype
∧available → reserved

【００２０】次いで、発話生成規則について説明する。
一般に、対話者は相手に与える情報を最初から持ってい
たとしても、その情報をいつ与えてもよいわけではな
い。例えば、ホテル予約の場面において、客はまず予約
の意志を伝えてから部屋のタイプや日程に関する情報を
与えるだろうし、ホテル側は希望の部屋がとれて初めて
支払い方法や到着時間についての問いを発するのが普通
だろう。このような発話の順序に対する制約を、発話生
成規則として以下のように記述する。Next, the utterance generation rules will be described.
In general, even if the interlocutor has the information to be given to the other party from the beginning, it does not mean that the information can be given at any time. For example, in the case of a hotel reservation, the customer would first tell the will of the reservation and then give information about the room type and schedule, and the hotel would only ask about the payment method and arrival time when the desired room was taken. That would be normal. Such restrictions on the order of utterances are described as utterance generation rules as follows.

【００２１】[0021]

【数９】Ｋ（α，Ｐ₁）∧…∧Ｋ（α，Ｐ_n）→Ｂ（α，
Ｎ（α₁，Ｑ₁））∧…∧Ｂ（α，Ｎ（α_m，Ｑ_m））ただし、Ｐ₁，…，Ｐ_n，Ｑ₁，…，Ｑ_mは事実、α_i（ｉ
＝１，…，ｍ）はα又はβである。K (α, P ₁ ) ∧... ∧K (α, P _n ) → B (α,
_{_{N (α 1, Q 1)}} ) ∧ ... ∧B (α, N (α m, Q m)) _{_{However, P 1, ..., P n}} , Q 1, ..., Q m fact, α _{i (i}
= 1,..., M) is α or β.

【００２２】この式は、対話参加者αがＰ₁，…，Ｐ_nを
知れば、今度は対話参加者α_i（ｉ=１，…，ｍ）がＱ_i
を知る必要があることを示す。各発話生成規則は一つの
対話に対し高々１回適用される。If the dialog participant α knows P ₁ ,..., P _n , then the dialog participant α _i (i = 1 _,.
Indicates that you need to know Each utterance generation rule is applied at most once for one dialog.

【００２３】例えば、ホテルフロント（ｃで表す）は、
宿泊の要求（ＤｅｓＲｅｓ）を知れば、希望の人数、日
程、部屋のタイプを知る必要がある、という知識は以下
のように表される。For example, a hotel front (represented by c)
The knowledge that it is necessary to know the desired number of people, the schedule, and the type of room if the accommodation request (DesRes) is known is expressed as follows.

【００２４】例２：発話生成規則Example 2: Utterance generation rule

【数１０】Ｋ（ｃ，ＤｅｓＲｅｓ）→Ｂ（ｃ，Ｎ（ｃ，
ｎｕｍｂｅｒ））∧Ｂ（ｃ，Ｎ（ｃ，ｄａｔｅｓ））∧
Ｂ（ｃ，Ｎ（ｃ，ｒｏｏｍＴｙｐｅ））## EQU10 ## K (c, DesRes) → B (c, N (c,
number)) {B (c, N (c, dates))}
B (c, N (c, roomType))

【００２５】さらに、状態遷移における内部状態につい
て説明する。各対話参加者は対話の進行とともに変化す
る内部状態を持つ。内部状態はその時点における知識と
信念の集合になっており、各要素はＫ（α，φ）、Ｂ
（α，φ）のいずれかの形で表現される。ただし、αは
対話参加者、φは事実又は様相演算子を含む式である。
発話ｕ_i（ｉ＝１，…，ｎ）によって内部状態はＳ_i-1か
らＳ_iへ遷移する。すると、対話は以下のような内部状
態の有限列に相当する。Further, the internal state in the state transition will be described. Each conversation participant has an internal state that changes as the conversation progresses. The internal state is a set of knowledge and beliefs at that time, and each element is K (α, φ), B
(Α, φ). Here, α is a dialog participant, and φ is an expression including a fact or modality operator.
The internal state transits from S _i _-1 to S _{i according} to the utterance u _i (i = 1,..., N). Then, the dialogue is equivalent to the following finite sequence of internal states.

【００２６】[0026]

【数１１】ｕ₁ ｕ₂ ｕ_n Ｓ₀⇒Ｓ₁⇒…⇒Ｓ_n [Number 11] _{_{_{_{u 1 u 2 u n S 0}}}} ⇒S 1 ⇒ ... ⇒S n

【００２７】初期状態は次式で表される。The initial state is represented by the following equation.

【数１２】Ｓ₀＝｛Ｋ（ａ，Ｐ₁），…，Ｋ（ａ，Ｐ_n），Ｋ（ｂ，Ｑ₁），…，Ｋ（ｂ，Ｑ_m），Ｋ（ａ，Ｎ（α，Ｐ））｝S ₀ = ｛K (a, P ₁ ), ..., K (a, P _n ), K (b, Q ₁ ), ..., K (b, Q _m ), K (a, N ( α, P))｝

【００２８】ここで、ａは情報収集意図を持つ対話参加
者、ｂはその相手であり、Ｐ₁，…，Ｐ_n，Ｑ₁，…，
Ｑ_m，Ｐは事実である。αはａ又はｂである。すなわ
ち、ａ、ｂの知識をすべて記述し、さらにαがＰを知る
必要があることをａが知っているということを記述した
内部状態から対話が始まる。最初の発話はｉｎｆｏｒｍ
（ａ，ｂ，Ｐ）又はｒｅｑｕｅｓｔ（ａ，ｂ，Ｐ）にな
る。Here, a is a participant in the dialogue with the intention of collecting information, b is its partner, and P ₁ ,..., P _n , Q ₁ ,.
Q _m and P are facts. α is a or b. That is, the dialogue starts from an internal state that describes all the knowledge of a and b, and further describes that a knows that α needs to know P. The first utterance is inform
(A, b, P) or request (a, b, P).

【００２９】対話におけるゴール（すなわち、情報収集
意図により最終状態で成り立つべき式をいう。）はＫ
（ａ，Ｐ）と表される。ただし、ａは情報収集意図を持
つ対話参加者、Ｐは獲得したい情報である。本実施形態
において、あるα₁，…，α_m∈Ｓ_nに対して、１．¬（α₁，…，α_m→ｆａｌｓｅ）２．α₁∧…∧α_m→K（ａ，Ｐ）ならば、「対話は成功した」と定義するThe goal in the dialogue (that is, the expression that should be satisfied in the final state due to the purpose of collecting information) is K
(A, P). Here, a is a dialogue participant having an information collection intention, and P is information to be acquired. In the present embodiment, for certain α ₁ ,..., Α _m ∈S _n , ¬ (α ₁ ,..., Α _m → false) If α ₁ ∧… ∧α _m → K (a, P), define the dialogue as “successful”

【００３０】さらに、状態遷移の仕組みについて説明す
る。発話はｉｎｆｏｒｍ（α，β，Ｐ）又はｒｅｑｕｅ
ｓｔ（α，β，Ｐ）の形をしている。状態Ｓ_i-1で各対
話参加者がある発話を入力として受け取ると、以下の操
作に従ってＳ_iへの状態遷移が起こる。（ａ）ｒｅｑｕｅｓｔ（α，β，Ｐ）による遷移Further, the mechanism of the state transition will be described. Utterance is inform (α, β, P) or request
It has the form st (α, β, P). Upon receiving the utterance in the state S _i-1 is the interactive participants as an input, state transition to S _i occurs according to the following procedure. (A) Transition by request (α, β, P)

【数１３】（ｂ）ｉｎｆｏｒｍ（α，β，Ｐ）による遷移(Equation 13) (B) Transition by inform (α, β, P)

【数１４】ｉｆＢ（α，Ｎ（β，Ｐ）），Ｋ（α，Ｐ）∈Ｓ_i-1 ｔｈｅｎＳ_i＝Ｓ_i-1−｛Ｂ（α，Ｎ（β，Ｐ））｝∪｛Ｋ（β，
Ｋ（α，Ｐ））｝∪｛Ｋ（β，Ｐ）｝Equation 14] if B (α, N (β , P)), K (α, P) ∈S i-1 then S i = S i-1 - {B (α, N (β, P))} ∪ ｛K (β,
K (α, P)) {K (β, P)}

【００３１】Ｓ_iでは、背景知識に基づく推論、発話生
成規則による推論が続いて行われ、次発話として出力さ
れる。以上が本実施形態における応答すべき発話生成処
理の枠組みである。In S _i , the inference based on the background knowledge and the inference based on the utterance generation rule are subsequently performed, and are output as the next utterance. The above is the framework of the utterance generation process to be responded in the present embodiment.

【００３２】次いで、図１を参照して、本実施形態の対
話システムの構成及び動作について説明する。図１にお
いて、対話参加者（人間）１００が発声する音声はマイ
クロホン１に入力されてアナログ音声信号（電気信号）
に変換された後、Ａ／Ｄ変換器２によりデジタル音声信
号にＡ／Ｄ変換されて音声認識部３に入力される。音声
認識部３は、入力されるデジタル音声信号を例えばＬＰ
Ｃ法により音声分析して、音響的特徴パラメータを抽出
した後、上記抽出した音響的特徴パラメータに基づい
て、隠れマルコフモデル（ＨＭＭ）や統計的言語モデル
を参照して、公知の音声認識方法を用いて、上記音声を
音声認識して、文字列のテキストデータに変換し、前処
理部４に出力する。Next, the configuration and operation of the interactive system according to the present embodiment will be described with reference to FIG. In FIG. 1, a voice uttered by a dialog participant (human) 100 is input to a microphone 1 and is converted into an analog voice signal (electric signal).
After that, the digital audio signal is A / D converted by the A / D converter 2 and input to the voice recognition unit 3. The voice recognition unit 3 converts the input digital voice signal into, for example, LP
After performing speech analysis by the C method and extracting acoustic feature parameters, a known speech recognition method is performed by referring to a hidden Markov model (HMM) or a statistical language model based on the extracted acoustic feature parameters. Then, the speech is recognized, converted into text data of a character string, and output to the preprocessing unit 4.

【００３３】前処理部４には、文字列を情報要求又は情
報提供の中間言語に変換するためのパターンモデルを予
め記憶するパターンモデルメモリ１１が接続され、前処
理部４は、入力される文字列に基づいて、パターンモデ
ルメモリ１１に記憶されたパターンモデルを参照して、
上記記述言語で記述された情報要求又は情報提供の中間
言語に変換して発話生成処理部５に出力する。次の表
に、前処理部４のためのパターンモデルメモリ１１にお
けるパターンモデル例を示す。このパターンモデル例で
は、変換元の文字列、変換後の中間言語、及び本実施形
態で説明する一例の番号(例がある場合のみ）を示して
いる。The pre-processing unit 4 is connected to a pattern model memory 11 for storing in advance a pattern model for converting a character string into an intermediate language for requesting information or providing information. Referring to the pattern model stored in the pattern model memory 11 based on the column,
It is converted into an information request or an information provision intermediate language described in the above description language and output to the utterance generation processing unit 5. The following table shows an example of a pattern model in the pattern model memory 11 for the preprocessing unit 4. In this example of the pattern model, the character string of the conversion source, the intermediate language after the conversion, and the example number described in the present embodiment (only when there is an example) are shown.

【００３４】[0034]

【表１】前処理部４のためのパターンモデルメモリ１１におけるパターンモデル例 ―――――――――――――――――――――――――――――――――― （ａ）はい、何名様でございますでしょうか。；ｒｅｑｕｅｓｔ（ｃ，ｇ，ｎｂｒ）；（例３） ―――――――――――――――――――――――――――――――――― （ｂ）かしこまりました。では、お名前をちょうだいできますか。；ｒｅｑｕｅｓｔ（ｃ，ｇ，ｎａｍｅ） ―――――――――――――――――――――――――――――――――― （ｃ）九月の十一日の日曜日、一泊お願いします。；ｉｎｆｏｒｍ（ｇ，ｃ，ｄａｔｅｓ） ―――――――――――――――――――――――――――――――――― （ｄ）はい、それで結構です。；ｉｎｆｏｒｍ（ｇ，ｃ，ｆｉｘ）；（例４） ―――――――――――――――――――――――――――――――――― （ｅ）八月の十三日の土曜日から、十五日までの三日間でお願いします。；ｉｎｆｏｒｍ（ｇ，ｃ，ｄａｔｅｓ）；（例５） ―――――――――――――――――――――――――――――――――― （ｆ）滞在先はホテルニューオータニロサンゼルス６０２号室。電話番号は、２１３，４４３，１７００。；ｉｎｆｏｒｍ（ｇ，ｃ，ａｄ），ｉｎｆｏｒｍ（ｇ，ｃ，ｔｅｌ）；（例６） ――――――――――――――――――――――――――――――――――[Table 1] Example of a pattern model in the pattern model memory 11 for the preprocessing unit 4 ――――――――――――――――――――――――――――― ――― (a) Yes, how many people are there? Request (c, g, nbr); (Example 3) ―――――――――――――――――――――――――――――――――― (b )Understood. So can you give me your name? Request (c, g, name) ―――――――――――――――――――――――――――――――――― (c) September of September I would like one night on one Sunday. ; Inform (g, c, dates) ―――――――――――――――――――――――――――――――――― (d) Yes, so No thank you. ; Inform (g, c, fix); (Example 4) ―――――――――――――――――――――――――――――――― (e ) Please take three days from Saturday, August 13, to the 15th. ; Form (g, c, dates); (Example 5) ―――――――――――――――――――――――――――――――――― (f ) I am staying at Hotel New Otani Los Angeles Room 602. The telephone number is 213,443,1700. Inform (g, c, ad), inform (g, c, tel); (Example 6) ――――――――――――――――――――――――――― ―――――――

【００３５】発話生成処理部５は、状態遷移処理部２１
と、推論処理部２２と、データ出力処理部２３とから構
成され、状態遷移処理部２１及びデータ出力処理部２３
には、内部状態記述メモリ３１が接続され、推論処理部
２２には発話生成規則メモリ３２及び知識記述メモリ３
３が接続される。The utterance generation processing unit 5 includes a state transition processing unit 21
, An inference processing unit 22, and a data output processing unit 23, and the state transition processing unit 21 and the data output processing unit 23.
Is connected to an internal state description memory 31, and the inference processing unit 22 includes an utterance generation rule memory 32 and a knowledge description memory 3.
3 are connected.

【００３６】ここで、内部状態記述メモリ３１は、その
時点における知識と信念の集合を記憶しており、各対話
参加者に対して、対話の進行とともに変化する対話の状
況を示す内部状態を格納する。次の表に、内部状態記述
メモリ３１における内部状態の記述例を示す。この内部
状態の記述例では、上述の演算子Ｋ，Ｂを用いて記述し
ており、各例の最後に本実施形態で説明する一例の番号
を示している。Here, the internal state description memory 31 stores a set of knowledge and beliefs at that time, and stores, for each dialogue participant, an internal state indicating the state of the dialogue that changes as the dialogue progresses. I do. The following table shows a description example of the internal state in the internal state description memory 31. In the description example of the internal state, the above-described operators K and B are used, and the numbers of the examples described in the present embodiment are shown at the end of each example.

【００３７】[0037]

【表２】内部状態記述メモリ３１における内部状態の記述例 ―――――――――――――――――――――――――――――――――― （ａ）Ｂ（ｃ，Ｎ（ｃ，ｎｂｒ））（例３） ―――――――――――――――――――――――――――――――――― （ｂ）Ｋ（ｇ，Ｎ（ｃ，ｎｂｒ））（例３） ―――――――――――――――――――――――――――――――――― （ｃ）Ｋ（ｃ，Ｋ（ｇ，ｎｂｒ））（例３） ―――――――――――――――――――――――――――――――――― （ｄ）Ｂ（ｇ，Ｎ（ｃ，ｆｉｘ）），Ｋ（ｇ，ｆｉｘ）（例４） ―――――――――――――――――――――――――――――――――― （ｅ）Ｋ（ｃ，Ｋ（ｇ，ｆｉｘ）），Ｋ（ｃ，ｆｉｘ），Ｂ（ｃ，Ｎ（ｃ，ｎａｍｅ）），Ｂ（ｃ，Ｎ（ｃ，ｐａｙ）），Ｋ（ｇ，ｆｉｘ）（例４） ―――――――――――――――――――――――――――――――――― （ｆ）Ｂ（ｇ，Ｎ（ｃ，ａｄ）），Ｂ（ｇ，ａｄ），Ｂ（ｇ，Ｎ（ｃ，ｔｅｌ）），Ｂ（ｇ，ｔｅｌ）（例６） ―――――――――――――――――――――――――――――――――― （ｇ）Ｂ（ｃ，Ｂ（ｇ，ａｄ）），Ｂ（ｃ，ａｄ），Ｂ（ｃ，Ｂ（ｇ，ｔｅｌ’ ）），Ｂ（ｃ，ｔｅｌ’），Ｂ（ｃ，Ｎ（ｇ，ａｄ）），Ｂ（ｇ，ａｄ），Ｂ（ｃ，Ｎ（ｇ，ｔｅｌ’）），Ｂ（ｇ，ｔｅｌ）（例６） ―――――――――――――――――――――――――――――――――― （ｈ）Ｂ（ｇ，Ｂ（ｃ，ａｄ）），Ｂ（ｇ，ａｄ），Ｂ（ｃ，ａｄ），Ｂ（ｇ，Ｂ（ｃ，ｔｅｌ”）），Ｂ（ｇ，ｔｅｌ），Ｂ（ｇ，ｔｅｌ”）（例６） ――――――――――――――――――――――――――――――――――[Table 2] Description example of internal state in internal state description memory 31 ―――――――――――――――――――――――――――――――――― a) B (c, N (c, nbr)) (Example 3) ――――――――――――――――――――――――――――――― ― (B) K (g, N (c, nbr)) (Example 3) ――――――――――――――――――――――――――――――― ――― (c) K (c, K (g, nbr)) (Example 3) ――――――――――――――――――――――――――― ――――― (d) B (g, N (c, fix)), K (g, fix) (Example 4) ――――――――――――――――――― ―――――――――――――― (e) K (c, K (g, fix)), K (c, fix), B (c, N (c, name)), B (C, N (c, pay)), K (g, fix) (Example 4) ――――――――――――――――――――――――――――― ――――― (f) B (g, N (c, ad)), B (g, ad), B (g, N (c, tel)), B (g, tel) (Example 6) ― ――――――――――――――――――――――――――――――― (g) B (c, B (g, ad)), B ( c, ad), B (c, B (g, tel ')), B (c, tel'), B (c, N (g, ad)), B (g, ad), B (c, N) (G, tel ')), B (g, tel) (Example 6) ―――――――――――――――――――――――――――――――― -(H) B (g, B (c, ad)), B (g, ad), B (c, ad), B (g, B (c, tel ")), B (g, tel) , B (g, tel ") (Example 6) ----------------------------------

【００３８】また、知識記述メモリ３３は、一般の公
理、推論規則に加え、タスク依存の知識を記述して記憶
している。次の表に、知識記述メモリ３３における知識
記述例を示す。ここでは、各例毎に、説明を加える。The knowledge description memory 33 describes and stores task-dependent knowledge in addition to general axioms and inference rules. The following table shows an example of knowledge description in the knowledge description memory 33. Here, an explanation will be added for each example.

【００３９】[0039]

【表３】知識記述メモリ３３における知識記述例 ―――――――――――――――――――――――――――――――――― （ａ）Ｂ（α，φ→ψ）→（Ｂ（α，φ）→Ｂ（α，ψ））（Ｂの公理）（説明）Ｂは含意のもとに閉じている。 ―――――――――――――――――――――――――――――――――― （ｂ）Ｎ（α，φ→ψ）→（Ｎ（α，φ）→Ｎ（α，ψ））（Ｎの公理）（説明）Ｎは含意のもとに閉じている。 ―――――――――――――――――――――――――――――――――― （ｃ）Ｋ（α，φ）→φ（Ｋの公理）（説明）αは真であることのみ知っている。 ―――――――――――――――――――――――――――――――――― （ｄ）ｎｕｍｂｅｒ∧ｄａｔｅｓ∧ｒｏｏｍｔｙｐｅ∧ａｖａｉｌａｂｌｅ →ｒｅｓｅｒｖｅｄ（例１）（説明）上述の背景知識を参照。ホテルの予約において、人数、日程、部屋のタイプが決まり、かつ条件を満たす部屋があいていれば予約は達成される。 ―――――――――――――――――――――――――――――――――― （ｅ）ｔｈｒｅｅ∧ｃｈｅｅｐ→ｅｘｔｒａＢｅｄ（説明）３人で安い部屋ならばエクストラベッドをいれる。 ―――――――――――――――――――――――――――――――――― （ｆ）ｓｉｎｇｌｅ∧ｄｏｕｂｌｅ→ｆａｌｓｅ（説明）部屋のタイプはシングルかつダブルになることはない。 ――――――――――――――――――――――――――――――――――[Table 3] Example of knowledge description in knowledge description memory 33 ―――――――――――――――――――――――――――――――― (a) B (Α, φ → ψ) → (B (α, φ) → B (α, ψ)) (Axiom of B) (Explanation) B is closed under implication. ―――――――――――――――――――――――――――――――――― (b) N (α, φ → ψ) → (N (α , Φ) → N (α, ψ)) (Axiom of N) (Explanation) N is closed under implication. ―――――――――――――――――――――――――――――――――― (c) K (α, φ) → φ (K axiom) (Explanation) We only know that α is true. ―――――――――――――――――――――――――――――――――― (d) number∧datas∧roomtype∧available → reserved (Example 1) (Description) See Background Knowledge above. When booking a hotel, the reservation is achieved if the number of people, the schedule, the type of room are determined, and there are rooms that meet the conditions. ―――――――――――――――――――――――――――――――――― (e) tree @ check → extraBed (Explanation) Cheap room with 3 people Then put in an extra bed. ―――――――――――――――――――――――――――――――――― (f) single → double → false (Description) Room type is single And never double. ――――――――――――――――――――――――――――――――――

【００４０】また、発話生成規則メモリ３２は、発話の
順序に関する制約を示す発話生成規則を予め記憶する。
ここでは、「ある（複数の）事実を知れば、自分は特定
の（複数の）事実を知らねばならない」、あるいは、
「ある（複数の）事実を知れば、相手は特定の（複数
の）事実を知らねばならない」という形で記述される。
ただし、弱いモデルの場合は「知っている」が「信じて
いる」になる。以下で（ｆ）は弱いモデルにおける例で
ある。ここで、弱いモデルとは発話が正しく伝わらない
可能性のある対話に対するモデルのことである。次の表
に、発話生成規則メモリ３２における発話生成規則例を
示す。ここでは、各例毎に説明を加える。The utterance generation rule memory 32 stores in advance utterance generation rules indicating restrictions on the order of utterances.
Here, "if you know a fact (s), you must know a particular fact (s)", or
It is described in the form of "if one knows one or more facts, the other party has to know a particular fact (s)".
However, in the case of a weak model, "know" becomes "believe". Hereinafter, (f) is an example in a weak model. Here, the weak model is a model for a dialogue in which the utterance may not be transmitted correctly. The following table shows an example of an utterance generation rule in the utterance generation rule memory 32. Here, an explanation will be added for each example.

【００４１】[0041]

【表４】発話生成規則メモリ３２における発話生成規則例 ―――――――――――――――――――――――――――――――――― （ａ）Ｋ（ｃ，ＤｅｓＲｅｓ）→Ｂ（ｃ，Ｎ（ｃ，ｎｕｍｂｅｒ））∧ Ｂ（ｃ，Ｎ（ｃ，ｄａｔｅｓ））∧Ｂ（ｃ，Ｎ（ｃ，ｒｏｏｍＴｙｐｅ））（説明）（例２）ホテルフロント（ｃで表す）は、宿泊の要求（ＤｅｓＲｅｓ）を知れば、希望の人数、日程、部屋のタイプを知る必要がある。 ―――――――――――――――――――――――――――――――――― （ｂ）Ｋ（ｃ，ｆｉｘ）→Ｂ（ｃ，Ｎ（ｃ，ｎａｍｅ））∧Ｂ（ｃ，Ｎ（ｃ，ｐａｙ））（説明）（例４）予約が確定するとホテル側は相手の名前や支払い方法を尋ねる。 ―――――――――――――――――――――――――――――――――― （ｃ）Ｋ（ｇ，ｂｒｅａｋｆａｓｔ）→Ｂ（ｇ，Ｎ（ｃ，ｂｆＴｙｐｅ））（説明）客（ｇで表す）は朝食がついていることを知れば、ホテル側は朝食の希望のタイプを知らねばならないと思う。 ―――――――――――――――――――――――――――――――――― （ｄ）Ｋ（ｃ，ｒｏｏｍＴｙｐｅ）∧Ｋ（ｃ，ｎｏｔＡｖａｉｌａｂｌｅ） →Ｂ（ｃ，Ｎ（ｇ，ｎｏｔＡｖａｉｌａｂｌｅ）） ∧Ｂ（ｃ，Ｎ（ｇ，ｒｏｏｍＴｙｐｅ’））（説明）ホテル側は希望の部屋のタイプを知りかつそのタイプに空室がなければ、客はそのことを知るとともに別のタイプの部屋のあることを知らねばならないと思う。 ―――――――――――――――――――――――――――――――――― （ｅ）Ｋ（ｃ，ｄａｔｅｓ）→Ｂ（ｃ，Ｎ（ｇ，ｄａｔｅｓ））（説明）（例４）ホテル側は日程を知れば、客は（確認の為）それを知るべきだと思う。 ―――――――――――――――――――――――――――――――――― （ｆ）Ｂ（ｃ，ａｄ）→Ｂ（ｃ，Ｎ（ｇ，ａｄ））（説明）（例６）ホテル側は連絡先を知れば（信じれば）、客は（確認の為）それを知るべきだと思う。 ――――――――――――――――――――――――――――――――――[Table 4] Example of utterance generation rules in utterance generation rule memory 32 ―――――――――――――――――――――――――――――――― ) K (c, DesRes) → B (c, N (c, number)) ∧B (c, N (c, dates)) ∧B (c, N (c, roomType)) (Explanation) (Example 2) The hotel front desk (represented by c) needs to know the desired number of people, the schedule, and the type of room once the accommodation request (DesRes) is known. ―――――――――――――――――――――――――――――――― (b) K (c, fix) → B (c, N ( (c, name)) @ B (c, N (c, pay)) (Explanation) (Example 4) When the reservation is confirmed, the hotel asks for the name and payment method of the partner. ―――――――――――――――――――――――――――――――――― (c) K (g, breakfast) → B (g, N ( c, bfType)) (Explanation) If the guest (represented by g) knows that breakfast is included, I think that the hotel should know the type of breakfast desired. ―――――――――――――――――――――――――――――――――― (d) K (c, roomType) ∧K (c, notAvailable) → B (c, N (g, notAvailable)) ∧B (c, N (g, roomType ')) (Description) If the hotel knows the type of room desired and there is no vacancy in that type, the customer I need to know that and know that there is another type of room. ―――――――――――――――――――――――――――――――――― (e) K (c, dates) → B (c, N ( g, dates)) (Explanation) (Example 4) If the hotel knows the schedule, the guest should know it (for confirmation). ―――――――――――――――――――――――――――――――――― (f) B (c, ad) → B (c, N ( g, ad)) (Explanation) (Example 6) If the hotel side knows the contact information (if they believe it), the guest should know it (for confirmation). ――――――――――――――――――――――――――――――――――

【００４２】従って、発話生成処理部５では、状態遷移
処理部２１は、まず、前処理部４から入力される情報要
求又は情報提供の中間言語に基づいて、内部状態記述メ
モリ３１内の内部状態を参照して、内部状態を遷移させ
て内部状態記述メモリ３１内の内部状態を更新するとと
もに、遷移後の内部状態を推論処理部２２に出力する。
次いで、推論処理部２２は、入力される内部状態に基づ
いて、知識記述メモリ３３内の知識と、発話生成規則メ
モリ３２内の発話生成規則とを参照して、現在の状態に
鑑みて当該対話において応答して発話すべき内容を推論
してその中間言語を生成してデータ出力処理部２３に出
力する。さらに、これに応答して、データ出力処理部２
３は、入力される発話すべき内容の中間言語に基づい
て、内部状態記述メモリ３１内の内部状態を参照して、
情報要求又は情報提供の中間言語の出力データに変換し
て後処理部６に出力する。Accordingly, in the utterance generation processing unit 5, the state transition processing unit 21 firstly determines the internal state in the internal state description memory 31 based on the information request or the intermediate language of information provision input from the preprocessing unit 4. , The internal state is transited to update the internal state in the internal state description memory 31, and the internal state after the transition is output to the inference processing unit 22.
Next, the inference processing unit 22 refers to the knowledge in the knowledge description memory 33 and the utterance generation rule in the utterance generation rule memory 32 based on the input internal state, and considers the dialogue in consideration of the current state. In response, the contents to be uttered are inferred, the intermediate language is generated, and output to the data output processing unit 23. Further, in response to this, the data output processing unit 2
3 refers to the internal state in the internal state description memory 31 based on the intermediate language of the content to be uttered,
The data is converted into output data in an intermediate language for information request or information provision and output to the post-processing unit 6.

【００４３】後処理部６には、発話生成処理部５のデー
タ出力処理部２３から入力される、情報要求又は情報提
供の中間言語を文字列のテキストデータに逆変換するた
めのパターンモデルを予め記憶するパターンモデルメモ
リ１２が接続され、後処理部６は、入力される情報要求
又は情報提供の中間言語に基づいて、パターンモデルメ
モリ１２に記憶されたパターンモデルを参照して、上記
記述言語で記述された情報要求又は情報提供の中間言語
からそれに対応する文字列のテキストデータに逆変換し
て音声合成部７に出力する。次の表に、後処理部６のた
めのパターンモデルメモリ１２におけるパターンモデル
例を示す。このパターンモデル例では、変換元の中間言
語、変換後の文字列、及び本実施形態で説明する一例の
番号(例がある場合）を示している。The post-processing unit 6 stores in advance a pattern model for inversely converting an intermediate language for information request or information provision, which is input from the data output processing unit 23 of the utterance generation processing unit 5, into text data of a character string. The pattern model memory 12 to be stored is connected, and the post-processing unit 6 refers to the pattern model stored in the pattern model memory 12 based on an input information request or an intermediate language for providing information, and The described information request or the intermediate language of information provision is inversely converted into text data of a character string corresponding thereto and output to the speech synthesis unit 7. The following table shows an example of a pattern model in the pattern model memory 12 for the post-processing unit 6. In this example of the pattern model, a conversion source intermediate language, a character string after conversion, and an example number (if any) described in the present embodiment are shown.

【００４４】[0044]

【表５】後処理部６のためのパターンモデルメモリ１２におけるパターンモデル例 ―――――――――――――――――――――――――――――――――― （ａ）大人二名でお願いします。；ｉｎｆｏｒｍ（ｇ，ｃ，ｎｂｒ）；（例３） ―――――――――――――――――――――――――――――――――― （ｂ）はい、エイミー・ハリスと申します。；ｉｎｆｏｒｍ（ｇ，ｃ，ｎａｍｅ） ―――――――――――――――――――――――――――――――――― （ｃ）はい、何名様でございますでしょうか。；ｒｅｑｕｅｓｔ（ｃ，ｇ，ｎｂｒ） ―――――――――――――――――――――――――――――――――― （ｄ）かしこまりました。では、お名前をちょうだいできますか。；ｒｅｑｕｅｓｔ（ｃ，ｇ，ｎａｍｅ）（例４） ―――――――――――――――――――――――――――――――――― （ｅ）はい、八月十三日にチェックインされて、二泊お泊まりになられるということですね。；ｉｎｆｏｒｍ（ｃ，ｇ，ｄａｔｅｓ）（例５） ―――――――――――――――――――――――――――――――――― （ｆ）はい、鈴木様、ニューオータニホテルの６０２号室に御滞在中ですね。そして、そちらのお電話番号が、７１４，４４３，１７００でございますね。；ｉｎｆｏｒｍ（ｃ，ｇ，ａｄ’），ｉｎｆｏｒｍ（ｃ，ｇ，ｔｅｌ’）（例６） ――――――――――――――――――――――――――――――――――[Table 5] Example of pattern model in pattern model memory 12 for post-processing unit 6 ――――――――――――――――――――――――――――― --- (a) I would like two adults. Inform (g, c, nbr); (Example 3) ―――――――――――――――――――――――――――――――― (b Yes, my name is Amy Harris. Inform (g, c, name) ―――――――――――――――――――――――――――――――――― (c) Yes, how many people Is it? Request (c, g, nbr) ―――――――――――――――――――――――――――――――― (d) I was smart. So can you give me your name? Request (c, g, name) (Example 4) ―――――――――――――――――――――――――――――――――― (e) Yes, you will be checked in on August 13 and staying for two nights. ; Inform (c, g, dates) (Example 5) ―――――――――――――――――――――――――――――――― (f) Yes, Mr. Suzuki is staying in Room 602 of the New Otani Hotel. And your phone number is 714,443,1700. ; Inform (c, g, ad '), inform (c, g, tel') (Example 6) ―――――――――――――――――――――――――― ――――――――

【００４５】音声合成部７は、パルス発生器と雑音発生
器と利得可変型増幅器とフィルタとを備えて、有声と無
声とに基づいて、パルス発生器からのパルス信号と、雑
音発生器からの雑音とを切り換えた後、合成音声の振幅
に応じて増幅器の利得を変化するとともに、合成音声に
応じてフィルタ係数を変化するという公知の音声合成方
法を用いて、入力された文字列のテキストデータをデジ
タル音声信号に変換して、Ｄ／Ａ変換器８に出力する。
これに応答して、Ｄ／Ａ変換器８は、入力されたデジタ
ル音声信号をアナログ音声信号にＤ／Ａ変換してスピー
カ９を介して合成音声として出力する。The speech synthesizer 7 includes a pulse generator, a noise generator, a variable gain amplifier, and a filter, and based on voiced and unvoiced, a pulse signal from the pulse generator and a signal from the noise generator. After switching between the noise and the noise, the text data of the input character string is changed using a known speech synthesis method in which the gain of the amplifier is changed according to the amplitude of the synthesized speech and the filter coefficient is changed according to the synthesized speech. Is converted into a digital audio signal and output to the D / A converter 8.
In response, the D / A converter 8 D / A converts the input digital audio signal into an analog audio signal and outputs it as a synthesized audio via the speaker 9.

【００４６】以上のように構成された対話システムにお
いて、Ａ／Ｄ変換器２、音声認識部３、前処理部４、状
態遷移処理部２１と推論処理部２２とデータ出力処理部
２３とを備えた発話生成処理部５、後処理部６、音声合
成部７及びＤ／Ａ変換器８は、例えばデジタル計算機で
構成され、パターンモデルメモリ１１，１２、内部状態
記述メモリ３１、発話生成規則メモリ３２、及び知識記
述メモリ３３は、例えば、ハードディスクメモリなどの
記憶装置で構成される。The interactive system configured as described above includes an A / D converter 2, a speech recognition unit 3, a preprocessing unit 4, a state transition processing unit 21, an inference processing unit 22, and a data output processing unit 23. The utterance generation processing unit 5, the post-processing unit 6, the speech synthesis unit 7, and the D / A converter 8 are constituted by, for example, digital computers, and include pattern model memories 11, 12, an internal state description memory 31, and an utterance generation rule memory 32. , And the knowledge description memory 33 are configured by a storage device such as a hard disk memory.

【００４７】本実施形態で用いるシステム動作例を次の
表に示す。以下の例で（ｆ）は弱いモデルに対する動作
例である。各例における入出力はそれぞれ、上述の前処
理及び後処理の例に対応する。The following table shows an example of the system operation used in this embodiment. In the following example, (f) is an operation example for a weak model. The inputs and outputs in each example correspond to the examples of the pre-processing and post-processing described above, respectively.

【００４８】[0048]

【表６】システム動作例 ―――――――――――――――――――――――――――――――――― （ａ）質疑応答（例３）ｒｅｑｕｅｓｔ（ｃ，ｇ，ｎｂｒ） ↓状態遷移Ｋ（ｇ，Ｎ（ｃ，ｎｂｒ）） ↓推論Ｂ（ｇ，Ｎ（ｃ，ｎｂｒ））∧Ｋ（ｇ，ｎｂｒ） ↓出力ｉｎｆｏｒｍ（ｇ，ｃ，ｎｂｒ） ―――――――――――――――――――――――――――――――――― （ｂ）質疑応答（例４の直後の対話）ｒｅｑｕｅｓｔ（ｃ，ｇ，ｎａｍｅ） ↓状態遷移Ｋ（ｇ，Ｎ（ｃ，ｎａｍｅ）） ↓推論Ｂ（ｇ，Ｎ（ｃ，ｎａｍｅ））∧Ｋ（ｇ，ｎａｍｅ） ↓出力ｉｎｆｏｒｍ（ｇ，ｃ，ｎａｍｅ） ―――――――――――――――――――――――――――――――――― （ｃ）話題の遷移（例３の直前の対話）ｉｎｆｏｒｍ（ｇ，ｃ，ｄａｔｅｓ） ↓状態遷移Ｋ（ｃ，Ｋ（ｇ，ｄａｔｅｓ）） ↓発話生成Ｂ（ｃ，Ｎ（ｃ，ｎｂｒ）） ↓推論Ｂ（ｃ，Ｎ（ｃ，ｎｂｒ））∧¬Ｋ（ｃ，ｎｂｒ） ↓出力ｒｅｑｕｅｓｔ（ｃ，ｇ，ｎｂｒ） ―――――――――――――――――――――――――――――――――― （ｄ）話題の遷移（例４）ｉｎｆｏｒｍ（ｇ，ｃ，ｆｉｘ） ↓状態遷移Ｋ（ｃ，Ｋ（ｇ，ｆｉｘ）） ↓推論Ｋ（ｃ，ｆｉｘ） ↓発話生成Ｂ（ｃ，Ｎ（ｃ，ｎａｍｅ））∧Ｂ（ｃ，Ｎ（ｃ，ｐａｙ）） ↓推論Ｂ（ｃ，Ｎ（ｃ，ｎａｍｅ））∧Ｂ（ｃ，Ｎ（ｃ，ｐａｙ））∧ ¬Ｋ（ｃ，ｎａｍｅ）∧¬Ｋ（ｃ，ｐａｙ） ↓出力ｒｅｑｕｅｓｔ（ｃ，ｇ，ｎａｍｅ） ―――――――――――――――――――――――――――――――――― （ｅ）確認（例５）ｉｎｆｏｒｍ（ｇ，ｃ，ｄａｔｅｓ） ↓状態遷移Ｋ（ｃ，Ｋ（ｇ，ｄａｔｅｓ）） ↓推論Ｋ（ｇ，ｄａｔｅｓ）∧Ｋ（ｃ，ｄａｔｅｓ） ↓発話生成Ｂ（ｃ，Ｎ（ｇ，ｄａｔｅｓ））∧Ｋ（ｃ，ｄａｔｅｓ） ↓出力ｉｎｆｏｒｍ（ｃ，ｇ，ｄａｔｅｓ） ―――――――――――――――――――――――――――――――――― （ｆ）確認（例６）ｉｎｆｏｒｍ（ｇ，ｃ，ａｄ），ｉｎｆｏｒｍ（ｇ，ｃ，ｔｅｌ） ↓状態遷移Ｂ（ｃ，Ｂ（ｇ，ａｄ））∧Ｂ（ｃ，ａｄ）∧Ｂ（ｃ，Ｂ（ｇ，ｔｅｌ’））∧ Ｂ（ｃ，ｔｅｌ’） ↓発話生成Ｂ（ｃ，Ｎ（ｇ，ａｄ））∧Ｂ（ｃ，ａｄ）∧Ｂ（ｃ，Ｎ（ｇ，ｔｅｌ’））∧ Ｂ（ｃ，ｔｅｌ’） ↓出力ｉｎｆｏｒｍ（ｃ，ｇ，ａｄ），ｉｎｆｏｒｍ（ｃ，ｇ，ｔｅｌ’） ――――――――――――――――――――――――――――――――――[Table 6] System operation example ―――――――――――――――――――――――――――――――― (a) Q & A (Example 3) request (c, g, nbr) ↓ State transition K (g, N (c, nbr)) ↓ Inference B (g, N (c, nbr)) ∧ K (g, nbr) ↓ Output inform (g, c, (nbr) ―――――――――――――――――――――――――――――――― (b) Q & A (dialogue immediately after Example 4) request (C, g, name) ↓ State transition K (g, N (c, name)) ↓ Inference B (g, N (c, name)) ∧ K (g, name) ↓ Output inform (g, c, name) ) ―――――――――――――――――――――――――――――――――― (c) Topic transition (dialogue just before Example 3) (G , C, data) ↓ State transition K (c, K (g, data)) ↓ Utterance generation B (c, N (c, nbr)) ↓ Inference B (c, N (c, nbr)) ∧¬ K ( c, nbr) ↓ Output request (c, g, nbr) ―――――――――――――――――――――――――――――― (d ) Topic transition (Example 4) inform (g, c, fix) ↓ State transition K (c, K (g, fix)) ↓ Inference K (c, fix) ↓ Utterance generation B (c, N (c, name) )) ∧B (c, N (c, pay)) ↓ Inference B (c, N (c, pay)) ∧B (c, N (c, pay)) ∧ ¬K (c, name) ∧¬K (C, pay) ↓ Output request (c, g, name) ―――――――――――――――――――――――――――――――― ( e) Confirmation Example 5) inform (g, c, data) ↓ state transition K (c, K (g, data)) ↓ inference K (g, data) ∧ K (c, data) ↓ utterance generation B (c, N (g) , Dates)) ∧K (c, dates) ↓ Output inform (c, g, dates) ――――――――――――――――――――――――――――― ――――― (f) Confirmation (Example 6) inform (g, c, ad), inform (g, c, tel) ↓ state transition B (c, B (g, ad)) ∧ B (c, ad) ) ∧B (c, B (g, tel ′)) ∧B (c, tel ′) ↓ Utterance generation B (c, N (g, ad)) ∧B (c, ad) ∧B (c, N ( g, tel ')) ∧ B (c, tel') ↓ Output inform (c, g, ad), inform (c, g, tel ') ――― ------------------------------

【００４９】さらに、以上のように構成された対話シス
テムの発話生成処理部５における動作例について詳述す
る。発話生成処理部５では、入力される情報要求又は情
報提供の中間言語に基づいて、発話生成処理部５の処理
フローに沿って、状態遷移、推論、出力データの３つの
処理が実行されて、対話において発話すべきことばに対
応する情報要求又は情報提供の中間言語を生成する。Further, an example of the operation of the utterance generation processing section 5 of the interactive system configured as described above will be described in detail. In the utterance generation processing unit 5, three processes of state transition, inference, and output data are executed along the processing flow of the utterance generation processing unit 5 based on the input information request or the intermediate language of information provision. An intermediate language for requesting or providing information corresponding to words to be spoken in a dialog is generated.

【００５０】まず、典型的な質疑応答について説明す
る。ｒｅｑｕｅｓｔ（α，β，Ｐ）によってＫ（β，Ｎ
（α，Ｐ））が生じる。推論によりＢ（β，Ｎ（α，
Ｐ））が導かれる。その時点で内部状態にＫ（β，Ｐ）
があれば情報提供の前条件が成立し、ｉｎｆｏｒｍ
（β，α，Ｐ）が引き起こされる。次の例は、ホテルの
予約場面でのホテルフロントと客の対話であり（本実施
形態で使った例はすべて、本特許出願人が所有する旅行
対話のためのコーパスからとったものである。）、客
（guest）をｇ，ホテルフロント（clerk）をｃで表す。
ホテル予約対話では通常客は宿泊日程や希望する部屋の
タイプ、料金などを知識として持っており、フロントと
の対話でそれらを順に伝え、最終的に「希望条件で予約
された」ということを知る。First, a typical question and answer will be described. request (α, β, P) gives K (β, N
(Α, P)) occurs. By inference, B (β, N (α,
P)) is derived. At that time the internal state becomes K (β, P)
If there is, the precondition of information provision is satisfied, and
(Β, α, P) is caused. The following example is a dialogue between a hotel front desk and a customer in a hotel reservation scene (all the examples used in this embodiment are taken from a corpus for travel dialogue owned by the present applicant. ), G represents a guest, and c represents a hotel front (clerk).
In hotel reservation dialogue, customers usually have knowledge of accommodation schedule, desired room type, price, etc., communicate them in turn with the front desk, and finally know that "booking was done under desired conditions" .

【００５１】例３：質疑応答ｃ：はい、何名様でございますでしょうか。ｇ：大人二名でお願いします。最初の発話はｒｅｑｕｅ
ｓｔ（ｃ，ｇ，ｎｂｒ）、次の発話はｉｎｆｏｒｍ
（ｇ，ｃ，ｎｂｒ）と記述される。この対話の直前の内
部状態をＳ₁、各発話によって遷移した状態を順にＳ₂、
Ｓ₃とすると、この対話による内部状態の遷移は以下の
ようになる。Example 3: Q & A c: Yes, how many people are there? g: I would like two adults. The first utterance is request
st (c, g, nbr), next utterance is inform
(G, c, nbr). The internal state immediately before this dialogue is S ₁ , the state transited by each utterance is S ₂ ,
When S _3, the transition of the internal state of this interaction is as follows.

【００５２】[0052]

【数１５】Ｓ₁：Ｂ（ｃ，Ｎ（ｃ，ｎｂｒ））∧¬Ｋ（ｃ，ｎｂｒ） ↓ｒｅｑｕｅｓｔ（ｃ，ｇ，ｎｂｒ）Ｓ₂：Ｋ（ｇ，Ｎ（ｃ，ｎｂｒ）） →Ｂ（ｇ，Ｎ（ｃ，ｎｂｒ））∧Ｋ（ｇ，ｎｂｒ） ↓ｉｎｆｏｒｍ（ｇ，ｃ，ｎｂｒ）Ｓ₃：Ｋ（ｃ，Ｋ（ｇ，ｎｂｒ）） →Ｋ（ｃ，ｎｂｒ）S ₁ : B (c, N (c, nbr)) ∧¬K (c, nbr) ↓ request (c, g, nbr) S ₂ : K (g, N (c, nbr)) → B (g, N (c, nbr)) ∧K (g, nbr) ↓ inform (g, c, nbr) S ₃ : K (c, K (g, nbr)) → K (c, nbr)

【００５３】客は最初から人数がわかっているので、初
期状態はＫ（ｇ，ｎｂｒ）を含む。さらに、これまでの
対話から状態Ｓ₁はＢ（ｇ，Ｎ（ｃ，ｎｂｒ））を含み
Ｋ（ｃ，ｎｂｒ）を含まないと仮定する。すると、情報
要求の前条件が満たされてｒｅｑｕｅｓｔ（ｃ，ｇ，ｎ
ｂｒ）が発せられる。これにより状態Ｓ₂への状態遷移
が起こりＫ（ｇ，Ｎ（ｃ，ｎｂｒ））が生起する。次
に、推論規則によってＫ（ｇ，Ｎ（ｃ，ｎｂｒ））から
Ｂ（ｇ，Ｎ（ｃ，ｎｂｒ））が導かれる。一方、Ｋ
（ｇ，ｎｂｒ）は状態Ｓ₂にも保存されるので、情報提
供の前条件が成り立ちｉｎｆｏｒｍ（ｇ，ｃ，ｎｂｒ）
が発せられる。その結果、Ｓ₃への状態遷移が起こりＫ
（ｃ，Ｋ（ｇ，ｎｂｒ））が生起する。Ｋの公理からＫ
（ｃ，ｎｂｒ）が成り立つ。従って、最終的にホテルフ
ロントは人数に関する情報を得る。Since the number of customers is known from the beginning, the initial state includes K (g, nbr). Assume further from the previous dialog that state S ₁ contains B (g, N (c, nbr)) and does not contain K (c, nbr). Then, the precondition of the information request is satisfied and request (c, g, n
br) is issued. Thus occurs a state transition to state _{S 2 K (g, N (} c, nbr)) is occurring. Next, B (g, N (c, nbr)) is derived from K (g, N (c, nbr)) by the inference rule. On the other hand, K
(G, nbr) so is also stored in the state S _2, it holds the previous condition of providing information inform (g, c, nbr)
Is issued. As a result, it occurs state transition to S ₃ K
(C, K (g, nbr)) occurs. From the axiom of K to K
(C, nbr) holds. Therefore, the hotel reception finally obtains information on the number of people.

【００５４】次いで、発話の順序の例において、ある情
報を得た後、新たな情報に関する要求や告知が起こる過
程について説明する。ｉｎｆｏｒｍ（α，β，Ｐ）によ
ってＫ（β，Ｋ（α，Ｐ））が生じる。推論によりＫ
（β，Ｐ））が導かれる。ここで適応可能な発話生成規
則があり、その結果、Ｂ（β，Ｎ（α，Ｑ））が生起さ
れるとする。その時点で内部状態にＫ（β，Ｑ）がなけ
れば情報要求の前条件が成立しｒｅｑｕｅｓｔ（β，
α，Ｑ）が引き起こされる。適用する発話生成規則によ
ってｉｎｆｏｒｍ（β，α，Ｑ）になる場合もある。Next, in the example of the utterance order, a process in which a request or notification regarding new information occurs after obtaining certain information will be described. K (β, K (α, P)) is generated by inform (α, β, P). K by inference
(Β, P)). Here, it is assumed that there is an applicable utterance generation rule, and as a result, B (β, N (α, Q)) is generated. If there is no K (β, Q) in the internal state at that time, the precondition for the information request is satisfied and request (β, Q)
α, Q). The result may be inform (β, α, Q) depending on the utterance generation rule to be applied.

【００５５】例４：発話の順序ｇ：はい、それで結構です。ｃ：かしこまりました。では、お名前をちょうだいでき
ますか。Example 4: Order of utterance g: Yes, that is fine. c: I was smart. So can you give me your name?

【００５６】最初の発話は客のホテルの予約意志を伝え
るものであり、ｉｎｆｏｒｍ（ｇ，ｃ，ｆｉｘ）と、次
の発話はｒｅｑｕｅｓｔ（ｃ，ｇ，ｎａｍｅ）と記述さ
れる。この対話の直前の内部状態をＳ₁、各発話によっ
て遷移した状態を順にＳ₂、Ｓ₃とすると、この対話によ
る内部状態の遷移は以下のようになる。The first utterance conveys the intention of the guest to make a reservation at the hotel, and is described as inform (g, c, fix) and the next utterance is request (c, g, name). Assuming that the internal state immediately before the dialogue is S ₁ , and the states transited by each utterance are S ₂ and S ₃ , the internal state transitions due to the dialogue are as follows.

【００５７】[0057]

【数１６】Ｓ₁：Ｂ（ｇ，Ｎ（ｃ，ｆｉｘ））∧Ｋ（ｇ，ｆｉｘ） ↓ｉｎｆｏｒｍ（ｇ，ｃ，ｆｉｘ）Ｓ₂：Ｋ（ｃ，Ｋ（ｇ，ｆｉｘ）） →Ｋ（ｃ，ｆｉｘ） →Ｂ（ｃ，Ｎ（ｃ，ｎａｍｅ））∧Ｂ（ｃ，Ｎ（ｃ，ｐａｙ）） ¬Ｋ（ｃ，ｎａｍｅ）∧¬Ｋ（ｃ，ｐａｙ） ↓ｒｅｑｕｅｓｔ（ｃ，ｇ，ｎａｍｅ）Ｓ₃：Ｋ（ｇ，Ｎ（ｃ，ｎａｍｅ））S ₁ : B (g, N (c, fix)) ∧K (g, fix) ↓ inform (g, c, fix) S ₂ : K (c, K (g, fix)) → K (C, fix) → B (c, N (c, name)) ∧B (c, N (c, pay)) ¬K (c, name) ∧¬K (c, pay) ↓ request (c, g) , Name) S ₃ : K (g, N (c, name))

【００５８】予約が確定するとホテル側は相手の名前や
支払い方法を尋ねる。この順序を発話生成規則として以
下のように記述する。When the reservation is confirmed, the hotel asks for the name of the partner and the payment method. This order is described as an utterance generation rule as follows.

【００５９】[0059]

【数１７】Ｋ（ｃ，ｆｉｘ）→Ｂ（ｃ，Ｎ（ｃ，ｎａｍ
ｅ））∧Ｂ（ｃ，Ｎ（ｃ，ｐａｙ））K (c, fix) → B (c, N (c, nam
e)) ∧B (c, N (c, pay))

【００６０】これまでの対話からＳ₁はＢ（ｇ，Ｎ
（ｃ，ｆｉｘ））及びＫ（ｇ，ｆｉｘ）を含むと仮定す
る。すると、情報提供の前条件が満たされて、ｉｎｆｏ
ｒｍ（ｇ，ｃ，ｆｉｘ）が発せられる。これにより状態
Ｓ₂への状態遷移が起こりＫ（ｃ，Ｋ（ｇ，ｆｉｘ））
が生起する。次に、推論規則によってＫ（ｃ，ｆｉｘ）
が導かれる。すると上述の発話生成規則によってFrom the conversation so far, S ₁ is B (g, N
(C, fix)) and K (g, fix). Then, the pre-conditions for providing information are satisfied,
rm (g, c, fix) is emitted. As a result, a state transition to the state S ₂ occurs, and K (c, K (g, fix))
Occurs. Next, according to the inference rule, K (c, fix)
Is led. Then, according to the above utterance generation rules,

【数１８】Ｂ（α，Ｎ（ｃ，ｎａｍｅ））∧Ｂ（α，Ｎ
（ｃ，ｐａｙ））が生起する。一方、状態Ｓ₂はＫ（ｃ，ｎａｍｅ）もＫ
（ｃ，ｐａｙ）も含まないことから情報要求の前条件が
成り立ち、ｒｅｑｕｅｓｔ（ｃ，ｇ，ｎａｍｅ）又はｒ
ｅｑｕｅｓｔ（ｃ，ｇ，ｐａｙ）が発せられる。どちら
が先に生じるかは非決定的である。この発話の結果、状
態Ｓ₃への状態遷移が起こりＫ（ｃ，Ｎ（ｇ，ｎａｍ
ｅ））が生起する。この後、推論を続けるとｉｎｆｏｒ
ｍ（ｃ，ｇ，ｎａｍｅ）が引き起こされ情報提供が行わ
れる。以上から、情報要求、情報提供いずれもこの枠組
みで説明できることがわかる。B (α, N (c, name)) ∧B (α, N
(C, pay)) occurs. On the other hand, in the state S ₂ , K (c, name) is also K
Since (c, pay) is not included, the precondition of the information request is satisfied, and request (c, g, name) or r
request (c, g, pay) is issued. Which occurs first is non-deterministic. The result of this speech, takes place the state transition to state _{S 3 K (c, N (} g, nam
e)) occurs. After this, if you continue inference,
m (c, g, name) is triggered to provide information. From the above, it can be understood that both the information request and the information provision can be explained by this framework.

【００６１】次いで、確認の過程の一例について説明す
る。ｉｎｆｏｒｍ（α，β，Ｐ）のあとにｉｎｆｏｒｍ
（β，α、Ｐ）が起こる場合、後者は単なる情報の提供
ではなく確認（confirmation）に相当する。以下ではこ
の過程を説明する。確認の発話を生じさせる規則は発話
生成規則の一つとして以下のように記述できる。Next, an example of the confirmation process will be described. inform (α, β, P) followed by inform
When (β, α, P) occurs, the latter corresponds to a confirmation rather than a mere provision of information. Hereinafter, this process will be described. The rule that causes the confirmation utterance can be described as one of the utterance generation rules as follows.

【００６２】[0062]

【数１９】Ｋ（α，Ｐ₁）∧…∧Ｋ（α，Ｐ_n）→Ｂ
（α，Ｎ（β，Ｐ））[Equation 19] K (α, P ₁ ) ∧ ... ∧K (α, P _n ) → B
(Α, N (β, P))

【００６３】この規則はαが特定の事実（Ｐ₁，…，
Ｐ_n）を知った場合、事実Ｐをβに確認する必要がある
ことを表す。This rule states that α is a specific fact (P ₁ ,...,
If P _n ) is known, it means that P needs to be confirmed to β.

【００６４】[0064]

【数２０】Ｂ（α，Ｎ（β，Ｐ））∧Ｋ（α，Ｐ）∧Ｂ
（α，Ｋ（β，Ｐ））の状態でｉｎｆｏｒｍ（α，β，Ｐ）が起これば、それ
は確認になる。この条件は、βがＰを知る必要があると
αが思っており、自分がその情報を知っていてかつβが
Ｐを知っているとαが思っていることを表している。確
認の結果遷移した内部状態では、B (α, N (β, P)) ∧K (α, P) ∧B
If inform (α, β, P) occurs in the state of (α, K (β, P)), it is confirmed. This condition indicates that α believes that β needs to know P, and that α knows that he knows the information and β knows P. In the internal state that has transitioned as a result of confirmation,

【数２１】Ｋ（α，Ｋ（β，Ｐ））∧Ｋ（β，Ｋ（α，Ｐ））が成り立ち、Ｐは対話参加者α，β間における相互知識
になる。The following holds: K (α, K (β, P)) ∧K (β, K (α, P)), and P is mutual knowledge between the dialog participants α and β.

【００６５】例５：確認ｇ：八月の十三日の土曜日から、十五日までの三日間で
お願いします。ｃ：はい、八月十三日にチェックインされて、二泊お泊
まりになられるということですね。Example 5: Confirmation g: I would like three days from Saturday, the 13th of August to the 15th. c: Yes, you check-in on August 13th and you will stay for two nights.

【００６６】最初の発話はｉｎｆｏｒｍ（ｇ，ｃ，ｄａ
ｔｅｓ）、次の発話はｉｎｆｏｒｍ（ｃ，ｇ，ｄａｔｅ
ｓ）と記述される。Ｋ（ｇ，ｄａｔｅｓ）は初期状態で
与えられ、Ｂ（ｇ，Ｎ（ｃ，ｄａｔｅｓ））はこの直前
の発話によって生じたとする。単純化のため、発話生成
規則は次式で表されるものとする。The first utterance is inform (g, c, da
tes), the next utterance is inform (c, g, date)
s). It is assumed that K (g, dates) is given in an initial state, and B (g, N (c, dates)) is generated by the immediately preceding utterance. For simplicity, the utterance generation rule is represented by the following expression.

【００６７】[0067]

【数２２】Ｋ（ｃ，ｄａｔｅｓ）→Ｂ（ｃ，Ｎ（ｇ，ｄ
ａｔｅｓ））とする。## EQU22 ## K (c, dates) → B (c, N (g, d
ates)).

【００６８】この場合の状態遷移は以下のようになる。The state transition in this case is as follows.

【数２３】Ｂ（ｇ，Ｎ（ｃ，ｄａｔｅｓ））∧Ｋ（ｇ，ｄａｔｅｓ） ↓ｉｎｆｏｒｍ（ｇ，ｃ，ｄａｔｅｓ）Ｋ（ｃ，Ｋ（ｇ，ｄａｔｅｓ）） →Ｋ（ｇ，ｄａｔｅｓ）∧Ｋ（ｃ，ｄａｔｅｓ） →Ｂ（ｃ，Ｎ（ｇ，ｄａｔｅｓ））∧Ｋ（ｃ，ｄａｔｅｓ） ↓ｉｎｆｏｒｍ（ｃ，ｇ，ｄａｔｅｓ）Ｋ（ｇ，Ｋ（ｃ，ｄａｔｅｓ）） B (g, N (c, dates)) ， K (g, dates) ↓ inform (g, c, dates) K (c, K (g, dates)) → K (g, dates) ∧ K (c, dates) → B (c, N (g, dates)) ∧K (c, dates) ↓ inform (c, g, dates) K (g, K (c, dates))

【００６９】ここで、下線をひいた２式からホテルと客
の間で日程（dates）に関する相互知識ができたことが
わかる。Here, it can be seen from the two underlined equations that mutual knowledge about dates was obtained between the hotel and the guest.

【００７０】次いで、聞き間違いの場合の一例について
説明する。これまでは情報が正しく伝わるものと仮定し
ていたため、確認による失敗はなかった。しかし、実際
の対話では聞き間違いにより情報が正しく伝わらない場
合もあり、確認や訂正行為がしばしば行なわれる。この
過程を記述するために、ｒｅｑｕｅｓｔ、ｉｎｆｏｒｍ
の条件を弱める。Next, an example in the case of a wrong listening will be described. Until now, it was assumed that information was transmitted correctly, so there was no failure due to confirmation. However, in an actual dialogue, information may not be transmitted correctly due to a mistake in listening, and confirmation and correction are often performed. In order to describe this process, request, inform
Weaken the condition.

【００７１】（１）情報要求−ｒｅｑｕｅｓｔ（α，
β，Ｐ）（ａ）前条件：Ｂ（α，Ｎ（α，Ｐ））∧¬Ｂ（α，
Ｐ）（ｂ）後条件：Ｂ（β，Ｎ（α，Ｐ’））Ｐ’は事実だがＰと一致するとは限らない。(1) Information request—request (α,
β, P) (a) Precondition: B (α, N (α, P)) ∧¬B (α,
P) (b) Post-condition: B (β, N (α, P ′)) P ′ is true but does not always coincide with P.

【００７２】（２）情報提供−ｉｎｆｏｒｍ（α，β，
Ｐ）（ａ）前条件：Ｂ（α，Ｎ（β，Ｐ））∧Ｂ（α，Ｐ）（ｂ）後条件：Ｂ（β，Ｂ（α，Ｐ’））∧Ｂ（β，
Ｐ’）(2) Information provision-inform (α, β,
P) (a) Precondition: B (α, N (β, P)) ∧B (α, P) (b) Postcondition: B (β, B (α, P ′)) ∧B (β,
P ')

【００７３】このモデルでは、発話の結果相手は正確な
情報を得るとは限らず、発話者の内部状態についての信
念を得るに過ぎない。In this model, the other party does not always obtain accurate information as a result of the utterance, but merely obtains the belief about the internal state of the utterer.

【数２４】Ｂ（α，Ｎ（β，Ｐ））∧Ｂ（α，Ｐ）∧Ｂ
（α，Ｂ（β，Ｐ））の状態でｉｎｆｏｒｍ（α，β，Ｐ）が起これば、それ
は確認になる。(24) B (α, N (β, P)) ＰB (α, P) ∧B
If inform (α, β, P) occurs in the state of (α, B (β, P)), it is confirmed.

【００７４】また、発話生成規則は以下のようにすべて
ＫがＢに置き換えられた形になる。Further, the utterance generation rule has a form in which K is replaced with B as follows.

【数２５】Ｂ（α，Ｐ₁）∧…∧Ｂ（α，Ｐ_n）→Ｂ
（α，Ｎ（α₁，Ｑ₁））∧…∧Ｂ（α，Ｎ（α_m，
Ｑ_m））確認の結果遷移した内部状態では、## EQU25 ## B (α, P ₁ ) ∧... B (α, P _n ) → B
(Α, N (α ₁ , Q ₁ )) ∧ ... ∧B (α, N (α _m ,
Q _m )) In the internal state that transits as a result of the confirmation,

【数２６】Ｂ（α，Ｐ）∧Ｂ（β，Ｐ’）∧Ｂ（α，Ｂ
（β，Ｐ））∧Ｂ（β，Ｂ（α，Ｐ’））が成り立つ。
ここで、(26) B (α, P) ∧B (β, P ′) 'B (α, B
(Β, P)) ∧B (β, B (α, P ′)) holds.
here,

【数２７】Ｐ＝Ｐ’ ならば、Ｐはα，β間における相互信念に相当する。If P = P ', then P corresponds to the mutual belief between α and β.

【数２８】Ｐ≠Ｐ’ ならば聞き間違いになる。## EQU28 ## If P ≠ P ’, it is a misunderstanding.

【００７５】例６：聞き間違いの発見ｇ：滞在先はホテルニューオータニロサンゼルス６０２
号室。電話番号は、２１３、４４３、１７００。ｃ：はい、鈴木様、ニューオータニホテルの６０２号室
に御滞在中ですね。そして、そちらのお電話番号が、７
１４、４４３、１７００でございますね。Example 6: Finding Mistakes in Listening g: Stay at Hotel New Otani Los Angeles 602
Issue room. The telephone numbers are 213, 443 and 1700. c: Yes, Mr. Suzuki is staying at room 602 of the New Otani Hotel. And your phone number is 7
14,443,1700.

【００７６】この対話はｉｎｆｏｒｍ（ｇ，ｃ，ａ
ｄ）、ｉｎｆｏｒｍ（ｇ，ｃ，ｔｅｌ）、次の発話はｉ
ｎｆｏｒｍ（ｃ，ｇ，ａｄ）、ｉｎｆｏｒｍ（ｃ，ｇ，
ｔｅｌ’）と記述される。Ｂ（ｇ，ａｄ）、Ｂ（ｇ，ｔ
ｅｌ）は初期状態で与えられ、Ｂ（ｇ，Ｎ（ｃ，ａ
ｄ））、Ｂ（ｇ，Ｎ（ｃ，ｔｅｌ））は直前の発話によ
って生じたものとする。単純化のため、発話生成規則はThis conversation is performed in the form (g, c, a
d), inform (g, c, tel), the next utterance is i
nform (c, g, ad), inform (c, g,
tel ′). B (g, ad), B (g, t
el) is given in the initial state, and B (g, N (c, a
d)) and B (g, N (c, tel)) are assumed to have been generated by the immediately preceding utterance. For simplicity, the utterance generation rule is

【数２９】Ｂ（ｃ，ａｄ）→Ｂ（ｃ，Ｎ（ｃ，ａｄ））## EQU29 ## B (c, ad) → B (c, N (c, ad))

【数３０】Ｂ（ｃ，ｔｅｌ）→Ｂ（ｃ，Ｎ（ｃ，ｔｅ
ｌ））とする。すると、内部状態の遷移は以下のようになる。## EQU30 ## B (c, tel) → B (c, N (c, te)
l)) Then, the transition of the internal state is as follows.

【００７７】[0077]

【数３１】Ｂ（ｇ，Ｎ（ｃ，ａｄ））∧Ｂ（ｇ，ａｄ）∧ Ｂ（ｇ，Ｎ（ｃ，ｔｅｌ））∧［Ｂ（ｇ，ｔｅｌ）］ ↓ｉｎｆｏｒｍ（ｇ，ｃ，ａｄ），ｉｎｆｏｒｍ（ｇ，ｃ，ｔｅｌ）Ｂ（ｃ，Ｂ（ｇ，ａｄ）） ∧Ｂ（ｃ，ａｄ）∧ Ｂ（ｃ，Ｂ（ｇ，ｔｅｌ’））∧Ｂ（ｃ，ｔｅｌ’） →Ｂ（ｃ，Ｎ（ｇ，ａｄ））∧Ｂ（ｃ，ａｄ）∧ Ｂ（ｃ，Ｎ（ｇ，ｔｅｌ’））∧Ｂ（ｃ，ｔｅｌ’） ↓ｉｎｆｏｒｍ（ｃ，ｇ，ａｄ’），ｉｎｆｏｒｍ（ｃ，ｇ，ｔｅｌ’）Ｂ（ｇ，Ｂ（ｃ，ａｄ）） ∧Ｂ（ｇ，ａｄ）∧ Ｂ（ｇ，Ｂ（ｃ，ｔｅｌ”））∧［Ｂ（ｇ，ｔｅｌ”）］B (g, N (c, ad)) ∧B (g, ad) ∧B (g, N (c, tel)) ∧ [B (g, tel)] ↓ inform (g, c, ad), inform (g, c, tel) B (c, B (g, ad)) ∧ B (c, ad) Ｂ B (c, B (g, tel ')) ∧ B (c, tel') → B (c, N (g, ad ')) ∧B (c, ad) ∧B (c, N (g, tel')) ∧B (c, tel ') ↓ inform (c, g, ad') , Inform (c, g, tel ') B (g, B (c, ad)) {B (g, ad)} B (g, B (c, tel "))} [B (g, tel") ]

【００７８】この例では、下線部で示したように、ｃは
ａｄを信じるとともにｇがａｄを信じていると信じてお
り、ｇの信念に関しても同様のことがいえる。従って、
住所に関しては相互信念が成り立つ。一方、［］の囲み
で示したように、ｇはｔｅｌを信じかつｔｅｌ”も信じ
ている。¬（ｔｅｌ∧ｔｅｌ”）なので電話番号につい
ては相互信念にならず聞き間違いが発見された。In this example, as shown by the underlined part, c believes in ad and g believes in ad. The same can be said for g's belief. Therefore,
Mutual beliefs hold for addresses. On the other hand, as shown in the box of [], g believes in tel and also believes in tel. ”(Tel @ tel”), so that the telephone number was not a mutual belief and a misunderstanding was found.

【００７９】本実施形態によれば、情報収集を目的とす
る対話に対する情報の授受に基づく対話のモデルを用い
た対話システムを開示した。このモデルでは「知る必要
がある」という様相オペレータの導入により、情報収集
側の要求と情報提供側の告知の双方が統一的な説明を与
えることができる。また、実際の対話に現れる確認や聞
き間違いの発見など動的に生じる現象も同一の枠組みに
よって説明できる。According to the present embodiment, a dialogue system using a dialogue model based on information exchange for a dialogue for information collection has been disclosed. In this model, both the request of the information collecting side and the notification of the information providing side can provide a unified explanation by introducing the modality operator who needs to know. Also, dynamically occurring phenomena, such as confirmations in actual conversations and discovery of listening mistakes, can be explained by the same framework.

【００８０】以上説明したように、本発明に係る実施形
態によれば、以下の効果を有する。（ａ）タスクに依存しない情報収集を目的とする対話に
対する情報の授受に基づく対話システムを提供すること
ができる。（ｂ）発話生成規則や知識を追加することで、応答機能
の拡張や学習機能の組み込みが容易であり、種々の規格
外の対応が可能である。従って、従来技術に比較して適
用する対話の範囲を広くすることができ、しかも柔軟性
がある協調型の対話システムを提供することができる。As described above, the embodiment according to the present invention has the following effects. (A) It is possible to provide a dialogue system based on the exchange of information for a dialogue for information collection independent of tasks. (B) By adding utterance generation rules and knowledge, it is easy to extend the response function and incorporate the learning function, and it is possible to cope with various nonstandard specifications. Therefore, it is possible to broaden the range of dialogue to be applied as compared with the related art, and it is possible to provide a flexible cooperative dialogue system.

【００８１】[0081]

【発明の効果】以上詳述したように本発明によれば、発
声される発声音声文の音声を文字列に音声認識して、音
声認識された文字列に応答して情報収集のための対話に
おける応答の発語内容の文字列を生成した後、発語内容
の文字列を音声合成して出力する対話システムであっ
て、対話の進行とともに変化する対話の状況を示す内部
状態を、（ａ）αがφを知っていることを表す第１の様相演算子
Ｋ（α，φ）と、（ｂ）αがφを信じていることを表す第２の様相演算子
Ｂ（α，φ）と、（ｃ）αがφを知る必要があることを表す第３の様相演
算子Ｎ（α，φ）とを用いて表して格納する内部状態記
述記憶装置と、（Ａ）標準命題様相論理の性質を様相演算子を用いて表
した一般の公理と、（Ｂ）必然性の規則を表す推論規則と、（Ｃ）複数の事実が論理積で成立するならば、ある事実
が成立するということを表す背景知識と、発話の順序に
関する制約を示す発話生成規則とを有するタスク依存の
知識とを格納する知識記述記憶装置と、発話の順序に関
する制約を示す発話生成規則を格納する発話生成規則記
憶装置と、発声される発声音声文の音声を文字列に音声
認識して出力する音声認識手段と、文字列を情報要求又
は情報提供の内容を表す中間言語に変換するための変換
パターンモデルを参照して、上記音声認識手段によって
音声認識された文字列を、情報要求又は情報提供の内容
を表す中間言語に変換して出力する前処理手段と、上記
前処理手段から出力される情報要求又は情報提供の内容
を表す中間言語から、上記内部状態記述記憶装置内の内
部状態を参照して、内部状態を遷移させてその内部状態
を更新するとともに、遷移後の内部状態を出力する状態
遷移処理手段と、上記状態遷移処理手段から出力される
内部状態から、上記知識記述記憶装置内の知識と、上記
発話生成規則記憶装置内の発話生成規則とを参照して、
現在の状態における当該対話において応答して発話すべ
き内容を推論してその中間言語を生成して出力する推論
処理手段と、上記内部状態記述記憶装置内の内部状態を
参照して、上記推論処理手段から出力される応答して発
話すべき内容の中間言語を、情報要求又は情報提供の中
間言語の出力データに変換して出力するデータ出力処理
手段と、情報要求又は情報提供の内容を表す中間言語を
文字列に逆変換するための逆変換パターンモデルを参照
して、上記データ出力処理手段から出力される情報要求
又は情報提供の内容を表す中間言語の出力データを、当
該中間言語に対応する文字列に逆変換して出力する後処
理手段と、上記後処理手段から出力される文字列を音声
合成してそれに対応する音声を出力する音声合成手段と
を備える。As described above in detail, according to the present invention, the voice of an uttered voice sentence to be uttered is recognized as a character string, and the dialogue for collecting information is performed in response to the recognized character string. A dialogue system that generates a character string of the utterance content of the response in and then synthesizes and outputs the character string of the utterance content, wherein the internal state indicating the status of the dialogue that changes with the progress of the dialogue is represented by (a) ) A first modal operator K (α, φ) indicating that α knows φ, and (b) a second modal operator B (α, φ) indicating that α believes φ. (C) an internal state description storage device which is represented and stored using a third modality operator N (α, φ) indicating that α needs to know φ, and (A) standard propositional modal logic. General axioms that express the properties of, using modal operators; (B) inference rules that represent the rules of necessity; and (C) multiple things. A knowledge description storage device for storing background knowledge indicating that a certain fact is satisfied if AND is established, and task-dependent knowledge having utterance generation rules indicating constraints on the order of utterances; An utterance generation rule storage device for storing utterance generation rules indicating restrictions on the order of speech, a voice recognition means for recognizing and outputting a voice of an uttered voice sentence as a character string, and requesting or providing information on the character string Before converting the character string recognized by the voice recognition unit into an intermediate language representing the content of the information request or information provision with reference to a conversion pattern model for converting the content into an intermediate language representing the content of The internal state is transited by referring to the internal state in the internal state description storage device from the processing means and the intermediate language representing the content of the information request or information provision output from the preprocessing means. State update processing means for updating the internal state and outputting the internal state after the transition, knowledge from the knowledge description storage device based on the internal state output from the state transition processing means, and generation of the utterance With reference to the utterance generation rule in the rule storage device,
Inference processing means for inferring the content to be uttered in response to the dialogue in the current state, generating and outputting the intermediate language, and the inference processing with reference to the internal state in the internal state description storage device Data output processing means for converting the intermediate language of the content to be spoken in response to the information output from the means into output data of an intermediate language for requesting or providing information, and outputting the data; and an intermediate representing the content of the information request or providing information. With reference to an inverse conversion pattern model for inversely converting a language into a character string, the output data of the intermediate language representing the content of the information request or information provision output from the data output processing means corresponds to the intermediate language. Post-processing means for inverting and outputting the character string, and voice synthesizing means for synthesizing the character string output from the post-processing means and outputting the corresponding voice.

【００８２】従って、本発明によれば、以下の効果を有
する。（ａ）タスクに依存しない情報収集を目的とする対話に
対する情報の授受に基づく対話システムを提供すること
ができる。（ｂ）発話生成規則や知識を追加することで、応答機能
の拡張や学習機能の組み込みが容易であり、種々の規格
外の対応が可能である。従って、従来技術に比較して適
用する対話の範囲を広くすることができ、しかも柔軟性
がある協調型の対話システムを提供することができる。Accordingly, the present invention has the following effects. (A) It is possible to provide a dialogue system based on the exchange of information for a dialogue for information collection independent of tasks. (B) By adding utterance generation rules and knowledge, it is easy to extend the response function and incorporate the learning function, and it is possible to cope with various nonstandard specifications. Therefore, it is possible to broaden the range of dialogue to be applied as compared with the related art, and it is possible to provide a flexible cooperative dialogue system.

[Brief description of the drawings]

【図１】本発明に係る一実施形態である対話システム
の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a dialogue system according to an embodiment of the present invention.

[Explanation of symbols]

１…マイクロホン、２…Ａ／Ｄ変換器、３…音声認識部、４…前処理部、５…発話生成処理部、６…後処理部、７…音声合成部、８…Ｄ／Ａ変換器、９…スピーカ、１１…パターンモデルメモリ、１２…パターンモデルメモリ、２１…状態遷移処理部、２２…推論処理部、２３…データ出力処理部、３１…内部状態記述メモリ、３２…発話生成規則メモリ、３３…知識記述メモリ、１００…対話参加者。 DESCRIPTION OF SYMBOLS 1 ... Microphone, 2 ... A / D converter, 3 ... Voice recognition part, 4 ... Pre-processing part, 5 ... Speech generation processing part, 6 ... Post-processing part, 7 ... Voice synthesis part, 8 ... D / A converter Reference numeral 9: Speaker, 11: Pattern model memory, 12: Pattern model memory, 21: State transition processing unit, 22: Inference processing unit, 23: Data output processing unit, 31: Internal state description memory, 32: Utterance generation rule memory 33: knowledge description memory; 100: dialog participants.

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平７−239694（ＪＰ，Ａ) 特開平７−210391（ＪＰ，Ａ) ”Ｂｅｌｉｅｆ，Ａｗａｒｅｎｅｓｓ，ａｎｄＬｉｍｉｔｅｄＲｅａｓｏｎｉｎｇ”ＡｒｔｉｆｉｃａｌＩｎｔｅｌｌｉｇｅｎｃｅ，Ｖｏｌ．34，Ｎｏ．１，Ｄｅｃｅｍｂｅｒ 1987，ｐ. 39 (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 17/27 - 17/30 G10L 3/00 - 9/20 G10L 13/00 - 15/00 ＪＩＣＳＴファイル（ＪＯＩＳ)──────────────────────────────────────────────────続き Continuation of the front page (56) References JP-A-7-239694 (JP, A) JP-A-7-210391 (JP, A) "Belief, Awarenes, and Limited Reasoning" Artificial Intelligence, Vol. . 34, No. 1, December 1987, p. 39 (58) Fields investigated (Int. Cl. ⁷ , DB name) G06F 17/27-17/30 G10L 3/00-9/20 G10L 13/00-15/00 JICST file (JOIS)

Claims

(57) [Claims]

1. A speech recognition system comprising: a speech recognition unit that recognizes a voice of an uttered voice sentence to be uttered into a character string and generates a character string of an utterance content of a response in a dialogue for information collection in response to the character recognition string; A dialogue system for speech-synthesizing and outputting a character string of the utterance content, wherein an internal state indicating the state of the dialogue changing with the progress of the dialogue is represented by: (a) a state indicating that α knows φ. Modal operator of 1
K (α, φ) and (b) a second modal operator indicating that α believes φ.
B (α, φ) and (c) the third aspect that α needs to know φ
An internal state description storage device that expresses and stores using an operator N (α, φ); and (A) expresses the properties of standard propositional modal logic using modal operators.
General axioms that, (B) if the inference rules that represent the necessity of regulations, (C) a plurality of facts is established in logical, facts
And the order of utterance
Task-dependent with utterance generation rules indicating constraints on
And knowledge description storage device that stores the knowledge, and the speech generation rule storage unit for storing a speech production rule indicating the constraints on the order of speech, speech to be output by the speech recognition speech utterance voice statement string uttered Recognition means and convert character strings into intermediate language to express information request or information provision
Refer to the conversion pattern model for conversion and
A pre-processing unit that converts the character string recognized by the voice recognition unit into an intermediate language representing the contents of the information request or information provision and outputs the intermediate request; and an information request or information provision output from the pre-processing unit.
State transition processing means for referring to the internal state in the internal state description storage device from the intermediate language representing the content, transiting the internal state and updating the internal state, and outputting the internal state after the transition; from the internal state is outputted from the state transition processing means, with reference and knowledge in the knowledge description storage unit, and a speech generation rule in the utterance generation rule storage unit, put the current state
That in response in the dialogue to infer the contents to be uttered and inference processing means for generating and outputting an intermediate language, with reference to the internal states within the internal state description storage device, upper
Content to be uttered in response to output from the inference processing means
Of the intermediate language, and a data output processing means for converting the output data of the intermediate language information request or information providing, the intermediate language into a character string representing the content of the information request or information providing
Refer to the inverse transformation pattern model for inverse transformation, and
Information request or information output from the data output processing means
Post-processing means for inverting the output data of the intermediate language representing the content of the provision into a character string corresponding to the intermediate language and outputting the converted character string; A dialogue system, comprising: voice synthesis means for outputting voice.