JPH11102196A

JPH11102196A - Speech interactive system, method of speech interaction, and storage medium

Info

Publication number: JPH11102196A
Application number: JP9263229A
Authority: JP
Inventors: Masako Hirose; 雅子広瀬
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1997-09-29
Filing date: 1997-09-29
Publication date: 1999-04-13

Abstract

PROBLEM TO BE SOLVED: To provide a speech interactive system, a method of speech interaction, and a storage medium therefor inducing user's utterance. SOLUTION: When interaction is started, a language processing part 3 generates an output candidate sentence, a recognition candidate sentence, from expressions and sentence patterns referring to a sentence template dictionary 4, task knowledge data 5, and topic data 6. A voice synthesis part 2 outputs one of the generated output candidate sentences. When a speech is inputted in response to this, the speech recognition part 1 recognizes the recognition candidate sentence with priority generated from the expressions and sentence patterns, and sends the recognized word of the recognition result to the language processing part 3. The language processing part 3 generates the following output candidate sentence and the recognition candidate sentence based on this recognition result. When no speech is inputted or no recognition result is obtained, the following output candidate sentence, etc., are generated accordingly. Interaction is carried out by repeating these.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声対話システム
及び音声対話方法及び記録媒体に関し、より詳細には、
使用者の発話を誘導する音声対話システム及び音声対話
方法及び記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a spoken dialogue system, a spoken dialogue method, and a recording medium.
The present invention relates to a voice dialogue system for guiding a user's utterance, a voice dialogue method, and a recording medium.

【０００２】[0002]

【従来の技術】近年、音声によって機械に格納された情
報をやりとりする音声対話システムの研究や試作が盛ん
である。音声が人間のもっとも自然な入出力手段である
ことから、情報の入出力手段としても注目されている。
しかし、人間が機械と音声で情報をやりとりする際、人
間の発話の速度，タイミング，使用する語が円滑な対話
に影響する。そこで、人間の発話の速度に合わせて機械
側の応答を変化させる技術が開発されている。例えば、
特開平５−１７３５８９号公報に開示された技術は、同
じ意味を表わす文字数の異なる応答文を複数用意し、入
力された発話の速度に応じて応答文を変えるというもの
である。2. Description of the Related Art In recent years, research and trial production of a voice dialogue system for exchanging information stored in a machine by voice have been active. Since voice is the most natural input / output means of humans, it is also attracting attention as information input / output means.
However, when a human exchanges information with a machine by voice, the speed and timing of the human utterance and the words used affect a smooth dialogue. Therefore, a technology for changing the response of the machine side according to the speed of human speech has been developed. For example,
The technique disclosed in Japanese Patent Application Laid-Open No. Hei 5-173589 is to prepare a plurality of response sentences having the same meaning but different numbers of characters, and change the response sentences according to the speed of the input utterance.

【０００３】[0003]

【発明が解決しようとする課題】しかし、この技術は、
使用者の発話速度に応じて応答を変えるという使用者に
合わせるだけの方略なので、使用者の発話速度が非常に
速かったり、遅い場合に、システム側が合わせることの
できる限度をこえる場合が起こりうる。However, this technique is
Since it is only a strategy to change the response according to the user's utterance speed in response to the user, if the utterance speed of the user is very fast or slow, the system side may exceed the limit that can be adjusted.

【０００４】人間同士の対話をみた場合に、相手がどん
なテンポ，言い回しかによって、それに合わせる作用が
働くことが知られており、対話を円滑に行なうには、機
械の側から使用者に合わせるだけでなく、使用者の発話
を自然な形で誘導することが必要である。こうすること
で、機械の能力に限界があった場合にも円滑に対話を行
なうことができる。[0004] It is known that when a human talks with each other, the opponent works according to the tempo and language of the opponent. In order to carry out the dialogue smoothly, it is only necessary to match the user from the machine side. Instead, it is necessary to guide the user's utterance in a natural manner. In this way, even if the capability of the machine is limited, the conversation can be smoothly performed.

【０００５】また、対話システムの構成要素である音声
認識を考えた場合にも、タスクに関係するキーワードだ
けではなく、その前後の助詞や助動詞などの単語も認識
することがより高い認識率につながる。このことから
も、使用者の発話を誘導，予測することが効果的であ
る。[0005] In addition, in the case of speech recognition, which is a component of a dialog system, recognizing not only keywords related to a task but also words such as particles and auxiliary verbs before and after the task leads to a higher recognition rate. . From this, it is effective to guide and predict the utterance of the user.

【０００６】本発明は、上述のような実情に鑑みてなさ
れたもので、使用者に対しての発話のしやすさや語順を
考慮した音声出力を行ない、使用者からの発話を、シス
テムが出力した表現に基づいて認識を行なう音声対話シ
ステム及び音声対話方法及び記録媒体を提供するもので
ある。SUMMARY OF THE INVENTION The present invention has been made in view of the above-described circumstances, and performs voice output in consideration of ease of utterance and word order to a user, and outputs utterances from the user by the system. A voice dialogue system, a voice dialogue method, and a recording medium for performing recognition based on the expressed expression.

【０００７】[0007]

【課題を解決するための手段】請求項１の発明は、文又
は単語を出力し、入力音声を認識する音声対話方式にお
いて、予め設定した表現・文型の出力候補文及び認識候
補文を生成する手段と、前記出力候補文の一つを音声出
力する手段と、音声出力した出力文の表現・文型によ
り、使用者の入力音声を認識する際の認識候補単語又は
認識候補表現を設定する手段を有することを特徴とし、
もって、使用者の発話を誘導し、音声出力した表現・文
型と同じ表現・文型を、使用者の発話中に優先的に認識
することによって音声対話を円滑に進めるようにしたも
のである。According to a first aspect of the present invention, an output candidate sentence and a recognition candidate sentence of a predetermined expression / sentence pattern are generated in a speech dialogue system for outputting a sentence or word and recognizing an input voice. Means, a means for outputting one of the output candidate sentences by voice, and a means for setting a recognition candidate word or a recognition candidate expression when recognizing a user's input voice, based on the expression / sentence pattern of the output sentence output by voice. Characterized by having
Thus, the user's utterance is guided, and the same expression / sentence pattern as the speech-outputted expression / sentence pattern is preferentially recognized during the user's utterance so that the speech dialogue can be smoothly performed.

【０００８】請求項２の発明は、請求項１の発明におい
て、前記予め設定した表現・文型の出力候補文及び認識
候補文が、予め設定した特定条件の表現・文型の出力候
補文及び認識候補文であることを特徴とし、もって、特
に発話しやすい表現・文型にしてより強く発話を誘導
し、認識率を向上させたものである。According to a second aspect of the present invention, in the first aspect, the preset expression / sentence type output candidate sentence and recognition candidate sentence are set to a predetermined specific condition expression / sentence type output candidate sentence and recognition candidate sentence. The feature is that the sentence is a sentence, so that the expression / sentence pattern that is particularly easy to speak is induced more strongly, and the recognition rate is improved.

【０００９】請求項３の発明は、請求項２の発明におい
て、前記特定条件として、現在の話題を表わす語句を文
頭に位置させたことを特徴とし、もって、話題となって
いる事柄を語順として先にした文を出力することによ
り、使用者に話題を強く意識させ、より効率的な対話を
可能にしたものである。[0009] The invention of claim 3 is characterized in that, in the invention of claim 2, a word representing a current topic is positioned at the beginning of the sentence as the specific condition, and the topics that have become topics are arranged in word order. By outputting the above sentence, the user is strongly aware of the topic, and more efficient dialogue is enabled.

【００１０】請求項４の発明は、請求項２の発明におい
て、前記特定条件として、断定の助動詞を用いたことを
特徴とし、もって、使用者が同じ表現・文型を用いる可
能性を高め、より効率的な対話を行うことを可能にした
ものである。A fourth aspect of the present invention is characterized in that, in the second aspect of the present invention, an assertive auxiliary verb is used as the specific condition, thereby increasing the possibility that the user uses the same expression / sentence pattern. This enables efficient dialogue.

【００１１】請求項５の発明は、文又は単語を出力し、
入力音声を認識する音声対話方式において、文の後半又
は文末を省略した表現・文型の出力候補文及び認識候補
文を生成する手段と、前記出力候補文の一つを音声出力
する手段と、音声出力した文の後半又は文末を省略した
出力文の表現・文型により、使用者の入力音声を認識す
る際の認識候補単語又は認識候補表現に、省略した文の
後半又は文末に接続可能な語句を設定する手段を有する
ことを特徴とし、もって、使用者の発話を強く誘導し、
より効率的な対話を可能にしたものである。[0011] The invention according to claim 5 outputs a sentence or a word,
A means for generating an expression / sentence-type output candidate sentence and a recognition candidate sentence in which the latter half or the end of the sentence is omitted, a means for outputting one of the output candidate sentences by voice, According to the expression / sentence pattern of the output sentence in which the latter half of the output sentence or the end of the sentence is omitted, a word that can be connected to the latter half or the end of the omitted sentence is used as a recognition candidate word or recognition candidate expression when recognizing the input speech of the user. It is characterized by having means for setting, thereby strongly inducing the user's utterance,
It has enabled more efficient dialogue.

【００１２】請求項６の発明は、文又は単語を出力し、
入力音声を認識して音声対話を行なう音声対話方法にお
いて、予め設定した表現・文型の出力候補文及び認識候
補文を生成し、生成した出力候補文の一つを音声出力
し、音声出力した出力文の表現・文型により、使用者の
入力音声を認識する際の認識候補単語又は認識候補表現
を優先的に認識し、認識結果に基づいて次の出力候補文
及び認識候補文を生成することを繰り返して音声対話を
行なう音声対話方法を提供するものである。[0012] The invention according to claim 6 outputs a sentence or a word,
In a voice interaction method for recognizing an input voice and performing a voice dialogue, an output candidate sentence and a recognition candidate sentence of a preset expression / sentence are generated, one of the generated output candidate sentences is output as a voice, and the voice output is output. According to the sentence expression / sentence pattern, preferentially recognize a recognition candidate word or a recognition candidate expression when recognizing a user's input voice, and generate a next output candidate sentence and a recognition candidate sentence based on the recognition result. An object of the present invention is to provide a voice interaction method in which voice interaction is repeatedly performed.

【００１３】請求項７の発明は、請求項１乃至６のいず
れか１記載の音声対話システム又は対話方法を実施する
関連データ及びプログラムを記録したコンピュータ読み
取り可能な記録媒体を提供するものである。According to a seventh aspect of the present invention, there is provided a computer-readable recording medium in which relevant data and a program for implementing the voice interactive system or the interactive method according to any one of the first to sixth aspects are recorded.

【００１４】[0014]

【発明の実施の形態】図１（Ａ）は、本発明の音声対話
システムの全体の構成を示すブロック図で、図中、１は
音声認識部、２は音声合成部、３は言語処理部、４は文
テンプレート辞書、５はタスク知識データ、６は話題デ
ータである。音声認識部１は入力された音声を特徴分析
し、言語処理部３が作成した認識候補を順に音声入力中
に照合し、認識結果を言語処理部３に渡す。音声合成部
２は言語処理部３が作成した発話文を音声出力する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1A is a block diagram showing the overall configuration of a speech dialogue system according to the present invention, wherein 1 is a speech recognition unit, 2 is a speech synthesis unit, and 3 is a language processing unit. 4 is a sentence template dictionary, 5 is task knowledge data, and 6 is topic data. The speech recognition unit 1 analyzes the characteristics of the input speech, collates the recognition candidates created by the language processing unit 3 sequentially during speech input, and passes the recognition result to the language processing unit 3. The voice synthesizer 2 outputs the utterance sentence created by the language processor 3 as voice.

【００１５】言語処理部３は、話題データ６に書かれた
発話タイプ，話題，次発話タイプに基づき、文テンプレ
ート辞書４から発話タイプによって現在の対話場面に応
じたテンプレート（文型）にタスク知識データ５の語を
埋め込み、認識候補文（認識候補単語列）と出力候補文
（出力候補単語）を作成し、音声認識部１及び音声合成
部２に渡す。The language processing unit 3 converts the task knowledge data from the sentence template dictionary 4 into a template (sentence pattern) corresponding to the current dialogue scene based on the utterance type based on the utterance type, topic, and next utterance type written in the topic data 6. 5 are embedded to create a recognition candidate sentence (recognition candidate word string) and an output candidate sentence (output candidate word), which are passed to the speech recognition unit 1 and the speech synthesis unit 2.

【００１６】図１（Ｂ）は、本発明の音声対話システム
の処理の概要を説明するための処理フローである。FIG. 1B is a processing flow for explaining the outline of the processing of the voice interaction system of the present invention.

【００１７】対話が開始されると、言語処理部３は、出
力候補文及び認識候補文を生成する（Ｓ１）。ここで、
使用者に対して出力候補文の一つを音声合成部２が出力
する（Ｓ２）。この出力文に対して、使用者から音声入
力があるか否かにより（Ｓ３）、音声入力がなければ出
力候補文等の生成に戻り（Ｓ３のＮＯ）、次の出力候補
文が生成され、音声入力があれば（Ｓ３のＹＥＳ）、出
力候補文と同じテンプレート番号の認識候補文を音声入
力中に優先的に認識する（Ｓ４）。認識されれば（Ｓ５
のＹＥＳ）、音声認識部１が認識候補文を音声入力中に
照合し、認識結果である認識単語を言語処理部３に渡
す。言語処理部３は、この認識結果に基づいて、タスク
が終了したか否か（Ｓ７）により、終了しない場合（Ｓ
７のＮＯ）は、次の出力候補文，認識候補文を生成す
る。また、認識結果が得られなかった場合（Ｓ５のＮ
Ｏ）は、他の認識候補文を認識し（Ｓ６）、タスクを終
了したか否か（Ｓ７）により、次の出力候補文，認識候
補文を生成したり、タスクを終了したりする。一般的に
は、これらを繰り返し、タスクが終了したときに対話は
終了する。本発明の音声対話システムは、システムが音
声出力に使ったテンプレートと同じテンプレート番号の
テンプレートで、音声認識の認識候補単語を設定するこ
とが特徴である。ここでは、認識や合成の単位は単語列
でも文でもよい。When the conversation is started, the language processing unit 3 generates an output candidate sentence and a recognition candidate sentence (S1). here,
The speech synthesis unit 2 outputs one of the output candidate sentences to the user (S2). Depending on whether or not there is a voice input from the user with respect to this output sentence (S3), if there is no voice input, the process returns to the generation of output candidate sentences (NO in S3), and the next output candidate sentence is generated, If there is a voice input (YES in S3), a recognition candidate sentence having the same template number as the output candidate sentence is preferentially recognized during the voice input (S4). If recognized (S5
YES), the speech recognition unit 1 collates the recognition candidate sentence during speech input, and passes a recognition word as a recognition result to the language processing unit 3. The language processing unit 3 determines whether or not the task has been completed based on the recognition result (S7), and determines whether or not the task has been completed (S7).
7) generates the next output candidate sentence and recognition candidate sentence. If no recognition result is obtained (N in S5)
O) recognizes another recognition candidate sentence (S6) and generates the next output candidate sentence and recognition candidate sentence or ends the task depending on whether the task has been completed (S7). Generally, these are repeated, and the dialog ends when the task ends. The speech dialogue system of the present invention is characterized in that a recognition candidate word for speech recognition is set using a template having the same template number as the template used for speech output by the system. Here, the unit of recognition and synthesis may be a word string or a sentence.

【００１８】（請求項１の発明）図２（Ａ）は、請求項
１の発明を説明するための文テンプレート辞書の例であ
り、対話の場面を表わす発話タイプ，文型を記述したテ
ンプレート，テンプレート番号からなる。発話タイプは
対話が行なわれるタスク上、あるいは対話上の場面のよ
うなものである。ある場面で発話する語句の意味とその
語順を対応づけて記述したものである。場面によって変
更される内容は（）による変数として表現され、意味カ
テゴリが記述される。FIG. 2A is an example of a sentence template dictionary for explaining the invention of claim 1, which is a template describing a speech type and a sentence pattern representing a scene of a dialogue. Consists of numbers. The utterance type is like a task on which the dialogue is performed or a scene on the dialogue. It describes the meaning of words spoken in a certain scene and their word order in association with each other. The content changed by the scene is expressed as a variable by (), and a semantic category is described.

【００１９】図２（Ｂ）は、請求項１の発明を説明する
ためのタスク知識データの例であり、タスクに関する意
味カテゴリとそれを表わす語からなる。上記文テンプレ
ート辞書の例では、質問の場面では「(イベント)」の部
分にはタスク知識データの意味カテゴリ「イベント」に
該当する単語が位置し、そのあとに「は」「いつ」「あ
りますか」が接続する。テンプレート番号は、発話タイ
プごとに表現や語順によって分けたものに番号をつけた
ものである。同じ番号のものは同じ語順や表現であるこ
とを表わしている。文テンプレート辞書の一番上のテン
プレートは、テンプレート番号が「１」であり、例え
ば、別の発話タイプの同じテンプレート番号のテンプレ
ートを認識候補作成などに使う。テンプレート番号は、
対応するテンプレートが複数ある場合には複数記述する
ことができる。FIG. 2B shows an example of task knowledge data for explaining the first aspect of the present invention, which comprises semantic categories related to tasks and words representing the categories. In the example of the above sentence template dictionary, in the question scene, the word corresponding to the semantic category "event" of the task knowledge data is located in the "(event)" part, followed by "ha""when"" Is connected. The template number is a number assigned to each utterance type divided according to expression or word order. Those having the same number indicate that they have the same word order or expression. The template at the top of the sentence template dictionary has a template number of “1”. For example, a template having the same template number of another utterance type is used for creating a recognition candidate. The template number is
If there are a plurality of corresponding templates, a plurality of templates can be described.

【００２０】図２（Ｃ）は、請求項１の発明を説明する
ための話題データの例であり、現在の発話タイプ，話
題，次発話タイプからなる。発話タイプによって文テン
プレート辞書からテンプレートを選択し、話題をテンプ
レートに埋め込む。次発話タイプは、現在の発話タイプ
の次の場面を表わし、本発明では、システム出力後の使
用者の発話を認識する際の、認識候補単語を設定するテ
ンプレートを選択するのに使用する。話題は、現在の発
話場面で話題となる事柄を項目と語の対を０個以上記述
するものである。この話題をテンプレートに埋め込み、
文を生成する。FIG. 2C shows an example of topic data for explaining the invention of claim 1, which is composed of a current utterance type, a topic, and a next utterance type. A template is selected from the sentence template dictionary according to the utterance type, and the topic is embedded in the template. The next utterance type represents a scene next to the current utterance type, and is used in the present invention to select a template for setting a recognition candidate word when recognizing a user's utterance after system output. The topic describes zero or more pairs of items and words that represent a topic in the current utterance scene. Embed this topic in the template,
Generate a statement.

【００２１】スケジュール管理や登録を行なう音声対話
システムを例に処理の流れを説明する。まず、言語処理
部３が出力候補文，認識候補文を生成する。話題データ
の発話タイプが「質問」なので、文テンプレート辞書で
発話タイプが「質問」のテンプレートをもとに出力候補
を生成する。話題データの話題の欄の項目「(イベン
ト)」を持つテンプレートの「(イベント)」の個所に話
題データの「会議」を埋め込み、「会議はいつあります
か」といった文を生成する。The flow of processing will be described with reference to a voice dialogue system for performing schedule management and registration. First, the language processing unit 3 generates an output candidate sentence and a recognition candidate sentence. Since the utterance type of the topic data is “question”, output candidates are generated based on the template of the utterance type “question” in the sentence template dictionary. The "conference" of the topic data is embedded in the "(event)" part of the template having the item "(event)" in the topic column of the topic data, and a sentence such as "when is there a meeting" is generated.

【００２２】それとともに、話題データの次発話タイプ
が「応答」なので、文テンプレート辞書の「応答」のテ
ンプレートにタスク知識データの（日付）の値を埋め込
み、「会議は７月２４日にあります」「７月２４日にあ
ります」「会議は来週の月曜日にあります」「来週の月
曜日にあります」といった単語列、もしくは文を認識候
補として生成する。At the same time, since the next utterance type of the topic data is “response”, the value of (date) of the task knowledge data is embedded in the “response” template of the sentence template dictionary, and “the meeting is on July 24” A word string or a sentence such as “It is on July 24”, “The meeting is on next Monday” and “It is on next Monday” is generated as a recognition candidate.

【００２３】ここで、出力候補文の一つ、例えば、テン
プレート番号１の「会議はいつありますか」を音声出力
する。このあと、使用者の音声入力があると音声認識部
１は、言語処理部３が作成した認識候補列を入力音声中
に照合する。認識候補列は、音声出力した文のテンプレ
ート番号と同じ１のテンプレートで作成した認識候補
文、もしくは認識候補列を優先的に照合する。Here, one of the output candidate sentences, for example, "When is there a meeting?" Thereafter, when there is a user's voice input, the voice recognition unit 1 checks the recognition candidate sequence created by the language processing unit 3 in the input voice. The recognition candidate sequence preferentially matches a recognition candidate sentence or a recognition candidate sequence created with the same template as the template number of the sentence output.

【００２４】例えば、システムが「いつ会議があります
か」といった文を出力し、使用者の「来週の月曜日にあ
ります」といった発話を扱うことができる。使用者が異
なる語句を話して音声出力中に単語が認識されなかった
場合は、他の認識候補文を照合する。以上をタスクが終
了するまで繰り返す。For example, the system can output a sentence such as "When is a meeting?" And handle a user's utterance such as "I am on next Monday." When the user speaks a different phrase and the word is not recognized during the voice output, another recognition candidate sentence is collated. The above is repeated until the task is completed.

【００２５】（請求項２の発明）図３は、請求項２の発
明を説明するための文テンプレート辞書の例である。形
式は、請求項１の発明の文テンプレート辞書と同じであ
る。(Invention of Claim 2) FIG. 3 is an example of a sentence template dictionary for explaining the invention of claim 2. The format is the same as the sentence template dictionary of the first aspect of the present invention.

【００２６】請求項２の発明は、発話しやすい表現や発
話の誘導がしやすい表現を音声出力して使用者の発話を
誘導し、また認識候補に設定する。例えば、テンプレー
ト番号の若い番号ほど発話しやすい表現や発話誘導のさ
れやすい表現として設定するものである。この発明の例
では、テンプレート番号が「１」のデータとして、発話
タイプが「質問」で「(イベント)」の部分にはタスク知
識データの意味カテゴリ「イベント」に該当する単語が
位置し、そのあとには「は」が接続し、さらにその後に
「いつ」が位置し、最後に「ですか」が接続することを
表わしている。ここでは、「〜は〜です」といった文型
が発話しやすい表現として、テンプレート番号を「１」
としている。なお、請求項２の発明で使用するタスク知
識データ，話題データは請求項１のものと同じである。According to a second aspect of the present invention, the user's utterance is guided by outputting an expression that is easy to speak or an expression that easily induces the utterance, and is set as a recognition candidate. For example, the lower the template number, the easier the utterance or the utterance guidance. In the example of the present invention, a word corresponding to the semantic category “event” of the task knowledge data is located in the part where “utterance type” is “question” and “(event)” as template number “1” data. After that, "ha" is connected, followed by "when", and finally "is" is connected. Here, the template number is “1” as an expression that makes it easy to speak a sentence pattern such as
And The task knowledge data and topic data used in the second aspect of the invention are the same as those of the first aspect.

【００２７】ここで、スケジュール管理や登録を行なう
音声対話システムを例に処理の流れを説明する。言語処
理部３が出力候補文，認識候補文を生成する。話題デー
タの発話タイプが「質問」なので、文テンプレート辞書
の発話タイプが「質問」であるテンプレートのうち、テ
ンプレート番号が１のテンプレートを使い、タスク知識
データに書かれた情報を埋め込み、「会議はいつです
か」といった文を生成する。それとともに、話題データ
の次発話タイプが「応答」であるので、文テンプレート
辞書の発話タイプが「応答」のテンプレートに、タスク
知識データを埋め込み、「会議は７月２４日です」「７
月２４日です」「会議は来週の月曜日です」「来週の月
曜日です」といった単語列、もしくは文を認識候補とし
て生成する。次に、生成した文「会議はいつですか」を
音声出力する。このあと、使用者の音声入力があると、
音声認識部１は、言語処理部３が作成した認識候補文を
入力音声中に照合する。認識候補文は、発話タイプが音
声出力した文のテンプレート番号と同じ、１のテンプレ
ートで作成した認識候補文、もしくは認識候補列を優先
的に照合する。Here, the flow of processing will be described with reference to a voice dialogue system for performing schedule management and registration. The language processing unit 3 generates an output candidate sentence and a recognition candidate sentence. Since the utterance type of the topic data is "question", among the templates whose utterance type in the sentence template dictionary is "question", the template number 1 is used, and the information written in the task knowledge data is embedded. When is it? " At the same time, since the next utterance type of the topic data is "response", the task knowledge data is embedded in the template whose utterance type of the sentence template dictionary is "response", and "the meeting is July 24""7
A word string or a sentence, such as “Month 24”, “Meeting is next Monday”, “Next Monday”, or a sentence is generated as a recognition candidate. Next, the generated sentence "When is the meeting?" After this, if there is a user's voice input,
The speech recognition unit 1 checks the recognition candidate sentence created by the language processing unit 3 in the input speech. The recognition candidate sentence preferentially matches a recognition candidate sentence or a recognition candidate sequence created with one template having the same utterance type as the template number of the sentence output as speech.

【００２８】例えば、システム出力「会議はいつです
か」、使用者の「来週の月曜日です」「会議は来週の月
曜日です」といった発話を扱うことができる。もし使用
者が異なる語句を話して音声出力中に単語が認識されな
かった場合は、他の認識候補文を照合する。以上をタス
クが終了するまで繰り返す。For example, utterances such as the system output "When is the meeting?" And the user's "Next Monday is coming" or "Meeting is next Monday" can be handled. If the user speaks a different phrase and the word is not recognized during the voice output, another recognition candidate sentence is checked. The above is repeated until the task is completed.

【００２９】（請求項３の発明）図４（Ａ），図４
（Ｂ），図４（Ｃ）は、それぞれ請求項３の発明を説明
するための文テンプレート辞書，タスク知識データ，話
題データの例を示す図である。データ形式は、請求項１
のものと同じである。なお、図４（Ｃ）の話題データで
は、話題の欄の記述内容が複数あった場合に、より左の
事柄が話題として重要であるとする。(Invention of Claim 3) FIGS. 4A and 4
FIGS. 4B and 4C are diagrams showing examples of a sentence template dictionary, task knowledge data, and topic data, respectively, for explaining the invention of claim 3. Claim 1
Is the same as In the topic data of FIG. 4C, when there are a plurality of descriptions in the topic column, it is assumed that the left-most matter is important as the topic.

【００３０】請求項３の発明は、システム側の出力文を
現在の話題を語順として先にすることによって、特定の
話題についてユーザに発話させることを仕組むことを特
徴としている。認識候補単語の設定でも、現在の話題を
語順として先にした文テンプレートを優先的に照合対象
とする。The invention according to claim 3 is characterized in that the system is designed to make the user speak about a specific topic by giving the output sentence on the system side the current topic first in word order. Also in the setting of the recognition candidate word, a sentence template in which the current topic is arranged in the word order first is preferentially set as a matching target.

【００３１】スケジュール管理や登録を行なう音声対話
システムを例に処理の流れを説明する。言語処理部３が
出力候補文，認識候補文を生成する。話題データの発話
タイプが「質問」であるので、文テンプレート辞書の発
話タイプが「質問」のテンプレートのうち、話題データ
の話題の欄の左端の内容である「(イベント)」を文頭に
持つテンプレートを選択する。「(イベント)」が語順と
して先にあるテンプレート番号１のテンプレートに、話
題データの内容を埋め込み、「会議は明日ですか」とい
った文を生成する。それとともに、話題データの次発話
タイプが「応答」なので、文テンプレート辞書で発話タ
イプが「応答」のテンプレートを選択する。タスク知識
データの値をテンプレートに埋め込み、文を作成する。
「会議は７月２４日です」「会議は来週の月曜日です」
「７月２４日は会議です」「来週の月曜日は会議です」
といった単語列、もしくは文を認識候補として生成す
る。The flow of processing will be described with reference to a voice dialogue system for performing schedule management and registration. The language processing unit 3 generates an output candidate sentence and a recognition candidate sentence. Since the utterance type of the topic data is "question", templates having an utterance type of "question" in the sentence template dictionary that have "(event)" at the beginning of the subject at the left end of the topic column of the topic data Select The contents of the topic data are embedded in the template of template number 1 in which "(event)" precedes the word order, and a sentence such as "Is the meeting tomorrow?" At the same time, since the next utterance type of the topic data is "response", a template whose utterance type is "response" is selected in the sentence template dictionary. Embed the value of task knowledge data in the template to create a sentence.
"The meeting is July 24.""The meeting is next Monday."
"July 24 is a meeting.""Next Monday is a meeting."
Is generated as a recognition candidate.

【００３２】次に、出力文「会議は明日ですか」を音声
出力する。このあと、使用者の音声入力があると、音声
認識部１は、言語処理部３が作成した認識候補列を入力
音声中に照合する。認識候補列は、音声出力した文のテ
ンプレート番号と同じ、１のテンプレートで作成した認
識候補文「会議は７月２４日です」「会議は来週の月曜
日です」を優先的に照合する。Next, the output sentence "Is the meeting tomorrow?" Thereafter, when there is a user's voice input, the voice recognition unit 1 checks the recognition candidate sequence created by the language processing unit 3 in the input voice. The recognition candidate string preferentially matches the recognition candidate sentences “meeting is on July 24” and “meeting is next Monday” created with the same template number 1 as the template number of the sentence output.

【００３３】例えば、システムが「会議は明日ですか」
と出力したのに対し、使用者が「会議は来週の月曜日で
す」といった発話を扱うことができる。使用者が異なる
語句を話して音声出力中に単語が認識されなかった場合
は、他の認識候補文を照合する。以上をタスクが終了す
るまで繰り返す。For example, if the system is "Is the meeting tomorrow?"
Is output, the user can handle an utterance such as "Meeting is next Monday." When the user speaks a different phrase and the word is not recognized during the voice output, another recognition candidate sentence is collated. The above is repeated until the task is completed.

【００３４】「会議は明日です」と、別の語順の「明日
は会議です」は非常に似た表現であるが、文頭に位置し
た語が話題であることを使用者に伝えることができ、他
の話題にそれるといった可能性が低くなる。The phrase "meeting is tomorrow" and another word order "tomorrow is meeting" is a very similar expression, but it can inform the user that the word located at the beginning of the sentence is a topic, The likelihood of diverting to other topics is reduced.

【００３５】（請求項４の発明）図５（Ａ），図５
（Ｂ），図５（Ｃ）は、請求項４の発明を説明するため
のそれぞれ文テンプレート辞書，タスク知識データ，話
題データの例を示す図で、文テンプレート辞書のデータ
形式は、請求項１のものと同じであり、タスク知識デー
タと話題データのデータ形式は、請求項３のものと同じ
である。(Invention of Claim 4) FIGS. 5A and 5
FIGS. 5B and 5C show examples of a sentence template dictionary, task knowledge data, and topic data, respectively, for explaining the invention of claim 4. The data format of the sentence template dictionary is claim 1. The data formats of the task knowledge data and topic data are the same as those of the third aspect.

【００３６】請求項４の発明は、助動詞「です」を使っ
た文型をシステムの出力文とすることで、使用者にも同
じ文型で発話させるように仕組んだ点が特徴である。助
動詞「です」を使ったテンプレートのテンプレート番号
を「１」とする。The invention of claim 4 is characterized in that a sentence pattern using the auxiliary verb "is" is used as an output sentence of the system, so that the user is made to speak in the same sentence pattern. The template number of the template using the auxiliary verb “is” is “1”.

【００３７】スケジュール管理や登録を行なう音声対話
システムを例に処理の流れを説明する。言語処理部３が
出力候補文，認識候補文を生成する。話題データの発話
タイプが「質問」であるので、文テンプレート辞書の発
話タイプが「質問」のテンプレートを選択する。中で
も、助動詞「です」を使った文型であるテンプレート番
号１のテンプレートを使う。テンプレート番号１のテン
プレートを使い、話題データの項目の事柄を埋め込み、
出力文を作る。「会議はいつですか」といった文を生成
する。それとともに、話題データでは、次発話タイプが
「応答」なので、発話タイプが「応答」のテンプレート
にタスク知識データを埋め込み、認識候補文を作成す
る。「会議は明日ですか」「明日会議がありますか」と
いった単語列、もしくは文を認識候補として生成する。The flow of processing will be described by taking a voice dialogue system for schedule management and registration as an example. The language processing unit 3 generates an output candidate sentence and a recognition candidate sentence. Since the utterance type of the topic data is “question”, a template whose utterance type in the sentence template dictionary is “question” is selected. In particular, the template of template number 1 which is a sentence pattern using the auxiliary verb "is" is used. Using the template of template number 1, embedding topic data items,
Create an output statement. Generate a sentence such as "When is the meeting?" At the same time, in the topic data, since the next utterance type is “response”, the task knowledge data is embedded in a template whose utterance type is “response” to create a recognition candidate sentence. A word string or a sentence such as “Is the meeting tomorrow?” Or “Is there a meeting tomorrow?” Is generated as a recognition candidate.

【００３８】次に、出力文「会議は明日ですか」を音声
出力する。このあと、使用者の音声入力があると、音声
認識部１は、言語処理部３が作成した認識候補列を入力
音声中に照合する。認識候補列は、音声出力した文のテ
ンプレート番号と同じ、１のテンプレートで作成した認
識候補文、もしくは認識候補列を優先的に照合する。例
えば、使用者が異なる語句を話して音声出力中に単語が
認識されなかった場合は、他の認識候補文を照合する。
以上をタスクの場面に応じた発話タイプ，話題データ，
タスク知識データをもとにタスクが終了するまで繰り返
す。Next, the output sentence "Is the meeting tomorrow?" Thereafter, when there is a user's voice input, the voice recognition unit 1 checks the recognition candidate sequence created by the language processing unit 3 in the input voice. The recognition candidate sequence preferentially matches a recognition candidate sentence or a recognition candidate sequence created with one template, which is the same as the template number of the sentence output as speech. For example, when the user speaks a different phrase and the word is not recognized during the voice output, another recognition candidate sentence is collated.
The above describes the utterance type, topic data,
Repeat until the task is completed based on the task knowledge data.

【００３９】「ＡはＢです」「Ｂです」といった文型
は、Ａ，Ｂについての意味的な整合性を問わずに使える
発話の負担の少ない文型であるため、発話されやすい。
システム出力に、この文型を使うことで、使用者が同じ
文型で応答する可能性が高い。また、この文型を用いる
ことにより、使用者の発話，言い回しを誘導，予測する
ことができる。Sentence patterns such as "A is B" and "B is" are sentence patterns that can be used regardless of the semantic consistency of A and B, and are easily uttered.
By using this sentence pattern in the system output, there is a high possibility that the user will respond with the same sentence pattern. Further, by using this sentence pattern, it is possible to guide and predict the user's utterance and wording.

【００４０】（請求項５の発明）図６（Ａ），図６
（Ｂ），図６（Ｃ）は、請求項５の発明を説明するため
のそれぞれ文テンプレート辞書，タスク知識データ，話
題データの例を示す図であり、文テンプレート辞書のデ
ータ形式は、請求項１のものと同じであり、タスク知識
データと話題データのデータ形式は、請求項３のデータ
形式と同じである。(Invention of claim 5) FIGS. 6A and 6
FIGS. 6B and 6C are diagrams showing examples of a sentence template dictionary, task knowledge data, and topic data, respectively, for explaining the invention of claim 5, and the data format of the sentence template dictionary is as follows. The data formats of the task knowledge data and the topic data are the same as those of the third aspect.

【００４１】請求項５の発明は、同じテンプレート番号
で、話題データの同じ行の発話タイプ，次発話タイプの
対、例えば、「質問」「応答」について、発話タイプ
「質問」では、質問したい内容の項目に助詞を付けた形
のものを設定し、発話タイプ「応答」では、答えとなる
項目を文頭に位置させたものを設定する。つまり、同じ
テンプレート番号で、かつ、対になる発話タイプのテン
プレート同士を接続すると一つの文となるように設定す
る。According to a fifth aspect of the present invention, with respect to a pair of the utterance type and the next utterance type in the same line of the topic data with the same template number, for example, "question" and "response", the utterance type "question" indicates the contents to be asked. In the utterance type "response", an item to be the answer is set at the beginning of the sentence. In other words, the setting is made such that connecting the templates of the same utterance type with the same template number results in one sentence.

【００４２】スケジュール管理や登録を行なう音声対話
システムを例に処理の流れを説明する。言語処理部３が
出力候補文，認識候補文を生成する。話題データの発話
タイプが「質問」なので、発話タイプが「質問」のテン
プレートをテンプレート辞書から選択し、語「外出」を
埋め込んで文を生成する。また、認識候補文も生成す
る。話題データの発話タイプが「応答」であるので、テ
ンプレート辞書の発話タイプが「応答」の文で、かつ、
テンプレート番号が同じ「１」であるテンプレートにタ
スク知識データの値を埋め込んで生成する。「明日で
す」「明後日です」「１０時からです」といった文を認
識候補文として生成する。The flow of processing will be described using a voice dialogue system for managing and registering a schedule as an example. The language processing unit 3 generates an output candidate sentence and a recognition candidate sentence. Since the utterance type of the topic data is “question”, a template having the utterance type of “question” is selected from the template dictionary, and the sentence is generated by embedding the word “go out”. Also, a recognition candidate sentence is generated. Since the utterance type of the topic data is “response”, the utterance type of the template dictionary is “response”, and
The value of the task knowledge data is generated by embedding the value of the task knowledge data in the template having the same template number “1”. A sentence such as "It is tomorrow", "It is the day after tomorrow", "It is from 10:00" is generated as a recognition candidate sentence.

【００４３】次に、出力文「外出は」を音声出力する。
このあと、使用者の入力があると、音声認識部１は言語
処理部３が作成した認識候補文を入力音声中に照合す
る。例えば、システムの「外出は」といった文に対し
て、使用者が「明日です」といった発話を認識すること
ができる。以上をタスクが終了するまで繰り返す。Next, the output sentence "going out" is output as voice.
Thereafter, when there is a user input, the voice recognition unit 1 checks the recognition candidate sentence created by the language processing unit 3 in the input voice. For example, the user can recognize an utterance such as "Tomorrow" for a sentence such as "Going out" in the system. The above is repeated until the task is completed.

【００４４】人間の日常の対話で、相手が発話した続き
を補う性質を用いて、本実施例は、使用者に応答してほ
しい事柄を文頭に発話させるように仕組んだもので、使
用者の発話、言い回しを誘導，的確に予測することがで
きる。特に、タスク上キーとなる語を文頭に発話させる
ことで、より確実に対話管理，タスク達成ができる。In the present embodiment, using the nature of supplementing the continuation of the utterance of the other party in daily human conversation, the present embodiment is designed to cause the user to respond to what he wants to respond to at the beginning of the sentence. Utterance and wording can be induced and accurately predicted. In particular, by uttering a key word on the task at the beginning of the sentence, dialog management and task achievement can be performed more reliably.

【００４５】（請求項６の発明）文又は単語を出力し、
入力音声を認識して音声対話を行なう音声対話方法にお
いて、まず、システム側で予め設定した表現・文型によ
り出力候補文及び認識候補文を生成し、生成した出力候
補文の一つを音声出力する。この音声出力に応答して音
声入力を行なう。システム側は、先の音声出力した出力
文の表現・文型により、入力音声を認識する際の認識候
補単語又は認識候補表現を優先的に認識し、認識した単
語，表現に基づいて次の出力候補文及び認識候補文を生
成し、再び生成した出力候補文の一つを音声出力する。
以上をタスクが終了するまで繰り返す対話方法である。(Invention of claim 6) A sentence or word is output,
In a spoken dialogue method for recognizing an input speech and performing a spoken dialogue, first, an output candidate sentence and a recognition candidate sentence are generated according to an expression / sentence pattern set in advance by the system, and one of the generated output candidate sentences is output as speech. . Voice input is performed in response to the voice output. The system side preferentially recognizes a recognition candidate word or a recognition candidate expression when recognizing the input voice based on the expression / sentence pattern of the output sentence output by the previous voice, and based on the recognized word or expression, the next output candidate. A sentence and a recognition candidate sentence are generated, and one of the generated output candidate sentences is output as speech.
This is an interactive method that repeats the above until the task is completed.

【００４６】（請求項７の発明）本発明による音声対話
システム及び対話方法の上述した実施形態において、音
声対話システムの各部で音声対話を行なうために記憶さ
れるべき情報としての表現・文型辞書類を含め、その機
能を働かせ、処理動作を実行するための制御プログラム
等をデータとして記録媒体に記録（保持）するように
し、記録されたデータを読み出して、当該装置を動作さ
せ、また当該方法を実行するために用いる。このよう
に、本発明の音声対話システム及び対話方法を実施させ
るデータを記録媒体に保持させ、この記録媒体をインス
トールした汎用コンピュータによっても、当該音声対話
システム及び方法の機能が実現できるようになる。(Invention of claim 7) In the above-described embodiment of the voice dialogue system and the dialogue method according to the present invention, the expression / sentence dictionaries as information to be stored in each part of the voice dialogue system to perform voice dialogue. The control function for executing the processing operation is recorded (held) as data on a recording medium, the recorded data is read, the device is operated, and the method is performed. Used to execute. As described above, the data for executing the voice dialogue system and the dialogue method of the present invention is stored in the recording medium, and the functions of the voice dialogue system and the method can be realized by a general-purpose computer on which the recording medium is installed.

【００４７】[0047]

【発明の効果】請求項１の発明によれば、文又は単語を
出力し、入力音声を認識する音声対話方式において、予
め設定した表現・文型の出力候補文及び認識候補文を生
成する手段と、前記出力候補文の一つを音声出力する手
段と、音声出力した出力文の表現・文型により、使用者
の入力音声を認識する際の認識候補単語又は認識候補表
現を設定する手段を有するので、システム側の出力によ
って使用者の発話を誘導するとともに、音声出力した表
現と・文型同じ表現・文型を、使用者の発話中に優先的
に認識することによって効率的に、精度よく音声認識で
き、システムの能力に限界がある場合でも、対話を円滑
に進めることができる。According to the first aspect of the present invention, in a voice interactive system for outputting a sentence or word and recognizing an input voice, means for generating an output candidate sentence and a recognition candidate sentence of a predetermined expression / sentence type are provided. A means for outputting one of the output candidate sentences by voice, and a means for setting a recognition candidate word or a recognition candidate expression when recognizing a user's input voice based on the expression / sentence pattern of the output sentence output by voice. In addition to guiding the user's utterance by the output of the system, efficient and accurate speech recognition can be achieved by preferentially recognizing the expression and sentence pattern that is the same as the sentence output during the user's utterance. Even if the system has limited capabilities, the dialogue can proceed smoothly.

【００４８】請求項２の発明によれば、請求項１の発明
の効果に加えて、前記予め設定した表現・文型の出力候
補文及び認識候補文が、予め設定した特定条件の表現・
文型の出力候補文及び認識候補文であるので、特に発話
しやすい表現・文型にすることで、より強く発話表現を
誘導でき、認識率を向上させることができる。According to the second aspect of the present invention, in addition to the effects of the first aspect, the preset expression / sentence type output candidate sentence and the recognition candidate sentence are set to a predetermined specific condition expression / sentence.
Since the sentence-type output candidate sentence and the recognition candidate sentence are used, an utterance expression can be more strongly induced by making the expression / sentence type particularly easy to speak, and the recognition rate can be improved.

【００４９】請求項３の発明によれば、請求項２の発明
の効果に加えて、前記特定条件として、現在の話題を表
わす語句を文頭に位置させたので、話題となっている事
柄を語順として先にした文を出力することで使用者に話
題を強く意識させ、その話題の発話文を発声させること
ができ、また、それに基いて認識候補単語を設定するの
で、より効率時な対話を行なうことができる。According to the third aspect of the present invention, in addition to the effect of the second aspect of the present invention, the phrase representing the current topic is located at the beginning of the sentence as the specific condition, so that the topical matters are arranged in word order. By outputting the sentence as above, the user can be strongly aware of the topic and can utter the utterance sentence of the topic, and the recognition candidate word is set based on it, so that more efficient dialogue Can do it.

【００５０】請求項４の発明によれば、請求項２の発明
の効果に加えて、前記特定条件として、断定の助動詞を
用いたので、助動詞「です」「だ」による文型を提示す
ることで、使用者が同じ表現・文型を用いる可能性が高
くなり、認識候補単語の設定が精度よくでき、より効率
的な対話を行なうことができる。According to the fourth aspect of the present invention, in addition to the effect of the second aspect of the present invention, since the assertive auxiliary verb is used as the specific condition, the sentence pattern by the auxiliary verbs "da" and "da" is presented. Thus, the user is more likely to use the same expression / sentence pattern, the recognition candidate words can be set with high accuracy, and more efficient dialogue can be performed.

【００５１】請求項５の発明によれば、文又は単語を出
力し、入力音声を認識する音声対話方式において、文の
後半又は文末を省略した表現・文型の出力候補文及び認
識候補文を生成する手段と、前記出力候補文の一つを音
声出力する手段と、音声出力した文の後半又は文末を省
略した出力文の表現・文型により、使用者の入力音声を
認識する際の認識候補単語又は認識候補表現に、省略し
た文の後半又は文末に接続可能な語句を設定する手段を
有するので、使用者の発話を誘導でき、また、認識候補
単語の設定が精度よくでき、より効率的な対話を行なう
ことができる。According to the fifth aspect of the present invention, in a voice dialogue system for outputting a sentence or a word and recognizing an input voice, an output candidate sentence and a recognition candidate sentence of an expression / sentence type in which the latter half or the end of the sentence is omitted are generated. Means for performing voice output of one of the output candidate sentences, and a recognition candidate word for recognizing a user's input voice by using an expression / sentence pattern of an output sentence in which the latter half or the end of the sentence is omitted. Alternatively, the recognition candidate expression has means for setting a phrase that can be connected to the second half of the omitted sentence or the end of the sentence, so that the user's utterance can be guided, and the setting of the recognition candidate word can be performed with high accuracy, and more efficient Can interact.

【００５２】請求項６の発明によれば、システムが使用
者の発話を予測，誘導するので、システムの能力に限界
がある場合でも円滑に対話を行うことができる。According to the sixth aspect of the present invention, since the system predicts and guides the utterance of the user, it is possible to perform a smooth dialogue even when the system has a limited capacity.

【００５３】請求項７の発明によれば、請求項１乃至６
のいずれか１記載の音声対話システム又は対話方法を実
施させる関連データ及びプログラムを記録したコンピュ
ータ読み取り可能な記録媒体が得られる。According to the invention of claim 7, claims 1 to 6 are provided.
A computer-readable recording medium on which is recorded the relevant data and the program for implementing the voice interaction system or the interaction method according to any one of the above.

[Brief description of the drawings]

【図１】本発明の音声対話システムの全体構成と処理
フローを示す図である。FIG. 1 is a diagram showing an overall configuration and a processing flow of a voice interaction system of the present invention.

【図２】本発明の一実施例を説明するための文テンプ
レート辞書等の例を示す図である。FIG. 2 is a diagram showing an example of a sentence template dictionary and the like for explaining one embodiment of the present invention.

【図３】本発明の他の実施例を説明するための文テン
プレート辞書の例を示す図である。FIG. 3 is a diagram showing an example of a sentence template dictionary for explaining another embodiment of the present invention.

【図４】本発明のさらに他の実施例を説明するための
文テンプレート辞書等の例を示す図である。FIG. 4 is a diagram showing an example of a sentence template dictionary and the like for explaining still another embodiment of the present invention.

【図５】本発明のさらに他の実施例を説明するための
文テンプレート辞書等の例を示す図である。FIG. 5 is a diagram showing an example of a sentence template dictionary and the like for explaining still another embodiment of the present invention.

【図６】本発明のさらに他の実施例を説明するための
文テンプレート辞書等の例を示す図である。FIG. 6 is a diagram showing an example of a sentence template dictionary and the like for explaining still another embodiment of the present invention.

【符号の説明】１…音声認識部、２…音声合成部、３…言語処理部、４
…文テンプレート辞書、５…タスク知識データ、６…話
題データ。[Description of Signs] 1 ... Speech recognition unit, 2 ... Speech synthesis unit, 3 ... Language processing unit, 4
... sentence template dictionary, 5 ... task knowledge data, 6 ... topic data.

Claims

[Claims]

In a voice interactive system for outputting a sentence or a word and recognizing an input voice, a means for generating an output candidate sentence and a recognition candidate sentence of a preset expression / sentence type, and one of the output candidate sentences is provided. A speech dialogue system comprising: means for outputting a speech; and means for setting a recognition candidate word or a recognition candidate expression for recognizing a user's input speech based on the expression / sentence pattern of the output sentence output as a voice.

2. The method according to claim 1, wherein the preset expression / sentence type output candidate sentence and the recognition candidate sentence are a predetermined specific condition expression / sentence type output candidate sentence and a recognition candidate sentence. A spoken dialogue system as described.

3. The spoken dialogue system according to claim 2, wherein a phrase representing a current topic is located at the beginning of the sentence as the specific condition.

4. The spoken dialogue system according to claim 2, wherein an assertive auxiliary verb is used as the specific condition.

5. A means for outputting a sentence or word and recognizing an input voice, generating an output candidate sentence and a recognition candidate sentence of an expression / sentence type in which the latter half or the end of the sentence is omitted, and said output candidate Omitting the recognition candidate word or recognition candidate expression when recognizing the user's input voice by means of outputting one of the sentences by voice and the expression / sentence pattern of the output sentence in which the latter half or the end of the sentence is omitted. A speech dialogue system comprising means for setting a connectable phrase at the second half or at the end of a sentence.

6. A spoken dialogue method for outputting a sentence or word, recognizing an input speech and performing a spoken dialogue, generating an output candidate sentence and a recognition candidate sentence of a predetermined expression / sentence type, and generating the generated output candidate sentence. One of the voice output, by the expression and sentence pattern of the output sentence that is voice output, preferentially recognize the recognition candidate word or recognition candidate expression when recognizing the input voice of the user,
A voice interaction method for performing a voice interaction by repeatedly generating a next output candidate sentence and a recognition candidate sentence based on a recognition result.

7. A computer-readable recording medium on which relevant data and a program for implementing the voice dialogue system or dialogue method according to claim 1 are recorded.