JP2002073080A

JP2002073080A - Voice interactive system

Info

Publication number: JP2002073080A
Application number: JP2000266027A
Authority: JP
Inventors: Shinichi Iwamoto; 真一岩本; Toshitaka Yamato; 俊孝大和
Original assignee: Denso Ten Ltd
Current assignee: Denso Ten Ltd
Priority date: 2000-09-01
Filing date: 2000-09-01
Publication date: 2002-03-12

Abstract

PROBLEM TO BE SOLVED: To provide a audio response system improved in comfort to a user by reducing the data size of dialog script data. SOLUTION: An interactive processing part is provided with an assistant mechanism 7, and the assistant mechanism 7 is provided with a global transition condition table 8 and a means 9 for generating an assistant interactive mode. A plurality of conditions that are used in common under any interactive states are stored in the table 8. The means 9 provides an assistant interactive mode that helps the user when a usual basic interactive mode does not run smoothly.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は音声対話システムに
関する。人間と機械との間の情報交換を行ういわゆるマ
ン・マシンインタフェースの一形態として、近年、音声
対話型インタフェースが急速に普及し始めている。その
代表的な一例としては、車載情報機器であるオーディオ
装置やナビゲーション装置に適用される音声対話型イン
タフェースがあり、車輛を運転中のユーザにとって当該
車載情報機器に対する操作容易性は飛躍的に向上してい
る。[0001] The present invention relates to a voice interaction system. In recent years, a voice interactive interface has begun to spread rapidly as one form of a so-called man-machine interface for exchanging information between a human and a machine. As a typical example, there is a voice interactive interface applied to an audio device or a navigation device, which is an on-vehicle information device, and the operability of the on-vehicle information device for a user driving a vehicle is dramatically improved. ing.

【０００２】本発明は上記音声対話型インタフェースを
実現するための音声対話システムについて述べる。[0002] The present invention describes a voice interactive system for realizing the above voice interactive interface.

【０００３】[0003]

【従来の技術】図１８は一般的な音声対話システムの原
型を示す図である。本図において、音声対話システム１
は、音声認識部２と、対話処理部３と、音声合成部４と
に大別される。さらに対話処理部３は、「意味理解系」
と「対話管理系」とに機能分類することができる。前者
は、発話モデルデータを収容した発話データベース５を
主構成要素とし、後者は、対話スクリプトデータを収容
した対話データベース６を主構成要素としている。尚、
「意味理解系」は、音声認識部２にその機能を持たせる
構成とする考え方もある。2. Description of the Related Art FIG. 18 is a diagram showing a prototype of a general speech dialogue system. In this figure, a speech dialogue system 1
Are roughly divided into a speech recognition unit 2, a dialog processing unit 3, and a speech synthesis unit 4. Further, the dialogue processing unit 3 is a "semantic understanding system"
And “dialogue management system”. The former has an utterance database 5 containing utterance model data as a main component, and the latter has a dialogue database 6 containing dialogue script data as a main component. still,
There is an idea that the “semantic understanding system” has a configuration in which the voice recognition unit 2 has the function.

【０００４】図１８の原型を簡単に理解するために、フ
ァーストフード店を例にとって説明すると、音声認識部
２は来店客に対面して設けられ、一方、音声合成部４は
当店の店員の代替として機能する。来店客が仮に、「バ
ーガー、２つ」と発声したとすると、音声認識部２は、
当該音声をテキスト（ｔｅｘｔ）データに変換する。こ
の場合、〔バーガー〕と〔フタツ〕に変換できる。In order to easily understand the prototype of FIG. 18, a fast food store will be described as an example. The voice recognition unit 2 is provided to face a customer, while the voice synthesis unit 4 is used as a substitute for a clerk of the store. Function as Assuming that a visitor utters “two burgers,” the voice recognition unit 2
The voice is converted into text data. In this case, it can be converted into [Burger] and [Futatsu].

【０００５】その音声認識結果（文字コード）を入力と
して、対話処理部３は、まず、発話データベース５を参
照して、その文字コードの意味付けを行う。この場合、
〔ハンバーガー＝注文〕および〔フタツ＝注文数〕と意
味付けることができる。対話処理部３は上記の意味付け
をもとに、対話データベース６を参照して、対話の流れ
を決定する。つまり引き続いてどのような対話を来店客
に与えるべきか決定する。この場合例えば、〔こちらで
お召し上がりですか？〕および〔お飲み物は？〕という
ことになる。ここに決定した対話は、音声合成部４にて
電子的に合成された音声となり、当該来店客に返され
る。来店客はさらにこれに答えて、次々と対話が遷移し
ていく。[0005] With the speech recognition result (character code) as an input, the dialogue processing unit 3 first refers to the utterance database 5 to give the meaning of the character code. in this case,
[Hamburger = order] and [Futatsu = order number] can be used. The dialogue processing unit 3 determines the flow of the dialogue with reference to the dialogue database 6 based on the meaning given above. In other words, it decides what kind of dialogue should be given to the visitor subsequently. In this case, for example, [Do you have it here? ] And [Drinks? 〕It turns out that. The dialogue determined here becomes a voice synthesized electronically by the voice synthesis unit 4 and is returned to the customer. Visitors answer this further, and the dialogue changes one after another.

【０００６】[0006]

【発明が解決しようとする課題】上述した一般的な音声
対話システムは、ファーストフード店を例にとって説明
したが、例えば既述のナビゲーション装置では、対話の
内容はかなり複雑多岐に亘り、対話の流れも何階層にも
遷移していく。このため音声対話システムとしては一層
の改良が求められる。その改良すべき点は多種に及ぶ
が、本発明ではその改良すべき点として、対話スクリプ
トデータに着目する。The above-mentioned general voice dialogue system has been described using a fast food restaurant as an example. However, for example, in the above-described navigation device, the content of the dialogue is quite complicated and various, and the flow of the dialogue is complicated. Also transitions to several levels. For this reason, further improvement is required for the voice dialogue system. Although the points to be improved are various, the present invention focuses on the interactive script data as points to be improved.

【０００７】第１に、従来は、遷移していく対話の流れ
は、起こり得る全てのパターンについてそれぞれ個別に
対話スクリプトデータを用意していた。また第２に、音
声認識部２から正常な認識結果が得られないときには、
標準的な対話スクリプトデータだけでは、それ以上通常
の対話の流れを継続させることができない。[0007] First, conventionally, in the transitioning dialog flow, dialog script data is prepared individually for all possible patterns. Second, when a normal recognition result cannot be obtained from the voice recognition unit 2,
Standard conversation script data alone cannot continue the normal conversation flow any further.

【０００８】上記第１の点に関しては、対話スクリプト
データのデータサイズがきわめて大きくなり対話データ
ベース６のメモリ容量が膨大になってしまう、という問
題が生ずる（第１の問題）。また上記第２の点に関して
は、対話がスムーズに流れないことから、ユーザに不快
感を与える、という問題が生ずる（第２の問題）。As for the first point, there is a problem that the data size of the interactive script data becomes extremely large and the memory capacity of the interactive database 6 becomes enormous (first problem). As for the second point, since the dialogue does not flow smoothly, there is a problem that the user is uncomfortable (second problem).

【０００９】したがって本発明は上記問題点に鑑み、対
話データベースにおけるデータサイズを小さくし、ま
た、ユーザに快適な対話を提供できるようにした、音声
対話システムを実現することを目的とするものである。SUMMARY OF THE INVENTION Accordingly, it is an object of the present invention to realize a voice interactive system which reduces the data size in an interactive database and can provide a comfortable dialog to a user. .

【００１０】[0010]

【課題を解決するための手段】図１は本発明の概念を表
す図である。本図に示すように、本発明の特徴は、対話
処理部（図１８の３に相当）内に、アシスト機構７を設
けることである。そしてこのアシスト機構７内には、グ
ローバル遷移条件テーブル８と、アシスト対話モード生
成手段９が含まれる。FIG. 1 is a diagram showing the concept of the present invention. As shown in the drawing, a feature of the present invention is that an assist mechanism 7 is provided in a dialog processing unit (corresponding to 3 in FIG. 18). The assist mechanism 7 includes a global transition condition table 8 and an assist dialog mode generation means 9.

【００１１】上記グローバル遷移条件テーブル８は、ユ
ーザの発話状態において、いかなる対話状態のもとでも
共通に使用される複数のグローバル遷移条件を予め格納
する。対話処理部は、その発話状態におけるユーザの発
声の内容が上記グローバル遷移条件に該当するとき、グ
ローバル遷移条件テーブル８にアクセスして対話内容を
生成するものである。The global transition condition table 8 stores in advance a plurality of global transition conditions that are commonly used under any conversational state in the utterance state of the user. The dialogue processing unit accesses the global transition condition table 8 and generates the dialogue content when the content of the utterance of the user in the utterance state corresponds to the global transition condition.

【００１２】これにより、対話スクリプトデータには全
ての対話遷移を記述する必要がなくなり、既述の第１の
問題を解決することができる。一方、前記アシスト対話
モード生成手段９は、対話処理部が、基本対話モードの
他にアシスト対話モードでも動作することを可能にす
る。ここに基本対話モードは、ナビゲーション装置等の
アプリケーションプログラムが要求する各種情報を収集
する通常の一連の対話の流れを形成するモードであり、
アシスト対話モードは、一連の対話の流れを維持できな
い異常時に形成されるモードである。This eliminates the need to describe all dialog transitions in the dialog script data, and can solve the first problem described above. On the other hand, the assist interaction mode generation means 9 enables the interaction processing unit to operate not only in the basic interaction mode but also in the assist interaction mode. Here, the basic interaction mode is a mode for forming a flow of a normal series of interaction for collecting various information requested by an application program such as a navigation device,
The assist dialogue mode is a mode formed at the time of an abnormality in which the flow of a series of dialogues cannot be maintained.

【００１３】これにより、ユーザは一連の対話の流れを
スムーズに進めることができ快適性は向上するので、既
述の第２の問題を解決することができる。Thus, the user can smoothly progress the flow of a series of dialogues and the comfort is improved, so that the second problem described above can be solved.

【００１４】[0014]

【発明の実施の形態】図２は本発明に係る音声対話シス
テムの全体構成例を示す図である。本図に示すとおり、
本発明に係る音声対話システム１は、基本的に、発話状
態においてユーザから発声された音声を、例えばマイク
を介して、入力する音声入力部１１と、この音声入力部
１１からの音声を認識してその意味を解析する音声認識
部１２と、この音声認識部１２による認識結果に基づい
て、予め対話データベース１４内に収容された対話スク
リプトデータを参照しつつ、ユーザに返すべき対話内容
を生成する対話処理部１３と、その対話内容に基づいて
ユーザに対して合成音声を返す音声出力部１６と、を備
え、ユーザによる発声の都度、対話スクリプトデータに
より規定される一連の対話状態を次々に遷移させて、ア
プリケーションプログラムが要求する各種情報をユーザ
から対話形式で収集するようにした音声対話システムで
ある。DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 2 is a diagram showing an example of the overall configuration of a voice interaction system according to the present invention. As shown in this figure,
The voice interaction system 1 according to the present invention basically recognizes a voice input unit 11 for inputting a voice uttered by a user in a utterance state via a microphone, for example, and recognizes a voice from the voice input unit 11. Based on the recognition result by the voice recognition unit 12, a dialog content to be returned to the user is generated with reference to the dialog script data stored in the dialog database 14 in advance. A dialogue processing unit 13 and a voice output unit 16 for returning a synthesized voice to the user based on the content of the dialogue are provided. Each time the user utters, a series of dialogue states defined by the dialogue script data are successively changed. This is a voice interactive system that collects various information requested by an application program from a user in an interactive manner.

【００１５】そして本発明をまず特徴づけるのは、対話
処理部１３により参照されるグローバル遷移条件テーブ
ル１５（図１の８に相当）をさらに設けることである。
このグローバル遷移条件テーブル１５には、上記の発話
状態において、いかなる対話状態のもとでも共通に使用
される複数のグローバル遷移条件を予め格納する。ここ
に対話処理部１３は、上記の発話状態における発声の内
容がそのグローバル遷移条件に該当するとき、グローバ
ル遷移条件テーブル１５にアクセスして上記の対話内容
を生成する。The first feature of the present invention is to further provide a global transition condition table 15 (corresponding to 8 in FIG. 1) which is referred to by the interaction processing unit 13.
In the global transition condition table 15, a plurality of global transition conditions that are commonly used under any conversation state in the utterance state are stored in advance. Here, when the content of the utterance in the utterance state corresponds to the global transition condition, the dialog processing unit 13 accesses the global transition condition table 15 and generates the content of the dialog.

【００１６】グローバル遷移条件テーブル１５の具体的
な内容を説明する前に、図２の構成要素についてさらに
詳しく述べておく。マイクからなる音声入力部１１から
の音声入力は、音声認識部１２（図１８の２）を構成す
る連続単語認識エンジン１７にて各単語に切り出され
て、かつ、テキストに変換される。このテキストは、同
じく音声認識部１２を構成する発話適応ユニット１８に
入力される。該ユニット１８は、発話モデルデータ（図
１８の５）を参照して、その入力されたテキスト（単
語）の意味を理解して、その認識結果を対話処理部１３
（図１８の３に相当）に送る。例えばナビゲーション装
置において、認識されたテキスト（音声）が「行きた
い」ならば、認識結果は「目的地」ということになる。Before describing the specific contents of the global transition condition table 15, the components of FIG. 2 will be described in more detail. The speech input from the speech input unit 11 composed of a microphone is cut out into each word by the continuous word recognition engine 17 constituting the speech recognition unit 12 (2 in FIG. 18) and converted into text. This text is input to an utterance adaptation unit 18 also constituting the speech recognition unit 12. The unit 18 refers to the utterance model data (5 in FIG. 18), understands the meaning of the input text (word), and outputs the recognition result to the interactive processing unit 13.
(Corresponding to 3 in FIG. 18). For example, in a navigation device, if the recognized text (voice) is “want to go”, the recognition result is “destination”.

【００１７】上記認識結果を得た対話処理部１３は、対
話データベース１４内の対話スクリプトデータ（図１８
の６）をもとに、当該対話の流れを決定して組み立て
る。例えば上記の「目的地」に対しては、「遊園地？」
→「△△県？」→「○○ランド？」という流れを決定す
る。いずれもユーザとの対話形式で進行するので、音声
出力部１６を構成する音声合成部１９（図１８の４）で
次の対話が音声合成され、スピーカよりユーザに返され
る。The dialogue processing unit 13 that has obtained the recognition result transmits the dialogue script data (FIG. 18) in the dialogue database 14.
Based on 6), the flow of the dialog is determined and assembled. For example, for the above “Destination”, “Amusement park?”
→ "△△ prefecture?" → "○ land?" Since both proceed in the form of dialogue with the user, the next dialogue is voice-synthesized by the voice synthesis unit 19 (4 in FIG. 18) constituting the voice output unit 16 and returned to the user from the speaker.

【００１８】一連の対話の流れで得た結果、例えば「目
的地・遊園地・△△県・○○ランド」は、対話結果とし
て、ナビゲーション装置を動かすアプリケーションプロ
グラムに与えられる。次にグローバル遷移条件テーブル
１５について説明する。グローバル遷移条件テーブル１
５は、予め定めた複数のグローバル遷移条件のそれぞれ
に対応した個別の対話内容を格納する複数のグローバル
遷移ファイルのいずれかを特定する。The result obtained in the flow of a series of dialogues, for example, “destination / amusement park / △△ prefecture / ○ land” is given to the application program for operating the navigation device as a dialogue result. Next, the global transition condition table 15 will be described. Global transition condition table 1
5 specifies any one of a plurality of global transition files that store individual conversation contents corresponding to each of a plurality of predetermined global transition conditions.

【００１９】複数のグローバル遷移条件のいくつかを例
示すると次のとおりである。ｉ）認識せず ii）認識エラー iii ）戻る（戻りたい） iv）中止（取り消す）ｖ）分からない vi）間違い上記のｉ）〜ii）は、音声認識部１２が自ら発するグロ
ーバル遷移条件であり、iii ）〜vi）はユーザ側から発
せられるグローバル遷移条件である。これらのいずれの
ケースも、あらゆる対話パターンの中に共通に起こり得
る事象である。このために、「グローバル」と名付けて
いる。Some examples of the plurality of global transition conditions are as follows. i) No recognition ii) Recognition error iii) Return (want to return) iv) Cancel (cancel) v) Don't know vi) Mistake i) to ii) above are global transition conditions issued by the voice recognition unit 12 by itself. , Iii) to vi) are global transition conditions issued from the user side. Each of these cases is a common event in any interaction pattern. For this reason, it is named "global".

【００２０】従来は、上記グローバル遷移条件を、全て
の対話パターンに持たせていたため、対話データベース
１５のデータサイズはきわめて膨大なものとなってしま
った。図３はグローバル遷移ファイルを図解的に示す図
である。本図のグローバル遷移ファイルＧ−１，Ｇ−２
〜Ｇ−６は、上記のグローバル遷移条件ｉ），ii）〜v
i）にそれぞれ対応している。ただし、Ｇ−３〜Ｇ−５
は図示を省略する。Conventionally, the above-mentioned global transition condition is provided for all conversation patterns, so that the data size of the conversation database 15 becomes extremely large. FIG. 3 is a diagram schematically showing a global transition file. Global transition files G-1, G-2 in this figure
To G-6 are the above global transition conditions i), ii) to v
i) respectively. However, G-3 to G-5
Are not shown.

【００２１】なお、これらのファイルＧ１〜Ｇ−６は、
図２の対話データベース１４の中にさらに追加して形成
してもよいし、あるいは、ファイルＧ−１〜Ｇ−６とし
て独立に存在させてもよい。いずれにしても、グローバ
ル遷移条件テーブル１５を介して対話処理部１３がその
ファイルの内容を読み込む。図４はグローバル遷移条件
テーブル１５の内容を図解的に示す図である。Note that these files G1 to G-6 are
It may be additionally formed in the interactive database 14 of FIG. 2, or may exist independently as files G-1 to G-6. In any case, the interaction processing unit 13 reads the contents of the file via the global transition condition table 15. FIG. 4 is a diagram schematically showing the contents of the global transition condition table 15.

【００２２】音声認識部１２による認識結果が「認識せ
ず」であれば、ファイルＧ−１へアクセスすべきことを
指定する。「認識せず」とは、ユーザが発話状態にあり
ながら何も発声しないことを意味する。また音声認識部
１２による認識結果が「認識エラー」であれば、ファイ
ルＧ−２へアクセスすべきことを指定する。「認識エラ
ー」とは、ユーザが発声はしているものの、周囲の騒音
が大きいか、声が大き過ぎるか、または声が小さ過ぎる
ために、音声認識部１２が正しく認識できないことを意
味する。If the result of recognition by the voice recognition unit 12 is "not recognized", it specifies that the file G-1 should be accessed. “Not recognizing” means that the user does not utter anything while speaking. If the recognition result by the voice recognition unit 12 is “recognition error”, it specifies that the file G-2 should be accessed. The “recognition error” means that the user does not speak, but the surrounding noise is too loud, the voice is too loud, or the voice is too loud, so that the voice recognizing unit 12 cannot recognize correctly.

【００２３】またユーザから「戻る」と発声されたとき
はファイルＧ−３に飛び、以前の対話状態に戻すための
対話シーケンスに入る。ユーザから「中止」と発声され
たときは、ファイルＧ−４に飛び、対話を終結させる対
話シーケンスに入る。図５はグローバル遷移ファイルの
データ構造の一例を図解的に示す図である。ただし、フ
ァイルＧ−１とＧ−２のみについて示す。When the user utters "return", the program jumps to the file G-3 and enters a dialog sequence for returning to the previous dialog state. When the user utters “stop”, the process jumps to the file G-4 and enters a dialog sequence for ending the dialog. FIG. 5 is a diagram schematically illustrating an example of the data structure of the global transition file. However, only the files G-1 and G-2 are shown.

【００２４】本図に示すように、ファイルＧ−１には対
話スクリプトデータの対話シーケンスの「Ｓ−５」を選
択し、かつ、その対話シーケンス「Ｓ−５」の中のノー
ドＩＤの「２」を選択すべきことを指示する情報が書い
てある。ファイルＧ−２についても上記と同様であり、
対話ＩＤ「Ｓ−２１」とその中のノードＩＤ「４」が書
かれている。As shown in the figure, the file G-1 selects "S-5" of the dialog sequence of the dialog script data and "2" of the node ID in the dialog sequence "S-5". Is instructed to be selected. The same applies to the file G-2,
A conversation ID “S-21” and a node ID “4” therein are described.

【００２５】すなわち、グローバル遷移ファイルＧ−
１，Ｇ−２…は、遷移先対話ＩＤを特定する対話ＩＤ情
報と、特定された該遷移先対話ＩＤにより展開される一
連の対話シーケンス内の遷移先ノードＩＤを特定するノ
ードＩＤ情報とを格納する。図６は対話シーケンスにつ
いて一例を表す図である。例えば対話シーケンスＳ−１
は、通常の基本対話モードのシーケンスであり、そのノ
ード１では例えば「行きたい」というユーザからの発声
により、目的地の設定が行われる。That is, the global transition file G-
, G-2... Include dialog ID information specifying a transition destination dialog ID and node ID information specifying a transition destination node ID in a series of dialog sequences developed by the specified transition destination dialog ID. Store. FIG. 6 is a diagram illustrating an example of the interaction sequence. For example, the interaction sequence S-1
Is a sequence of a normal basic conversation mode, and a destination is set at the node 1 by, for example, utterance of a user who wants to go.

【００２６】ノード２では、「行き先をはなして下さ
い」との問いかけをユーザに対して行う。ノード３で
は、ユーザは「兵庫県」と発声する。ノード４では、
「市町村名をはなして下さい」との問いかけをユーザに
対して行う。The node 2 asks the user "Please go to the destination". At node 3, the user says "Hyogo prefecture". At node 4,
Ask the user "Please strip off the municipal name."

【００２７】以下、さらに対話は続く。本来このような
一連の対話の流れを進めるべきところ、仮に、上記ノー
ド２において、音声認識部１２からの認識結果が「認識
せず」であったとすると、対話処理部１３は、グローバ
ル遷移条件テーブル１５をアクセスする。そこには、図
５に示すように対話番号Ｇ−１の「認識せず」に対応す
る対話シーケンスを開始し、所定のガイダンスに沿っ
て、ユーザから再指示を引き出す。その後グローバル遷
移ファイルＧ−１に書かれた遷移先対話ＩＤ「Ｓ−５」
と遷移先ノードＩＤ「２」従い、図６の二重丸の「２」
へ遷移する。Hereafter, the dialogue continues. Originally, the flow of such a series of dialogues should proceed. If the recognition result from the speech recognition unit 12 in the node 2 is “not recognized”, the dialogue processing unit 13 15 is accessed. In this case, as shown in FIG. 5, a dialogue sequence corresponding to "not recognized" of the dialogue number G-1 is started, and re-instruction is drawn from the user according to predetermined guidance. Thereafter, the transition destination conversation ID “S-5” written in the global transition file G-1
And the transition destination node ID "2", the double circle "2" in FIG.
Transition to.

【００２８】図７は図５のデータ構造の変形例を図解的
に示す図である。対話番号Ｇ−３では、直前の対話ＩＤ
（移動して来る前の対話ＩＤ）における直前のu（移動
して来る前の対話処理部１３からの発声ノード）へ遷移
する。対話番号Ｇ−４では、直前の対話ＩＤ（移動して
来る前の対話ＩＤ）における直前のｕ（移動して来る前
のユーザの発声ノード）へ遷移する。FIG. 7 is a diagram schematically showing a modification of the data structure of FIG. In the conversation number G-3, the previous conversation ID
The transition to the immediately preceding u (the utterance node from the dialogue processing unit 13 before moving) in (the dialogue ID before moving) is made. At the conversation number G-4, the transition is made to the immediately preceding u (the utterance node of the user before the movement) in the immediately preceding conversation ID (the conversation ID before the movement).

【００２９】すなわち、図７の変形例に基づけば、既述
の対話ＩＤ情報および既述のノードＩＤ情報が、それぞ
れ、各種の遷移先を機能表現（ＰＲＥＶ−ｐｒｅｖｉｏ
ｕｓ）したコードによって記述される。このように機能
表現することにより、１つ１つのケースにおいて具体的
に対話シーケンスのＩＤを特定する必要がなりなり、フ
ァイル（Ｇ−１〜Ｇ−６）のデータサイズは一層小さく
できる。That is, based on the modified example of FIG. 7, the above-described dialog ID information and the above-described node ID information respectively represent various transition destinations in a functional expression (PREV-previo).
us). By expressing the functions in this manner, it is necessary to specifically specify the ID of the interaction sequence in each case, and the data size of the files (G-1 to G-6) can be further reduced.

【００３０】次に図１のアシスト機構７におけるアシス
ト対話モード生成手段９について説明する。このアシス
ト対話モード生成手段９を機能させた対話処理部１３
は、基本対話モードの他にアシスト対話モードでも動作
することができる。ここに基本対話モードは、アプリケ
ーションプログラム（図２の上方参照）が要求する各種
情報を収集する通常の一連の対話の流れを形成するモー
ドであり、アシスト対話モードは、その一連の対話の流
れを維持できない異常時に形成されるモードである。Next, the assist dialog mode generation means 9 in the assist mechanism 7 shown in FIG. 1 will be described. The dialogue processing unit 13 which functions the assist dialogue mode generation means 9
Can operate in the assist interactive mode in addition to the basic interactive mode. Here, the basic interaction mode is a mode for forming a normal series of interaction flows for collecting various types of information requested by the application program (see the upper part of FIG. 2), and the assist interaction mode is for forming the series of interaction flows. This mode is formed when an abnormality cannot be maintained.

【００３１】図８はアシスト対話モード生成手段９によ
る第１の動作態様を表すフローチャートである。この第
１の動作態様において、対話処理部１３は、音声認識部
１２より「認識せず」を示す認識結果が連続して与えら
れたとき、アシスト対話モードに移行して動作する。FIG. 8 is a flowchart showing a first operation mode by the assist dialog mode generation means 9. In the first operation mode, the dialogue processing unit 13 shifts to the assisted dialogue mode and operates when the speech recognition unit 12 continuously gives a recognition result indicating “not recognized”.

【００３２】図８を参照すると、ステップＳ１１：対話処理部１３が音声認識部１２より
受け取った、「認識せず」という認識結果が１回目のも
のか否か判断する。ステップＳ１２：その判断結果がＮｏであると、すなわ
ち「認識せず」という認識結果を２回またはそれ以上、
連続して受け取ると、対話処理部１３は、この第１の動
作態様のもとでのアシスト対話モードに移行する。Referring to FIG. 8, step S11: The dialog processing unit 13 determines whether or not the recognition result "not recognized" received from the speech recognition unit 12 is the first time. Step S12: If the determination result is No, that is, the recognition result of “not recognized” is given twice or more,
When continuously received, the interaction processing unit 13 shifts to the assist interaction mode under the first operation mode.

【００３３】ステップＳ１３：上記判断結果がＹｅｓの
とき、すなわち初めて「認識せず」という認識結果を受
け取ったときは、前の対話ＩＤならびにノードＩＤに戻
る。上記アシスト対話モードについて具体的に説明す
る。今仮に基本対話モードでの問いかけが「都道府県名
をおはなし下さい」であったものとして、これに対する
ユーザからの応答について、音声認識部１２による認識
結果が連続して「認識せず」であったとすると、今度
は、アシスト対話モードに移行する。このアシスト対話
モードでは、上記の問いかけの言い方を別の言い方に変
える。例えば上記の例に対しては、「大阪府や兵庫県の
ようにおはなし下さい」というような別の言い方に変え
る。Step S13: When the result of the determination is Yes, that is, when the recognition result of "not recognizing" is received for the first time, the process returns to the previous conversation ID and node ID. The above-described assist interactive mode will be specifically described. Now, suppose that the question in the basic dialogue mode is "Please tell me the name of the prefecture", and that the result of recognition by the voice recognition unit 12 for the response from the user to this was "No recognition" continuously. Then, the mode shifts to the assist interactive mode. In this assist interaction mode, the above-mentioned question is changed to another way. For example, for the above example, change it to another phrase such as "Please talk like Osaka or Hyogo prefectures".

【００３４】これによりユーザは引き続いて発声するこ
とができ快適な対話がスムーズに流れる。図９はアシス
ト対話モード生成手段９による第２の動作態様を表すフ
ローチャートである。この第２の動作態様において、対
話処理部１３は、音声認識部１２より「認識エラーあ
り」を示す認識結果が連続して与えられたとき、音声入
力を中止するための中止対話モードで動作する。As a result, the user can continue speaking and a comfortable dialogue flows smoothly. FIG. 9 is a flowchart showing a second operation mode by the assist interactive mode generation means 9. In the second operation mode, the dialog processing unit 13 operates in the stop dialog mode for stopping the voice input when the recognition result indicating “there is a recognition error” is continuously given from the voice recognition unit 12. .

【００３５】図９を参照すると、ステップＳ２１：対話処理部１３が音声認識部１２より
受け取った、「認識エラー」という認識結果が１回目の
ものか否か判断する。ステップＳ２２：その判断結果がＮｏであると、すなわ
ち「認識エラー」という認識結果を２回またはそれ以
上、連続して受け取ると、対話処理部１３は、この第２
の動作態様のもとでの中止対話モードに移行する。Referring to FIG. 9, step S21: The dialog processing unit 13 determines whether the recognition result "recognition error" received from the speech recognition unit 12 is the first recognition result. Step S22: If the determination result is No, that is, if the recognition result of “recognition error” is received twice or more consecutively, the interaction processing unit 13
To the suspension interactive mode under the operation mode of (1).

【００３６】ステップＳ２３：上記判断結果がＹｅｓの
とき、すなわち初めて「認識エラー」という認識結果を
受け取ったときは、前の対話ＩＤならびにノードＩＤに
戻る。上記中止対話モードについて具体的に説明する。
基本対話モードでの問いかけに対して「認識エラー」が
連続したときは、中止対話モードとしていろいろな中止
対話でユーザを援助することができる。その二、三の例
は以下のとおりである。Step S23: When the result of the determination is Yes, that is, when a recognition result of "recognition error" is received for the first time, the process returns to the previous conversation ID and node ID. The stop interactive mode will be specifically described.
When "recognition errors" continue in response to the inquiry in the basic dialogue mode, the user can be assisted in various stop dialogues in the stop dialogue mode. A few examples are as follows.

【００３７】「あなたの音声は聞き取りにくいので音声
入力を中止します（または中止しますか？）」「あなたの音声を正しく認識できない状態にあります」「もう少し大きくはっきりとはなして下さい」等である。"Your voice is difficult to hear, so stop the voice input (or stop it?)""Your voice cannot be recognized correctly""Please make it a little bigger and clearer" .

【００３８】これによりユーザは対話がスムーズに流れ
ない理由を知ることができ、従来の不快な待ち時間から
解放される。図１０はアシスト対話モード生成手段９に
よる第３および第４の動作態様を表すフローチャートで
ある。第３の動作態様において、対話処理部１３は、音
声認識部１２より予め定めた一定時間を経過しても、認
識結果が与えられないとき、タイムアウト処理を実行す
る。Thus, the user can know the reason why the conversation does not flow smoothly, and is released from the conventional uncomfortable waiting time. FIG. 10 is a flowchart showing the third and fourth operation modes by the assist dialog mode generation means 9. In the third operation mode, the dialogue processing unit 13 executes a timeout process when a recognition result is not given even after a predetermined period of time has elapsed from the voice recognition unit 12.

【００３９】この場合、その一定時間でのモードが基本
対話モードであるときは、上記のタイムアウト処理の実
行によりアシスト対話モードを形成する。また第４の動
作態様は、前記の一定時間でのモードがアシスト対話モ
ードである場合に現れる。このときは上記のタイムアウ
ト処理の実行により当該対話を一時停止するための一時
停止対話に移行する。In this case, if the mode for a certain period of time is the basic interactive mode, the assist interactive mode is formed by executing the above timeout process. Further, the fourth operation mode appears when the mode for the certain time is the assist interactive mode. At this time, the execution of the above-mentioned timeout processing shifts to a paused dialogue for temporarily stopping the dialogue.

【００４０】図１０を参照すると、ステップＳ３１：対話処理部１３は、音声認識部１２か
ら認識結果を受信したか否か判断する。受信していれば
（Ｙｅｓ）、次の対話へ移行する（ＥＮＤ）。ステップＳ３２：対話処理部１３は、上記の受信がなけ
れば（Ｎｏ）、上記のタイムアウト処理を実行し、上記
の一定時間が経過したか否か判断する。経過していなけ
れば（Ｎｏ）、上記ステップＳ３１とＳ３２を繰り返
す。Referring to FIG. 10, step S31: dialogue processing section 13 determines whether or not a recognition result has been received from speech recognition section 12. If it has been received (Yes), the process proceeds to the next conversation (END). Step S32: If the above-mentioned reception has not been performed (No), the dialogue processing unit 13 executes the above-mentioned timeout processing and determines whether or not the above-mentioned certain time has elapsed. If it has not elapsed (No), steps S31 and S32 are repeated.

【００４１】ステップＳ３３：対話処理部１３は、上記
の一定時間が経過いていれば（Ｙｅｓ）、その一定時間
での（現在の）モードが基本対話モードか否か判断す
る。ステップＳ３４：上記の判断結果が、基本対話モードで
あることを示していれば（Ｙｅｓ）、上記アシスト対話
モードへ移行する。ステップＳ３５：上記の判断結果が、基本対話モードで
ないことを示していれば（Ｎｏ）、上記の一時停止対話
に移行する。Step S33: If the above-mentioned fixed time has elapsed (Yes), the dialog processing unit 13 determines whether or not the (current) mode in the certain time is the basic dialog mode. Step S34: If the result of the determination indicates that the current mode is the basic interactive mode (Yes), the process shifts to the assist interactive mode. Step S35: If the result of the determination indicates that the mode is not the basic dialogue mode (No), the process shifts to the paused dialogue.

【００４２】かくして、上記ステップＳ３４に入れば、
ユーザを手助けする対話へユーザを案内することができ
る。また上記ステップＳ３５に入れば、対話処理部１３
は、例えば「ここで対話を一時停止します」というメッ
セージをユーザに返し、ユーザは自分の置かれている状
況を確認することができる。図１１はアシスト対話モー
ド生成手段９による第５および第６の動作態様を表すフ
ローチャートである。Thus, if the above-mentioned step S34 is entered,
The user can be guided to a dialog to assist the user. If the process goes to step S35, the dialogue processing unit 13
Returns, for example, a message "Pause the dialog here" to the user, and the user can confirm the situation in which the user is located. FIG. 11 is a flowchart showing fifth and sixth operation modes by the assist dialog mode generation means 9.

【００４３】この第５の動作態様においては、上記第４
の動作態様での対話を一時停止するための一時停止対話
のもとで、ユーザが対話再開スイッチ（図１のＳＷ）を
オンとした後にタイムアウト処理が実行されるとき、こ
のタイムアウト処理の実行により、音声入力を中止する
ための中心対話に移行する。また第６の動作態様では、
対話処理部１３が、アシスト対話モードで動作中に再び
前述の一定時間が経過したとき、そのアシスト対話モー
ドとは異なる、一層詳細な対話内容を有する別のアシス
ト対話モードに移行する。In the fifth mode of operation, the fourth mode
Under the paused dialogue for temporarily stopping the dialogue in the operation mode, when the user turns on the dialogue resume switch (SW in FIG. 1), the timeout processing is executed. Then, the process shifts to the central dialogue for stopping the voice input. In the sixth operation mode,
When the above-described predetermined time has elapsed again during the operation in the assist interaction mode, the interaction processing unit 13 shifts to another assist interaction mode having a more detailed interaction content different from the assist interaction mode.

【００４４】図１１を参照すると、ステップＳ４１：対話処理部１３は、音声認識部１２か
ら認識結果を受信したか否か判断する。受信していれば
（Ｙｅｓ）、次の対話へ移行する（ＥＮＤ）。ステップＳ４２：対話処理部１３は、上記の受信がなけ
れば（Ｎｏ）、上記のタイムアウト処理を実行し、上記
の一定時間が経過したか否か判断する。経過していなけ
れば（Ｎｏ）、上記ステップＳ４１とＳ４２を繰り返
す。Referring to FIG. 11, step S41: dialogue processing section 13 determines whether or not a recognition result has been received from speech recognition section 12. If it has been received (Yes), the process proceeds to the next conversation (END). Step S42: If the above-mentioned reception is not performed (No), the interaction processing unit 13 executes the above-mentioned timeout processing and determines whether or not the above-mentioned certain time has elapsed. If it has not elapsed (No), steps S41 and S42 are repeated.

【００４５】ステップＳ４３：対話処理部１３は、上記
の一定時間が経過していれば（Ｙｅｓ）、その一定時間
でのモード（現在のモード）が一時停止対話か否か判断
する。ステップＳ４４：上記の判断結果が、一時停止対話であ
ることを示していれば（Ｙｅｓ）、上記の中止対話へ移
行する。Step S43: If the above-mentioned fixed time has elapsed (Yes), the dialog processing unit 13 determines whether or not the mode (current mode) at that fixed time is a paused dialog. Step S44: If the result of the determination indicates that the dialogue is a paused dialogue (Yes), the process shifts to the above-mentioned suspended dialogue.

【００４６】ステップＳ４５：上記の判断結果が、一時
停止対話でないことを示していれば（Ｎｏ）、対話処理
部１３は、現在の対話状態がアシスト対話モードか否か
判断する。Ｎｏであれば、図１０のａ）へ進む。ステップＳ４６：上記の判断の結果がＹｅｓであれば、
上述した異なる、一層詳細な対話内容を有する別のアシ
スト対話モードへ移行する。Step S45: If the result of the determination indicates that the conversation is not a paused conversation (No), the dialog processing unit 13 determines whether or not the current conversation state is in the assist dialog mode. If No, go to a) of FIG. Step S46: If the result of the above determination is Yes,
A transition is made to another assist interaction mode having a different and more detailed interaction content as described above.

【００４７】これによりユーザはさらに表現を変えたメ
ッセージを受け取ることができる。図１２はアシスト対
話モード生成手段９による第７および第８の動作態様を
表すフローチャートである。第７の動作態様では、対話
処理部１３が、ユーザの発話状態中に音声認識部１２よ
り特定の音声入力を示す認識結果が与えられたとき、ア
シスト対話モードに移行する。Thus, the user can receive a message whose expression is further changed. FIG. 12 is a flowchart showing seventh and eighth operation modes by the assist dialog mode generation means 9. In the seventh operation mode, the dialog processing unit 13 shifts to the assist dialog mode when a recognition result indicating a specific voice input is given from the voice recognition unit 12 during the utterance state of the user.

【００４８】また第８の動作態様においては、対話処理
部１３が、ユーザの発話状態中に特定のスイッチ（図１
のＳＷ２）がオンになったことを検出したとき、アシス
ト対話モードに移行する。図１２を参照すると、ステップＳ５１：対話処理部１３は、ユーザから音声入
力を受け付ける対話状態であるか否か判断する。その判
断結果がＮｏならば処理を終了する（ＥＮＤ）。In the eighth mode of operation, the dialogue processing unit 13 sets a specific switch (FIG. 1) while the user is speaking.
When the switch SW2) is detected to be turned on, the mode shifts to the assist interactive mode. Referring to FIG. 12, step S51: the dialog processing unit 13 determines whether or not the user is in a dialog state in which a voice input is received from the user. If the result of the determination is No, the process ends (END).

【００４９】ステップＳ５２：上記の判断結果がＹｅｓ
ならば、音声認識部１２より、アシスト対話モードへ移
行することを示す音声入力の認識結果を受信したか否か
判断する。ステップＳ５３：上記の判断の結果がＹｅｓならば、ア
シスト対話モードに移行する。Step S52: The above judgment result is Yes
Then, it is determined whether or not the voice recognition unit 12 has received a voice input recognition result indicating transition to the assist dialog mode. Step S53: If the result of the above determination is Yes, shift to the assist interactive mode.

【００５０】ステップＳ５４：上記の判断の結果がＮｏ
ならば、ユーザにより、上記特定のスイッチＳＷ２がオ
ンとされたか判断する。この判断結果がＮｏならば処理
を終了するが（ＥＮＤ）、Ｙｅｓならばアシスト対話モ
ードに移行する。かくしてユーザはより一層内容の深い
アシストメッセージを音声出力部１６から得ることがで
きる。Step S54: If the result of the above determination is No
Then, it is determined whether the specific switch SW2 is turned on by the user. If this determination result is No, the process ends (END), but if Yes, the process shifts to the assist interactive mode. In this way, the user can obtain an assist message having a much deeper content from the audio output unit 16.

【００５１】なお上記特定のスイッチＳＷ２は、メカニ
カルなスイッチでも良いしタッチパネル形のスイッチで
も良く、あるいはＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅ
ｒＩｎｔｅｒｆａｃｅ）でも良い。ユーザが発声により
指示することが逆に面倒ならば、このようなスイッチ操
作によることもできる。図１３はアシスト対話モード生
成手段９による第９の動作態様を表すフローチャートで
あり、図１４は第９の動作態様を図解的に表す図であ
る。The specific switch SW2 may be a mechanical switch or a touch panel type switch, or a GUI (Graphical Use).
rInterface). If it is troublesome for the user to give an instruction by voice, such a switch operation can be used. FIG. 13 is a flowchart showing a ninth operation mode by the assist interaction mode generation means 9, and FIG. 14 is a diagram schematically showing the ninth operation mode.

【００５２】この第９の動作態様のもとでは、対話処理
部１３は、前述した基本対話モードおよびアシスト対話
モードのそれぞれの使用履歴を例えばメモリ（図示せ
ず）に保存し、そのアシスト対話モードの使用頻度が基
本対話モードの使用頻度を超えたとき、アシスト対話モ
ードと基本対話モードとを入れ替えるように動作する。
まず図１４の方を参照すると、ある基本対話が、対話ノ
ード１→２→４のように進むように対話スクリプトデー
タに書き込まれていたものとする。そしてそれに対応す
るアシスト対話が、対話ノード１→３→５のように進む
ように同様に書き込まれていたものとする。Under the ninth mode of operation, the dialogue processing unit 13 stores the use histories of the basic dialogue mode and the assistive dialogue mode in, for example, a memory (not shown), When the usage frequency exceeds the usage frequency of the basic interactive mode, the operation is performed so that the assist interactive mode and the basic interactive mode are switched.
First, referring to FIG. 14, it is assumed that a certain basic dialog has been written in the dialog script data so as to proceed in the order of dialog nodes 1 → 2 → 4. Then, it is assumed that the assist dialog corresponding thereto has been similarly written so as to proceed as dialog nodes 1 → 3 → 5.

【００５３】ここで、各対話ノードの使用履歴をとった
ところ、対話ノード１→３→５の使用頻度が、対話ノー
ド１→２→４の使用頻度よりも多いことが判明したもの
とする。そうすると対話処理部１３は、上記の事実に基
づき、対話ノード１へ来たとき、初めから対話ノード３
→５へと遷移するシーケンスを設定するようにする。Here, it is assumed that when the use history of each dialog node is taken, it is found that the use frequency of the dialog nodes 1 → 3 → 5 is higher than the use frequency of the dialog nodes 1 → 2 → 4. Then, based on the above fact, the dialogue processing unit 13 starts the dialogue node 3 when it comes to the dialogue node 1.
→ Set the sequence to transition to 5.

【００５４】このようにすればユーザの好みにより近い
対話シーケンスを、ユーザに最初から提供することがで
き、快適性は向上する。次に図１３を参照すると、ステップS ６１：ユーザが通過した図１４の対話ノード
の回数に関し、ノード３の回数がノード２の回数より大
きいか否か判断する。In this way, a dialog sequence closer to the user's preference can be provided to the user from the beginning, and the comfort is improved. Next, referring to FIG. 13, step S61: It is determined whether or not the number of nodes 3 is greater than the number of nodes 2 with respect to the number of dialog nodes in FIG.

【００５５】ステップＳ６２：上記の判断の結果がＹｅ
ｓならば、対話ノード２と３の入替えを行う。ステップＳ６３：上記の判断の結果がＮｏならば、現状
のままとする。図１５はアシスト対話モード生成手段９
による第１０の動作態様を説明するための図（その１）
であり、図１６は同図（その２）であり、図１７は同図
（その３）である。Step S62: The result of the above judgment is Ye
If s, the dialog nodes 2 and 3 are exchanged. Step S63: If the result of the above determination is No, the current state is maintained. FIG. 15 shows the assist interactive mode generating means 9.
For explaining the tenth operation mode according to (1)
FIG. 16 is the same figure (No. 2), and FIG. 17 is the same figure (No. 3).

【００５６】この第１０の動作態様のもとでは、前述の
基本対話モードおよびアシスト対話モードが、それぞ
れ、階層構造をなす一連の複数対話ノードで構成される
とき、基本対話モードおよびアシスト対話モードの少な
くとも一方のモードについて、各対話ノードの使用履歴
を例えばメモリ（図示せず）に保存し、各対話ノードの
使用頻度が高い階層のノードから低い階層のノードへ順
次階層の入替えを行う。In the tenth mode of operation, when the basic interactive mode and the assist interactive mode are each composed of a series of a plurality of interactive nodes having a hierarchical structure, the basic interactive mode and the assist interactive mode For at least one mode, the usage history of each dialog node is stored in, for example, a memory (not shown), and the hierarchy is sequentially switched from the node of the higher usage frequency to the node of the lower usage frequency.

【００５７】まず図１５を参照すると、本図の例では、
対話ノード２に遷移してくる対話の流れを示しており、
該ノード１→２→８は基本対話モードに属し該ノード３
→４→５→７はアシスト対話モードに属するものとす
る。ここでその対話ノード２に推移してくる各対話ノー
ドの使用履歴が、例えば図１６に示すような結果になっ
たものとする。そうすると、最終的に対話ノード５を通
過する回数は４回、同様に、対話ノード４は３回、対話
ノード３は２回ということになり、対話ノードは使用頻
度の高い順から５→４→３となる。Referring first to FIG. 15, in the example of FIG.
It shows the flow of the dialog that transits to the dialog node 2,
The node 1 → 2 → 8 belongs to the basic conversation mode and the node 3
→ 4 → 5 → 7 belongs to the assist interactive mode. Here, it is assumed that the use history of each dialog node that has shifted to the dialog node 2 has a result as shown in FIG. 16, for example. Then, finally, the number of times of passing through the dialog node 5 is four times, similarly, the dialog node 4 is three times, and the dialog node 3 is two times, and the dialog nodes are 5 → 4 → It becomes 3.

【００５８】そこで上記の結果を踏まえて、対話ノード
の並べ替えをすると、図１７に示すごとくなる。このよ
うにすると当該ユーザにとって最も自然な対話の流れに
自動的に移っていくことになり、快適性は向上する。Then, when the dialog nodes are rearranged based on the above result, the result is as shown in FIG. In this way, the flow automatically shifts to the most natural flow of conversation for the user, and the comfort is improved.

【００５９】[0059]

【発明の効果】以上説明したように本発明によれば、第
１に、対話データベース５に収容される対話スクリプト
データのデータサイズを従来よりも大幅に縮少すること
ができる。また第２に、ユーザにとって従来よりも一層
快適な対話環境を提供することができる。As described above, according to the present invention, firstly, the data size of the dialog script data stored in the dialog database 5 can be significantly reduced as compared with the related art. Second, it is possible to provide a more comfortable conversation environment for the user than before.

[Brief description of the drawings]

【図１】本発明の概念を表す図である。FIG. 1 is a diagram illustrating the concept of the present invention.

【図２】本発明に係る音声対話システムの全体構成例を
示す図である。FIG. 2 is a diagram showing an example of the overall configuration of a voice interaction system according to the present invention.

【図３】グローバル遷移ファイルを図解的に示す図であ
る。FIG. 3 is a diagram schematically showing a global transition file.

【図４】グローバル遷移条件テーブル１５の内容を図解
的に示す図である。FIG. 4 is a diagram schematically showing the contents of a global transition condition table 15;

【図５】グローバル遷移ファイルのデータ構造の一例を
図解的に示す図である。FIG. 5 is a diagram schematically illustrating an example of a data structure of a global transition file.

【図６】対話シーケンスについて一例を表す図である。FIG. 6 is a diagram illustrating an example of a dialog sequence.

【図７】図５のデータ構造の変形例を図解的に示す図で
ある。FIG. 7 is a diagram schematically showing a modification of the data structure of FIG. 5;

【図８】アシスト対話モード生成手段９による第１の動
作態様を表すフローチャートである。FIG. 8 is a flowchart showing a first operation mode of the assist interactive mode generation means 9;

【図９】アシスト対話モード生成手段９による第２の動
作態様を表すフローチャートである。FIG. 9 is a flowchart showing a second operation mode by the assist interactive mode generation means 9;

【図１０】アシスト対話モード生成手段９による第３お
よび第４の動作態様を表すフローチャートである。FIG. 10 is a flowchart showing third and fourth operation modes by the assist interactive mode generation means 9;

【図１１】アシスト対話モード生成手段９による第５お
よび第６の動作態様を表すフローチャートである。FIG. 11 is a flowchart showing fifth and sixth operation modes by the assist interactive mode generation means 9;

【図１２】アシスト対話モード生成手段９による第７お
よび第８の動作態様を表すフローチャートである。FIG. 12 is a flowchart showing seventh and eighth operation modes by the assist interactive mode generation means 9;

【図１３】アシスト対話モード生成手段９による第９の
動作態様を表すフローチャートである。FIG. 13 is a flowchart showing a ninth operation mode of the assist interactive mode generation means 9;

【図１４】第９の動作態様を図解的に表す図である。FIG. 14 is a diagram schematically showing a ninth operation mode.

【図１５】アシスト対話モード生成手段９による第１０
の動作態様を説明するための図（その１）である。FIG. 15 shows a tenth dialogue generated by the assist interactive mode generation means 9;
FIG. 3 is a diagram (No. 1) for describing the operation mode of FIG.

【図１６】アシスト対話モード生成手段９による第１０
の動作態様を説明するための図（その２）である。FIG. 16 shows a tenth dialogue generated by the assist dialogue mode generation means 9;
FIG. 10 is a diagram (part 2) for explaining the operation mode of FIG.

【図１７】アシスト対話モード生成手段９による第１０
の動作態様を説明するための図（その３）である。FIG. 17 shows a tenth dialogue generated by the assist dialogue mode generation means 9;
FIG. 10 is a diagram (No. 3) for describing the operation mode of FIG.

【図１８】一般的な音声対話システムの原型を示す図で
ある。FIG. 18 is a diagram showing a prototype of a general spoken dialogue system.

[Explanation of symbols]

７…アシスト機能８…グローバル遷移条件テーブル９…アシスト対話モード生成手段１１…音声入力部１２…音声認識部１３…対話処理部１４…対話データベース１５…グローバル遷移条件テーブル１６…音声出力部１７…連続単語認識エンジン１８…発話適応ユニット１９…音声合成部ＳＷ１…対話再開スイッチＳＷ２…特定のスイッチ 7 Assist function 8 Global transition condition table 9 Assist dialog mode generation means 11 Voice input unit 12 Voice recognition unit 13 Dialog processing unit 14 Dialog database 15 Global transition condition table 16 Voice output unit 17 Continuous Word recognition engine 18 Speech adaptation unit 19 Speech synthesis unit SW1 Dialogue restart switch SW2 Specific switch

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１０Ｌ 15/28 Ｇ１０Ｌ 3/00 ５５１Ｑ // Ｇ０１Ｃ 21/00 Ｆターム(参考） 2F029 AA02 AB13 AC18 5D015 KK02 KK04 5D045 AB21 AB30 5E501 AA20 AA22 AA23 AC03 BA05 BA12 CA08 CB15 DA11 EA21──────────────────────────────────────────────────続き Continuation of the front page (51) Int.Cl. ⁷ Identification symbol FI theme coat ゛ (reference) G10L 15/28 G10L 3/00 551Q // G01C 21/00 F term (reference) 2F029 AA02 AB13 AC18 5D015 KK02 KK04 5D045 AB21 AB30 5E501 AA20 AA22 AA23 AC03 BA05 BA12 CA08 CB15 DA11 EA21

Claims

[Claims]

1. A voice input unit for inputting voice uttered by a user in a utterance state, a voice recognition unit for recognizing the voice from the voice input unit and analyzing its meaning, and a recognition by the voice recognition unit. A dialogue processing unit that generates dialogue contents to be returned to the user while referring to the dialogue script data stored in the dialogue database in advance based on the result; and returns a synthesized voice to the user based on the dialogue contents. A voice output unit, wherein each time the user utters a voice, a series of dialogue states defined by the dialogue script data are successively transited, and various information requested by the application program is collected in an interactive manner from the user. In the voice interaction system, a global transition condition table referred to by the interaction processing unit is further provided. In the global transition condition table, in the utterance state, a plurality of global transition conditions that are commonly used under any of the dialogue states are stored in advance, and the dialogue processing unit includes a content of the utterance in the utterance state. Wherein the dialogue content is generated by accessing the global transition condition table when the global transition condition is satisfied.

2. The global transition condition table according to claim 1, wherein the global transition condition table specifies a plurality of global transition files that store individual conversation contents corresponding to each of a plurality of predetermined global transition conditions. Spoken dialogue system.

3. Each of the plurality of global transition files includes dialog ID information for specifying a transition destination dialog ID, and a transition destination node ID in a series of dialog sequences developed by the specified transition destination dialog ID. 3. The voice interaction system according to claim 1, wherein the node ID information to be specified is stored.

4. The dialog ID information and the node ID
4. The voice interaction system according to claim 3, wherein the information is described by codes representing various transition destinations.

5. A voice input unit for inputting a voice uttered by a user in an utterance state, a voice recognition unit for recognizing the voice from the voice input unit and analyzing the meaning, and a recognition by the voice recognition unit. A dialogue processing unit that generates dialogue contents to be returned to the user while referring to the dialogue script data stored in the dialogue database in advance based on the result, and returns a synthesized voice to the user based on the dialogue contents A voice output unit, and each time the user utters a voice, a series of dialogue states defined by the dialogue script data are sequentially transited, and various information requested by the application program is collected in an interactive manner from the user. In the voice interaction system, the interaction processing unit operates in an assist interaction mode in addition to the basic interaction mode. The present interactive mode is a mode for forming a normal series of dialog flows for collecting various information requested by the application program, and the assist interactive mode is formed when an abnormality cannot be maintained for the series of dialog flows. A speech dialogue system characterized by being in a mode.

6. The dialogue processing unit operates when the speech recognition unit continuously receives a recognition result indicating “not recognized” from the voice recognition unit and shifts to the assisted dialogue mode. Item 6. The voice interaction system according to Item 5.

7. The interactive processing unit operates in a stop interactive mode for stopping a speech input when a recognition result indicating “there is a recognition error” is continuously given from the voice recognition unit. The voice interaction system according to claim 5, wherein

8. The dialogue processing unit executes a timeout process when the recognition result is not given even after a predetermined period of time has elapsed from the voice recognition unit, and the mode in the predetermined period is 6. The voice interaction system according to claim 5, wherein when in the basic interaction mode, the assist interaction mode is formed by executing the timeout process.

9. The dialogue processing unit executes a time-out process when the recognition result is not given even after a predetermined period of time has elapsed from the voice recognition unit, and the mode in the predetermined period is set to the mode. The voice interaction system according to claim 5, wherein, when the assist interaction mode is set, the execution of the timeout process causes a transition to a suspension interaction for temporarily suspending the interaction.

10. When the timeout process is executed after turning on a dialogue resume switch under the paused dialogue for temporarily suspending the interaction, the voice input is stopped by executing the timeout process. The voice dialogue system according to claim 9, wherein the process shifts to a stop dialogue to perform the dialogue.

11. The method according to claim 1, wherein the dialogue processing section shifts to another assistive dialogue mode having a dialogue content different from that of the assistive dialogue mode when the predetermined time has elapsed again while operating in the assistive dialogue mode. Spoken dialogue system.

12. The method according to claim 1, wherein the dialogue processing unit shifts to an assistive dialogue mode when a recognition result indicating a specific voice input is given from the voice recognition unit while the user is speaking in the assistive dialogue mode. The voice interaction system according to claim 5, wherein

13. The voice dialogue according to claim 5, wherein when the dialogue processing unit detects that a specific switch is turned on during a utterance state of the user, the dialogue processing unit shifts to an assistive dialogue mode. system.

14. The usage history of each of the basic interaction mode and the assist interaction mode is stored, and when the usage frequency of the assist interaction mode exceeds the usage frequency of the basic interaction mode, the assist interaction mode and the basic interaction mode are stored. The voice interaction system according to claim 5, wherein the conversation mode is switched.

15. When the basic interaction mode and the assist interaction mode are respectively composed of a series of a plurality of interaction nodes having a hierarchical structure, each of the basic interaction mode and the assist interaction mode has at least one of: The voice dialogue system according to claim 5, wherein the use history of the dialogue nodes is stored, and the hierarchy is sequentially switched from a node having a higher frequency of use to a node having a lower frequency.