JP2003108191A

JP2003108191A - Voice interacting device

Info

Publication number: JP2003108191A
Application number: JP2001305683A
Authority: JP
Inventors: Tsukasa Shimizu; 司清水; Ryuta Terajima; 立太寺嶌; Iko Terasawa; 位好寺澤; Hiroyuki Hoshino; 博之星野; Toshihiro Wakita; 敏裕脇田
Original assignee: Toyota Central R&D Labs Inc
Current assignee: Toyota Central R&D Labs Inc
Priority date: 2001-10-01
Filing date: 2001-10-01
Publication date: 2003-04-11

Abstract

PROBLEM TO BE SOLVED: To provide a voice interacting device which does not lower operation efficiency. SOLUTION: When voice interaction input is done during operation, a cognitive information processing load increases and the operation efficiency decreases. For the purpose, the voice interacting device having a voice recognition part 12 is provided with an operation state detection part 20, an operation load estimation part 25, and an interaction control part 13. The voice recognition part 12 changes a language dictionary for recognition and a sound dictionary for recognition according to the operation load. Consequently, the recognition rate is improved. Further, the interaction control part 13 makes a question answer sentence and a confirmation answer sentence single when it is estimated that the operation load is heavy, and allows a plurality of the question answer sentences or a question answer sentence containing implicit confirmation when the operation load is light. Consequently, the cognitive information processing load by interaction input is always suitably controlled and the operation efficiency does not decrease. Namely, the operation and interaction input are both enabled by changing the interaction systems.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、様々な環境下、様
々な作業者状態においても対話入力を可能とする音声対
話装置に関する。特に、様々な条件下においても作業と
対話入力を両立させる音声対話装置に関する。本発明
は、車種又は車輌の走行状態、運転者状態を把握して的
確に対話入力をさせる音声対話装置に適用できる。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice interactive apparatus which enables interactive input even in various worker states under various environments. In particular, the present invention relates to a voice dialogue device that makes work and dialogue input compatible even under various conditions. INDUSTRIAL APPLICABILITY The present invention can be applied to a voice interaction device that grasps a vehicle type or a running state of a vehicle and a driver's state and accurately inputs a dialogue.

【０００２】[0002]

【従来の技術】従来より、様々な環境下で対話入力を可
能とする音声対話装置がある。例えば、特開平１０−２
０８８４号に開示の音声対話装置、特開平１０−１０４
００９号に開示にナビゲーション装置がある。前者は図
１１に示すように、音声入力部１０１、音声認識部１０
２、辞書記憶部１０３、辞書選択部１０４、ガイダンス
選択部１０７、音声出力部１０９からなる従来の音声対
話装置に、対話管理部１０６、対話記憶部１０５、習熟
度検出部１１０を付加した例である。簡単に説明する
と、運転者が音声を音声入力部１０１から入力すると、
音声認識部１０２が音声認識を行う。音声認識に使用す
る辞書は、辞書選択部１０４が辞書記憶部１０３から選
択した認識用辞書を使用する。そして、対話管理部１０
６が対話記憶部１０５の記憶内容と音声認識部１０２の
認識結果とに従い対話の流れを管理する。熟練度検出部
１１０は、対話の速度によりユーザの熟練度を判別す
る。そして、ガイダンス選択部１０７は、熟練度検出部
１１０の結果（習熟度）と対話記憶部１０５における記
憶内容に従い、出力する音声ガイダンスをガイダンス記
憶部１０６から決定する。最後に、音声出力部１０９か
らそれを出力する。このように特開平１０−２０８８４
号に開示の音声対話装置は、習熟度に応じてガイダンス
内容を変更し、効率よく対話させることが特徴である。2. Description of the Related Art Conventionally, there is a voice dialogue device which enables dialogue input under various environments. For example, Japanese Patent Laid-Open No. 10-2
No. 0884, Japanese Patent Laid-Open No. 10-104
No. 009 discloses a navigation device. The former is, as shown in FIG. 11, a voice input unit 101 and a voice recognition unit 10.
2, an example in which a dialogue management unit 106, a dialogue storage unit 105, and a proficiency level detection unit 110 are added to a conventional voice dialogue device including a dictionary storage unit 103, a dictionary selection unit 104, a guidance selection unit 107, and a voice output unit 109. is there. Briefly, when the driver inputs a voice from the voice input unit 101,
The voice recognition unit 102 performs voice recognition. The dictionary used for voice recognition is the recognition dictionary selected by the dictionary selection unit 104 from the dictionary storage unit 103. Then, the dialogue management unit 10
6 manages the flow of the dialogue according to the stored contents of the dialogue storage unit 105 and the recognition result of the voice recognition unit 102. The skill level detection unit 110 determines the skill level of the user based on the speed of interaction. Then, the guidance selection unit 107 determines the voice guidance to be output from the guidance storage unit 106 according to the result (proficiency level) of the skill level detection unit 110 and the stored content in the dialogue storage unit 105. Finally, it is output from the audio output unit 109. As described above, JP-A-10-20884
The speech dialogue device disclosed in the issue is characterized in that the contents of the guidance are changed according to the degree of proficiency so that the dialogue can be carried out efficiently.

【０００３】又、後者の従来例は車載用ナビゲーション
装置であって、車輌の走行状態により音声情報の提供に
様々な規制を加える装置である。例えば、車輌の急旋
回、急制動時には音声情報の提供を中止することを特徴
としている。又、例えば車輌の急旋回、急制動時には登
録された特性メディアの音声情報のみを提供することを
特徴としている。又、提供する音声情報の順番を変更す
ること等も特徴としている。又、音声情報の総量を限定
したりそれにかかる時間を規制することを特徴としてい
る。更に、音声情報の提供が中止された場合は文字情報
で提供することも特徴としている。The latter conventional example is a vehicle-mounted navigation device, which is a device that applies various restrictions to the provision of voice information depending on the running state of the vehicle. For example, it is characterized in that the provision of voice information is stopped when the vehicle makes a sharp turn or is suddenly braked. Further, it is characterized in that only voice information of the registered characteristic media is provided when the vehicle suddenly turns or is suddenly braked. Another feature is that the order of the audio information provided is changed. It is also characterized in that the total amount of voice information is limited and the time required for it is regulated. Further, when the provision of the voice information is stopped, the voice information is provided as the character information.

【０００４】[0004]

【発明が解決しようする課題】上述のように、特開平１
０−２０８８４号に開示の音声対話装置は作業者の習熟
度に対して音声応答を変化させる例である。即ち、作業
者の回りの環境ノイズ等の作業状態や作業装置の状態を
考慮するものではない。例えば、音声対話装置を車輌用
に適用した場合は、音声入力に車輌のエンジン音、風切
り音、ロード音ノイズ等が混入する場合がある。このよ
うな場合は、音声認識が困難となる。即ち、音声対話が
困難となる場合がある。又、他に音声対話時には作業者
（運転者）が他の装置を作動させている場合がある。例
えば、車輌を加速させたり制動させている場合がある。
この様な場合に、音声対話を行うと音声対話による認知
的情報処理負荷の増大によって注意力が低下し、車輌操
作を遅らせることがある。SUMMARY OF THE INVENTION As described above, Japanese Patent Laid-Open No.
The voice interaction device disclosed in No. 0-20884 is an example in which the voice response is changed according to the skill level of the worker. That is, it does not take into consideration the work state such as environmental noise around the worker or the state of the work device. For example, when the voice interaction device is applied to a vehicle, a vehicle engine sound, wind noise, road noise, etc. may be mixed in the voice input. In such a case, voice recognition becomes difficult. That is, voice dialogue may be difficult. In addition, an operator (driver) may be operating another device during voice interaction. For example, the vehicle may be accelerating or braking.
In such a case, when the voice dialogue is performed, the attention is lowered due to an increase in the cognitive information processing load due to the voice dialogue, and the vehicle operation may be delayed.

【０００５】又、特開平１０−１０４００９号に開示に
ナビゲーション装置は、車両の走行状態をパラメータと
し、そのパラメータによって音声対話を様々に規制する
例である。例えば、走行状態によっては音声対話を完全
に中止する例である。即ち、車輌操作と音声対話を完全
に両立させるものではない。又、上記ナビゲーションシ
ステムは、車輌の走行状態のみをパラメータとする例で
あり、運転者（作業者）状態をパラメータとするもので
はない。例えば、運転者が運転に集中（緊張）している
場合は、運転者の発話特徴が変化し音声認識率が低下す
ることが知られている。即ち、的確な音声対話を遂行す
るためには、運転者状態（運転者の心拍数、目の動き
等）を把握して音声認識条件を変更することが必要とな
る。即ち、運転者状態をパラメータとしない上記ナビゲ
ーションシステムは、常に的確な音声対話を保証するも
のではない。The navigation device disclosed in Japanese Unexamined Patent Publication No. 10-10409 is an example in which the running state of the vehicle is used as a parameter, and the voice conversation is variously regulated by the parameter. For example, it is an example in which the voice conversation is completely stopped depending on the running state. That is, the vehicle operation and the voice dialogue are not completely compatible with each other. Further, the above navigation system is an example in which only the running state of the vehicle is used as a parameter, and the driver (worker) state is not used as a parameter. For example, it is known that when the driver concentrates (tenses) on driving, the utterance characteristics of the driver change and the voice recognition rate decreases. That is, in order to carry out an accurate voice dialogue, it is necessary to grasp the driver's state (the driver's heart rate, eye movement, etc.) and change the voice recognition condition. That is, the navigation system that does not use the driver's state as a parameter does not always guarantee an accurate voice dialogue.

【０００６】本発明は、上述した問題点を解決するため
になされたものであり、その目的は、作業者の作業負荷
に基づいて適切に音声対話方式、認識用辞書等を制御す
ることで、様々な環境下及び様々な作業者状態において
も作業効率を低下させることなく対話入力を可能とする
ことである。即ち、作業と対話入力を両立させることで
ある。又、他の目的は、本発明の音声対話装置を自動車
の車室内での音声対話装置に適用し、走行の安全性を維
持しつつ円滑な音声対話を実現させることである。これ
らの発明の目的は、個々の発明が全ての目的を達成すべ
きものと解されるべきでなく、対応する個々の発明が、
それぞれ達成すべき目的と解されるべきである。The present invention has been made to solve the above-mentioned problems, and an object of the present invention is to appropriately control a voice dialogue system, a recognition dictionary, etc. based on a work load of an operator. It is possible to enable interactive input in various environments and in various worker states without lowering work efficiency. That is, it is to make work and dialogue input compatible. Another object of the present invention is to apply the voice dialogue system of the present invention to a voice dialogue system in a passenger compartment of an automobile so as to realize smooth voice dialogue while maintaining driving safety. The purpose of these inventions should not be understood as that each invention should achieve all the objects, and the corresponding individual inventions are:
Each should be understood as an objective to be achieved.

【０００７】[0007]

【課題を解決するための手段】請求項１に記載の音声対
話装置は音声認識技術を利用して作業中の対話入力を可
能とする音声対話装置であって、作業状態検出部と、そ
の作業状態検出部の検出結果に基づいて作業者の作業負
荷を推定する作業負荷推定部と、単数又は複数の認識用
言語辞書、及び単数又は複数の認識用音響辞書を用い
て、音声入力部への入力を認識する音声認識部と、音声
認識部の認識結果とその認識結果に対応するデータベー
ス検索結果とそのデータベース検索結果と確定度で構成
される対話進展度情報と作業負荷推定部で推定された作
業負荷レベルとに基づいて作業者の作業負荷を増大させ
ない確認項目及び／又は質問項目と音声特徴を選択する
対話制御部と、その対話制御部の結果に基づいて作業者
への確認ガイダンス及び／又は質問ガイダンスを生成す
るガイダンス生成部とを備えたことを特徴とする。A voice dialog device according to claim 1 is a voice dialog device which enables a dialogue input during a work by using a voice recognition technique. Using a work load estimation unit that estimates the work load of the worker based on the detection result of the state detection unit, one or more recognition language dictionaries, and one or more recognition acoustic dictionaries, Estimated by the speech recognition unit that recognizes the input, the recognition result of the speech recognition unit, the database search result corresponding to the recognition result, the conversation progress information composed of the database search result and the certainty, and the work load estimation unit. A dialogue control unit that selects a confirmation item and / or a question item and a voice feature that does not increase the work load of the worker based on the work load level, and a confirmation guidance to the worker based on the result of the dialogue control unit. Characterized in that a guidance generator for generating a beauty / or questions guidance.

【０００８】又、請求項２に記載の音声対話装置は請求
項１に記載の音声対話装置であって、音声認識部は作業
負荷推定部の推定結果に基づいて、複数の認識用言語辞
書、及び複数の認識用音響辞書から最適な認識用言語辞
書と認識用音響辞書を選択し認識することを特徴とす
る。又、請求項３に記載の音声対話装置は請求項１に記
載の音声対話装置であって、音声認識部は作業状態検出
部の検出結果及び／又は対話制御部の対話進展度情報に
基づいて、複数の認識用言語辞書、及び複数の認識用音
響辞書から最適な認識用言語辞書と認識用音響辞書を選
択し認識することを特徴とする。The speech dialogue system according to a second aspect of the invention is the speech dialogue system according to the first aspect, wherein the speech recognition section is based on the estimation result of the work load estimation section, and a plurality of recognition language dictionaries are provided. And an optimal recognition language dictionary and a recognition acoustic dictionary are selected from a plurality of recognition acoustic dictionaries for recognition. Further, the voice dialogue apparatus according to claim 3 is the voice dialogue apparatus according to claim 1, wherein the voice recognition unit is based on the detection result of the work state detection unit and / or the dialogue progress degree information of the dialogue control unit. , A plurality of recognition language dictionaries and a plurality of recognition sound dictionaries are selected to recognize an optimal recognition language dictionary and recognition sound dictionary.

【０００９】又、請求項４に記載の音声対話装置は請求
項１乃至請求項３の何れか１項に記載の音声対話装置で
あって、作業状態検出部は作業者状態を検出する作業者
状態検出部、及び／又は作業装置の状態を検出する装置
状態検出部、及び／又は周囲の雑音を検出する雑音検出
部からなることを特徴とする。Further, the voice dialogue apparatus according to a fourth aspect is the voice dialogue apparatus according to any one of the first to third aspects, wherein the work state detecting section detects the worker state. It is characterized by comprising a state detection unit, and / or a device state detection unit that detects the state of the work device, and / or a noise detection unit that detects ambient noise.

【００１０】又、請求項５に記載の音声対話装置は請求
項４に記載の音声対話装置であって、作業者状態検出部
は作業者の生理指標から作業者状態を検出することを特
徴とする。又、請求項６に記載の音声対話装置は請求項
５に記載の音声対話装置であって、作業者の生理指標は
心拍数、発汗量、視線関連値であることを特徴とする。
又、請求項７に記載の音声対話装置は請求項４乃至請求
項６の何れか１項に記載の音声対話装置であって、装置
状態検出部は作業装置の物理量とその装置の使用頻度か
らその装置状態を検出することを特徴とする。Further, a voice dialogue system according to a fifth aspect is the voice dialogue system according to the fourth aspect, wherein the worker state detecting section detects the worker state from the physiological index of the worker. To do. A sixth aspect of the present invention is the voice interaction device according to the fifth aspect, wherein the physiological index of the worker is a heart rate, a sweat rate, and a line-of-sight related value.
Further, the voice interaction device according to claim 7 is the voice interaction device according to any one of claims 4 to 6, wherein the device state detection unit is based on the physical quantity of the work device and the frequency of use of the device. It is characterized in that the device state is detected.

【００１１】又、請求項８に記載の音声対話装置は請求
項１乃至請求項７の何れか１項に記載の音声対話装置で
あって、対話進展度情報は確認項目とその確認項目の確
定度を表す未知状態、要確認状態、確定状態の何れか１
つから構成されることを特徴とする。又、請求項９に記
載の音声対話装置は請求項１乃至請求項８の何れか１項
に記載の音声対話装置であって、対話制御部は作業負荷
が小又は中である場合は要確認状態である確認項目を使
用してその項目の確認を促すとともに、新規の質問項目
を設定することを特徴とする。Further, the voice dialogue apparatus according to claim 8 is the voice dialogue apparatus according to any one of claims 1 to 7, wherein the dialogue progress information is confirmation items and confirmation of the confirmation items. One of the unknown state, the confirmation required state, and the confirmed state indicating the degree 1
It is characterized by being composed of two. Further, the voice dialogue apparatus according to claim 9 is the voice dialogue apparatus according to any one of claims 1 to 8, and the dialogue control unit requires confirmation when the workload is small or medium. It is characterized by using a confirmation item that is in a state to prompt confirmation of the item and setting a new question item.

【００１２】又、請求項１０に記載の音声対話装置は請
求項１乃至請求項８の何れか１項に記載の音声対話装置
であって、対話制御部は作業負荷が大である場合は、質
問項目を単一の一問一答形式に設定することを特徴とす
る。又、請求項１１に記載の音声対話装置は請求項１乃
至請求項１０の何れか１項に記載の音声対話装置であっ
て、対話制御部は所定時間及び／又は所定時間後の作業
負荷が大であると推定される場合は、対話負荷がその所
定時間及び／又は所定時間後に増大しないように対話方
式、対話時間間隔及び／又は発話長、発話速度を調整す
ることを特徴とする。Further, a voice dialogue system according to a tenth aspect is the voice dialogue system according to any one of the first to eighth aspects, wherein the dialogue control section has a large work load, It is characterized in that the question items are set in a single question-and-answer format. Further, the voice dialogue apparatus according to claim 11 is the voice dialogue apparatus according to any one of claims 1 to 10, wherein the dialogue control unit has a predetermined time and / or a workload after a predetermined time. If it is estimated to be large, it is characterized by adjusting the interaction method, the interaction time interval and / or the utterance length, and the utterance speed so that the interaction load does not increase for the predetermined time and / or after the predetermined time.

【００１３】又、請求項１２に記載の音声対話装置は請
求項１乃至請求項１１の何れか１項に記載の音声対話装
置であって、作業状態検出部と作業負荷推定部は音声認
識部及び対話制御部及びガイダンス生成部とは独立した
プロセスであることを特徴とする。Further, the voice interactive apparatus according to claim 12 is the voice interactive apparatus according to any one of claims 1 to 11, wherein the work state detecting section and the work load estimating section are voice recognizing sections. Also, the process is independent of the dialogue control unit and the guidance generation unit.

【００１４】又、請求項１３に記載の音声対話装置は請
求項１乃至請求項１２の何れか１項に記載の音声対話装
置であって、作業装置は車輌であり作業者はその車輌の
運転者であることを特徴とする。又、請求項１４に記載
の音声対話装置は請求項１３に記載の音声対話装置であ
って、作業装置の物理量は前方及び／又は後方車両との
車間距離、車輌の位置、速度、加速度、旋回速度、旋回
加速度、アクセル／ブレーキペダル踏み込み量の少なく
とも１つを含むことを特徴とする。又、請求項１５に記
載の音声対話装置は請求項１３又は請求項１４に記載の
音声対話装置であって、その音声対話装置は車輌に搭載
された情報提供装置に備えられることを特徴とする。A voice dialogue system according to a thirteenth aspect is the voice dialogue system according to any one of the first to twelfth aspects, wherein the working device is a vehicle and the worker drives the vehicle. It is characterized by being a person. The voice interaction device according to claim 14 is the voice interaction device according to claim 13, wherein the physical quantity of the work device is a vehicle-to-vehicle distance between a front vehicle and / or a rear vehicle, a vehicle position, a speed, an acceleration, and a turn. It is characterized by including at least one of speed, turning acceleration, and accelerator / brake pedal depression amount. Further, a voice dialogue device according to claim 15 is the voice dialogue device according to claim 13 or 14, wherein the voice dialogue device is provided in an information providing device mounted on a vehicle. .

【００１５】[0015]

【作用および効果】本欄では、各請求項に記載の発明に
関して、主としてその作用及び効果を記載する。発明の
理解を容易にするために、例示的に具体化して説明して
いるが、請求項の構成を限定するものではない。そし
て、例示的に具体化して説明した部分は、発明の実施の
形態の説明でもある。請求項１に記載の音声対話装置に
よれば、作業中の音声対話入力において、先ず作業状態
検出部が作業状態を検出する。作業状態とは、作業者の
作業負荷に関連する余裕度、煩雑度、作業装置の駆動状
態等である。そして、作業負荷推定部がその作業状態検
出部の検出結果に基づいて作業者の作業負荷を推定す
る。作業負荷とは認知的情報処理負荷を意味し、作業負
荷小とは新たな認知的情報処理負荷を受容する余裕度が
大であることを意味する。又、逆に作業負荷大とは新た
な認知的情報処理負荷を受容する余裕度が小であること
を意味する。この作業負荷は、例えば所定時間毎に検出
され更新される。音声認識部は、単数又は複数の認識用
言語辞書、及び単数又は複数の認識用音響辞書を有し、
作業者の発話をその認識用言語辞書、及び認識用音響辞
書を参照して認識する。ACTION AND EFFECT This section mainly describes the action and effect of the invention described in each claim. For facilitating the understanding of the invention, the invention is embodied and described as an example, but the structure of the claims is not limited thereto. The parts that have been concretely illustrated and described are also descriptions of the embodiments of the present invention. According to the voice interaction device of the first aspect, in the voice interaction input during work, the work state detection unit first detects the work state. The work state is a margin, a degree of complexity, a driving state of the work device, or the like related to the work load of the worker. Then, the work load estimating unit estimates the work load of the worker based on the detection result of the work state detecting unit. The workload means a cognitive information processing load, and the small workload means that there is a large margin to accept a new cognitive information processing load. On the contrary, a large work load means that the margin for accepting a new cognitive information processing load is small. This work load is detected and updated, for example, every predetermined time. The voice recognition unit has a single or plural recognition language dictionary, and a single or plural recognition acoustic dictionary,
The utterance of the worker is recognized by referring to the recognition language dictionary and the recognition acoustic dictionary.

【００１６】そして、対話制御部はその音声認識部の認
識結果とその認識結果に対応するデータベース検索結果
（そのデータベースの有無を含む）、その検索結果とそ
の確定度に基づいた対話進展度情報を作製する。対話進
展度情報は、認識結果又は検索結果に基づいた対話項目
とその確定度からなる。例えば作業装置を車輌、対話入
力対象をナビゲーション装置とすれば、対話項目は目的
地の名称、業種、住所、詳細住所等となる。又、確定度
は確定の度合いであり、例えば１、２、３で表示する。
そして、その確定数（確定度３の項目数）が進展度であ
る。更に、対話制御部は作業負荷推定部で推定された作
業負荷レベルを考慮して、作業者の情報処理負荷を増大
させないように次の確認項目及び／又は質問項目と音声
特徴を選択し、次段のガイダンス生成部に出力する。例
えば、作業負荷が大である場合は次の確認項目又は質問
項目を単数で出力する。次段では、例えば、場所は’名
古屋市’ですか？と単一文の質問となる。作業者（運転
者）は、例えば’ハイ’又は’イイエ’で応答するので
認知的情報処理負荷が小となる。よって、作業効率（運
転操作）は低下されない。逆に、対話制御部は、作業負
荷が小である場合は例えば確認項目と質問項目との複数
を出力する。この時、次段では例えば、’名古屋市’
の’何というレストランですか’という質問文となる。
この場合、’名古屋市’が前回の認識結果及びデータベ
ース検索結果であり暗黙的な確認項目となる。そし
て、’何というレストランですか’が質問項目となる。
この場合は、作業負荷が小であるので、複数項目の認知
でも作業効率は低下されない。Then, the dialogue control unit provides the recognition result of the voice recognition unit, the database search result (including the presence or absence of the database) corresponding to the recognition result, and the dialogue progress degree information based on the search result and the degree of determination. Create. The dialogue progress information includes dialogue items based on the recognition result or the search result and the degree of confirmation thereof. For example, if the working device is a vehicle and the dialog input target is a navigation device, the dialog items are the name of the destination, the business type, the address, the detailed address, and the like. The degree of confirmation is the degree of confirmation, and is displayed as 1, 2, or 3, for example.
Then, the determined number (the number of items with the degree of determination of 3) is the degree of progress. Further, the dialogue control unit selects the next confirmation item and / or question item and voice feature so as not to increase the information processing load of the worker, in consideration of the work load level estimated by the work load estimation unit. Output to the guidance generation unit of the column. For example, when the work load is large, the following confirmation item or question item is output as a single item. In the next stage, for example, is the place'Nagoya City '? Becomes a single sentence question. The worker (driver) responds with, for example, “high” or “yes”, so that the cognitive information processing load becomes small. Therefore, work efficiency (driving operation) is not reduced. On the contrary, when the work load is small, the dialogue control unit outputs, for example, a plurality of confirmation items and question items. At this time, in the next stage, for example, 'Nagoya City'
It becomes the question sentence "What kind of restaurant is it?"
In this case, 'Nagoya City' is the last recognition result and database search result and is an implicit confirmation item. The question item is "What kind of restaurant is it?"
In this case, since the work load is small, the work efficiency is not lowered even by recognizing a plurality of items.

【００１７】そして、最後に、ガイダンス生成部が対話
制御部の出力に基づいて、上述した確認ガイダンス及び
／又は質問ガイダンスを選択された音声特徴の音声合成
で出力する。尚、音声認識部、対話制御部の動作は、上
記対話進展度情報の全ての項目が決定されるまで繰り返
される。このように作業状態によって対話制御部が対話
項目を変化させるので、作業効率は低下することはな
い。まして、作業操作は中止されることはない。即ち、
作業と音声対話入力を両立させることができる。即ち、
複数の作業を円滑に行わせる利便性に優れた音声対話装
置となる。Finally, the guidance generation section outputs the above-mentioned confirmation guidance and / or question guidance by voice synthesis of the selected voice feature based on the output of the dialogue control section. The operations of the voice recognition unit and the dialogue control unit are repeated until all the items of the dialogue progress information are determined. In this way, the dialogue control unit changes the dialogue item depending on the work state, so that the work efficiency does not decrease. Moreover, the work operation is not stopped. That is,
Work and voice dialogue input can be compatible. That is,
This is a highly convenient voice interaction device that smoothly performs a plurality of tasks.

【００１８】又、請求項２に記載の音声対話装置は請求
項１に記載の音声対話装置であって、音声認識部は作業
負荷推定部の推定結果に基づいて、複数の認識用言語辞
書、及び認識用音響辞書から最適な認識用言語辞書、及
び認識用音響辞書を選択し認識している。一般に、作業
者の発語及び発音は作業負荷によって変化することが知
られている。よって、音声認識部は、作業負荷推定部の
推定結果（作業負荷）に対応した最適な認識用言語辞
書、及び認識用音響辞書を選択する。そして、その最適
な辞書を用いて音声認識する。最適な辞書を用いて音声
認識がなされるので、常に的確に作業者の発話が認識さ
れる。これにより、認識率が向上し常に円滑な対話を遂
行することができる。The voice dialog device according to claim 2 is the voice dialog device according to claim 1, wherein the voice recognition unit is based on the estimation result of the work load estimation unit, and a plurality of recognition language dictionaries are provided. The optimal recognition language dictionary and the recognition acoustic dictionary are selected from the recognition acoustic dictionary and recognized. It is generally known that the utterances and pronunciations of workers vary depending on the work load. Therefore, the voice recognition unit selects the optimal recognition language dictionary and recognition acoustic dictionary corresponding to the estimation result (work load) of the work load estimation unit. Then, voice recognition is performed using the optimum dictionary. Since speech recognition is performed using the optimum dictionary, the utterance of the worker is always recognized accurately. As a result, the recognition rate is improved and a smooth dialogue can always be performed.

【００１９】又、請求項３に記載の音声対話装置は請求
項１に記載の音声対話装置であって、音声認識部は作業
状態検出部の検出結果及び／又は対話制御部の対話進展
度情報に基づいて、複数の認識用言語辞書、及び複数の
認識用音響辞書から最適な認識用言語辞書と認識用音響
辞書を選択し認識している。一般に、作業者の発話の音
響的特徴と、言語的特徴は作業状況、雑音状況によって
変化することが知られている。従って、音声認識部は様
々な作業状況、雑音状況に対応する認識用言語辞書と認
識用音響辞書を用意し、作業状態検出部の検出結果に応
じてそれらを選択する。これにより、認識率を向上させ
ることができる。又、音声認識部は対話制御部の対話進
展度情報からフィードバックで認識に必要な情報を得る
ようにする。例えば、住所決定において’名古屋市’が
確定すれば、詳細住所の最後尾は、○○区となる。音声
認識部は最後尾の’ｋｕ’を認識する必要がない。即
ち、○○の部分のみ認識すればよい。これによっても認
識率を向上させることができる。即ち、音声対話の成立
を向上させることができる。Further, the voice dialogue device according to claim 3 is the voice dialogue device according to claim 1, wherein the voice recognition unit detects the detection result of the work state detection unit and / or the dialogue progress degree information of the dialogue control unit. Based on the above, the optimal recognition language dictionary and the recognition acoustic dictionary are selected and recognized from the plurality of recognition language dictionaries and the plurality of recognition acoustic dictionaries. It is generally known that the acoustic characteristics and linguistic characteristics of a worker's utterance change depending on the work situation and noise situation. Therefore, the voice recognition unit prepares a recognition language dictionary and a recognition acoustic dictionary corresponding to various work situations and noise situations, and selects them according to the detection result of the work state detection unit. Thereby, the recognition rate can be improved. Also, the voice recognition unit obtains information necessary for recognition by feedback from the conversation progress degree information of the dialogue control unit. For example, if “Nagoya City” is confirmed in the address determination, the tail end of the detailed address becomes XX ward. The voice recognition unit does not need to recognize the last'ku '. That is, it is sufficient to recognize only the XX part. This can also improve the recognition rate. That is, the establishment of the voice dialogue can be improved.

【００２０】又、請求項４に記載の音声対話装置は請求
項１乃至請求項３の何れか１項に記載の音声対話装置で
あって、作業状態検出部は作業者状態を検出する作業者
状態検出部及び／又は作業装置の状態を検出する装置状
態検出部及び／又は周囲雑音を検出する雑音検出部から
構成されている。作業者状態検出部は、直接、作業者の
例えば余裕度を検出する。余裕度は、例えば心拍数、姿
勢変化、手足の動作数から検出する。装置状態検出部
は、装置状態を直接検出する。装置状態とは、例えば装
置が車輌である場合は、エンジン回転数、走行速度等で
ある。又、雑音検出部は周囲の雑音レベルを検出する。
何故なら、雑音レベルが高いと連続語の音声認識率が低
下するためである。作業状態検出部をこのように構成す
れば、音声認識率を高めることができる。又、これによ
り音声認識後段の対話制御部の対話を更に最適にし、更
に的確な音声対話をさせることができる。Further, the voice dialogue apparatus according to claim 4 is the voice dialogue apparatus according to any one of claims 1 to 3, wherein the work state detecting section detects the worker state. The state detecting section and / or the apparatus state detecting section for detecting the state of the working apparatus and / or the noise detecting section for detecting ambient noise. The worker state detection unit directly detects, for example, a margin of the worker. The margin is detected from, for example, the heart rate, the posture change, and the number of limb movements. The device state detection unit directly detects the device state. When the device is a vehicle, the device state is, for example, the engine speed, the traveling speed, or the like. In addition, the noise detection unit detects the ambient noise level.
This is because the speech recognition rate of continuous words decreases when the noise level is high. If the work state detection unit is configured in this way, the voice recognition rate can be increased. Further, this makes it possible to further optimize the dialogue of the dialogue control unit at the latter stage of the voice recognition and to make the voice dialogue more accurate.

【００２１】又、請求項５に記載の音声対話装置は請求
項４に記載の音声対話装置であって、作業者状態検出部
は作業者の生理指標から作業者状態を検出している。生
理指標とは、作業者の生理状態を表すパラメータであ
る。例えば、緊張によって変化する心拍数、発汗量等で
ある。この生理指標は環境（作業状態）によって自律的
に変化するパラメータであり、的確に作業者の例えば余
裕度を検出することができる。これを、次段の音声認識
時の辞書選択に適用すれば、より的確に音声を認識させ
ることができる。又、作業負荷推定部を介して対話制御
部に適用すれば、より円滑に作業と音声対話を遂行させ
ることができる。Further, the voice interactive apparatus according to a fifth aspect is the voice interactive apparatus according to the fourth aspect, wherein the worker state detecting section detects the worker state from the physiological index of the worker. The physiological index is a parameter indicating the physiological state of the worker. For example, the heart rate, the amount of sweat, etc. that change due to tension. This physiological index is a parameter that autonomously changes depending on the environment (work state), and the operator's margin, for example, can be accurately detected. If this is applied to the dictionary selection at the next speech recognition, the speech can be recognized more accurately. Further, if applied to the dialogue control unit via the work load estimation unit, work and voice dialogue can be performed more smoothly.

【００２２】請求項６に記載の音声対話装置は請求項５
に記載の音声対話装置であって、作業者の生理指標は心
拍数、発汗量、視線関連値であることを特徴とする。こ
れらの指標は、最も容易に作業者の余裕度、緊張度を検
出することができる生理指標である。例えば、緊張度が
大である場合は、心拍数、発汗量が増大し、視点は固定
される。この心拍数、発汗量を検出するセンサは容易に
作業者に取り付けることができる。又、視線を監視する
簡易監視カメラ等も容易に作業装置に取り付けることが
できる。よって、この心拍数、発汗量、視線関連値をそ
れぞれのセンサによって監視すれば容易に作業者の余裕
度、緊張度を把握することができる。よって、容易に請
求項５の音声対話装置を実現することができる。A voice dialogue system according to a sixth aspect of the present invention is the fifth aspect.
The voice interaction device according to the above item 1, wherein the physiological index of the worker is a heart rate, a sweat rate, and a line-of-sight-related value. These indexes are physiological indexes that allow the operator's margin and tension to be detected most easily. For example, when the degree of tension is high, the heart rate and the amount of sweating increase, and the viewpoint is fixed. The sensor for detecting the heart rate and the amount of sweating can be easily attached to the worker. Further, a simple monitoring camera for monitoring the line of sight can be easily attached to the working device. Therefore, if the heart rate, the amount of sweat, and the line-of-sight-related value are monitored by the respective sensors, it is possible to easily grasp the operator's margin and tension. Therefore, the voice dialogue apparatus according to claim 5 can be easily realized.

【００２３】又、請求項７に記載の音声対話装置は請求
項４乃至請求項６の何れか１項に記載の音声対話装置で
あって、装置状態検出部は作業装置の物理量とその装置
の使用頻度からその装置状態を検出する。一般に、作業
者状態と作業装置状態は相関がある。例えば、作業装置
を車輌とすれば、高速走行時は作業者状態である運転者
の余裕度は低く、低速走行時には高い。又、装置の使用
頻度と作業者状態も相関がある。例えば、スイッチ操作
等の頻度が高いと作業者の余裕度は小である。よって、
その作業装置の作業状態からも作業者状態を把握するこ
とができる。よって、更に正確に作業者状態を把握する
ことができる。よって、更に円滑に作業と対話を両立さ
せる音声対話装置を実現することができる。Further, the voice dialogue device according to claim 7 is the voice dialogue device according to any one of claims 4 to 6, wherein the device state detection unit includes the physical quantity of the work device and the device. The device status is detected from the frequency of use. Generally, there is a correlation between the worker status and the work equipment status. For example, if the working device is a vehicle, the margin of the driver who is a worker is low when traveling at high speed and high when traveling at low speed. Further, there is a correlation between the frequency of use of the device and the condition of the worker. For example, if the frequency of switch operations is high, the operator's margin is small. Therefore,
The worker state can be grasped from the working state of the working device. Therefore, the operator's state can be grasped more accurately. Therefore, it is possible to realize a voice dialogue device that makes work and dialogue compatible with each other more smoothly.

【００２４】又、請求項８に記載の音声対話装置は請求
項１乃至請求項７の何れか１項に記載の音声対話装置で
あって、対話進展度情報は確認項目とその確認項目の確
定度を表す未知状態、要確認状態、確定状態の何れか１
つから構成されている。本発明の音声対話装置は、上記
のように対話進展度情報を有している。対話進展度情報
とは、確認項目とその確認項目の確定度からなる。そし
て、その確定度は、未知状態（＝１）、要確認状態（＝
２）、確定状態（＝３）の３段階からなる。例えば、確
認項目が住所であり、それが未知状態であれば、直接、
住所を質問する。例えば、それは’住所を言って下さ
い’と質問する。又、要確認状態であれば、次の項目の
質問時に暗黙的に質問する。例えば、’名古屋市’の’
何というレストランですか’と質問する。そして、質問
すべき全ての項目が確定状態となれば終了とする。この
ように本発明の音声対話装置は、対話進展度情報を用い
て段階的に対話を進めるので、作業効率を低減させた
り、作業を中止させることはない。又、対話の進展度を
把握し確実に対話を遂行させることができる。Further, the voice dialogue apparatus according to claim 8 is the voice dialogue apparatus according to any one of claims 1 to 7, wherein the dialogue progress information is confirmation items and confirmation of the confirmation items. One of the unknown state, the confirmation required state, and the confirmed state indicating the degree 1
It is composed of two. The voice dialogue apparatus of the present invention has dialogue progress degree information as described above. The conversation progress level information includes a confirmation item and a degree of confirmation of the confirmation item. And, the degree of certainty is unknown state (= 1), confirmation required state (=
2), which consists of three stages, the fixed state (= 3). For example, if the confirmation item is an address and it is in an unknown state, directly
Ask for an address. For example, it asks'Please tell me your address'. Also, if the confirmation is required, the question is implicitly asked at the time of the question of the next item. For example, in'Nagoya City '
What kind of restaurant is it? ' Then, when all the items to be inquired about are in a definite state, the process ends. As described above, the voice dialogue apparatus of the present invention advances the dialogue step by step using the dialogue progress degree information, and therefore does not reduce the work efficiency or stop the work. Further, the progress of the dialogue can be grasped and the dialogue can be surely performed.

【００２５】又、請求項９に記載の音声対話装置は請求
項１乃至請求項８の何れか１項に記載の音声対話装置で
あって、対話制御部は作業負荷が小又は中である場合
は、要確認状態である確認項目を使用してその項目の確
認を促すとともに、新規の質問項目を設定している。作
業者の作業負荷が小又は中である場合は、作業者には前
回の質問項目に対する認識結果の確認と新たな質問項目
を受容する余裕があると考えられる。例えば、前回の作
業者の応答が’名古屋市のピッコロ’であった場合に、
新たな確認項目が業種であれば’名古屋市のピッコロ
は’’どのようなお店ですか？’と質問事項を設定す
る。この’名古屋市のピッコロ’が前回の確認項目であ
り、’どのようなお店ですか？’が新たな質問項目であ
る。この前回の項目の繰り返しは、作業者に暗黙的な前
回の確認を促すことになる。即ち、作業者から訂正がな
ければ前回の認識が正かったことになり訂正があれば、
再度、質問するようにする。即ち、この様な対話制御部
を備えれば、新たな質問項目を設定するとともに前回項
目の暗黙的了解を得ることができる。Further, the voice dialogue apparatus according to claim 9 is the voice dialogue apparatus according to any one of claims 1 to 8, wherein the dialogue control unit has a small or medium workload. Uses a confirmation item in the confirmation required state to prompt confirmation of the item and sets a new question item. When the workload of the worker is small or medium, it is considered that the worker has room to confirm the recognition result of the previous question item and accept the new question item. For example, if the previous worker's response was'Piccolo in Nagoya City ',
If the new check item is an industry, 'What kind of shop is Piccolo in Nagoya? 'And set the question. This'Piccolo in Nagoya 'was the last item to check, and what kind of shop is it? 'Is a new question item. This repetition of the previous item will prompt the operator to implicitly confirm the previous item. That is, if there is no correction from the operator, the previous recognition was correct and if there is a correction,
Ask the question again. That is, if such a dialogue control unit is provided, a new question item can be set and an implicit understanding of the previous item can be obtained.

【００２６】又、請求項１０に記載の音声対話装置は請
求項１乃至請求項８の何れか１項に記載の音声対話装置
であって、対話制御部は作業負荷が大である場合は、質
問項目を単一の一問一答形式に設定している。作業者の
作業負荷が大である場合は、作業者には前回の応答項目
の確認と新たな質問項目を受容する余裕がないと考えら
れる。例えば、前回の作業者の応答が’名古屋市のピッ
コロ’であって新たな質問項目が業種であれば、前回の
応答項目に拘わらず、’どのようなお店ですか？’、又
は’業種は何ですか？’と質問項目を単一文で一問一答
形式にする。若しくは、新たな質問を行う前に、例え
ば’名古屋市’と’ピッコロ’の確認をそれぞれ明示的
に行う。例えば、’名古屋市ですか？’、又は’ピッコ
ロですか？’と明示的に単一文で確認する。即ち、この
様に単一文にすることで作業者の認知的情報処理負荷を
低減させる。即ち、作業者の認知的情報処理負荷を低減
させることで作業を続行させ、又対話も遂行させる。即
ち、作業負荷が大である場合でも作業と音声対話を両立
させることができる。Further, the voice dialogue apparatus according to claim 10 is the voice dialogue apparatus according to any one of claims 1 to 8, wherein the dialogue control section has a large work load, The question items are set in a single question / answer format. When the workload of the worker is large, it is considered that the worker cannot afford to confirm the previous response item and accept the new question item. For example, if the previous worker's response was'Piccolo in Nagoya City 'and the new question item was the industry, regardless of the previous response item,' What kind of shop is it? What is'or 'industry? 'And the question item into one question and one answer form with a single sentence. Alternatively, before asking a new question, for example, confirm'Nagoya City 'and'Piccolo' explicitly. For example, 'Are you Nagoya? "Or" Piccolo? 'Is explicitly confirmed in a single sentence. That is, such a single sentence reduces the cognitive information processing load on the worker. That is, the work is continued and the dialogue is performed by reducing the cognitive information processing load of the worker. That is, even when the work load is large, work and voice conversation can be compatible.

【００２７】又、請求項１１に記載の音声対話装置は請
求項１乃至請求項１０の何れか１項に記載の音声対話装
置であって、対話制御部は所定時間及び／又は所定時間
後の作業負荷が大であると推定される場合は、対話負荷
がその所定時間及び／又は所定時間後に増大しないよう
に対話方式、対話時間間隔及び／又は発話長、発話速度
を調整している。例えば、作業装置を車輌とした場合、
作業状態検出部は加速中を検出する場合がある。このよ
うな場合は、速度変化が一定になるまでの所定時間は作
業者（運転者）は作業負荷、特に認知的情報処理負荷が
大であると考えられる。又、例えば車載用ナビゲーショ
ン装置において、所定時間後に交差点で左折することが
予想される場合がある。この様な場合は、所定時間後に
は同様に作業者（運転者）の作業負荷は大であると予想
される。本発明では、所定時間及び／又は所定時間後の
作業負荷が大であると推定される場合は、対話方式、対
話時間間隔及び／又は発話長、発話速度を調整する。例
えば、質問項目を簡単な一文にする。又は、例えば対話
時間間隔を延ばしたり、質問項目の発話長、発話速度を
調節する。即ち、対話負荷が所定時間及び／又は所定時
間後に増大しないように処理する。作業負荷の増大が対
話負荷の増大と重ならないので、より円滑に作業と対話
を遂行することができる。尚、対話時間間隔の調整、質
問項目の発話長、発話速度を調節は、対話の中断は意味
しない。所要時間に拘わらず最後の項目まで対話は遂行
されるものとする。[0027] Further, the speech dialogue apparatus according to claim 11 is the speech dialogue apparatus according to any one of claims 1 to 10, wherein the dialogue control section is used for a predetermined time and / or after a predetermined time. When the work load is estimated to be heavy, the interaction method, the interaction time interval and / or the utterance length, and the utterance speed are adjusted so that the interaction load does not increase for the predetermined time and / or after the predetermined time. For example, if the working device is a vehicle,
The work state detection unit may detect that the vehicle is accelerating. In such a case, it is considered that the worker (driver) has a heavy work load, especially a cognitive information processing load for a predetermined time until the speed change becomes constant. Also, for example, in a vehicle-mounted navigation device, it may be expected to turn left at an intersection after a predetermined time. In such a case, it is expected that the work load of the worker (driver) will be large after a predetermined time. In the present invention, when it is estimated that the predetermined time and / or the work load after the predetermined time is large, the conversation method, the conversation time interval and / or the utterance length, and the utterance speed are adjusted. For example, make the question item a simple sentence. Alternatively, for example, the conversation time interval is extended, or the utterance length and utterance speed of the question item are adjusted. That is, processing is performed so that the dialogue load does not increase for a predetermined time and / or after a predetermined time. Since the increase in work load does not overlap with the increase in conversation load, it is possible to perform work and conversation more smoothly. Note that adjusting the conversation time interval, adjusting the utterance length of the question item, and adjusting the utterance speed does not mean interruption of the conversation. The dialogue shall be carried out to the last item regardless of the time required.

【００２８】又、請求項１２に記載の音声対話装置は請
求項１乃至請求項１１の何れか１項に記載の音声対話装
置であって、作業状態検出部と作業負荷推定部は音声認
識部及び対話制御部及びガイダンス生成部とは独立した
プロセスとしている。独立したプロセスとは主従関係が
ないことを意味する。即ち、それぞれ独立に作動してい
るので、例えば音声認識部及び対話制御部は作業負荷推
定部から必要な時点で直ちに作業状態情報を得ることが
できる。即ち、応答速度を上げることができる。よっ
て、応答速度に優れた音声対話装置を実現することがで
きる。[0028] Further, a voice interaction device according to a twelfth aspect is the voice interaction device according to any one of the first to eleventh aspects, wherein the work state detecting section and the work load estimating section are the voice recognizing section. The process is independent of the dialogue control unit and the guidance generation unit. Independent process means that there is no master-slave relationship. That is, since they operate independently of each other, for example, the voice recognition unit and the dialogue control unit can immediately obtain the work state information from the work load estimation unit at a necessary time. That is, the response speed can be increased. Therefore, it is possible to realize a voice dialogue device having an excellent response speed.

【００２９】又、請求項１３に記載の音声対話装置は請
求項１乃至請求項１２の何れか１項に記載の音声対話装
置であって、作業装置は車輌であり作業者はその車輌の
運転者である。請求項１乃至請求項１２の何れか１項に
記載の音声対話装置は、作業装置の操作を妨げずに音声
対話可能を特徴としている。よって、作業装置を車輌と
した場合は、車輌の運転を妨げずに音声対話可能とす
る。よって、車輌走行の安全性に影響を与えることがな
い。即ち、安全走行を保証する音声対話装置となる。A voice dialogue system according to a thirteenth aspect is the voice dialogue system according to any one of the first to twelfth aspects, wherein the working device is a vehicle and the worker drives the vehicle. Person. The voice interaction device according to any one of claims 1 to 12 is characterized in that voice interaction is possible without disturbing the operation of the work device. Therefore, when the working device is a vehicle, it is possible to have a voice conversation without disturbing the driving of the vehicle. Therefore, it does not affect the safety of running the vehicle. That is, it becomes a voice dialogue device that guarantees safe driving.

【００３０】又、請求項１４に記載の音声対話装置は請
求項１３に記載の音声対話装置であって、この音声対話
装置は作業状態検出部が検出する作業装置の物理量、即
ち車輌状態を示すパラメータが前方車輌及び／又は後方
車両との車間距離、車輌の位置、速度、加速度、旋回速
度、旋回加速度、アクセル／ブレーキペダル踏み込み量
の少なくとも１つを含んでいることを意味している。作
業状態検出部が検出する前方車輌及び／又は後方車両と
の車間距離は、他の車輌の接近度を表している。この場
合は、接近度に応じて対話方式、発話長等を調整する。
又、車輌の位置は例えば走行車線、追い越し車線の区別
を表している。その車線によって運転者の余裕度が異な
ると推察されるので、車線によって対話方式、発話長等
を調整する。又、速度、加速度、アクセル／ブレーキペ
ダル踏み込み量のをパラメータとする場合は、加速中、
又は減速中を意味する。加速中、又は減速中には運転者
の余裕度が変化すると考えられるので、音声対話装置の
対話方式、発話長等を調整する。更に、旋回速度、旋回
加速度をパラメータとした場合は、例えば右折又は左折
中を意味する。旋回中は、運転者の余裕度が小となると
考えられるので、同じく音声対話装置の対話方式、発話
長等を調整する。このように車輌の状態を表す物理量を
パラメータとして車輌状態を判別すれば、安全走行を維
持しながら音声対話入力が可能となる。Further, the voice dialog device according to claim 14 is the voice dialog device according to claim 13, wherein the voice dialog device indicates a physical quantity of the work device detected by the work state detection unit, that is, a vehicle state. It is meant that the parameter includes at least one of the following distance between the vehicle ahead and / or the vehicle behind, the position of the vehicle, the speed, the acceleration, the turning speed, the turning acceleration, and the accelerator / brake pedal depression amount. The inter-vehicle distance to the front vehicle and / or the rear vehicle detected by the work state detection unit represents the degree of approach of another vehicle. In this case, the dialogue method, speech length, etc. are adjusted according to the degree of proximity.
Further, the vehicle position represents, for example, the distinction between the driving lane and the overtaking lane. Since it is estimated that the driver's leeway varies depending on the lane, the conversation method, speech length, etc. are adjusted depending on the lane. When parameters such as speed, acceleration, and accelerator / brake pedal depression amount are used as parameters,
Or, it means decelerating. Since it is considered that the driver's leeway changes during acceleration or deceleration, the dialog system, utterance length, etc. of the voice dialog device are adjusted. Further, when the turning speed and the turning acceleration are used as parameters, it means that the vehicle is turning right or turning left, for example. Since it is considered that the driver's margin is small during the turning, the dialog system of the voice dialog device, the utterance length, etc. are adjusted in the same manner. In this way, if the vehicle state is discriminated using the physical quantity representing the vehicle state as a parameter, it becomes possible to perform voice interactive input while maintaining safe driving.

【００３１】又、請求項１５に記載の音声対話装置は請
求項１３又は請求項１４に記載の音声対話装置であっ
て、その音声対話装置は車輌に搭載された情報提供装置
に備えられる。車輌搭載の情報提供装置とは、例えばナ
ビゲーション装置、オーディオ装置、電話装置等であ
る。これらの装置に請求項１３又は請求項１４に記載の
音声対話装置が備えられれば、車輌の安全走行を維持し
ながら円滑に情報提供装置から情報を取り出すことがで
きる。よって、車輌の安全性と情報取得を容易とする利
便性に優れた音声対話装置となる。Further, a voice dialogue device according to a fifteenth aspect is the voice dialogue device according to the thirteenth or fourteenth aspects, and the voice dialogue device is provided in an information providing device mounted on a vehicle. The information providing device mounted on the vehicle is, for example, a navigation device, an audio device, a telephone device, or the like. If these devices are provided with the voice interaction device according to claim 13 or 14, information can be smoothly retrieved from the information providing device while maintaining safe driving of the vehicle. Therefore, the voice dialog device is excellent in vehicle safety and convenience that facilitates information acquisition.

【００３２】[0032]

【発明の実施の形態】以下、本発明の望ましい実施の形
態について説明する。発明の実施の形態の一部は、上記
の作用及び効果の欄に記載されている。（第１実施例）以下、本発明の音声対話装置について図
面を参照して説明する。図１に本発明の音声対話装置の
１実施例を示す。本発明の音声対話装置は、作業者の音
声を収集する音声入力部１０、音声認識部１２、認識用
言語辞書記憶部１１、認識用音響辞書記憶部１６、デー
タベース検索部１７、対話制御部１３、ガイダンス生成
部１４、音声出力部１５、作業状態検出部２０、作業負
荷推定部２５から構成される。この音声対話装置を例え
ばナビゲーション装置等の情報提供装置３０に適用すれ
ば、音声対話型情報提供装置が形成される。尚、情報提
供装置３０を除く各構成要素は、図示はしないがＣＰ
Ｕ、ＲＯＭ、ＲＡＭ、Ａ／Ｄ変換装置、Ｄ／Ａ変換装
置、システムバス、外部バス、外部記憶メモリ、センサ
装置及びＲＯＭ内のプログラムから構成されるコンピュ
ータ装置により構成される。BEST MODE FOR CARRYING OUT THE INVENTION Preferred embodiments of the present invention will be described below. Some of the embodiments of the invention are described in the above-mentioned action and effect column. (First Embodiment) A voice interactive apparatus of the present invention will be described below with reference to the drawings. FIG. 1 shows an embodiment of a voice dialog device of the present invention. The voice interaction device of the present invention includes a voice input unit 10 for collecting voices of a worker, a voice recognition unit 12, a recognition language dictionary storage unit 11, a recognition acoustic dictionary storage unit 16, a database search unit 17, and a dialogue control unit 13. The guidance generation unit 14, the voice output unit 15, the work state detection unit 20, and the work load estimation unit 25. If this voice interaction device is applied to the information providing device 30 such as a navigation device, a voice interaction information providing device is formed. Although not shown, each component except the information providing device 30 is a CP.
U, ROM, RAM, A / D conversion device, D / A conversion device, system bus, external bus, external storage memory, sensor device, and computer device including programs in ROM.

【００３３】又、図２に上記作業状態検出部２０を示
す。作業状態検出部２０は、作業者の生理指標を検出す
る作業者状態検出部２１と作業装置の駆動状態を物理量
で検出する装置状態検出部２２と作業者の周囲の雑音環
境を検出する雑音検出部２３から構成される。作業者状
態検出部２１は、例えば作業者の動作や視線関連値であ
る瞳の動きを監視する画像処理装置である。即ち、作業
者の動作や視線の動きで余裕度を検出する装置である。
そして、その結果を信号Ａで図１の作業負荷推定部２５
に出力する装置である。又、装置状態検出部２２は、例
えば作業装置を車輌とした場合はアクセル／ブレーキペ
ダルの踏み込み量を検出するセンサ装置である。そし
て、その検出量を信号Ｂで作業負荷推定部２５に出力す
る装置である。尚、このセンサ装置はアクセル／ブレー
キペダルの踏み込み量のみならず、他に車輌の速度、加
速度、旋回加速度等を検出するセンサ装置としてもよ
い。作業者の余裕度に相関のある物理量を検出するセン
サ装置であればよい。Further, FIG. 2 shows the work state detecting section 20. The work state detection unit 20 includes a worker state detection unit 21 that detects a physiological index of the worker, a device state detection unit 22 that detects the driving state of the work device by a physical quantity, and noise detection that detects a noise environment around the worker. It is composed of the unit 23. The worker state detection unit 21 is, for example, an image processing device that monitors the motion of the worker and the movement of the pupil that is a line-of-sight related value. That is, it is a device that detects the margin based on the motion of the worker or the movement of the line of sight.
Then, the result is indicated by the signal A in the work load estimating unit 25 in FIG.
It is a device that outputs to. The device state detection unit 22 is a sensor device that detects the amount of depression of the accelerator / brake pedal when the working device is a vehicle, for example. Then, it is a device that outputs the detected amount as a signal B to the work load estimation unit 25. It should be noted that this sensor device may be a sensor device that detects not only the amount of depression of the accelerator / brake pedal but also the speed, acceleration, turning acceleration, etc. of the vehicle. Any sensor device may be used as long as it is a sensor device that detects a physical quantity that correlates with the operator's margin.

【００３４】又、雑音検出部２３は例えば作業者周囲の
雑音を検出するマイクロフォン装置であり、その結果を
信号Ｃで作業負荷推定部２５に出力する装置である。そ
して、作業負荷推定装置２５は、３者の信号Ａ，Ｂ，Ｃ
より作業者の作業負荷を推定する装置である。例えば、
アクセル／ブレーキペダルの踏み込み量が小である時は
作業負荷を小とする。何れかのペダルの踏み込み量が大
である時は、作業負荷を大とする。尚、この検出と推定
は例えばタイマー割り込み等で所定時間毎行われる。
又、本実施例では作業装置は車輌としたがこれは１例で
あり、その装置は作業者が操作する装置であれば何でも
よい。本発明の音声対話装置は、例えば他に航空機、船
舶、生産用旋盤、生産用組立自動機等様々な機器に対し
て適用可能である。又、情報提供装置３０は、例えばナ
ビゲーション装置としたがこれも１例である。他に、オ
ーディオ装置、コンピュータ装置、生産用自動機の中央
装置等の作業者の音声入力で様々な情報が提供可能な全
ての情報提供装置に適用可能である。The noise detector 23 is, for example, a microphone device that detects noise around the worker, and outputs the result as a signal C to the work load estimator 25. Then, the work load estimation device 25 uses the signals A, B, C of the three parties.
This is a device that estimates the work load of the worker. For example,
When the accelerator / brake pedal depression amount is small, the work load is small. When the depression amount of any of the pedals is large, the work load is large. It should be noted that this detection and estimation is performed at predetermined time intervals, for example, by timer interruption.
Further, in the present embodiment, the working device is a vehicle, but this is one example, and the device may be any device operated by an operator. The voice interaction device of the present invention can be applied to various devices such as an aircraft, a ship, a lathe for production, and an automatic assembly machine for production, for example. The information providing device 30 is, for example, a navigation device, but this is also an example. In addition, the present invention can be applied to all information providing devices capable of providing various information by voice input of an operator such as an audio device, a computer device, and a central device of an automatic production machine.

【００３５】又、図１において音声入力部１０は例えば
図示しないマイクロフォンとＡ／Ｄ変換器とメモリから
なる装置である。作業者の音声をマイクロフォンで検出
し、Ａ／Ｄ変換器でデジタルデータに変換してメモリ装
置に記憶する装置である。音声認識部１２は音声入力部
１０から送られた音声デジタル信号の特徴量を抽出し、
認識用言語辞書記憶部１１、認識用音響辞書記憶部１６
を参照して、入力された音声を同定する装置である。こ
の時、音声認識部１３は作業負荷推定部２５から作業負
荷を取得し、それに応じて認識用言語辞書、認識用音響
辞書を変更する機能を有している。これは、雑音等の作
業負荷、緊張等の作業負荷があると作業者の発話に変化
があるためである。辞書を変更することで、認識率を向
上させる構成である。Further, in FIG. 1, the voice input section 10 is, for example, a device including a microphone, an A / D converter and a memory (not shown). This is a device that detects a worker's voice with a microphone, converts it into digital data with an A / D converter, and stores it in a memory device. The voice recognition unit 12 extracts the feature amount of the voice digital signal sent from the voice input unit 10,
Recognition language dictionary storage unit 11, recognition acoustic dictionary storage unit 16
Is a device for identifying the input voice by referring to. At this time, the voice recognition unit 13 has a function of acquiring the work load from the work load estimation unit 25 and changing the recognition language dictionary and the recognition sound dictionary accordingly. This is because the utterance of the worker changes when there is a work load such as noise and a work load such as tension. The recognition rate is improved by changing the dictionary.

【００３６】対話制御部１３は音声認識部１２の認識結
果に基づいて、応答するデータベースを検索し、その検
索結果（例えば、データベースの有無）に基づいて順
次、対話進展度情報を作成する装置である。ここで、対
話進展度情報は対話情報と進展度情報からなる情報であ
る。対話情報とは、対話に必要な確認項目又は質問項目
とその確定度を示すパラメータから構成される。例えば
情報提供装置３０をナビゲーション装置とすれば、目的
地の施設名、住所、詳細住所、業種等が確認項目とな
り、その確かさを示す数値、例えば１、２、３が確定度
である。例えば、確定度１は未知状態、確定度２は要確
認状態、確定度３は確定状態とする。The dialogue control unit 13 is a device for searching a database that responds based on the recognition result of the voice recognition unit 12 and sequentially creating dialogue progress degree information based on the search result (for example, presence or absence of the database). is there. Here, the dialogue progress information is information including dialogue information and progress information. The dialogue information is composed of confirmation items or question items necessary for dialogue and parameters indicating the degree of confirmation thereof. For example, if the information providing device 30 is a navigation device, the facility name, address, detailed address, type of business, etc. of the destination are confirmation items, and the numerical value indicating the certainty thereof, for example, 1, 2, 3 is the degree of determination. For example, the degree of certainty 1 is an unknown state, the degree of certainty 2 is a confirmation required state, and the degree of certainty 3 is a confirmed state.

【００３７】又、進展度情報とはその全ての確認項目に
対する上記確定状態の割合である。例えば、総確認項目
数が４であり、確定状態の数が２であれば進展度は５０
％である。そして、進展度１００％の時点で対話終了と
する。又、対話制御部は、順次、対話を進行させてその
対話進展度情報を作成し、作業負荷推定部２５の結果を
考慮して必要な項目と音声合成用の音声特徴をガイダン
ス生成部１４に出力する装置である。ガイダンス生成部
１４は、対話制御部１３からの項目に基づいて、質問応
答文（質問ガイダンス）、又は確認応答文（確認ガイダ
ンス）を作成する装置である。又、指定された音声特徴
を次段の音声出力部１５に出力する装置である。又、対
話進展度情報において進展度が１００％の時点で終了と
し、対象とする情報提供装置３０に例えば目的地の座標
（緯度と経度）を出力する装置である。そして音声出力
部１５は、音声合成を用いて、ガイダンス生成部１４よ
り入力された質問応答文、又は確認応答文を指定の音声
特徴で出力する装置である。本実施例の音声対話装置は
このように構成されている。The progress information is the ratio of the above-mentioned confirmed state to all the confirmation items. For example, if the total number of confirmation items is 4 and the number of confirmed states is 2, the progress degree is 50.
%. The dialogue ends when the degree of progress is 100%. Further, the dialogue control unit sequentially advances the dialogue to create the dialogue progress degree information, and considers the result of the work load estimation unit 25 to provide the guidance generation unit 14 with necessary items and voice features for voice synthesis. It is a device for outputting. The guidance generation unit 14 is a device that creates a question response sentence (question guidance) or a confirmation response sentence (confirmation guidance) based on the item from the dialogue control unit 13. It is also a device for outputting the designated voice feature to the voice output unit 15 in the next stage. Further, it is a device that terminates when the progress level is 100% in the conversation progress level information and outputs, for example, the coordinates (latitude and longitude) of the destination to the target information providing device 30. The voice output unit 15 is a device that outputs the question response sentence or the confirmation response sentence input from the guidance generation unit 14 by using voice synthesis with a designated voice feature. The voice interaction device of this embodiment is configured in this way.

【００３８】次に、図３のフローチャート及び図４の対
話例を用いて本実施例の音声対話装置を用いた音声対話
型情報提供装置の動作を説明する。ここでは、音声対話
型情報提供装置は所謂車載用ナビゲーション装置として
説明する。この装置の動作は、図３に示すように２つに
分けられる。１つはステップＳ３０〜ステップＳ３２で
示される作業状態検出部２０のプロセスであり、他の１
つはステップＳ１０〜ステップＳ２０で示される音声対
話プロセスである。作業状態検出部２０のプロセスは、
例えば３秒毎に駆動して作業負荷を検出する例えばタイ
マー割り込みによるプログラムである。作業状態検出部
２０のプロセスは、先ずステップＳ３０で作業状態を検
出する。作業状態は、上述の作業者状態（信号Ａ）と装
置状態（信号Ｂ）と作業者周囲の雑音状態（信号Ｃ）で
出力される。次に、ステップＳ３１でそれらの信号Ａ、
Ｂ、Ｃから作業者負荷を推定する。例えば、視線変化量
（信号Ａ）、アクセル／ブレーキ踏み込み量（信号
Ｂ）、雑音量（信号Ｃ）を総合して作業負荷を例えば
大、中、小に分類する。そして、ステップＳ３２でその
時点の作業負荷としてメモリに記憶して終了する。これ
らの作業負荷は時々刻々変化する量である。Next, the operation of the voice dialogue type information providing apparatus using the voice dialogue apparatus of this embodiment will be described with reference to the flowchart of FIG. 3 and the dialogue example of FIG. Here, the voice interactive information providing device will be described as a so-called in-vehicle navigation device. The operation of this device is divided into two, as shown in FIG. One is the process of the work state detection unit 20 shown in steps S30 to S32, and the other one is
One is the voice interaction process shown in steps S10 to S20. The process of the work state detection unit 20 is
For example, it is a program by a timer interrupt that is driven every 3 seconds to detect the work load. The process of the work state detection unit 20 first detects the work state in step S30. The work state is output as the above-mentioned worker state (signal A), device state (signal B), and noise state around the worker (signal C). Then, in step S31, those signals A,
The worker load is estimated from B and C. For example, the work load is classified into large, medium, and small, for example, by integrating the gaze change amount (signal A), the accelerator / brake depression amount (signal B), and the noise amount (signal C). Then, in step S32, the work load at that time is stored in the memory and the process ends. These workloads are quantities that change from moment to moment.

【００３９】一方、音声対話プロセスはステップＳ１０
から開始される。先ず、ステップＳ１０では上述の対話
進展度情報を初期化する。例えば、図４（ｂ）の最上列
に示すように施設名（？、１）、住所１（？、１）、住
所２（？、１）、業種（？、１）に初期化する。次に、
ステップＳ１１に移行する。ステップＳ１１では、対話
制御部１３がステップＳ３２から現在の作業負荷を取得
する。これは、作業負荷、例えば作業者の緊張度によっ
て作業者の音声特徴が変化するためである。そして、次
にステップＳ１２に移行する。ステップＳ１２では、音
声認識部１２が得られた作業負荷に応じた認識用言語辞
書、及び認識用音響辞書を選択する。これにより、後段
（ステップｓ１８）の音声認識率が向上する。On the other hand, the voice dialogue process is step S10.
It starts from. First, in step S10, the above-mentioned dialogue progress degree information is initialized. For example, as shown in the uppermost column of FIG. 4B, the facility name (?, 1), address 1 (?, 1), address 2 (?, 1), industry (?, 1) are initialized. next,
Control goes to step S11. In step S11, the dialogue control unit 13 acquires the current work load from step S32. This is because the voice characteristics of the worker change depending on the work load, for example, the tension of the worker. Then, the process proceeds to step S12. In step S12, the speech recognition unit 12 selects a recognition language dictionary and a recognition acoustic dictionary according to the obtained work load. As a result, the voice recognition rate in the subsequent stage (step s18) is improved.

【００４０】次に、ステップＳ１３に移行する。ステッ
プＳ１３では、対話制御部１３が対話進展度情報と上記
作業負荷から、質問項目及び／又は確認項目を選択す
る。初期化時は対話進展度情報は、図４（ｂ）の最上列
に示すように施設名（？、１）、住所１（？、１）、住
所２（？、１）、業種（？、１）であるので、例えば施
設名（？、１）、住所１（？、１）を選択しステップＳ
１４に移行する。ステップＳ１４では、ガイダンス生成
部１４がその項目を受け取り、最初のガイダンスであ
る’お店の名前と住所を言って下さい’を作成する。次
に、ステップＳ１５でそれを音声合成で出力する（図４
（ａ）Sys1）。尚、図４（ａ）はこの音声対話装置で得
られる対話の１例であり、図４（ｂ）はその対話の進展
度を示す対話進展度情報の変化図である。Then, the process proceeds to step S13. In step S13, the dialogue control unit 13 selects a question item and / or a confirmation item from the dialogue progress level information and the work load. At the time of initialization, as shown in the uppermost column of FIG. 4B, the dialogue progress level information is facility name (?, 1), address 1 (?, 1), address 2 (?, 1), industry (?, 1), the facility name (?, 1) and address 1 (?, 1) are selected, for example, and step S
Move to 14. In step S14, the guidance generation unit 14 receives the item and creates the first guidance, "Please tell me the name and address of the shop". Next, in step S15, it is output by voice synthesis (FIG. 4).
(A) Sys1). Note that FIG. 4A is an example of a dialogue obtained by this voice dialogue device, and FIG. 4B is a change diagram of dialogue progress degree information indicating the progress degree of the dialogue.

【００４１】次に、ステップＳ１６に移行する。ステッ
プＳ１６では、対話終了か否かが判定される。終了でな
ければステップＳ１７に移行する。尚、終了判定の方法
については後述する。ステップＳ１７では、運転者から
の音声入力が待機される。音声入力があれば、それを取
得しステップＳ１８に移行する。ステップＳ１８では、
音声認識部１２がステップＳ１２で選択された、作業負
荷に合った認識用言語辞書、認識用音響辞書を用いて音
声認識する。例えば、運転者の発話（Drv1：）’えー
と、名古屋市のピッコロお願いします。’から、施設
名’ピッコロ’と住所１’名古屋市’を認識する。次
に、ステップＳ１９に移行する。Then, the process proceeds to step S16. In step S16, it is determined whether or not the dialogue has ended. If not, the process proceeds to step S17. The method of determining the end will be described later. In step S17, the voice input from the driver is awaited. If there is a voice input, it is acquired and the process proceeds to step S18. In step S18,
The voice recognition unit 12 performs voice recognition using the recognition language dictionary and the recognition acoustic dictionary that are selected in step S12 and that match the work load. For example, driver's utterance (Drv1 :) 'um, piccolo in Nagoya. From ', recognize the facility name'Piccolo' and address 1'Nagoya City '. Then, the process proceeds to step S19.

【００４２】ステップＳ１９では、対話制御部１３がス
テップＳ１８での音声認識結果をデータベース検索でそ
の有無を確認し、現在の対話進展度情報を更新する。即
ち、施設名（ピッコロ、２）、住所１（名古屋市、
２）、住所２（？、１）、業種（？、１）を作成する
（図４（ｂ））。そして、ステップＳ１１に戻り、以降
のルーチンをを繰り返す。例えば、２回目のルーチンの
ステップＳ１３以降を例に取る。２回目のステップＳ１
３では、例えば作業負荷が小である場合は運転者の余裕
は大であると考えられるので、又１回目のルーチンで施
設名と住所１は要確認状態２であるので、再度暗黙的
に’名古屋市’と’ピッコロ’を確認項目とし、他の項
目である例えば業種を質問項目に選択する。即ち、要確
認状態の’名古屋市’、’ピッコロ’と未確認状態１
の’業種’を選択し、ステップＳ１４に移行する。In step S19, the dialogue control unit 13 confirms the presence or absence of the voice recognition result obtained in step S18 by searching a database, and updates the current dialogue progress degree information. That is, facility name (piccolo, 2), address 1 (Nagoya City,
2), address 2 (?, 1), and business type (?, 1) are created (Fig. 4 (b)). Then, the process returns to step S11 to repeat the subsequent routines. For example, step S13 and subsequent steps of the second routine will be taken as an example. Second step S1
In No. 3, for example, when the work load is small, it is considered that the driver's margin is large, and since the facility name and address 1 are in the confirmation required state 2 in the first routine, it is again implicitly ' Select “City of Nagoya” and “Piccolo” as confirmation items, and select other items, such as industry, as question items. In other words, “Nagoya City” and “Piccolo” that are in the confirmation required state and the unconfirmed state 1
“Industry” is selected and the process proceeds to step S14.

【００４３】２回目のステップＳ１４では、その選択項
目に従ってガイダンスを生成する。例えば、’名古屋
市’と’ピッコロ’が要確認項目であり、業種項目が質
問項目であれば、’名古屋市のピッコロは、どの様なお
店でしょうか？’（Sys2：）という質問応答文を設定す
る。そして、次のステップＳ１５で指定された音声特徴
の音声合成で運転者に出力する。このように順次、対話
が繰り返され対話進展度情報が更新される。即ち、図４
（ａ）のDrv2以降に示す新たな対話が繰り返され、最終
的に図４（ｂ）の対話進展度情報の進展度が１００％と
なる。In step S14 for the second time, guidance is generated according to the selected item. For example, if'Nagoya City 'and'Piccolo' are items to be checked and the industry item is a question item, what kind of store is'Piccolo in Nagoya city '? Set the question and answer sentence of '(Sys2 :). Then, in the next step S15, it is output to the driver by voice synthesis of the voice feature specified. In this way, the dialogue is sequentially repeated and the dialogue progress information is updated. That is, FIG.
The new dialogues shown after Drv2 in (a) are repeated, and finally the progress of the dialogue progress information in FIG. 4 (b) becomes 100%.

【００４４】終了の方法は、この対話進展度情報から判
定される。即ち、対話進展度情報の各項目の確定度が全
て３であれば対話終了と判断される。即ち、進展度１０
０％であれば終了となる。即ち、この時、対話進展度情
報は施設名（ピッコロ、３）、住所１（名古屋市、
３）、住所２（中区、３）、業種（レストラン、３）と
なる。そして、最後のステップＳ２０に移行する。ステ
ップＳ２０では例えばナビゲーション装置である情報提
供装置３０に’レストランピッコロ’の変換データ（緯
度、経度）を出力する。そして、情報提供装置３０は’
レストランピッコロ’の例えば地図情報を図示しない表
示装置に出力する。本発明の音声対話装置を用いた音声
対話型情報提供装置はこのように動作する。The method of ending is determined from this dialogue progress degree information. That is, if the degree of determination of each item of the dialogue progress degree information is all 3, it is judged that the dialogue has ended. That is, the degree of progress 10
If it is 0%, the process ends. That is, at this time, the dialogue progress information is the facility name (piccolo, 3), address 1 (Nagoya City,
3), address 2 (Naka Ward, 3), type of business (restaurant, 3). Then, the process proceeds to the final step S20. In step S20, the converted data (latitude, longitude) of'Restaurant Piccolo 'is output to the information providing device 30 which is a navigation device, for example. Then, the information providing device 30 '
For example, map information of the restaurant Piccolo 'is output to a display device (not shown). The voice interactive information providing apparatus using the voice interactive apparatus of the present invention operates in this way.

【００４５】上述のように、本実施例では作業状態によ
って認識用言語辞書、認識用音響辞書を選択するので的
確に音声認識がなされる。更に、対話制御部が作業状態
によって対話方式を変化させるので、作業者はその作業
装置の操作を中止させたり、作業能率を低下させること
がない。即ち、その作業と音声対話を両立させることが
できる。特に、上述のように本発明を車輌に適用した場
合は、音声対話入力を可能にするとともに運転者の運転
操作を安全に維持することができる。即ち、安全走行を
維持しつつ対話入力を可能とすることができる。As described above, in the present embodiment, the recognition language dictionary and the recognition acoustic dictionary are selected according to the work state, so that the voice recognition can be accurately performed. Furthermore, since the dialogue control unit changes the dialogue method depending on the work state, the worker does not stop the operation of the work device or reduce the work efficiency. That is, the work and the voice conversation can be compatible with each other. In particular, when the present invention is applied to a vehicle as described above, it is possible to enable voice interactive input and safely maintain the driving operation of the driver. That is, it is possible to enable interactive input while maintaining safe driving.

【００４６】尚、２回目のステップＳ１３では作業負荷
を小とし、’名古屋市’と’ピッコロ’を要確認項目と
し、又、業種項目を他の質問項目として、’名古屋市の
ピッコロは、どの様なお店でしょうか？’という質問応
答文を設定したが、これは図５に示す作業負荷推定量と
対話制御方式の対応表から作成したものである。他の作
業負荷の場合もこの対応表から作成する。対応は簡単に
説明すると、以下の３通りになる。（１）作業負荷推定量が小である場合は、対話制御方式
は複数の質問項目、又、要確認状態を有している場合は
暗黙的な確認項目と単数の質問項目とする。例えば、お
店の’名前’と’住所’を言って下さい（複数の質問項
目）。要確認状態を有している場合は、名古屋市’の’
ピッコロ’はどのようなお店（業種）でしょうか？（暗
黙的確認項目＋複数又は単数の質問項目）となる。（２）作業負荷が中である場合は、対話制御方式は単一
の質問項目、又、要確認状態を有している場合は暗黙的
確認項目と単数の質問項目とする。例えば、お店の’名
前’を言って下さい（単一の質問項目）。又、要確認状
態を有している場合は、’名古屋市’の’なんというお
店’（施設名）でしょうか？（暗黙的確認項目＋単数の
質問項目）となる。（３）作業負荷が大である場合は、対話制御方式は単一
の質問項目、又、要確認状態を有している場合は明示的
確認項目となる。例えば、お店の’名前’を言って下さ
い（単一の質問項目）。又は、’名古屋市’ですか？
（明示的質問項目）となる。ステップＳ１３では、図５
に示す対応表によって項目が選択され、それに応じて対
話制御方式が決定される。尚、図中スロットは項目の意
味で用いている。In step S13 of the second time, the workload is set to be small, "Nagoya City" and "Piccolo" are required confirmation items, and the business type item is another question item. Is it such a store? The question and answer sentence "is set, which is created from the correspondence table between the workload estimation amount and the dialogue control method shown in FIG. Create for other workloads from this correspondence table. The correspondence is briefly described below. (1) If the work load estimation amount is small, the interactive control method includes a plurality of question items, and if there is a confirmation required state, an implicit confirmation item and a single question item. For example, say the'name 'and'address' of the store (multiple question items). If you have a status that requires confirmation, please check the Nagoya city's'
What kind of shop (industry) is Piccolo? (Implicit confirmation item + multiple or singular question item). (2) If the work load is medium, the interactive control method uses a single question item, and if the user has a confirmation-required state, an implicit confirmation item and a single question item. For example, say the'name 'of your store (single question item). Also, if you have a status that requires confirmation, what kind of shop (facility name) in'Nagoya City '? (Implicit confirmation item + single question item). (3) When the work load is large, the interactive control method is a single question item, and when it has a confirmation required state, it is an explicit confirmation item. For example, say the'name 'of your store (single question item). Or is it'Nagoya City '?
(Explicit question item). In step S13, FIG.
Items are selected by the correspondence table shown in FIG. 3 and the interactive control method is determined accordingly. The slots in the figure are used to mean items.

【００４７】（第２実施例）第１実施例は、音声認識部
１２も作業負荷推定部２５による作業負荷をパラメータ
とし、認識用言語辞書及び／又は認識用音響辞書を選択
する構成であった。これに代えて、音声認識部１２は次
段の対話制御部１３で作成される対話進展度情報をフィ
ードバックしてそれをパラメータとすることもできる。
即ち、図６に示す構成とし、対話制御部１３から信号Ｄ
で音声認識部１２にフィードバックしてもよい。例え
ば、第１実施例の対話進展度情報において、住所１（名
古屋市、３）、住所２（？、１）となる例がある。即
ち、目的の住所の名古屋市は確定状態３となった場合、
住所２の項目は必ず○○区となるのは自明である。即
ち、音声認識において最後尾の発音は’ｋｕ’となる。
即ち、最後尾の発音は認識する必要がない。即ち、最後
尾を’ｋｕ（区）’と限定して、それ以外を認識するよ
うにフィードバックする。これにより、認識率及び認識
速度が向上する。又、例えば業種（レストラン、３）、
住所１（名古屋市、３）、住所２（中区、３）となった
場合は、音声認識部１３は施設名認識において、中区の
レストランデータベースを参照することができる。中区
のレストランデータベースと一致、又は最も近い発話の
店名を認識結果とすることができる。即ち、この場合も
認識率を向上させることができる。(Second Embodiment) In the first embodiment, the voice recognition unit 12 also uses the work load by the work load estimation unit 25 as a parameter to select the recognition language dictionary and / or the recognition sound dictionary. . Alternatively, the voice recognition unit 12 may feed back the dialogue progress degree information created by the dialogue control unit 13 in the next stage and use it as a parameter.
That is, the configuration shown in FIG.
It may be fed back to the voice recognition unit 12. For example, in the dialogue progress level information of the first embodiment, there is an example of address 1 (Nagoya city, 3) and address 2 (?, 1). That is, if the city of Nagoya at the destination address is in the confirmed state 3,
It is self-evident that the item of address 2 is always XX ward. That is, in the voice recognition, the last pronunciation is'ku '.
That is, it is not necessary to recognize the last pronunciation. That is, the tail end is limited to “ku (district)”, and feedback is performed so as to recognize the rest. This improves the recognition rate and the recognition speed. Also, for example, industry (restaurant, 3),
When the address becomes 1 (Nagoya city, 3) and the address 2 (Naka-ku, 3), the voice recognition unit 13 can refer to the Naka-ku restaurant database in recognizing the facility name. The recognition result can be the store name of the utterance that matches or is closest to the restaurant database in Naka Ward. That is, the recognition rate can be improved also in this case.

【００４８】（変形例）上記実施例は１例であり、他に
様々な変形が考えられる。例えば、第１実施例では、作
業負荷の１つをアクセル／ブレーキペダル踏み込み量か
らその時点の作業負荷を判定したが、例えばアクセルＯ
ＮからアクセルＯＦＦとしても数秒間は高速状態であ
り、図７に示すように作業負荷は大が連続すると予想さ
れる（矢印）。この様な場合は、その時点での作業負荷
が中であっても大に修正し、且つシステムからの質問応
答Sys2と確認応答Sys3の間隔を通常より長く設定する。
このようにしてもよい。即ち、作業負荷（運転者の緊
張）と応答負荷（運転者の認知的情報処理負荷）が重な
らないように、即ち対話負荷が増大しないように設定す
る。より、運転者の負荷を低減することができ、安全走
行を維持することができる。(Modification) The above embodiment is only one example, and various modifications can be considered. For example, in the first embodiment, one of the work loads is determined from the accelerator / brake pedal depression amount, but the work load at that time is determined.
Even if the accelerator is turned off from N, it is in a high-speed state for several seconds, and as shown in FIG. 7, the work load is expected to be large (arrow). In such a case, even if the work load at that time is medium, it is largely corrected, and the interval between the question response Sys2 and the confirmation response Sys3 from the system is set longer than usual.
You may do this. That is, it is set so that the work load (driver's tension) and the response load (driver's cognitive information processing load) do not overlap, that is, the dialogue load does not increase. As a result, the load on the driver can be reduced and safe driving can be maintained.

【００４９】又、第１実施例の作業者状態検出部２２
は、作業者の生理指標として運転者の視線関連値である
瞳の動きを監視する画像処理装置としたが、例えばハン
ドルに装着した心拍数センサ、発汗量センサでもよい。
同様の効果が得られる。要は、作業者の余裕度、又は緊
張度に相関の得られるセンサ装置であればよい。又、第
１実施例の装置状態検出部２３は、検出物理量を車輌の
アクセル／ブレーキペダル踏み込み量としたが、その踏
み込み頻度でもよい。要は、運転者の負荷と相関のある
装置（車両）のパラメータであればよい。Further, the worker state detecting section 22 of the first embodiment.
The image processing device is an image processing device that monitors the movement of the pupil, which is a line-of-sight-related value of the driver, as a physiological index of the worker.
The same effect can be obtained. The point is that a sensor device that can obtain a correlation with the operator's margin or tension is sufficient. Further, although the apparatus state detection unit 23 of the first embodiment sets the detected physical quantity as the accelerator / brake pedal depression amount of the vehicle, it may be the depression frequency. The point is that it may be a parameter of the device (vehicle) that is correlated with the driver's load.

【００５０】又、上記第１実施例、第２実施例では項目
の選択順序には言及しなかったが、その順序は任意であ
る。例えば図７の対話例では、住所１、住所２、施設
名、業種の順番で決定されているが、図８の対話例で
は、住所１、施設名、住所２、業種の順番で決定されて
いる。要は、最終的に対話進展度情報が１００％になれ
ばよい。その順序は問わない。Although the order of selecting items is not mentioned in the first and second embodiments, the order is arbitrary. For example, in the dialogue example of FIG. 7, it is determined in the order of address 1, address 2, facility name, and business type, but in the dialogue example of FIG. 8, it is determined in the order of address 1, facility name, address 2, and business type. There is. The point is that the dialogue progress level information will eventually reach 100%. The order does not matter.

【００５１】又、第１実施例では、音声認識部１２は複
数の認識用言語辞書、及び認識用音響辞書を備え、作業
負荷推定部２５の推定結果に基づいてそれらから最適な
辞書を選択したが、音声認識部１２は、直接、作業状態
検出部２０の検出結果を参照してその結果に基づいて最
適な辞書を選択してもよい（図９）。例えば、作業状態
検出部２０から周囲の環境雑音レベル、又は作業者状態
を直接検出して、それに対応した辞書を選択するように
してもよい。直接、作業状態を把握できるのでより的確
な辞書を選択することができる。又、図９の様なシステ
ム構成とすれば作業負荷推定部２５による処理を迂回す
るので、迅速に認識用言語辞書、認識用音響辞書を選択
することができる。尚、この時、図３のステップｓ１２
においては、音声認識部１２は作業負荷推定部２５の作
業負荷からではなく作業状態検出部２０の検出結果を参
照することになる。Further, in the first embodiment, the voice recognition unit 12 is provided with a plurality of recognition language dictionaries and a recognition acoustic dictionary, and the optimum dictionary is selected from them based on the estimation result of the work load estimation unit 25. However, the voice recognition unit 12 may directly refer to the detection result of the work state detection unit 20 and select the optimum dictionary based on the result (FIG. 9). For example, the ambient noise level in the surroundings or the worker's state may be directly detected from the work state detecting unit 20, and the dictionary corresponding thereto may be selected. Since the work status can be grasped directly, a more accurate dictionary can be selected. Further, with the system configuration as shown in FIG. 9, the processing by the work load estimating unit 25 is bypassed, so that the recognition language dictionary and the recognition acoustic dictionary can be quickly selected. At this time, step s12 in FIG.
In the above, the voice recognition unit 12 refers not to the work load of the work load estimation unit 25 but to the detection result of the work state detection unit 20.

【００５２】又、第１実施例では、音声認識部１２は複
数の認識用言語辞書、及び認識用音響辞書を備え、作業
負荷推定部２５の推定結果に基づいて、それらから最適
な辞書を選択したが、例えば作業環境が常に同一、作業
者が常に同一であればその辞書をそれぞれ単数としても
よい。即ち、音声認識部１２は作業負荷推定部２５の推
定結果を参照せずに、常に同一の辞書で認識してもよい
（図１０）。同等の効果が得られる。尚、この場合は図
３におけるステップｓ１２は省略される。Further, in the first embodiment, the voice recognition unit 12 includes a plurality of recognition language dictionaries and a recognition acoustic dictionary, and selects an optimum dictionary from them based on the estimation result of the work load estimation unit 25. However, if the work environment is always the same and the workers are always the same, the dictionary may be singular. That is, the voice recognition unit 12 may always recognize the same dictionary without referring to the estimation result of the work load estimation unit 25 (FIG. 10). The same effect can be obtained. In this case, step s12 in FIG. 3 is omitted.

[Brief description of drawings]

【図１】本発明の第１実施例に係る音声対話装置のシス
テムブロック図。FIG. 1 is a system block diagram of a voice dialog device according to a first embodiment of the present invention.

【図２】本発明の第１実施例に係る作業状態検出部の構
成図。FIG. 2 is a configuration diagram of a work state detection unit according to the first embodiment of the present invention.

【図３】本発明の第１実施例の音声対話装置の動作を示
すフローチャート。FIG. 3 is a flowchart showing the operation of the voice dialog device according to the first embodiment of the present invention.

【図４】本発明の第１実施例の音声対話装置による対話
例（ａ）、及びその対話による対話進展度情報の変化図
（ｂ）。FIG. 4 is a diagram showing a dialogue example (a) by the voice dialogue apparatus according to the first embodiment of the present invention and a change diagram (b) of dialogue progress degree information due to the dialogue.

【図５】本発明の第１実施例の音声対話装置に係る作業
負荷推定量と対話制御方式の対応表。FIG. 5 is a correspondence table between a workload estimation amount and a dialogue control method according to the voice dialogue apparatus of the first embodiment of the present invention.

【図６】本発明の第２実施例に係る音声対話装置のシス
テムブロック図。FIG. 6 is a system block diagram of a voice dialog device according to a second embodiment of the present invention.

【図７】本発明の第１実施例にかかる対話例の変形例。FIG. 7 is a modification of the dialogue example according to the first embodiment of the present invention.

【図８】本発明の第１実施例にかかる対話例の変形例。FIG. 8 is a modification of the dialog example according to the first embodiment of the present invention.

【図９】本発明の第１実施例の変形例に係る音声対話装
置のシステムブロック図。FIG. 9 is a system block diagram of a voice interaction device according to a modification of the first embodiment of the present invention.

【図１０】本発明の第１実施例の変形例に係る音声対話
装置のシステムブロック図。FIG. 10 is a system block diagram of a voice interaction device according to a modification of the first embodiment of the present invention.

【図１１】従来の音声対話装置のシステムブロック図。FIG. 11 is a system block diagram of a conventional voice dialog device.

[Explanation of symbols]

１０…音声入力部１１…認識用言語辞書記憶部１２…音声認識部１３…対話制御部１４…ガイダンス生成部１５…音声出力部１６…認識用音響辞書記憶部１７…データベース検索部１８…データベース２０…作業状態検出部２１…作業者状態検出部２２…装置状態検出部２３…雑音検出部３０…情報提供装置 10 ... Voice input section 11 ... Recognition language dictionary storage unit 12 ... Voice recognition unit 13 ... Dialogue control unit 14 ... Guidance generator 15 ... Voice output section 16 ... Acoustic dictionary storage for recognition 17 ... Database search section 18 ... Database 20 ... Working state detector 21 ... Worker state detection unit 22 ... Device state detector 23 ... Noise detector 30 ... Information providing device

───────────────────────────────────────────────────── フロントページの続き (72)発明者寺澤位好愛知県愛知郡長久手町大字長湫字横道41番地の１株式会社豊田中央研究所内 (72)発明者星野博之愛知県愛知郡長久手町大字長湫字横道41番地の１株式会社豊田中央研究所内 (72)発明者脇田敏裕愛知県愛知郡長久手町大字長湫字横道41番地の１株式会社豊田中央研究所内Ｆターム(参考） 5D015 HH06 HH13 KK02 LL10 ─────────────────────────────────────────────────── ─── Continued front page (72) Inventor Noriyoshi Terazawa Aichi Prefecture Nagachite Town Aichi District Ground 1 Toyota Central Research Institute Co., Ltd. (72) Inventor Hiroyuki Hoshino Aichi Prefecture Nagachite Town Aichi District Ground 1 Toyota Central Research Institute Co., Ltd. (72) Inventor Toshihiro Wakita Aichi Prefecture Nagachite Town Aichi District Ground 1 Toyota Central Research Institute Co., Ltd. F term (reference) 5D015 HH06 HH13 KK02 LL10

Claims

[Claims]

1. A voice interactive device that enables voice input during work by using voice recognition technology, comprising: a work state detection unit; and a work performed by a worker based on a detection result of the work state detection unit. A workload estimation unit for estimating a load; a speech recognition unit for recognizing an input from a speech input unit by using one or more recognition language dictionaries and one or more recognition acoustic dictionaries; and the speech recognition unit. Of the recognition result, a database search result corresponding to the recognition result, dialogue progress degree information composed of the database search result and the degree of confirmation, and the work load level estimated by the work load estimating unit, A dialogue control unit that selects a confirmation item and / or question item and a voice feature that does not increase the workload of the worker, and a confirmation guidance and / or question guider for the worker based on the result of the dialogue control unit. Voice dialogue system, characterized in that it includes a guidance generator for generating a scan.

2. The voice recognition unit is configured to optimize the recognition language dictionary and the recognition language dictionary from the plurality of recognition language dictionaries and the plurality of recognition acoustic dictionaries based on the estimation result of the work load estimation unit. The voice interaction device according to claim 1, wherein a voice dictionary is selected and recognized.

3. The voice recognition unit, based on the detection result of the work state detection unit and / or the dialogue progress degree information of the dialogue control unit, the plurality of recognition language dictionaries and the plurality of recognition language dictionaries. 2. The voice dialogue apparatus according to claim 1, wherein the optimal recognition language dictionary and the recognition acoustic dictionary are selected and recognized from the acoustic dictionary.

4. The work state detecting section includes a worker state detecting section for detecting a worker state, an apparatus state detecting section for detecting a state of a working apparatus, and / or a noise detecting section for detecting ambient noise. The voice interaction device according to any one of claims 1 to 3, wherein:

5. The voice interaction device according to claim 4, wherein the worker state detecting unit detects the worker state from a physiological index of the worker.

6. The voice interactive apparatus according to claim 5, wherein the physiological index of the worker is a heart rate, a sweat rate, and a line-of-sight related value.

7. The apparatus state detecting unit according to claim 4, wherein the apparatus state detecting unit detects the apparatus state based on a physical quantity of the working apparatus and an operation frequency of the apparatus. Spoken dialogue device.

8. The dialogue progress level information is composed of a confirmation item and one of an unknown state, a confirmation required state, and a confirmation state indicating the degree of confirmation of the confirmation item.
9. The voice interaction device according to claim 7.

9. The dialogue control unit, when the workload is small or medium, prompts confirmation of the item by using a confirmation item in a confirmation required state and sets a new question item. The voice interaction device according to any one of claims 1 to 8.

10. The dialogue control unit sets a question item in a single question-and-answer form when the workload is large, according to any one of claims 1 to 8. The voice interaction device according to item 1.

11. The dialogue control unit prevents the dialogue load from increasing when the predetermined time and / or the work load after the predetermined time is estimated to be large. The dialogue method, the dialogue time interval, and / or the speech length and the speech speed are adjusted.
11. The voice interaction device according to claim 10.

12. The work state detection unit and the work load estimation unit are processes independent of the voice recognition unit, the dialogue control unit, and the guidance generation unit. 11. The voice interaction device according to any one of 11.

13. The voice interactive apparatus according to claim 1, wherein the working device is a vehicle, and the worker is a driver of the vehicle.

14. The physical quantity of the working device includes at least one of a following distance between a front vehicle and / or a rear vehicle, a position of the vehicle, a speed, an acceleration, a turning speed, a turning acceleration, and an accelerator / brake pedal depression amount. 14. The voice interaction device according to claim 13, wherein:

15. The voice interactive apparatus according to claim 13, wherein the voice interactive apparatus is provided in an information providing device mounted on the vehicle.