JPH0883097A

JPH0883097A - Device operation instruction method and device for excuting the method

Info

Publication number: JPH0883097A
Application number: JP6216882A
Authority: JP
Inventors: Masanobu Abe; 匡伸阿部
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1994-09-12
Filing date: 1994-09-12
Publication date: 1996-03-26

Abstract

PURPOSE: To improve the efficiency of the human interface by referring to the past synthesized voice instruction outputs while outputting synthesized voice instructions based on the recognition results of human motion/operations and machine operations. CONSTITUTION: In the case of instruction on the operations of putting a car into a garage, for example, a distance computing section 201 receives the data obtained by a sensor and computes the distance between the body of the car and the wall of the garage. An instruction voice storage memories 202 to 205 store the instruction voices synthesized and outputted in the past or the contents of the voices. A condition judgement processing section 206 makes a judgement on the condition based on the distance between the body and the wall computed by the section 201 and the past instruction voice outputs and decides to generate a proper response sentence. These synthesized voice response sentences are beforehand registered in a response voice storage device 207. A response sentence selection processing section 208 selects a synthesized response sentence corresponding to the judgement processing result of the section 206 from the synthesized voice response sentences stored in the section 207 and outputs the selected sentence.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、機器操作指示方法お
よびこの方法を実施する装置に関し、特に、人間の動作
および／或は操作および機器の動作を認識し、認識結果
に基づいて適切な合成音声を出力することによりヒュー
マンインタフェースの効率を向上せしめる機器操作指示
方法およびこの方法を実施する装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a device operation instructing method and an apparatus for implementing the method, and more particularly to recognizing human motions and / or operations and device motions, and appropriately synthesizing them based on the recognition result. The present invention relates to a device operation instruction method for improving the efficiency of a human interface by outputting a voice, and an apparatus for implementing this method.

【０００２】[0002]

【従来の技術】コンピュータのダウンサイズ化が進行す
るに伴い、コンピュータを使用する機会が増加し、帳票
作成作業、文書作成その他の日常の業務にコンピュータ
を使用することは珍しいことではない。そして、コンピ
ュータに不慣れな初心者が使用するケースも増加してい
る。この様な状況を考慮すると、コンピュータのヒュー
マンインタフェースが重要なこととなってくる。2. Description of the Related Art As the downsizing of computers progresses, the chances of using the computers increase, and it is not uncommon to use the computers for form preparation work, document preparation and other daily work. In addition, the number of cases where beginners unfamiliar with computers use it is increasing. Considering such a situation, the human interface of a computer becomes important.

【０００３】例えば、帳票作成を日常の業務としている
場合、帳票作成作業の慣れに起因してうっかりミスを
し、或は連続作業の疲れにより帳票記入欄を間違えると
いう様なことがある。この様な場合、誤りを指摘し、そ
の対処の仕方を音声により指示すれば、誤りの回復は容
易になり、誤った作業を続行する無駄を省くことができ
る。視覚により誤りを指摘することもできるが、目は入
力すべきデータを常に注目しているところから、誤りに
すぐに気付かせるには音声に依る方が有効であると考え
られる。また、コンピュータに不慣れな初心者に様々な
指示をするには、視覚に依るより音声に依る方が良いこ
とがある。例えば、コンピュータの画面上には様々な情
報が表示されているので、初心者は自分の必要としてい
る情報が画面上の何処に表示されているのかを探すこと
にさえ一苦労する。この様な場合、音声によりタイミン
グよく指示すれば、初心者はスムーズにコンピュータの
操作を習得することができる。[0003] For example, when the form creation is a daily work, there are cases in which the form entry field is mistaken due to an inadvertent mistake due to familiarity with the form creating work or due to tiredness of continuous work. In such a case, by pointing out an error and instructing how to deal with it by voice, the error can be easily recovered and the waste of continuing the erroneous work can be omitted. Although the error can be visually pointed out, the eyes are always paying attention to the data to be input, and it is considered that it is more effective to rely on the voice to immediately notice the error. Also, in order to give various instructions to a beginner unfamiliar with a computer, it may be better to rely on voice rather than on vision. For example, since various kinds of information are displayed on the screen of a computer, a beginner has a hard time even finding where on the screen the information he or she needs is displayed. In such a case, a beginner can learn the operation of the computer smoothly by giving voice instructions at a timely timing.

【０００４】音声出力による指示は、コンピュータ機器
との間のやり取りに限らず、人間と機器一般との間の協
調作業、或は人間の操作をモニタしてその操作に注意を
与えることに適用しても有効である。例えば、運転者が
自動車を後進させて車庫入れする様な場合、自動車と車
庫壁との間の距離を測定し、その距離に対応してハンド
ル操作を、「右に切れ」、或は「あと何センチバックし
てよい」という様に音声により出力する。この様な音声
による操作指示は、日常の経験から非常に便利なもので
あることが知られている。The instruction by the voice output is applied not only to the interaction with the computer equipment but also to the cooperative work between the human and the equipment in general, or to monitor the human operation and give attention to the operation. But it is effective. For example, when the driver reverses the car and puts it in the garage, the distance between the car and the garage wall is measured, and the steering wheel operation is "turned to the right" or "after" according to the distance. You can move back a few centimeters. " It is known from the daily experience that such an operation instruction by voice is very convenient.

【０００５】[0005]

【発明が解決しようとする課題】機器とこれを操作する
者との間のヒューマンインタフェースの効率を向上する
に、音声出力を利用して指示すること自体は、上述した
通り、従来から行なわれている。しかし、音声出力を利
用する機器操作指示装置の従来例は、予め登録されてい
るメッセージを出力するに留まり、様々な状況に応じて
適切な音声を出力しているとは言い難いものである。状
況判断をせずに単に登録されているメッセージを出力す
ることのみに依っては、ヒューマンインタフェースの効
率の向上を大きく望むことはできない。As described above, in order to improve the efficiency of the human interface between the equipment and the person who operates the equipment, the instruction itself using the voice output has been performed conventionally. There is. However, it is hard to say that the conventional example of the device operation instructing apparatus that uses voice output outputs only a message registered in advance and outputs appropriate voice according to various situations. It is not possible to expect much improvement in the efficiency of the human interface by simply outputting the registered message without making a situation judgment.

【０００６】この発明は、人間の動作および／或は操作
および機器の動作を認識し、認識結果に基づいて適切な
音声を出力することによりヒューマンインタフェースの
効率を向上せしめる機器操作指示方法およびこの方法を
実施する装置を提供するものである。The present invention recognizes a human motion and / or an operation and a device motion, and outputs an appropriate voice based on the recognition result to improve the efficiency of a human interface and a method for this device operation. An apparatus for carrying out the method is provided.

【０００７】[0007]

【課題を解決するための手段】動作および／或は操作お
よび機器の動作を認識し、認識結果に基づいて合成音声
指示出力を生成する機器操作指示方法において、過去の
合成音声指示出力を参照して合成音声指示出力を生成す
る機器操作指示方法を構成した。そして、動作および／
或は操作および機器の動作を検出する検出装置１０１、
１０２を具備し、過去の合成音声指示出力を記憶する指
示出力メモリ２０２ないし２０５を具備し、検出装置の
検出結果と指示出力メモリの記憶内容とに基づいて合成
音声指示出力を生成する合成音声応答文生成装置２０
７、２０８、２０９を具備する機器操作指示装置を構成
した。In a device operation instruction method for recognizing an operation and / or an operation and an operation of a device and generating a synthetic voice instruction output based on a recognition result, a past synthetic voice instruction output is referred to. A device operation instruction method for generating a synthetic voice instruction output is configured. And action and /
Alternatively, a detection device 101 for detecting the operation and the operation of the device,
A synthetic voice response that includes 102 and includes instruction output memories 202 to 205 that store past synthetic voice instruction outputs, and that generates a synthetic voice instruction output based on the detection result of the detection device and the stored contents of the instruction output memory. Sentence generator 20
A device operation instruction device including 7, 208, and 209 was configured.

【０００８】また、合成音声応答文生成装置は編集音声
合成装置或は規則による音声合成装置である機器操作指
示装置を構成した。更に、合成音声応答文生成装置は音
声の基本周波数パタンの形状および／或は音素の継続時
間を変化させる音声変換処理部２０９を有するものであ
る機器操作指示装置を構成した。Further, the synthesized voice response sentence generating device constitutes an equipment operation instructing device which is an edited voice synthesizing device or a rule-based voice synthesizing device. Further, the synthesized voice response sentence generation device constitutes a device operation instruction device having a voice conversion processing unit 209 for changing the shape of the fundamental frequency pattern of the voice and / or the duration of the phoneme.

【０００９】[0009]

【実施例】この発明の実施例を図１および図２を参照し
て説明する。図１はこの発明の機器操作指示装置の全体
構成を示す図である。図２は機器操作指示装置を構成す
る合成音声応答文生成装置を示す図である。図１におい
て、１０１はキーボード、マウス、タッチパネルその他
の入力装置であり、１０２は磁気センサ、光センサ、画
像認識装置、音声認識装置その他の物理量を検出するセ
ンサである。これらの入力装置およびセンサにより、人
間の動作、人間による操作、機器の動作を認識するに必
要とされるデータを得ることができる。１０３は音声お
よび／或は画像を出力する出力装置であり、これらを使
用して指示音声を出力し、或は機器本体の状態を表示し
たりする。１０４は主装置であり、１０５により示され
る記憶装置と、１０６により示される演算処理処理装置
より成る。この主装置１０４は、入力装置１０１および
センサ１０２により検出されるデータに基づいて、人間
の動作、人間による操作および機器の動作を認識した
り、応答文の生成を行ったりする。１０７はディスク装
置であり、主装置１０４において使用されるデータ、プ
ログラムを格納する記憶装置である。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT An embodiment of the present invention will be described with reference to FIGS. FIG. 1 is a diagram showing the overall configuration of a device operation instruction device of the present invention. FIG. 2 is a diagram showing a synthetic voice response sentence generation device which constitutes the device operation instruction device. In FIG. 1, 101 is a keyboard, mouse, touch panel, and other input devices, and 102 is a magnetic sensor, an optical sensor, an image recognition device, a voice recognition device, and other sensors that detect physical quantities. With these input devices and sensors, data necessary for recognizing human motion, human operation, and device motion can be obtained. An output device 103 outputs a voice and / or an image, and outputs an instruction voice or displays the state of the device body using these. Reference numeral 104 denotes a main unit, which includes a storage device indicated by 105 and an arithmetic processing unit indicated by 106. The main device 104 recognizes a human motion, a human operation and a device motion based on the data detected by the input device 101 and the sensor 102, and generates a response sentence. A disk device 107 is a storage device that stores data and programs used in the main device 104.

【００１０】ここで、機器操作指示の仕方を自動車の車
庫入れを例にとって具体的に説明する。図２を参照する
に、２０１は距離計算部であり、センサー１０２によ得
られるデータを入力して自動車の車体と車庫壁との間の
距離を計算する。２０２ないし２０５は、過去に出力し
た合成音声による指示音声或は音声内容を記憶する指示
音声記憶メモリである。この例は、過去に出力された４
個の指示音声を記憶している。２０６は状況判断処理部
であり、距離計算部２０１により計算された車体と壁と
の間の距離と過去の指示音声出力に基づいて状況を判断
し、如何なる応答文を生成すべきかを決定する。例え
ば、「後の壁まで１０ｃｍ」、「ハンドルをもう少し右
に切って」という様な合成音声応答文を生成する。過去
の指示音声出力を利用した状況判断とは、直前に「右に
切れ」と音声指示したにもかかわらず、「左に切って」
いる場合は「右、右」と指示したり、「切り方」が少な
い場合は「もっと切って」という様に以前の音声指示を
参照してこれを更に強調する指示をすることをいう。こ
の様な合成音声応答文は、２０７により示される応答音
声記憶装置に予め登録されているものとする。登録音声
を使用せずに規則による音声合成装置を使用することも
できる。２０８は応答文選択処理部であり、状況判断処
理部２０６の判断処理結果に対応する適切な合成応答文
を、応答音声記憶装置２０７に記憶される合成音声応答
文の内から選択出力する。２０９は音声変換処理部であ
り、音声の基本周波数、継続時間の変更処理をして応答
文のバリエーションを多くする。「右、右」と指示する
様な場合、基本周波数を高くし、継続時間を短くするこ
とにより緊張した感じの表現とすることができる。Here, the method of instructing the device operation will be specifically described by taking a case of a car garage as an example. Referring to FIG. 2, reference numeral 201 denotes a distance calculation unit that inputs data obtained by the sensor 102 and calculates the distance between the vehicle body of the automobile and the garage wall. Reference numerals 202 to 205 denote instruction voice storage memories for storing the instruction voice or the voice content of the synthesized voice output in the past. This example shows 4
Each instruction voice is stored. A situation determination processing unit 206 determines the situation based on the distance between the vehicle body and the wall calculated by the distance calculation unit 201 and the past instruction voice output, and determines what kind of response sentence should be generated. For example, a synthetic voice response sentence such as "10 cm to the rear wall" and "turn the steering wheel a little more to the right" is generated. Situation determination using past instruction voice output means "cut to the left" even though the voice instruction "cut to the right" was given immediately before.
If it is, "right, right" is instructed, or if there is little "cutting", "cut more" is referred to, and an instruction to further emphasize this is given by referring to the previous voice instruction. It is assumed that such a synthetic voice response sentence is registered in the response voice storage device 207 in advance. It is also possible to use a rule-based speech synthesizer without using registered voice. A response sentence selection processing unit 208 selects and outputs an appropriate synthesized response sentence corresponding to the determination processing result of the situation determination processing unit 206 from the synthesized voice response sentence stored in the response voice storage device 207. Reference numeral 209 denotes a voice conversion processing unit, which performs a process of changing the fundamental frequency and duration of the voice to increase variations of the response sentence. When "right, right" is instructed, it is possible to make the expression tense by increasing the fundamental frequency and shortening the duration.

【００１１】[0011]

【発明の効果】以上の通りであって、この発明は、人間
の動作、人間による操作、機器の動作を認識し、この認
識結果に基づいて適切な合成音声出力による機器操作指
示を与えるものであり、機器操作を容易にして作業の効
率を向上し、機器操作に起因する人間のストレスを軽滅
する効果を奏するものである。As described above, the present invention recognizes a human motion, a human operation, and a device motion, and provides a device operation instruction by an appropriate synthesized voice output based on the recognition result. Therefore, the present invention has an effect of facilitating operation of the device, improving work efficiency, and lightening human stress caused by operation of the device.

[Brief description of drawings]

【図１】実施例を示す図。FIG. 1 is a diagram showing an example.

【図２】合成音声応答文生成装置を示す図。FIG. 2 is a diagram showing a synthetic voice response sentence generation device.

[Explanation of symbols]

１０１入力装置１０２センサ１０３出力装置１０４主装置１０５記憶装置１０６演算処理処理装置１０７ディスク装置２０１距離計算部２０２ないし２０５指示音声記憶メモリ２０６状況判断処理部２０７応答音声記憶装置２０８応答文選択処理部２０９音声変換処理部 101 input device 102 sensor 103 output device 104 main device 105 storage device 106 arithmetic processing device 107 disk device 201 distance calculation unit 202 to 205 instruction voice storage memory 206 situation determination processing unit 207 response voice storage device 208 response sentence selection processing unit 209 Voice conversion processing unit

Claims

[Claims]

1. A device operation instruction method for recognizing an operation and / or an operation and an operation of a device and generating a synthetic voice instruction output based on the recognition result, in a synthetic voice instruction by referring to a past synthetic voice instruction output. A device operation instructing method characterized by generating an output.

2. A detection device for detecting an operation and / or an operation and an operation of a device, and an instruction output memory for storing a past synthetic voice instruction output, and a detection result of the detection device and an instruction output memory. An apparatus operation instruction device comprising a synthetic voice response sentence generation device for generating a synthetic voice instruction output based on stored contents.

3. The device operation instruction device according to claim 2, wherein the synthesized voice response sentence generation device is an edited voice synthesis device or a rule-based voice synthesis device.

4. The device operation instruction device according to claim 3, wherein the synthetic voice response sentence generation device has a voice conversion processing unit for changing the shape of the fundamental frequency pattern of the voice and / or the duration of the phoneme. An apparatus operation instruction device characterized in that