JPH0527790A

JPH0527790A - Voice input/output device

Info

Publication number: JPH0527790A
Application number: JP3202267A
Authority: JP
Inventors: Yuji Honda; 裕司本多; Tetsuya Sato; 哲也佐藤
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1991-07-18
Filing date: 1991-07-18
Publication date: 1993-02-05

Abstract

PURPOSE:To decide the type of a user from the voice and change an answer message matching the type of the user. CONSTITUTION:This voice input/output device is provided with a frequency analysis device 2 which decides the sex of the user, a voice recognition device 3 which decides how much the user is in a hurry, an answer time measuring instrument 4 which decides how much the user is familiar with operation, a total decision device 5 which judges the type of the user by comparing data indicating the tendency of past users stored in a reference data memory 6 with the decision result data, and a voice synthesizer 8 which composes an answer message corresponding to the specified type of the user by using voice data stored in a voice data memory 7 and outputs the message.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、音声を用いて対象の装
置を操作することを目的とする音声入出力装置に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice input / output device for operating a target device using voice.

【０００２】[0002]

【従来の技術】近年、現金入出金装置等の情報処理端末
装置は、利用者自身による操作の案内をするために、デ
ィスプレイ装置に文字を表示するだけでなく、図形を表
示したり音声による操作指示等のマルチメディア化が進
んでいる。特に、音声による操作案内は、装置に不慣れ
な利用者に対して、機械の操作を間違いなく行わせるた
めには非常に有効である。また、音声認識技術は単純な
応答ならかなりの精度で認識可能となっており、このこ
とから、音声により情報処理端末装置と対話を行いなが
ら、音声により操作を行うことができる情報処理端末装
置に対する期待が高まっている。2. Description of the Related Art In recent years, information processing terminal devices such as cash deposit / withdrawal devices not only display characters on a display device but also display figures or operate by voice in order to guide the user's operation. Instructions are becoming multimedia. In particular, voice-based operation guidance is very effective for allowing a user who is unfamiliar with the apparatus to operate the machine without fail. Further, the voice recognition technology is capable of recognizing a simple response with a considerably high accuracy. Therefore, it is possible to recognize an information processing terminal device that can operate by voice while interacting with the information processing terminal device by voice. Expectations are rising.

【０００３】ところで、従来の装置との応答を行いなが
ら操作を進めていく情報処理端末装置においては、装置
側からは１つの操作に対して１種類の応答メッセージが
出力されるようになっている。そして、その内容は、あ
らゆるタイプの利用者に対応するために、詳細でゆっく
りした丁寧なものであり、むしろ冗長であることが多か
った。By the way, in an information processing terminal device which advances an operation while responding to a conventional device, one kind of response message is output from the device side for one operation. .. And the content was often detailed, slow and polite, and rather redundant, to accommodate all types of users.

【０００４】[0004]

【発明が解決しようとする課題】このように、従来の画
一的かつ冗長な応答メッセージは、操作に慣れており迅
速な操作を行いたいと思っている利用者にとっては不要
なものであり、かえって負担を増加させるという問題点
があった。また、操作に不慣れな利用者にとっては、逆
に説明不足な面もあり、操作の際に、迷ったり誤操作を
引き起こすという問題点があった。As described above, the conventional uniform and redundant response message is unnecessary for a user who is accustomed to the operation and wants to perform a quick operation. On the contrary, there was a problem of increasing the burden. Further, for users who are unfamiliar with the operation, there is a problem that the explanation is insufficient, which causes a problem that the user may get confused or make an erroneous operation.

【０００５】本発明はこのような問題を解決するために
なされたもので、利用者の音声から該利用者のタイプを
判定して、応答メッセージを利用者のタイプに合った最
適なものに変更し、利用者に対するきめ細かい対応を行
う音声入出力装置を提供することを目的とする。The present invention has been made to solve such a problem, and determines the type of the user from the voice of the user, and changes the response message to the optimum one that matches the type of the user. However, it is an object of the present invention to provide a voice input / output device that performs a fine-tuned response to the user.

【０００６】[0006]

【課題を解決するための手段】この目的を達成するた
め、本発明は、利用者との応答を行いながら、音声によ
り情報処理装置を操作させる音声入出力装置において、
利用者の性別を判定するために、入力した利用者の音声
の周波数を測定し、その平均値を算出して判断の基準と
なる周波数と比較する手段と、利用者の急ぎ具合を判定
するために、単語間の時間間隔から言葉の速度を測定す
る手段と、利用者の操作に対する慣れを判定するため
に、応答に要する時間、単語間の時間間隔の均一さを測
定するとともに、単語の種類を判別する手段と、利用者
のタイプを判断する基準となる、音声から分析した過去
の利用者の傾向を示すデータが予め記憶されている手段
と、前記各判定結果のデータと基準となるデータを比較
して、利用者のタイプを判断する手段と、前記判断され
る利用者のタイプにそれぞれ対応する、１つの操作に対
して複数の応答メッセージを有し、その中から特定され
た利用者のタイプに対応した応答メッセージを出力させ
る手段とを備えるものである。In order to achieve this object, the present invention provides a voice input / output device for operating an information processing device by voice while responding to a user.
In order to determine the gender of the user, a means for measuring the frequency of the input voice of the user, calculating the average value thereof and comparing it with the frequency used as the reference for the determination, and for determining the urgency of the user In addition, the method of measuring the speed of words from the time interval between words, the time required for response and the uniformity of the time interval between words to determine the user's familiarity with the operation, and the type of word Means for determining, a means for determining the type of user, means for pre-storing data indicating past user tendency analyzed from voice, and data for each determination result and reference data And a plurality of response messages for one operation corresponding to the user type to be determined, and a user identified from among them. To the type of In which and a means for outputting a response message response.

【０００７】[0007]

【作用】上述した構成を有する本発明は、利用者に対し
て適当な質問を発する等して利用者に応答を求め、その
応答を分析することで利用者の目的と同時にそのタイプ
を過去の傾向と比較して統計的に推定する。その結果に
よって、応答メッセージの内容および速度を最適化す
る。According to the present invention having the above-mentioned configuration, the user is asked for a response by, for example, asking an appropriate question to the user, and the response is analyzed so that the type of the user can be determined at the same time as the purpose of the past. Estimated statistically in comparison with the trend. The result optimizes the content and speed of the response message.

【０００８】つまり、利用者の音声の平均周波数から性
別を判定でき、例えば４００ヘルツより平均周波数が大
きければ女性である確率が高く、４００ヘルツ以下であ
れば男性である確率が高いと判断する。また、言葉の速
度から利用者が急いでいるのかどうかを判定でき、例え
ば、「現金を引き出したい」と言う文章を認識するとき
に、「現金を」と「引き出したい」との間に時間的な間
隔がほとんどなければ、［急いでいる］と判断できる。
逆に間隔が０．５秒以上開いていれば、［あまり急いで
いない］と判断できる。That is, the sex can be determined from the average frequency of the voice of the user. For example, if the average frequency is higher than 400 hertz, the probability of being a female is high, and if it is 400 hertz or less, the probability of being a male is high. In addition, it is possible to determine whether the user is in a hurry based on the speed of words. For example, when recognizing the sentence "I want to withdraw cash", the time between "cash" and "want to withdraw" can be changed. If there is almost no such interval, you can judge that you are in a hurry.
On the other hand, if the interval is 0.5 seconds or more, it can be determined that the user is not in a hurry.

【０００９】さらに、応答に要する時間の長さや、言葉
と言葉の時間間隔および使用した言葉の種類から流暢さ
を求め、利用者が当該装置の利用に慣れているかどうか
を判定でき、例えば、利用者が応答を発するまでに１秒
以上の時間を有し、しかも「えーと」等の間投詞が存在
すれば、この顧客はどの様に応答したら良いかを一瞬迷
ったと判断できる。逆に即座に的確な応答が返されれ
ば、この顧客はどの様に応答すれば良いかを熟知してい
ると判断できる。これらのデータを元に、利用者をいく
つかのタイプに分類し、利用者がどのタイプに属するか
を判断することによって、最適な応答メッセージを発す
ることができる。Furthermore, it is possible to determine fluency from the length of time required for a response, the time interval between words and the type of words used, and determine whether the user is accustomed to using the device. If the person has a time of 1 second or more before issuing a response, and if there is an interjection such as "Eto", this customer can be momentarily wondered how to respond. On the other hand, if an accurate response is immediately returned, it can be determined that this customer is familiar with how to respond. An optimum response message can be issued by classifying the user into several types based on these data and determining which type the user belongs to.

【００１０】[0010]

【実施例】図１は本発明の一実施例を示す音声入力装置
の制御ブロック図で、以下、本実施例では、現金支払機
に組み込まれているものを例にとって説明する。図にお
いて、１は赤外線探知装置で、現金支払機の前に人がい
るか否かを検知する。２は周波数分析装置で、マイクＭ
から入力される利用者の音声の周波数を測定してその平
均周波数を算出し、予め決めておいた周波数とを比較し
て、利用者の性別の判断の基準とする。例えば、４００
ヘルツより平均周波数が大きければ女性、４００ヘルツ
以下であれば男性である確率が高いと判断できる。DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 is a control block diagram of a voice input device showing an embodiment of the present invention. In the following, in the present embodiment, the one incorporated in a cash payment machine will be described as an example. In the figure, reference numeral 1 denotes an infrared detection device, which detects whether or not there is a person in front of a cash dispenser. 2 is a frequency analysis device, a microphone M
The frequency of the user's voice input from is measured, the average frequency thereof is calculated, and the average frequency is compared and used as a criterion for determining the sex of the user. For example, 400
If the average frequency is higher than hertz, it can be determined that the probability of being a female is high, and if it is 400 hertz or less, it is likely to be a male.

【００１１】３は音声認識装置で、応答内容を認識する
とともに、単語間の時間間隔から言葉の速度が「早
口」，「普通」もしくは「ゆっくり」のいずれであるか
を判定して、利用者が急いでいるかどうかの判断の基準
とする。例えば、「現金を引き出したい」という文章を
認識するときに、「現金を」と引き出したい」との間に
時間的な間隔がほとんどなければ、この利用者は［急い
でいる］と判断できる。逆に間隔が０．５秒以上開いて
いれば、この利用者は［あまり急いでいない］と判断で
きる。Reference numeral 3 denotes a voice recognition device, which recognizes the contents of the response and judges whether the speed of the word is "quick talk", "normal" or "slow" from the time interval between words, and Will be used as a criterion for deciding whether or not you are in a hurry. For example, when recognizing the sentence “I want to withdraw cash”, if there is almost no time interval between “I want to withdraw cash” and “I want to withdraw cash”, this user can be judged as “in a hurry”. On the contrary, if the interval is 0.5 seconds or more, this user can be judged as [not in a hurry].

【００１２】４は応答時間測定装置で、利用者が応答に
要する時間が長いか短いか、応答に現れる単語間の時間
間隔が均一か否か、および言葉の種類から、利用者の応
答が「流暢」，「普通」もしくは「どもりどもり」のい
ずれであるかを判定して、利用者が装置の利用に慣れて
いるかどうかの判断の基準とする。例えば、利用者が応
答を発するまでに１秒以上の時間を要し、しかも「えー
と」等の間投詞が存在すれば、この利用者は装置に利用
に慣れてなく、どの様に応答したら良いかを一瞬迷った
と判断できる。逆に即座に的確な応答が返されれば、こ
の利用者は装置の使用になれており、どのように応答す
れば良いかを熟知していると判断できる。Reference numeral 4 denotes a response time measuring device, which determines whether the user's response is "long" or "short" for the response, whether the time intervals between the words appearing in the response are uniform, and the type of the word. Whether the user is accustomed to using the device is determined by determining whether the user is “fluent”, “normal”, or “severe”. For example, if it takes a second or more for the user to make a response, and if there is an interjection such as “um”, this user is not accustomed to using the device and how to respond. It can be judged that he was lost for a moment. On the other hand, if an accurate response is immediately returned, it can be determined that this user is already using the device and is well aware of how to respond.

【００１３】５は前記周波数分析装置２，音声認識装置
３および応答時間測定装置４による判定結果を元に、利
用者のタイプを判断する総合判定装置、６は前記総合判
定装置５が利用者のタイプを判断する際の基準となる、
音声から分析した過去の利用者の傾向を示すデータが予
め記録されている基準データメモリである。Reference numeral 5 is a comprehensive judgment device for judging the type of the user based on the judgment results by the frequency analysis device 2, the voice recognition device 3 and the response time measurement device 4, and 6 is the comprehensive judgment device 5 for the user. It will be the standard when judging the type,
This is a reference data memory in which data indicating past user trends analyzed from voice is recorded in advance.

【００１４】７は音声データメモリ、８は前記音声デー
タメモリ７に記憶されている音声データから所定の応答
メッセージを合成し、所定のスピードでスピーカＳＰを
介して出力させる音声合成装置であり、前記音声データ
メモリ７には、総合判定装置５で判断される利用者のタ
イプに対応した応答メッセージを出力するために、１つ
の操作に対して複数種の応答メッセージを出力できるよ
うに音声データが記憶されている。ここで、前記基準デ
ータメモリ６に記憶されている音声から分析した過去の
利用者の傾向としては、例えば早口で流暢な男性の利用
者は、何度も同じ装置を利用したことがあって操作に慣
れており、かつ急いでいる場合が多く、迅速な対応が望
ましいというデータが保存されている。Reference numeral 7 is a voice data memory, 8 is a voice synthesizer for synthesizing a predetermined response message from the voice data stored in the voice data memory 7, and outputting it via a speaker SP at a predetermined speed. The voice data memory 7 stores voice data so that a plurality of types of response messages can be output for one operation in order to output a response message corresponding to the type of user judged by the comprehensive judgment device 5. Has been done. Here, as a tendency of the past user analyzed from the voice stored in the reference data memory 6, for example, a quick and fluent male user has used the same device many times and operated. The data is stored that many people are accustomed to, and are in a hurry, and a quick response is desirable.

【００１５】このように過去に装置を利用したことがあ
って装置の操作に慣れており、かつ急いでいる利用者を
Ａタイプとし、このタイプＡの利用者に対応して、簡潔
な応答をする応答メッセージを出力させるための音声デ
ータが、音声データメモリ７に記憶されている。また、
過去に利用したことはあるが、急いでいるわけではない
利用者はＢタイプとし、このタイプＢの利用者に対応し
て、通常の応答をする応答メッセージを出力させるため
の音声データが、音声データメモリ７に記憶されてい
る。さらに、装置を初めて操作するかもしくはそれに近
い状態の利用者はＣタイプとし、このタイプＣの利用者
に対応して、より詳細な応答をする応答メッセージを出
力させるための音声データが、音声データメモリ７に記
憶されている。As described above, a user who has used the device in the past and is accustomed to operating the device and who is in a hurry is of type A, and responds to the user of type A with a simple response. The voice data for outputting the response message is stored in the voice data memory 7. Also,
Users who have used it in the past but are not in a hurry are of type B, and the voice data for outputting a response message that makes a normal response corresponds to this type B user. It is stored in the data memory 7. Furthermore, a user who is operating the device for the first time or is in a state close to the device is of type C, and the voice data for outputting a response message for giving a more detailed response to the user of this type C is voice data. It is stored in the memory 7.

【００１６】図２は上述したように、総合判定装置５が
利用者のタイプを判断する基準となるタイプ判定テーブ
ルである。まず、周波数分析装置２で男性の確率が高い
と判定された利用者の内、タイプＡの利用者と判断され
るのは、音声認識装置３で早口であると判定され、さら
に応答時間測定装置４で流暢もしくは普通の応答を行っ
ていると判定される場合である。As described above, FIG. 2 is a type determination table serving as a reference for the comprehensive determination device 5 to determine the type of user. First, of the users who are determined to have a high probability of being male by the frequency analysis device 2, the voice recognition device 3 determines that the user of type A is a quick mouth, and the response time measurement device. This is the case when it is determined in 4 that the user is responding fluently or normally.

【００１７】また、周波数分析装置２で男性の確率が高
いと判定された利用者の内、タイプＢの利用者と判断さ
れるのは、音声認識装置３で普通もしくはゆっくり話し
ていると判定され、さらに応答時間測定装置４で流暢も
しくは普通の応答を行っていると判定される場合であ
る。さらに、周波数分析装置２で男性の確率が高いと判
定された利用者の内、タイプＣの利用者と判断されるの
は、音声認識装置３で早口、普通もしくはゆっくり話し
ていると判定され、さらに応答時間測定装置４でどもり
どもりの応答を行っていると判定される場合である。Among the users who are determined to have a high probability of being male by the frequency analysis device 2, the type B user is determined to be normal or slow talking by the voice recognition device 3. Further, there is a case where it is determined that the response time measuring device 4 is performing fluent or normal response. Further, among the users who are determined to have a high probability of being male by the frequency analysis device 2, it is determined that the user is a type C user, and it is determined that the voice recognition device 3 is speaking quickly, normally or slowly, Further, it is a case where it is determined that the response time measuring device 4 is responding in a stuttering manner.

【００１８】周波数分析装置２で女性の確率が高いと判
定された利用者の内、タイプＡの利用者と判断されるの
は、音声認識装置３で早口であると判定され、さらに応
答時間測定装置４で流暢に応答を行っていると判定され
る場合である。また、周波数分析装置２で女性の確率が
高いと判定された利用者の内、タイプＢの利用者と判断
されるのは、音声認識装置３で早口であると判定され、
応答時間測定装置４で普通に応答していると判定された
場合、音声認識装置３で普通であると判定され、応答時
間測定装置４で流暢もしくは普通に応答していると判定
された場合、さらには、音声認識装置３でゆっくりであ
ると判定され、応答時間測定装置４で流暢に応答してい
ると判定された場合である。Among the users who are determined by the frequency analysis device 2 to have a high probability of being a woman, the type A user is determined to be fast by the voice recognition device 3, and the response time is measured. This is a case where it is determined that the device 4 is responding fluently. Further, among the users who are determined to have a high probability of being female by the frequency analysis device 2, it is determined by the voice recognition device 3 that the user of type B is quick.
When the response time measuring device 4 determines that the response is normal, the voice recognition device 3 determines that the response is normal, and when the response time measuring device 4 determines that the response time is fluent or normal, Further, there is a case where the voice recognition device 3 determines that the response is slow and the response time measurement device 4 determines that the response is fluent.

【００１９】さらに、周波数分析装置２で女性の確率が
高いと判定された利用者の内、タイプＣの利用者と判断
されるのは、音声認識装置３で早口であると判定され、
応答時間測定装置４でどもりどもりに応答していると判
定された場合、音声認識装置３で普通であると判定さ
れ、応答時間測定装置４でどもりどもりに応答している
と判定された場合、さらには、音声認識装置３でゆっく
りであると判定され、応答時間測定装置４で普通もしく
はどもりどもりに応答していると判定された場合であ
る。Further, among the users who are determined to have a high probability of being female by the frequency analysis device 2, it is determined by the voice recognition device 3 that the user of type C is quick.
When it is determined that the response time measuring device 4 is responding in a stuttering manner, it is determined that the voice recognition device 3 is normal, and when the response time measuring device 4 is determined to respond in a stuttering manner, Further, it is a case where the voice recognition device 3 determines that the response is slow, and the response time measurement device 4 determines that the response time measurement device 4 responds normally or in a stuttering manner.

【００２０】図３は本実施例における誘導音声の内容を
示す説明図で、（ａ）は利用者がＡタイプの場合、
（ｂ）は利用者がＣタイプの場合を示す。タイプＡの利
用者の場合は、操作方法を熟知していると判断している
ので、全体としてただ操作を促すだけの応答メッセージ
が出力される。また、タイプＣの利用者の場合は、装置
を初めて操作するかもしくはそれに近い状態であると判
断しているので、各種情報の入力手段として何を使う
か、それが装置の何処にあるのか等を説明しながら操作
を促す応答メッセージが出力される。FIG. 3 is an explanatory diagram showing the contents of the guidance voice in this embodiment. FIG. 3A shows the case where the user is the A type.
(B) shows the case where the user is a C type. In the case of the type A user, since it is determined that he is familiar with the operation method, the response message that merely prompts the operation is output as a whole. In addition, since the type C user determines that the device is being operated for the first time or is in a state close to it, what is used as an input means for various information, where is the device, etc. A response message that prompts the user to operate is output.

【００２１】図４は本実施例のフローチャートで、以下
に本実施例の作用を説明する。まず、利用者が現金支払
機の前に立つ。すると、装置は赤外線探知装置１によっ
て利用者が現れたことを認識し、利用者に操作の実行を
促すとともに、利用者のタイプを判断するため、図３
（１）のような応答メッセージを出力して、利用者との
応答を開始させる（Ｓ１，Ｓ２）。この時、利用者の応
答がやや早口の男性の声で、しかも図３（２）に示すよ
うに的確で流暢な応答であったとする。利用者からの音
声の入力があると（Ｓ３）、装置内では音声認識装置３
で入力された音声の認識がなされるとともに、利用者の
タイプを判断するために、周波数分析装置２，音声認識
装置３および応答時間測定装置４で、上述したような手
順で音声の判定がなされ、周波数分析装置２で利用者の
性別は男性である確率が高いことを示され（Ｓ４）、音
声認識装置３でやや早口であることを認識され（Ｓ
５）、応答時間測定装置４で流暢な応答であることを示
される（Ｓ６）。FIG. 4 is a flow chart of this embodiment, and the operation of this embodiment will be described below. First, the user stands in front of the cash dispenser. Then, the device recognizes that the user has appeared by the infrared detection device 1, prompts the user to perform an operation, and determines the type of the user.
The response message as in (1) is output to start the response with the user (S1, S2). At this time, it is assumed that the user's response is a slightly swollen male voice, and that the response is accurate and fluent as shown in FIG. When a voice is input from the user (S3), the voice recognition device 3 is activated in the device.
In addition to recognition of the voice input in step S1, the frequency analysis device 2, the voice recognition device 3, and the response time measurement device 4 determine the voice in order to determine the type of user. The frequency analysis device 2 indicates that the gender of the user is likely to be male (S4), and the voice recognition device 3 recognizes that the user is a little quicker (S4).
5) The response time measuring device 4 indicates that the response is fluent (S6).

【００２２】一方、上述したように、音声から分析した
過去の利用者の傾向を示すデータが基準データメモリ６
に記憶されており、その傾向として、例えば早口で流暢
な男性の利用者は、何度も同じ処理装置を利用したこと
があり操作に慣れた利用者であり、かつ急いでいる場合
が多く迅速な対応が望ましいというデータが記憶されて
いる。この時、前記周波数分析装置２，音声認識装置３
および応答時間測定装置４で判定した現在対応している
利用者の各判定結果データと、前記基準データメモリ６
に記憶されている、過去の利用者の傾向を分析した基準
データとを比較することによって、総合判定装置５がこ
の利用者のタイプは本装置を過去に利用したことがあっ
て操作に慣れており、かつ急いでいるＡのタイプの利用
者であると判断する（Ｓ７）。On the other hand, as described above, the data showing the tendency of the past user analyzed from the voice is the reference data memory 6
The tendency is that, for example, a male user who is fast-paced and fluent is a user who has used the same processing device many times and is familiar with the operation, and is often in a hurry. Data is stored that it is desirable to take appropriate measures. At this time, the frequency analysis device 2 and the voice recognition device 3
And each determination result data of the user who is currently supported by the response time measuring device 4, and the reference data memory 6
Comparing with the reference data which is analyzed the tendency of the user in the past, the comprehensive judgment device 5 is accustomed to the operation because this user type has used this device in the past. It is determined that the user is a type A user who is present and is in a hurry (S7).

【００２３】これにより、音声合成装置８は音声データ
メモリ７に記憶されている音声データから、図３（３）
〜（７）のような簡潔な応答をする応答メッセージを合
成し、通常よりやや早い速度でスピーカＳＰから出力し
て行く（Ｓ８）。図３（３）〜（７）に示すような簡潔
な応答メッセージであっても、操作に慣れている利用者
であるので、応答メッセージの内容，速度に即座に対応
して、迅速に現金を引き出す処理を終えることができ
る。一方、図３（１）の応答メッセージに対する応答
が、（８）のように不規則な女性の声で、的確とは言い
がたい応答であったとする。この時は、前述の場合と同
様に、周波数分析装置２，音声識別装置３，応答時間測
定装置４の判定結果から、総合判断装置５は、この利用
者は初めてこの装置を利用するかもしくはそれに近い状
態のＣのタイプの利用者であると判断する。As a result, the voice synthesizer 8 uses the voice data stored in the voice data memory 7 as shown in FIG.
A response message that makes a simple response such as (7) to (7) is synthesized and output from the speaker SP at a speed slightly faster than usual (S8). Even with a simple response message as shown in FIGS. 3 (3) to (7), since the user is accustomed to the operation, he / she can quickly respond to the content and speed of the response message to promptly cash out. You can finish the process of withdrawing. On the other hand, it is assumed that the response to the response message in FIG. 3 (1) is an irregular female voice as shown in (8), which cannot be said to be accurate. At this time, as in the case described above, based on the determination results of the frequency analysis device 2, the voice identification device 3, and the response time measurement device 4, the comprehensive determination device 5 determines whether this user is using this device for the first time or It is determined that the user is a C type user in a close state.

【００２４】これにより、音声合成装置８は音声データ
メモリ７に記憶されている音声データから、図３（９）
〜（１３）のようなより詳細な応答を行う応答メッセー
ジを合成し、通常よりややゆっくりした速度でスピーカ
ＳＰから出力して行く（Ｓ１０）。このように、図３
（９）〜（１３）に示すようなより詳細な応答メッセー
ジを、ややゆっくりした速度で出力することで、操作に
不慣れな利用者であっても、確実に目的を達成できるよ
うに誘導することができる。As a result, the voice synthesizer 8 uses the voice data stored in the voice data memory 7 as shown in FIG.
A response message for making a more detailed response like (13) to (13) is synthesized and output from the speaker SP at a speed slightly slower than usual (S10). Thus, FIG.
By outputting a more detailed response message as shown in (9) to (13) at a slightly slower speed, even a user who is unfamiliar with the operation can be guided so as to surely achieve the purpose. You can

【００２５】[0025]

【発明の効果】以上説明したように、本発明は、利用者
との応答を行いながら、音声により情報処理装置を操作
させる音声入出力装置において、利用者の性別を判定す
るために、入力した利用者の音声の周波数を測定し、そ
の平均値を算出して判断の基準となる周波数と比較する
手段と、利用者の急ぎ具合を判定するために、単語間の
時間間隔から言葉の速度を測定する手段と、利用者の操
作に対する慣れを判定するために、応答に要する時間、
単語間の時間間隔の均一さを測定するとともに、単語の
種類を判別する手段と、利用者のタイプを判断する基準
となる、音声から分析した過去の利用者の傾向を示すデ
ータが予め記憶されている手段と、前記各判定結果のデ
ータと基準となるデータを比較して、利用者のタイプを
判断する手段と、前記判断される利用者のタイプにそれ
ぞれ対応する、１つの操作に対して複数の応答メッセー
ジを有し、その中から特定された利用者のタイプに対応
した応答メッセージを出力させる手段とを備えたもので
ある。As described above, according to the present invention, in the voice input / output device that operates the information processing device by voice while responding to the user, the input is made in order to determine the sex of the user. To measure the frequency of the user's voice, calculate the average value and compare it with the reference frequency, and to determine the user's urgency, measure the speed of words from the time interval between words. The time it takes to respond in order to determine how to measure and how familiar the user is with the operation,
In addition to measuring the uniformity of the time interval between words, it also stores in advance the data indicating the past user tendency analyzed from the voice, which serves as a reference for determining the type of word and the type of user. Means for comparing the determination result data with the reference data to determine the user type, and one operation corresponding to the determined user type. And a means for outputting a response message corresponding to the type of user specified from among the plurality of response messages.

【００２６】これにより、音声を利用して操作する情報
処理装置において、音声入出力装置は、音声による応答
から利用者の操作に対する熟練度や急ぎであるかそうで
ないかを判断し、利用者のタイプによって異なる操作誘
導音声を出力することができる。したがって、装置側か
らの誘導音声が一律であることに起因する、操作に不慣
れな利用者が操作方法に迷ったり、操作に熟練した利用
者がまどろっこしい思いを抱くことが少なくなることが
期待できる。さらに、本発明をキャッシュディスペンサ
ー等の多くの不特定利用者を対象とする装置に応用した
場合、操作者によっては不必要な冗長なメッセージを発
する時間を節約することができ、全体として効率的な運
用が可能となり、その結果、利用者の待ち時間を短縮す
ることができる。Accordingly, in the information processing device operated by using the voice, the voice input / output device determines from the response by the voice whether the user is proficient in the operation, urgent or not, and the user It is possible to output different operation guidance voices depending on the type. Therefore, it is less likely that a user unfamiliar with the operation will be confused about the operation method or a user skilled in the operation will have a confused feeling due to the uniform guide voice from the device side. Can be expected. Further, when the present invention is applied to a device intended for many unspecified users such as a cash dispenser, it is possible to save the time of issuing unnecessary redundant messages depending on the operator, and it is efficient as a whole. Operation becomes possible, and as a result, the waiting time of the user can be shortened.

[Brief description of drawings]

【図１】本発明の一実施例を示す音声入出力装置の制御
ブロック図である。FIG. 1 is a control block diagram of a voice input / output device showing an embodiment of the present invention.

【図２】利用者のタイプを判定するタイプ判定テーブル
である。FIG. 2 is a type determination table for determining the type of user.

【図３】タイプ別の応答メッセージの出力例を示す説明
図である。FIG. 3 is an explanatory diagram showing an output example of a response message for each type.

【図４】本実施例の作用を示すフローチャートである。FIG. 4 is a flowchart showing the operation of the present embodiment.

[Explanation of symbols]

２周波数分析装置３音声認識装置４応答時間測定装置５総合判定装置６基準データメモリ７音声データメモリ８音声合成装置 2 Frequency analysis device 3 Speech recognition device 4 Response time measurement device 5 Comprehensive judgment device 6 Reference data memory 7 Speech data memory 8 Speech synthesis device

Claims

Claim: What is claimed is: 1. A voice input / output device for operating an information processing device by voice while responding to a user, in order to determine the gender of the user, an input voice of the user. And a means for calculating the average value and comparing it with the reference frequency, and a means for measuring the speed of words from the time interval between words in order to determine the urgency of the user. , To measure the user's familiarity with the operation, measure the time required for response, the uniformity of the time interval between words, and the means for determining the type of word, and the standard for determining the type of user. , Means for pre-storing data indicating past user tendency analyzed from voice, means for judging the type of user by comparing the data of each judgment result with reference data, Judgment Correspond to the type of user that is, 1
A voice input / output device comprising a plurality of response messages for one operation, and a means for outputting a response message corresponding to the type of user specified from among them.