JPH08259125A

JPH08259125A - Voice inputting device for elevator

Info

Publication number: JPH08259125A
Application number: JP7096110A
Authority: JP
Inventors: Shiyandoru Marukon; シャンドルマルコン
Original assignee: Fujitec Co Ltd
Current assignee: Fujitec Co Ltd
Priority date: 1995-03-28
Filing date: 1995-03-28
Publication date: 1996-10-08
Anticipated expiration: 2015-08-28
Also published as: JP3082618B2

Abstract

PURPOSE: To obtain a voice inputting device at a high reliability, which is easy to be used by users of elevator, by comparing recognition results of plural voice recognizing units so as to judge the suitableness and unsuitableness of voice input. CONSTITUTION: The voice, which is input from a microphone 4, is converted to the digital value by an A/D converter 10, and thereafter, band limitation is performed by a band pass filter unit 11. A spectrum computing unit 30 forms the voice recognizing input pattern on the basis of the voice detected by a voice zone detecting unit 12, and outputs it to a first and a second voice recognizing units 31, 32. The first voice recognizing unit 31 is formed as a multiple layered perceptron type neural net work, and the second voice recognizing unit 32 is formed as a LVQ net. The first and the second voice recognizing units 31, 32 recognize the voice of the voice data, and a result thereof is output to a voice recognition confirming unit 33. On the other hand, the voice recognition confirming unit 33 computes the reliability of the voice recognition, and outputs a result thereof to an elevator control unit 18 and an informing unit 34.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、エレベータ利用者の音
声を認識してかごや乗場の呼びの登録等を行うエレベー
タの音声入力装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice input device for an elevator which recognizes voice of an elevator user and registers a call for a car or a hall.

【０００２】[0002]

【従来の技術】音声認識装置を有する従来のエレベータ
装置を図により説明する。この従来例はエレベータかご
に乗る前に、行先階を登録するタイプのものである。図
４はエレベータ乗場に音声認識装置を使用した概略図で
あり、乗場ドア１の近辺にはかご位置表示装置２、行先
階呼登録ボタン３、音声認識用マイクロホン４、音声認
識処理が正常に行われたことを示すＯＫランプ５、音声
認識が行えなかったことを示すリジェクトランプ６が設
置されている。2. Description of the Related Art A conventional elevator apparatus having a voice recognition device will be described with reference to the drawings. This conventional example is of a type in which a destination floor is registered before getting on an elevator car. FIG. 4 is a schematic diagram in which a voice recognition device is used in an elevator hall, and a car position display device 2, a destination floor call registration button 3, a voice recognition microphone 4, and a voice recognition process are normally performed near the hall door 1. An OK lamp 5 indicating that the voice has been broken and a reject lamp 6 indicating that the voice recognition cannot be performed are installed.

【０００３】音声認識による操作を行わないときは、従
来と同様に、エレベータ利用客は行先階呼登録ボタン３
から自分の行きたい階を選んでボタンを押すと、その階
のボタンのランプが点灯する。エレベータかごが到着す
ると、先に操作した行先階呼が、かご内の呼に転送され
て、目的階に利用者を運ぶようになっている。When the operation by the voice recognition is not performed, the elevator customer uses the destination floor call registration button 3 as in the conventional case.
Select the floor you want to go to and press the button, and the button lamp for that floor will light up. When the elevator car arrives, the previously operated destination floor call is transferred to the call in the car to bring the user to the destination floor.

【０００４】音声認識処理を行う場合には、エレベータ
利用客がマイクロホン４に近づいて発声すると、音声認
識装置が作動する。そして、音声認識処理が行われて、
正常音声と判定されるとＯＫランプ５が点灯するととも
に、行先階呼登録ボタン３の該当する階のランプが点滅
する。また、異常音声と判定されるとリジェクトランプ
６のみが点灯する。In the case of performing voice recognition processing, when an elevator user approaches the microphone 4 and speaks, the voice recognition device operates. Then, voice recognition processing is performed,
When it is determined that the voice is normal, the OK lamp 5 lights up and the lamp of the floor corresponding to the destination floor call registration button 3 blinks. When it is determined that the sound is abnormal, only the reject lamp 6 is turned on.

【０００５】上記の正常音声，異常音声の判定は、入力
音声と、予め記憶されている標準パターンとを比較し、
その類似度が一定基準を超えていた場合はＯＫランプ
５、基準以下だった場合はリジェクトランプ６を一定時
間（１〜２秒）点灯させている。したがって、”にか
い”と発声しても、音声認識装置が３階と判定すると、
ＯＫランプ５が点灯し、行先階呼登録ボタン３の３階の
ランプが点滅することがある。この場合、行先階呼登録
ボタン３の点滅の間は、単に表示しているのみで、エレ
ベータの制御は行っていない仮表示のため、利用者はこ
の結果を見て正常であれば、上記一定時間待てば自動的
に登録され、また、異常であれば再度発声することによ
り補正できる。The above-mentioned normal voice / abnormal voice is judged by comparing the input voice with a standard pattern stored in advance,
When the degree of similarity exceeds a certain standard, the OK lamp 5 is lit, and when the degree of similarity is less than the standard, the reject lamp 6 is lit for a certain time (1 to 2 seconds). Therefore, if the voice recognition device determines that the third floor
The OK lamp 5 may be turned on, and the lamp on the third floor of the destination floor call registration button 3 may blink. In this case, while the destination floor call registration button 3 is blinking, it is only displayed, and the elevator is not controlled. Therefore, if the user sees this result and is normal, the above constant If you wait a while, it will be registered automatically, and if it is abnormal, you can correct it by speaking again.

【０００６】次に、音声認識処理部の構成を図５により
説明する。マイクロホン４から入力された音声は、Ａ／
Ｄ変換部１０によってデジタル値に変換された後、バン
ドパスフィルタ部１１で音声の帯域制限を行い、例えば
サンプリング周波数１２ｋＨｚ，１２ビットのデジタル
値を得るようになっている。更に、バンドパスフィルタ
部１１は上記デジタル値から音声信号の特徴のみを抽出
し、８msec単位のスペクトル系列に変換して情報の圧縮
を行う。Next, the configuration of the voice recognition processing section will be described with reference to FIG. The voice input from the microphone 4 is A /
After being converted into a digital value by the D conversion unit 10, the bandpass filter unit 11 limits the band of the voice, and obtains a 12-bit digital value with a sampling frequency of 12 kHz, for example. Further, the bandpass filter unit 11 extracts only the characteristics of the audio signal from the digital value, converts it into a spectrum sequence of 8 msec unit, and compresses the information.

【０００７】音声区間検出部１２では、有効な音声の検
出を行い、実際に音声認識すべき音声データをまとめ、
サンプリング部１３へ出力する。サンプリング部１３で
は、この音声データを辞書記憶部１４に格納されている
標準パターンの音声区間長に適合するように正規化す
る。この結果音声データは、２５６点のデータに変換さ
れ、辞書記憶部１４に格納されている２５６点の標準パ
ターンと比較して、ＣＰＵ１５が類似度を計算し、最も
類似度の高い標準パターンを認識結果として、操作出力
部１６に出力する。プログラム記憶部１７は上記の手順
をプログラム化したものである。The voice section detector 12 detects valid voices, collects voice data to be actually voice-recognized,
Output to the sampling unit 13. The sampling unit 13 normalizes the voice data so as to match the voice section length of the standard pattern stored in the dictionary storage unit 14. As a result, the voice data is converted into data of 256 points, and compared with the standard pattern of 256 points stored in the dictionary storage unit 14, the CPU 15 calculates the degree of similarity and recognizes the standard pattern with the highest degree of similarity. As a result, it outputs to the operation output unit 16. The program storage unit 17 is a program of the above procedure.

【０００８】操作出力部１６からのデータは、エレベー
タ制御部１８に入力され、エレベータの制御を行う。認
識結果報知部１９は、エレベータ制御部１８の呼出力部
２０から表示される行先階呼登録ボタン３に対し、仮表
示のための点滅を行うための手段を有している。また、
ＯＫランプ５及びリジェクトランプ６の点灯を制御す
る。Data from the operation output unit 16 is input to the elevator control unit 18 to control the elevator. The recognition result notification unit 19 has means for blinking the temporary floor call registration button 3 displayed from the call output unit 20 of the elevator control unit 18 for provisional display. Also,
Lighting of the OK lamp 5 and the reject lamp 6 is controlled.

【０００９】次に、この音声認識処理を図６，図７のフ
ローチャートにより説明する。まず、マイクロホン４か
ら音声入力（ステップＳ１）があると、その入力が利用
者の音声レベルであるか、音声と判断されない暗騒音で
あるかを検出する（ステップＳ２）。このステップＳ２
では、音声区間検出部１２により、適正な音声が入力さ
れていることも同時に検出し、適正であればステップＳ
３の音声認識処理を行う。Next, this voice recognition processing will be described with reference to the flowcharts of FIGS. First, when there is a voice input from the microphone 4 (step S1), it is detected whether the input is the voice level of the user or the background noise that cannot be judged as voice (step S2). This step S2
Then, it is also detected by the voice section detection unit 12 that proper voice is input, and if it is proper, step S
The voice recognition process 3 is performed.

【００１０】ステップＳ３では、音声の再入力による補
正を可能にしている。図７に示すように、ステップＳ３
Ａでは認識処理が終了した後の音声入力を検出し、入力
があった場合には、前の認識結果を無効にし（ステップ
Ｓ３Ｂ）、今回の音声による認識結果を優先させる構成
としている（ステップＳ３Ｃ）。In step S3, correction is possible by re-inputting voice. As shown in FIG. 7, step S3
In A, the voice input after the recognition processing is completed is detected, and when there is the input, the previous recognition result is invalidated (step S3B), and the recognition result by the current voice is prioritized (step S3C). ).

【００１１】ステップＳ４では、音声認識処理の結果の
音声データと辞書記憶部１４に格納されている標準パタ
ーンとの比較を行い、一定の基準を超える類似度が認め
られた場合、ステップＳ５でＯＫランプ５の点灯を行
い、エレベータ利用者に対し、有効な音声であったこと
を報知する。また、類似度が基準以下の場合には、ステ
ップＳ６でリジェクトランプ６を点灯し、再入力を要求
する。In step S4, the voice data as a result of the voice recognition process is compared with the standard pattern stored in the dictionary storage unit 14. If a similarity exceeding a certain standard is recognized, it is OK in step S5. The lamp 5 is turned on to notify the elevator user that the sound is valid. If the similarity is equal to or lower than the reference, the reject lamp 6 is turned on in step S6 to request re-input.

【００１２】ステップＳ７では、有効な音声指令が登録
可能か否かを判断する。例えば不停止階を指令した場合
には、ステップＳ９で点滅周期２を選択し、通常処理の
ステップＳ８と異なることを利用者に報知している。ス
テップＳ１０では、上記の点滅周期で認識結果を点滅さ
せている。ステップＳ８の点滅周期は０．５秒、ステッ
プＳ９の点滅周期は０．３秒程度である。In step S7, it is determined whether a valid voice command can be registered. For example, when the non-stop floor is instructed, the blinking cycle 2 is selected in step S9 to notify the user that it is different from step S8 of the normal processing. In step S10, the recognition result is blinked at the above blinking cycle. The blinking cycle in step S8 is 0.5 seconds, and the blinking cycle in step S9 is about 0.3 seconds.

【００１３】ステップＳ１１は、音声指令終了後か操作
前かの判定を行っており、音声指令後であった場合はス
テップＳ１２により、上記のランプの点灯処理を終了さ
せている。つまり、ＯＫランプ５の点灯後１秒程度でラ
ンプを消灯し、今まで点滅していた認識結果をエレベー
タ制御部１８に送って、実際に呼が作成されたものとし
て処理する。In step S11, it is determined whether the voice command is finished or before the operation. If it is after the voice command, the lamp lighting process is finished in step S12. That is, the lamp is turned off about 1 second after the OK lamp 5 is turned on, the recognition result which has been blinking until then is sent to the elevator control unit 18, and it is processed as if a call was actually created.

【００１４】上記の従来例によれば、エレベータ利用者
は音声認識用マイクロホン４の前に立ち、自分の行きた
い階を発声し、その結果を行先階呼登録ボタン３の点灯
状態で確認して所望の結果であれば、１秒程度待てば呼
が登録される。また、誤った結果であっても、行先階呼
登録ボタン３の点滅中にもう１度発声すれば前に入力し
た内容、つまり現在点滅しているデータを消去して、今
回入力した結果を点滅表示する。これにより正しい結果
となるまで補正が可能である。According to the above conventional example, the elevator user stands in front of the voice recognition microphone 4 and speaks the floor he / she wants to go to, and confirms the result by checking the lighting state of the destination floor call registration button 3. If the desired result is obtained, the call is registered after waiting about 1 second. Even if the result is incorrect, if you say it again while the destination floor call registration button 3 is blinking, the previously entered content, that is, the currently blinking data, will be erased and the result entered this time will blink. indicate. This allows correction until a correct result is obtained.

【００１５】[0015]

【発明が解決しようとする課題】しかしながら、上記従
来例の場合、音声データと標準パターンとの類似度が極
めて高い場合であっても、常に一定時間行先階呼登録ボ
タン３を点滅させるため、慣れた利用客にとっては煩わ
しさを感じさせ、また、不慣れな利用客の場合、行先階
呼登録ボタン３の点滅中に、誤って次の呼を登録し、前
に入力した内容、つまり現在点滅しているデータを消去
してしまう可能性があった。However, in the case of the above-mentioned conventional example, even if the similarity between the voice data and the standard pattern is extremely high, the destination floor call registration button 3 is always blinked for a certain period of time, so that the user is accustomed to it. If the user is unfamiliar, the user may accidentally register the next call while the destination floor call registration button 3 is blinking, and the previously entered content, that is, the current blinking. There is a possibility that the existing data will be erased.

【００１６】そこで、音声データと標準パターンとの類
似度が一定の基準を超える場合には、直ちに登録し、類
似度が基準以下の場合には、最も類似度の高い単語とそ
の得点（類似の度合いを１００点満点で表示したもの）
を表示する、という方法も考えられている。しかしなが
ら、この従来例の場合、正しく音声認識できたか否かの
２通りの判断のみであるから、類似度が基準以下のもの
は、全て不登録となり、先の従来例のように、点滅中に
補正するということができない。Therefore, if the similarity between the voice data and the standard pattern exceeds a certain standard, it is immediately registered. If the similarity is less than the standard, the word with the highest similarity and its score (similar (The degree is displayed with 100 points)
A method of displaying is also considered. However, in the case of this conventional example, since there are only two judgments as to whether or not the voice recognition is correctly performed, all the cases where the degree of similarity is equal to or lower than the reference are unregistered, and as in the previous conventional example, during blinking. It cannot be corrected.

【００１７】更に、上記各従来例は、音声入力が適正で
あるか否かの判断は、音声データと標準パターンとの類
似度により決定している。つまり、１つの音声認識手段
のみによって音声認識処理を行っているため、音声認識
の信頼度が高くないという問題がある。Further, in each of the above-mentioned conventional examples, whether or not the voice input is proper is determined by the similarity between the voice data and the standard pattern. That is, since the voice recognition processing is performed by only one voice recognition unit, there is a problem that the reliability of voice recognition is not high.

【００１８】[0018]

【課題を解決するための手段】本発明は、音声入力が適
正であるか否かを認識する手段として、音声認識処理方
法が異なる複数種類の音声認識手段を使用し、両認識手
段の結果を比較することによって、音声入力の適否を判
断するようにしたものである。また、本発明は、両認識
手段の結果が一致すれば、直ちに登録し、また不一致の
場合には、その類似度の大きさによって、利用客に確認
を求めるか、不登録（再登録要求）とするかを選択する
構成である。According to the present invention, a plurality of types of voice recognition means having different voice recognition processing methods are used as means for recognizing whether or not voice input is proper, and the results of both recognition means are used. By comparing, the suitability of voice input is determined. Further, according to the present invention, if the results of both recognizing means match, it is immediately registered, and if they do not match, the confirmation is requested from the user or the non-registration (re-registration request) depending on the degree of similarity. This is a configuration for selecting whether or not.

【００１９】[0019]

【作用】本発明によれば、複数種類の認識手段の結果を
比較することにより、音声入力の適否を判断しているた
め、音声データと標準パターンとの類似度のみによって
音声入力の適否を判断する従来例に比べ音声認識の信頼
度が高くなり、また、音声認識処理の結果も、直ちに登
録、利用客に確認を求める、不登録（再登録要求）とい
う３段階にしているため、エレベータ利用客が使い易く
なる。According to the present invention, the suitability of voice input is judged by comparing the results of a plurality of types of recognition means. Therefore, the suitability of voice input is judged only by the similarity between the voice data and the standard pattern. The reliability of voice recognition is higher than that of the conventional example, and the result of the voice recognition processing is in three stages: immediate registration, requesting confirmation from the user, and non-registration (re-registration request). It is easy for customers to use.

【００２０】[0020]

【実施例】本発明の一実施例を図１により説明する。図
１は従来の図５に相当する図で、図５と同一符号は同一
のものをしめしている。図において、３０はスペクトル
演算部であり、音声区間検出部１２で検出された音声か
ら音声認識用入力パターン（音声データ）を作成し、第
１，第２音声認識部３１，３２へ出力する。DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described with reference to FIG. FIG. 1 is a view corresponding to FIG. 5 of the related art, and the same reference numerals as those in FIG. In the figure, reference numeral 30 is a spectrum calculation unit, which creates a voice recognition input pattern (voice data) from the voice detected by the voice section detection unit 12 and outputs it to the first and second voice recognition units 31 and 32.

【００２１】第１，第２音声認識部３１，３２はともに
ニューラルネットで構成されており、第１音声認識部３
１は、多層パーセプトロン型のニューラルネットワーク
（通常のＢＰネット）であり、第２音声認識部３２はＬ
ＶＱネットである。この第１，第２音声認識部３１，３
２で音声データの音声認識を行いその結果を音声認識確
認部３３へ出力する。この音声認識確認部３３では、音
声認識結果に基づいて音声認識の確信度を演算し、その
結果をエレベータ制御部１８及び報知部３４へ出力す
る。The first and second speech recognition units 31 and 32 are both constructed of a neural network, and the first speech recognition unit 3
Reference numeral 1 is a multilayer perceptron type neural network (normal BP net), and the second speech recognition unit 32 is L
It is a VQ net. The first and second voice recognition units 31 and 3
In step 2, voice recognition of voice data is performed, and the result is output to the voice recognition confirmation unit 33. The voice recognition confirmation unit 33 calculates the certainty factor of voice recognition based on the voice recognition result, and outputs the result to the elevator control unit 18 and the notification unit 34.

【００２２】図２は音声認識確認部３３での認識処理を
示す図である。ここで、第１，第２音声認識部３１，３
２を構成しているＢＰネット及びＬＶＱネットの特徴を
概説すると、ＢＰネットは、学習したデータに対して高
性能の分類機能を有するが、「未知」と言う判断はでき
ない。したがって、どんな音声データでも、予め格納さ
れているいずれかの標準パターンに一致するという認識
結果を出力し、類似度を出力することはない。また、Ｌ
ＶＱネットはＢＰネットほどの性能はないが、類似度の
出力が可能であるという特徴がある。また、基本的に
は、ＢＰネットとＬＶＱネットは異なる処理をするの
で、両ネットの結果が一致すれば、その信頼性は非常に
高いということになる。FIG. 2 is a diagram showing a recognition process in the voice recognition confirmation section 33. Here, the first and second speech recognition units 31, 3
The features of the BP net and the LVQ net that compose 2 are outlined. The BP net has a high-performance classification function for the learned data, but it cannot be judged as “unknown”. Therefore, the recognition result that any voice data matches any one of the standard patterns stored in advance is output, and the similarity is not output. Also, L
Although the VQ net does not have the performance of the BP net, it has the feature that it can output the similarity. Also, basically, since the BP net and the LVQ net perform different processing, if the results of both nets match, it means that their reliability is very high.

【００２３】第１音声認識部３１の出力ｘ１は、認識結
果としての標準パターンであり、第２音声認識部３２の
出力ｘ２は、認識結果としての標準パターン、またｄ２
は音声データとｘ２との類似度であり、「近い」又は
「遠い」のいずれかである。３３ａはｘ１とｘ２とを比
較する比較手段であり、ｘ１＝ｘ２のときは判定係数ｄ
１＝０、ｘ１≠ｘ２のときはｄ１＝１を、ｘ１とともに
出力する。３３ｂはゲートであり、ｄ１＝０のときはｙ
＝ｘ１を出力し、ｄ１＝１のときは、ｄ２＝「近い」で
あればｙ＝ｘ１を、ｄ２＝「遠い」であればｙ＝０を出
力する。The output x1 of the first voice recognition unit 31 is a standard pattern as a recognition result, and the output x2 of the second voice recognition unit 32 is a standard pattern as a recognition result, and d2.
Is the similarity between the audio data and x2, and is either "near" or "far". 33a is a comparison means for comparing x1 and x2, and when x1 = x2, the determination coefficient d
When 1 = 0 and x1 ≠ x2, d1 = 1 is output together with x1. 33b is a gate, and when d1 = 0, y
= X1 is output, and when d1 = 1, y = x1 is output if d2 = “close” and y = 0 is output if d2 = “far”.

【００２４】次に、音声認識処理を図３のフローチャー
トにより説明する。まずステップＳ２０で、確信度が大
か否かを判断する。即ち、ｄ１＝０であれば確信度が大
と判断してステップＳ２１を実行し、ｄ１＝１であれば
ステップＳ２２を実行する。ステップＳ２１では、音声
指令が登録可能か否かを判断する。例えば不停止階を指
令した場合には、エラー表示を行って（ステップＳ２
３）利用客に再登録を要求し、登録可能であれば登録処
理を行う（ステップＳ２４）。このとき、正常に登録さ
れた旨の案内表示等を行っても良い。Next, the voice recognition process will be described with reference to the flowchart of FIG. First, in step S20, it is determined whether or not the certainty factor is high. That is, if d1 = 0, it is determined that the certainty factor is high and step S21 is executed, and if d1 = 1, step S22 is executed. In step S21, it is determined whether the voice command can be registered. For example, when a non-stop floor is commanded, an error display is displayed (step S2
3) Request the customer to re-register, and if registration is possible, perform registration processing (step S24). At this time, a guidance display or the like indicating that the registration has been normally completed may be displayed.

【００２５】ステップＳ２２では、確信度が中か否かを
判断する。ｙ＝ｘ１（即ちｄ２＝「近い」）であれば確
信度が中と判断してステップＳ２５を実行し、ｙ＝０
（即ちｄ２＝「遠い」）であればステップＳ２６を実行
する。ステップＳ２５では、ステップＳ２１と同様に、
音声指令が登録可能か否かを判断し、登録不可能であれ
ばステップＳ２６を実行し、登録可能であればステップ
Ｓ２７の確認要求を実行する。この確認要求は、音声認
識の結果が正しいかどうか、即ち、出力ｙ＝ｘ１で良い
かどうかを利用客に判断してもらうものであり、例え
ば、音声合成による音声案内や表示装置への表示等によ
り行う。In step S22, it is determined whether the certainty factor is medium. If y = x1 (that is, d2 = “close”), it is determined that the certainty factor is medium, step S25 is executed, and y = 0.
If (ie, d2 = "far"), step S26 is executed. In step S25, like step S21,
It is determined whether or not the voice command can be registered. If the voice command cannot be registered, step S26 is executed, and if the voice command can be registered, the confirmation request of step S27 is executed. This confirmation request asks the user to determine whether or not the result of voice recognition is correct, that is, whether the output y = x1 is acceptable. For example, voice confirmation by voice synthesis or display on a display device, etc. By.

【００２６】ステップＳ２８により、利用客が確認をす
ればステップＳ２４の登録処理を行い、確認をしなけれ
ば、又は否認をすればステップＳ２６を実行する。ステ
ップＳ２６は、不登録のステップであり、利用客に再登
録を要求する。尚、上記ステップＳ２３のエラー表示及
び再登録要求、ステップＳ２４の正常登録済の案内表示
等、ステップＳ２６の再登録要求、ステップＳ２７の確
認要求は図１の報知部３４により行う。If the customer confirms in step S28, the registration processing in step S24 is performed, and if the confirmation is not made or the user denies, step S26 is executed. Step S26 is a non-registration step, and requests the user to re-register. Note that the error display and re-registration request in step S23, the normally registered guide display in step S24, the re-registration request in step S26, and the confirmation request in step S27 are performed by the notification unit 34 in FIG.

【００２７】上記のように、本実施例は、音声データを
認識する手段として、異なる処理を行う２種類のニュー
ラルネットを使用しているため、従来の装置に比べ音声
認識の信頼度が高くなり、また、音声認識処理の結果
も、直ちに登録、利用客に確認を求める、不登録（再登
録要求）という３段階にしているため、エレベータ利用
客にとって使い易い。As described above, the present embodiment uses two types of neural nets that perform different processes as means for recognizing voice data, so that the reliability of voice recognition is higher than that of the conventional device. Also, the result of the voice recognition processing is in three stages of immediately registering, requesting confirmation from the user, and non-registering (re-registration request), so that it is easy for the elevator user to use.

【００２８】ところで、上記実施例において、ステップ
Ｓ２２で、確信度が中か否かを判断するときに、ｙがｘ
１か０か、即ち、ｄ２＝「近い」かｄ２＝「遠い」かに
よって判断している。このときの確信度は、音声データ
とｘ１との類似度のことを示している。ところが、ｄ２
は音声データとｘ２との類似度を示すものであり、食い
違っている。したがって、より正確に処理を行おうとす
れば、図２の比較手段３３ａで、ｘ１≠ｘ２と判断した
場合、ｘ１を第２音声認識部３２へ入力し、音声データ
とｘ１との類似度をｄ２として再出力すれば良い。しか
しながら、音声データが真の音声データかノイズかを判
断するのみならば、上記実施例の方法でも実用上問題は
ない。By the way, in the above embodiment, when it is judged in step S22 whether or not the certainty factor is medium, y is x.
It is judged whether it is 1 or 0, that is, whether d2 = "close" or d2 = "far". The certainty factor at this time indicates the similarity between the voice data and x1. However, d2
Indicates the degree of similarity between the voice data and x2, which is inconsistent. Therefore, in order to perform the process more accurately, when the comparison means 33a in FIG. 2 determines that x1 ≠ x2, x1 is input to the second voice recognition unit 32, and the similarity between the voice data and x1 is d2. And output again. However, the method of the above embodiment has no practical problem as long as it is determined whether the audio data is true audio data or noise.

【００２９】上記実施例では、第１及び第２音声認識部
として、ＢＰネットとＬＶＱネットを使用しているが、
他のニューラルネットを使用しても良い。例えば、第１
音声認識部としてＢＰネット又はＣＰＮネットのいずれ
かを使用し、第２音声認識部として類似度の出力が可能
なＬＶＱ，ＬＶＱ−２，ＭＶＱ，ＡＲＴ，ＲＣＥ，ＲＢ
Ｆネットのいずれかを使用しても良い。また、３種以上
を組み合わせることもできる。更に、ニューラルネット
とＨＭＭ機構の音声認識装置とを組み合わせることもで
きる。In the above embodiment, the BP net and the LVQ net are used as the first and second voice recognition units.
Other neural nets may be used. For example, the first
LVQ, LVQ-2, MVQ, ART, RCE, RB that uses either BP net or CPN net as the voice recognition unit and can output the degree of similarity as the second voice recognition unit.
Either of the F nets may be used. Also, three or more kinds can be combined. Further, it is possible to combine the neural network and a voice recognition device having an HMM mechanism.

【００３０】また、上記実施例では、乗場で行先階を登
録するタイプのものについて説明したが、通常のエレベ
ータのように上昇及び下降の呼ボタンを有するタイプの
ものであっても同様に行うことができる。また、かご内
の呼ボタンや戸開閉ボタン等でも同様に行える。Further, in the above embodiment, the type in which the destination floor is registered at the hall has been described, but the same applies to the type having a call button for raising and lowering like an ordinary elevator. You can The same can be done by using the call button or door open / close button in the car.

【００３１】[0031]

【発明の効果】以上説明したように、本発明によれば、
複数の音声認識部の認識結果を比較することにより、音
声入力の適否を判断しているため、従来装置に比べ音声
認識の信頼度が高くなり、また、音声認識処理の結果
も、直ちに登録、利用客に確認を求める、不登録（再登
録要求）という３段階にしているため、エレベータ利用
客が使い易くなる、という効果がある。As described above, according to the present invention,
By comparing the recognition results of a plurality of voice recognition units, the suitability of voice input is determined, so the reliability of voice recognition is higher than in conventional devices, and the results of voice recognition processing are also registered immediately. Since there are three stages of requesting confirmation from the user and non-registration (re-registration request), there is an effect that it becomes easy for the elevator user to use.

[Brief description of drawings]

【図１】本発明の一実施例を示す音声認識処理部のブロ
ック図である。FIG. 1 is a block diagram of a voice recognition processing unit showing an embodiment of the present invention.

【図２】本発明の一実施例の音声認識確認部での認識処
理を示す図である。FIG. 2 is a diagram showing a recognition process in a voice recognition confirmation unit according to an embodiment of the present invention.

【図３】本発明の一実施例の音声認識処理を示すフロー
チャートである。FIG. 3 is a flowchart showing a voice recognition process according to an embodiment of the present invention.

【図４】従来のエレベータ乗場に音声認識装置を使用し
た概略図である。FIG. 4 is a schematic view of using a voice recognition device in a conventional elevator hall.

【図５】従来の音声認識処理部を示すブロック図であ
る。FIG. 5 is a block diagram showing a conventional voice recognition processing unit.

【図６】従来の音声認識処理を示すフローチャートであ
る。FIG. 6 is a flowchart showing a conventional voice recognition process.

【図７】図６の音声認識部における処理を示すフローチ
ャートである。FIG. 7 is a flowchart showing processing in the voice recognition unit in FIG.

[Explanation of symbols]

３行先階呼登録ボタン４音声認識用マイクロホン１０Ａ／Ｄ変換部１１バンドパスフィルタ部１２音声区間検出部１８エレベータ制御部３０スペクトル演算部３１第１音声認識部３２第２音声認識部３３音声認識確認部３４報知部 3 Destination floor call registration button 4 Voice recognition microphone 10 A / D conversion unit 11 Band pass filter unit 12 Voice section detection unit 18 Elevator control unit 30 Spectrum calculation unit 31 First voice recognition unit 32 Second voice recognition unit 33 Voice recognition Confirmation unit 34 Notification unit

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｇ１０Ｌ 9/10 ３０１Ｇ１０Ｌ 9/10 ３０１Ｃ ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁶ Identification code Internal reference number FI Technical display location G10L 9/10 301 G10L 9/10 301C

Claims

[Claims]

1. A microphone provided in a hall or a car of an elevator, a voice recognition device for recognizing a voice signal input to the microphone, and an elevator control device for controlling the elevator according to a recognition result of the voice recognition device. In a voice input device for an elevator provided, a plurality of voice recognition units for processing input voice data and outputting a recognition result are provided, and the certainty factor of voice recognition is classified into three types by the output of the plurality of voice recognition units. A voice input device for an elevator, comprising:

2. One of the speech recognition units is a BP net or a CPN net, and the other is a neural net capable of outputting a similarity. The outputs of both nets are compared and the outputs are compared to determine whether they match or not. However, it is provided with means for outputting a high degree of certainty in the case of coincidence, and in the case of disagreement, for outputting whether or not the degree of certainty is medium according to the similarity of the other neural network. The elevator voice input device described.