JP2001067095A

JP2001067095A - Voice recognizing method and its device

Info

Publication number: JP2001067095A
Application number: JP24325599A
Authority: JP
Inventors: Masami Naeshirozawa; 正巳苗代澤
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1999-08-30
Filing date: 1999-08-30
Publication date: 2001-03-16

Abstract

PROBLEM TO BE SOLVED: To provide a voice recognizing method capable of identifying the voice of a user without being affected by ambient noise. SOLUTION: The voice of a user and dial information are previously memorized in a vocal telephone directory-memory part 19 and ambient noise, in an ambient noise-memory part 25. Also in use, the voice of the user and the ambient noise are inputted from a handset 9 and from a main-body microphone 26, respectively. A standard pattern reproducing part 22 finds differences between the voice of the user, registered previously and inputted in use, and the ambient noise, thereby reproducing net voice. Similar data in the vocal telephone directory-memory part 19 is determined and extracted by a similarity determining part 21, thereby ensuring voice recognition, and an automatic call can be made based on the registered dial information.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、純粋な音声情報を
用いて行う音声認識方法及びその装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech recognition method and apparatus using pure speech information.

【０００２】[0002]

【従来の技術】図７は、従来の音声認識装置のブロック
図を示し、図８は、この音声認識装置を用いて音声認識
するフローチャートを示し、図９は、この音声認識装置
の音声認識・メモリ部に音声情報を入力するフローチャ
ートを示すものである。2. Description of the Related Art FIG. 7 is a block diagram of a conventional speech recognition apparatus, FIG. 8 is a flowchart of speech recognition using the speech recognition apparatus, and FIG. 5 shows a flowchart for inputting voice information to a memory unit.

【０００３】図７、図８を用いて音声認識の動作を説明
する。The operation of speech recognition will be described with reference to FIGS.

【０００４】従来の音声認識は、予め音声情報を音声電
話帳・メモリ部１９に記憶しておく。そして、使用者
が、その音声認識装置付き機器を使用する時に、音声入
力すると、この音声と前記音声電話帳・メモリ部１９に
記憶されていた音声情報とを類似判定部２１にて比較・
判定して、類似判定結果を得る仕様となっていた。In the conventional voice recognition, voice information is stored in a voice telephone directory / memory unit 19 in advance. When the user inputs a voice when using the device with the voice recognition device, the similarity determination unit 21 compares the voice with the voice information stored in the voice telephone directory / memory unit 19.
It is a specification that makes a judgment and obtains a similarity judgment result.

【０００５】フローで説明する。音声認識が開始される
（ステップ３０１）と、固定ガイダンスにて音声入力を
指示（ステップ３０２）し、音声入力される（ステップ
３０３）。次に、前記音声電話帳・メモリ部１９に蓄積
されてあった音声情報を読み出し（ステップ３０４）、
類似判定部２１で、前記入力された音声と比較・判定さ
れる（ステップ３０５）。The operation will be described with reference to a flow chart. When voice recognition is started (step 301), voice input is instructed by fixed guidance (step 302), and voice input is performed (step 303). Next, the voice information stored in the voice telephone directory / memory unit 19 is read out (step 304).
The similarity determination unit 21 compares and determines with the input voice (step 305).

【０００６】次に、電話帳・メモリ部１９に音声情報を
記憶する方法としては、従来より、図９のフローで行わ
れる。音声登録を開始する（ステップ４０１）と、固定
ガイダンスにて、メモリ入力部より音声を入力するよう
指示する（ステップ４０２）。使用者は、音声を入力
（ステップ４０３）し、音声電話帳・メモリ部１９に音
声情報として記憶する（ステップ４０４）のである。Next, as a method of storing voice information in the telephone directory / memory unit 19, conventionally, a flow of FIG. 9 is used. When voice registration is started (step 401), an instruction to input voice from the memory input unit is issued by fixed guidance (step 402). The user inputs voice (step 403) and stores it as voice information in the voice telephone directory / memory unit 19 (step 404).

【０００７】図１１は、従来の音声認識機能付きボタン
電話システムの構成図を示している。１は、ボタン電話
装置を示しており、局線インタフェース部２、内線イン
タフェース部３、主電源部５及びこれらを制御するシス
テム制御部４で構成されている。６は、ボタン電話装置
１に接続されるボタン電話機であり、外部との通話を制
御する通話回路部８、入力部を備えたハンドセット９、
キー入力のためのキーマトリクス部１０、ＬＥＤ表示部
１１、ＬＣＤ表示部１２、子機電源部１３、音声認識を
行う音声認識部２０、入力部である本体マイク２６、ス
ピーカ２７及びこれらを制御する子機インタフェース部
７で構成されている。前記音声認識部２０は、Ａ／Ｄ変
換部１７、固定ガイダンス・メモリ部１８、音声電話帳
・メモリ部１９、類似判定部２１及びこれらを制御して
音声認識を行う音声認識制御部１６で構成されている。FIG. 11 shows a configuration diagram of a conventional key telephone system with a voice recognition function. Reference numeral 1 denotes a key telephone device, which comprises a central office interface unit 2, an extension interface unit 3, a main power supply unit 5, and a system control unit 4 for controlling these. Reference numeral 6 denotes a key telephone connected to the key telephone device 1, which includes a communication circuit unit 8 for controlling external communication, a handset 9 having an input unit,
A key matrix unit 10 for key input, an LED display unit 11, an LCD display unit 12, a slave unit power supply unit 13, a voice recognition unit 20 for performing voice recognition, a main unit microphone 26 as an input unit, a speaker 27, and control thereof. It is composed of a slave unit interface unit 7. The voice recognition unit 20 includes an A / D conversion unit 17, a fixed guidance / memory unit 18, a voice telephone directory / memory unit 19, a similarity determination unit 21, and a voice recognition control unit 16 that controls these to perform voice recognition. Have been.

【０００８】ボタン電話システムの基本動作を説明す
る。ボタン電話機６をボタン電話装置１に接続すると、
内線インタフェース部３を介して主電源部５により、子
機電源部１３に給電され、ボタン電話機６が使用可能と
なる。ボタン電話機６のハンドセット９がオフフックさ
れ、ダイヤルが押下されると、キーマトリクス部１０に
よりダイヤル情報が検出され、子機インタフェース部
７、内線インタフェース部３を介して、システム制御部
４に伝達される。ダイヤル情報が局線通話を要求するも
のであれば、システム制御部４は局線インタフェース部
２と内線インタフェース部３の通話路を接続させ、通話
回路部８を介し、ハンドセット９による通話が可能とな
る。ハンズフリーの場合は、本体マイク２６とスピーカ
２７とによる通話が可能となる。さらに、システム制御
部４の制御によって、ＬＥＤ表示部１１やＬＣＤ表示部
１２に局線通話状態情報等を表示できるのである。すな
わち、ハンドセット９を取り上げて、ダイヤルを押せ
ば、通常の電話機として使用できるのである。The basic operation of the key telephone system will be described. When the key telephone 6 is connected to the key telephone device 1,
Power is supplied to the slave unit power supply unit 13 by the main power supply unit 5 via the extension interface unit 3, and the key telephone 6 becomes usable. When the handset 9 of the key telephone 6 is off-hook and the dial is pressed, dial information is detected by the key matrix unit 10 and transmitted to the system control unit 4 via the slave unit interface unit 7 and the extension interface unit 3. . If the dial information is for requesting a local line call, the system control unit 4 connects the telephone line between the central line interface unit 2 and the extension interface unit 3, and enables a telephone call with the handset 9 via the telephone line circuit unit 8. Become. In the case of hands-free, a telephone call with the main unit microphone 26 and the speaker 27 becomes possible. Further, under the control of the system control unit 4, it is possible to display station line communication state information and the like on the LED display unit 11 and the LCD display unit 12. That is, if the user picks up the handset 9 and presses the dial, it can be used as a normal telephone.

【０００９】この図１１の具体的使用例として、次のよ
うになる。すなわち、使用時に、固定ガイダンス・メモ
リ部１８に記憶されている「名前を言って下さい」の指
示に従って、ハンドセット９または、本体マイク２６か
ら、例えば「スズキ」と音声入力した場合、音声認識制
御部１６は、「スズキ」と認識して、音声電話帳・メモ
リ部１９に登録されている情報の中から、「スズキ」に
関する情報を類似判定部２１にて抽出する。そして、
「スズキＡ男」「スズキＢ子」…などの情報と番号とを
関連付けて、図１０に示すようにＬＣＤ表示部１２に表
示する。続けて、固定ガイダンス・メモリ部１８の固定
ガイダンスにて、使用者に確認を促す（「相手は誰です
か？」）。使用者が、「１」と音声入力するか、または
キーマトリクス部１０より「１」を入力すると、音声電
話帳・メモリ部１９内のその音声情報（「スズキＡ
男」）と一緒に記憶されていたダイヤル情報（「０４５
−＊＊＊−〇×△□」）をシステム制御部４に伝達し、
局線インタフェース部２を通じて、ダイヤル発信する様
に制御される。すなわち、電話をかけたい相手の名前を
音声入力すると、登録されているデータ（その名前と電
話番号）を表示し、使用者が音声やキー入力で承諾する
と、その電話番号先へ自動発呼するのである。The following is a specific example of the use of FIG. In other words, when the voice is input as "Suzuki" from the handset 9 or the main unit microphone 26 in accordance with the instruction of "Please say your name" stored in the fixed guidance memory unit 18, the voice recognition control unit 16 recognizes “Suzuki”, and extracts information about “Suzuki” from the information registered in the voice telephone directory / memory unit 19 by the similarity determination unit 21. And
Information such as "Suzuki A man" and "Suzuki B child" are associated with numbers and displayed on the LCD display unit 12 as shown in FIG. Subsequently, the user is prompted for confirmation by the fixed guidance in the fixed guidance memory unit 18 ("Who is the other party?"). When the user voice-inputs "1" or "1" from the key matrix unit 10, the voice information ("Suzuki A") in the voice telephone directory / memory unit 19 is input.
Dial information (“045”) stored with the “man”
-***-〇 × △ □ ”) to the system controller 4.
It is controlled through the central office line interface unit 2 to make a dial call. That is, when the name of the other party to be called is input by voice, registered data (the name and the telephone number) is displayed, and when the user accepts by voice or key input, an automatic call is made to the telephone number. It is.

【００１０】なお、機密保持より、電話番号を表示する
必要は無い。[0010] It is not necessary to display the telephone number because of confidentiality.

【００１１】[0011]

【発明が解決しようとする課題】しかし、前記従来の音
声認識機能付き電話装置では、登録時の周囲騒音を含む
音声入力によって音声電話帳・メモリ部１９に音声情報
を記憶するため、周囲騒音が当然同時に入力されてしま
う。仮に、登録時は周囲雑音が無い状態であったとして
も、使用時にはうるさい雑踏の中にいた等、登録時の周
囲状態と使用時の周囲状態とが、同一になることは無
い。従って、使用時に、使用者の音声を正しく認識でき
ないという問題を有していた。However, in the conventional telephone apparatus with a voice recognition function, voice information is stored in the voice telephone directory / memory unit 19 by voice input including the background noise at the time of registration. Of course, they are input at the same time. Even if there is no ambient noise at the time of registration, the surrounding state at the time of registration and the surrounding state at the time of use do not become the same, such as being in a noisy crowd at the time of use. Therefore, there is a problem that the voice of the user cannot be correctly recognized at the time of use.

【００１２】本発明は、このような従来の問題を解決す
るものであり、確実に使用者の音声認識ができる優れた
音声認識方法及びその装置を提供することを目的とす
る。An object of the present invention is to solve such a conventional problem, and an object of the present invention is to provide an excellent voice recognition method and apparatus which can surely recognize a user's voice.

【００１３】[0013]

【課題を解決するための手段】前記問題を解決するため
に本発明の音声認識方法は、予め周囲騒音を記憶するこ
とにより、入力音声と前記周囲騒音との差分を求め、こ
の差分によって記憶手段に記憶された音声情報の中から
類似する音声を検索・抽出する方法である。In order to solve the above-mentioned problem, a voice recognition method according to the present invention stores ambient noise in advance to obtain a difference between an input voice and the ambient noise, and stores the difference based on the difference. This is a method for searching and extracting similar voices from the voice information stored in the.

【００１４】この方法により、周囲騒音を含む音声入力
から周囲騒音を除去した純粋音声を抽出して、予め記憶
手段に記憶させてある音声と比較・判定可能となり、周
囲騒音に影響されること無く、音声認識機能を向上させ
ることができるものである。According to this method, a pure voice from which the ambient noise has been removed is extracted from the voice input including the ambient noise, and can be compared and determined with the voice stored in the storage means in advance, without being affected by the ambient noise. And a voice recognition function can be improved.

【００１５】また、本発明の音声認識方法は、音声情報
として、予め周囲騒音を差し引いた音声を、記憶手段に
記憶させておく方法である。The voice recognition method of the present invention is a method in which a voice from which ambient noise has been subtracted in advance is stored in a storage means as voice information.

【００１６】この方法により、純粋な音声を基礎データ
として利用することが可能となる。According to this method, pure speech can be used as basic data.

【００１７】また、本発明の音声認識装置は、周囲騒
音、使用者の音声を入力する音声入力手段と、前記周囲
騒音のみを記憶する第１の記憶手段と、前記音声入力手
段から入力された使用者の音声と前記第１の記憶手段に
記憶された周囲騒音との差分を求める音声抽出手段と、
予め音声情報を記憶させた第２の記憶手段と、前記音声
抽出手段によって抽出された音声と前記第２の記憶手段
に記憶されていた音声情報と比較・判定する比較手段を
備えた構成である。Further, the voice recognition device of the present invention has a voice input means for inputting ambient noise and a user's voice, a first storage means for storing only the ambient noise, and a voice input means for inputting from the voice input means. Voice extracting means for calculating a difference between the voice of the user and the ambient noise stored in the first storage means;
A second storage unit in which audio information is stored in advance, and a comparison unit that compares / determines the audio extracted by the audio extraction unit with the audio information stored in the second storage unit. .

【００１８】この構成により、請求項１に記載された音
声認識方法を実現することが可能となり、電話装置、玄
関ドアの開錠などの応用することで、セキュリティ向上
が可能となる。According to this configuration, the voice recognition method described in claim 1 can be realized, and security can be improved by applying a telephone device, an unlocking of a front door, and the like.

【００１９】また、音声抽出手段は、音声入力手段から
入力された音声を記憶する第３の記憶手段を備え、前記
第１と第３の記憶手段の内容の差分を求めることをを特
徴とするものである。Further, the voice extracting means includes a third storage means for storing the voice input from the voice input means, and obtains a difference between the contents of the first and third storage means. Things.

【００２０】これにより、入力された音声を直接記憶
し、必要に応じて、記憶データの更新などにも利用可能
となる。Thus, the input voice can be directly stored and can be used for updating the stored data as needed.

【００２１】また、第２の記憶手段の音声情報として、
音声抽出手段より得られた音声を用いることを特徴とす
るものである。Further, as the audio information of the second storage means,
It is characterized by using the voice obtained from the voice extracting means.

【００２２】これにより、予め記憶された純粋な音声情
報と純粋な入力音声とを比較する音声認識装置が可能と
なる。This makes it possible to provide a speech recognition device for comparing pure speech information stored in advance with pure input speech.

【００２３】[0023]

【発明の実施の形態】以下、本発明の実施の形態につい
て、図１〜図６を用いて説明する。なお、従来例と同じ
構成のものは同一符号を付して説明を省略する。DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiments of the present invention will be described below with reference to FIGS. The same components as those in the conventional example are denoted by the same reference numerals, and description thereof is omitted.

【００２４】図１は、第一の実施の形態の音声認識装置
における音声ダイヤル発信動作のブロック図を示し、図
２は、この音声認識装置を用いて音声認識するフローチ
ャートを示し、図３は、第一の実施の形態の音声認識装
置における音声ダイヤル登録動作のブロック図を示し、
図４のフローチャートを用いてその動作を説明する。図
５は、第一の実施の形態の音声認識機能付きボタン電話
システムの構成図を示し、図６は、第一の実施の形態の
音声認識機能付き単独電話機の構成図を示し、１９は、
第２の記憶手段である音声電話帳・メモリ部である。従
来例の音声認識部２０内に、さらに標準パターン再生部
２２と、第１の記憶手段である周囲騒音・メモリ部２５
を加えている。FIG. 1 is a block diagram of a voice dial transmission operation in the voice recognition device of the first embodiment, FIG. 2 is a flowchart of voice recognition using this voice recognition device, and FIG. FIG. 4 shows a block diagram of a voice dial registration operation in the voice recognition device of the first embodiment,
The operation will be described with reference to the flowchart of FIG. FIG. 5 is a configuration diagram of a key telephone system with a voice recognition function according to the first embodiment. FIG. 6 is a configuration diagram of a single telephone with a voice recognition function according to the first embodiment.
It is a voice telephone directory / memory unit that is a second storage unit. A standard pattern reproducing unit 22 and an ambient noise / memory unit 25 serving as a first storage unit are further provided in the voice recognition unit 20 of the conventional example.
Is added.

【００２５】図１、図２を用いて、第一の実施の形態の
音声認識の動作を説明する。The operation of the speech recognition according to the first embodiment will be described with reference to FIGS.

【００２６】最初に、周囲騒音のみを入力し、第１の記
憶手段である周囲騒音・メモリ部２５に記憶する。次
に、標準パターン再生部２２において、入力された使用
者の音声と前記周囲騒音・メモリ部２５に記憶された周
囲騒音データとの差分の求め、この差分と予め第２の記
憶手段に記憶されていた音声情報とを類似判定部２１に
て比較・判定して類似判定結果を得る仕様となってい
る。First, only the ambient noise is input and stored in the ambient noise / memory section 25 as the first storage means. Next, in the standard pattern reproducing unit 22, a difference between the input user's voice and the ambient noise data stored in the ambient noise / memory unit 25 is obtained, and the difference is stored in the second storage unit in advance. The similarity determination unit 21 compares and determines the audio information that has been used, and obtains a similarity determination result.

【００２７】フローで説明する。音声認識（音声ダイヤ
ル発信動作）が開始される（ステップ１０１）と、最初
に沈黙を指示（ステップ１０２）し、周囲騒音データを
入力（ステップ１０３）した後に、この周囲騒音データ
を第１の記憶手段である周囲騒音・メモリ部２５に記憶
する（ステップ１０４）。続いて、使用者の音声入力を
要求（ステップ１０５）し、使用者から周囲騒音を含ん
だままで、音声入力される（ステップ１０６）と、標準
パターン再生部２２において前記２つの差分を求める
（ステップ１０７）のである。そして、第２の記憶手段
である音声電話帳・メモリ部１９に予め記憶されている
音声情報を呼び出し（ステップ１０８）て、この音声情
報と前記標準パターン再生部２２で再生された音声と
を、類似判定部２１において、繰り返して類似判定（ス
テップ１０９）し、同一データを抽出する（ステップ１
１０）。そして、ＬＣＤ表示部１２に表示（ステップ１
１１）し、固定ガイダンスにて、使用者に相手先を指定
させる（ステップ１１２）。使用者の指定内容に基づい
て、ダイヤル発信する（ステップ１１３）のである。ス
テップ１１２の表示データは、図７の様に従来例と同じ
である。The operation will be described with reference to a flow chart. When voice recognition (voice dialing operation) is started (step 101), silence is first instructed (step 102), ambient noise data is input (step 103), and this ambient noise data is first stored. It is stored in the ambient noise / memory unit 25 (step 104). Subsequently, a voice input of the user is requested (step 105), and when the voice is input while the ambient noise is included from the user (step 106), the two differences are obtained in the standard pattern reproducing unit 22 (step 106). 107). Then, the voice information stored in advance in the voice telephone directory / memory section 19 as the second storage means is called (step 108), and the voice information and the voice reproduced by the standard pattern reproducing section 22 are called out. The similarity determination unit 21 repeatedly performs similarity determination (step 109) and extracts the same data (step 1).
10). Then, it is displayed on the LCD display section 12 (step 1).
11) Then, the user is caused to specify the destination by the fixed guidance (step 112). Dialing is performed based on the contents specified by the user (step 113). The display data in step 112 is the same as in the conventional example as shown in FIG.

【００２８】以上のように、本発明の第一の実施の形態
によれば、周囲騒音のみ入力し、それを第１の記憶手段
である周囲騒音・メモリ部２５に記憶する。この周囲騒
音・メモリ部２５の周囲騒音データと使用者から入力さ
れた音声との差分を求め、この差分と前記音声電話帳・
メモリ部１９の音声情報とを比較することにより、周囲
騒音の影響を受けることなく、確実に音声認識すること
ができる。As described above, according to the first embodiment of the present invention, only the ambient noise is inputted and stored in the ambient noise / memory section 25 as the first storage means. The difference between the ambient noise data in the ambient noise / memory unit 25 and the voice input by the user is obtained, and this difference is compared with the voice telephone directory data.
By comparing the sound information with the sound information in the memory unit 19, the sound can be reliably recognized without being affected by the ambient noise.

【００２９】さらに、本発明の第一の実施の形態では、
図５に示す様な、ボタン電話機を接続するボタン電話シ
ステムを例として説明しているが、図６に示す音声認識
機能付き単独電話機にも応用可能である。図６におい
て、２３は各種の制御を行う制御部、２５は周囲騒音記
憶手段である周囲騒音・メモリ部である。その他、玄関
ドアの開錠などのセキュリティ管理に関する物への応用
も可能である。Further, in the first embodiment of the present invention,
Although a key telephone system for connecting a key telephone as shown in FIG. 5 is described as an example, the present invention is also applicable to a single telephone with a voice recognition function shown in FIG. In FIG. 6, reference numeral 23 denotes a control unit for performing various controls, and reference numeral 25 denotes an ambient noise / memory unit as ambient noise storage means. In addition, the present invention can be applied to items related to security management such as unlocking of a front door.

【００３０】なお、機密保持のため、電話番号の表示を
しなくとも良い。また、該当するデータが無い場合、音
声認識できないが、通常の通話モードとして、使用可能
である。It is not necessary to display the telephone number for security. If there is no corresponding data, voice recognition cannot be performed, but it can be used as a normal call mode.

【００３１】その他、本発明に関連しない内容について
は、詳細な説明は避ける。Other details not related to the present invention will not be described in detail.

【００３２】図３、図４を用いて、第一の実施の形態の
音声認識に使用する音声登録の動作を説明する。The operation of voice registration used for voice recognition according to the first embodiment will be described with reference to FIGS.

【００３３】最初に、周囲騒音のみを入力し、第１の記
憶手段である周囲騒音・メモリ部２５に記憶する。次
に、標準パターン再生部２２において、入力された使用
者の音声と前記周囲騒音・メモリ部２５に記憶された周
囲騒音データとの差分を求め、第２の記憶手段である音
声電話帳・メモリ部１９に記憶する仕様となっている。First, only the ambient noise is inputted and stored in the ambient noise / memory section 25 as the first storage means. Next, in the standard pattern reproducing unit 22, the difference between the input user's voice and the ambient noise data stored in the ambient noise / memory unit 25 is obtained, and the voice telephone directory / memory as the second storage unit is obtained. The specifications are stored in the unit 19.

【００３４】フローで説明する。音声登録（音声ダイヤ
ル登録動作）が開始される（ステップ２０１）と、その
初めに、固定ガイダンス・メモリ部１８により、使用者
に対し、しばらく沈黙するように指示する（ステップ２
０２）。この使用者の発声の無い状態中に周囲騒音（周
囲騒音データ）を、ハンドセット９、または、本体マイ
ク２６から入力（ステップ２０３）し、第１の記憶手段
である周囲騒音・メモリ部２５に周囲騒音データとして
記憶する（ステップ２０４）。続いて、使用者の音声入
力を要求（ステップ２０５）し、使用者から周囲騒音を
含んだままで、音声入力される（ステップ２０６）と、
標準パターン再生部２２において前記２つの差分を求
め、使用者の純粋音声を再生する（ステップ２０７）の
である。そして、再生された使用者の純粋音声を音声情
報として、第２の記憶手段である音声電話帳・メモリ部
１９に記憶する（ステップ２０８）。固定ガイダンスに
て使用者の電話番号の入力を要求し（ステップ２０
９）、使用者からの入力される（ステップ２１０）と、
音声電話帳・メモリ部１９に前記音声情報と対応して前
記電話番号が記憶される（ステップ２１１）。この時、
勿論前記周囲騒音データと前記使用者の音声情報とは、
関連付けられて、記憶されるのである。The operation will be described with reference to a flow chart. When voice registration (voice dial registration operation) is started (step 201), first, the fixed guidance memory unit 18 instructs the user to remain silent for a while (step 2).
02). While the user is not speaking, ambient noise (ambient noise data) is input from the handset 9 or the microphone 26 (step 203), and the ambient noise is stored in the ambient noise / memory unit 25 as the first storage unit. It is stored as noise data (step 204). Subsequently, a voice input of the user is requested (step 205), and voice input is performed while including the ambient noise from the user (step 206).
The standard pattern reproducing unit 22 calculates the difference between the two, and reproduces the pure voice of the user (step 207). Then, the reproduced pure voice of the user is stored as voice information in the voice telephone directory / memory unit 19 as the second storage means (step 208). The user is required to enter the telephone number in the fixed guidance (step 20).
9) When input from the user (step 210),
The telephone number is stored in the voice telephone directory / memory unit 19 in correspondence with the voice information (step 211). At this time,
Of course, the ambient noise data and the voice information of the user are:
It is associated and stored.

【００３５】本発明の第二の実施の形態としては、請求
項４に記載されているように、周囲騒音含んだまま入力
される音声を記憶する第３の記憶手段を備え、登録時に
は、使用者からの音声入力をそのまま（周囲騒音を含ん
だまま）記憶し、発信時にも、入力された周囲騒音を含
む使用者の音声入力（第３の記憶手段に記憶される）と
周囲騒音（第１の記憶手段に記憶される）との差分を求
め、類似・判定する音声認識装置である。この場合、図
２のフローチャートでは、ステップ１０７は通らずに、
ステップ１０６→ステップ１２１→ステップ１２２→ス
テップ１０８となる。また、図４のフローチャートで
は、ステップ２０７は通らずに、ステップ２０６→ステ
ップ２２１→ステップ２２２→ステップ２０８となる。According to a second embodiment of the present invention, as set forth in claim 4, there is provided a third storage means for storing a voice input while including ambient noise. The voice input from the user is stored as it is (including the ambient noise), and the user's voice input including the input ambient noise (stored in the third storage means) and the ambient noise (the (Stored in the first storage means) to determine the similarity and determine the similarity. In this case, in the flowchart of FIG.
Step 106 → step 121 → step 122 → step 108. In the flowchart of FIG. 4, step 206 is not performed and step 206 → step 221 → step 222 → step 208.

【００３６】[0036]

【発明の効果】以上のように本発明は、周囲騒音のみを
入力し、第１の記憶手段に記憶し、入力された音声と前
記第１の記憶手段の周囲騒音データとの差分を求めるこ
とで、使用者の純粋な音声情報が得られ、これを利用し
て、音声認識を行うことが可能になるので、音声認識性
能の向上が可能となる。さらに、この音声認識装置を応
用することで、確実な音声ダイヤル発信ができるなど、
セキュリティ保護が可能となるという効果を有する。As described above, according to the present invention, only the ambient noise is input and stored in the first storage means, and the difference between the input voice and the ambient noise data in the first storage means is obtained. Thus, pure voice information of the user is obtained, and it is possible to perform voice recognition using this, so that the voice recognition performance can be improved. Furthermore, by applying this voice recognition device, reliable voice dialing can be performed,
This has the effect of enabling security protection.

[Brief description of the drawings]

【図１】本発明の第一の実施の形態における音声認識装
置の音声ダイヤル発信動作を示すブロック図FIG. 1 is a block diagram showing a voice dial transmission operation of a voice recognition device according to a first embodiment of the present invention.

【図２】本発明の第一の実施の形態における音声認識方
法の音声ダイヤル発信動作を示すフローチャートFIG. 2 is a flowchart showing a voice dial transmission operation of the voice recognition method according to the first embodiment of the present invention.

【図３】本発明の第一の実施の形態における音声認識方
法の音声ダイヤル登録動作を示すブロック図FIG. 3 is a block diagram showing a voice dial registration operation of the voice recognition method according to the first embodiment of the present invention.

【図４】本発明の第一の実施の形態における音声認識方
法の音声ダイヤル登録動作を示すフローチャートFIG. 4 is a flowchart showing a voice dial registration operation of the voice recognition method according to the first embodiment of the present invention;

【図５】本発明の第一の実施の形態における音声認識機
能付きボタン電話システムの構成図FIG. 5 is a configuration diagram of a key telephone system with a voice recognition function according to the first embodiment of the present invention.

【図６】本発明の第一の実施の形態における音声認識機
能付き単独電話機の構成図FIG. 6 is a configuration diagram of a single telephone with a voice recognition function according to the first embodiment of the present invention.

【図７】音声認識機能付き電話機のおけるデータ表示の
一例を示す図FIG. 7 is a diagram showing an example of data display on a telephone with a voice recognition function.

【図８】従来の音声認識動作のブロック図FIG. 8 is a block diagram of a conventional voice recognition operation.

【図９】従来の音声ダイヤル発信動作のフローチャートFIG. 9 is a flowchart of a conventional voice dialing operation;

【図１０】従来の音声ダイヤル登録動作のフローチャー
トFIG. 10 is a flowchart of a conventional voice dial registration operation.

【図１１】従来の音声認識機能付きボタン電話システム
の構成図FIG. 11 is a configuration diagram of a conventional key telephone system with a voice recognition function.

[Explanation of symbols]

１ボタン電話装置２局線インタフェース部３内線インタフェース部４システム制御部５主電源部６ボタン電話機７子機装置インタフェース部８通話回路部９ハンドセット１０キーマトリクス部１１ＬＥＤ表示部１２ＬＣＤ表示部１３子機電源部１６音声認識制御部１７Ａ／Ｄ変換部１８固定ガイダンス・メモリ部１９音声電話帳・メモリ部２０音声認識部２１類似判定部２２標準パターン再生部２３制御部２４単独電話機２５周囲騒音メモリ部２６本体マイク２７スピーカ DESCRIPTION OF SYMBOLS 1 Key telephone apparatus 2 Local line interface part 3 Extension interface part 4 System control part 5 Main power supply part 6 Key telephone 7 Child device interface part 8 Communication circuit part 9 Handset 10 Key matrix part 11 LED display part 12 LCD display part 13 Child Machine power supply unit 16 Voice recognition control unit 17 A / D conversion unit 18 Fixed guidance / memory unit 19 Voice telephone directory / memory unit 20 Voice recognition unit 21 Similarity determination unit 22 Standard pattern reproduction unit 23 Control unit 24 Single phone 25 Ambient noise memory Part 26 Body microphone 27 Speaker

Claims

[Claims]

1. A speech recognition for obtaining a difference between an input speech and the ambient noise by storing ambient noise in advance, and searching for and extracting a similar speech from speech information stored in a storage means based on the difference. Method.

2. The speech recognition method according to claim 1, wherein a speech from which ambient noise has been subtracted is stored in advance in the storage means as the speech information.

3. A voice input unit for inputting ambient noise and a user's voice, a first storage unit for storing only the ambient noise, a user's voice input from the voice input unit, and the first voice. Voice extracting means for obtaining a difference from the ambient noise stored in the storing means, a second storing means storing voice information in advance, and a voice extracted by the voice extracting means and the second storing means. A speech recognition device comprising comparison means for comparing and judging stored speech information.

4. The voice extracting means includes third storage means for storing voice input from the voice input means, and
4. A speech recognition apparatus according to claim 3, wherein a difference between the contents of the first and second storage means is obtained.

5. The speech recognition apparatus according to claim 3, wherein a speech obtained from the speech extraction means is used as the speech information of the second storage means.