JPS5934596A

JPS5934596A - Voice recognition processing system

Info

Publication number: JPS5934596A
Application number: JP14411182A
Authority: JP
Inventors: 竹内　亜紀彦
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1982-08-20
Filing date: 1982-08-20
Publication date: 1984-02-24

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（イ）発明の技術分野本発明は、音声認識処理方式に関し、特に、特定話者全
対象とする音声認識処理装置ｌこ寂いて雑音パターンを
も考慮に入れて音声認識処理を実行するようｌこした方
式ｉこ関する。DETAILED DESCRIPTION OF THE INVENTION (a) Technical field of the invention The present invention relates to a speech recognition processing method, and in particular, to a speech recognition processing system for all specific speakers. This relates to a modified method for performing the recognition process.

（ロ）従来技術と問題点近年、電話等全入力媒体として特定話者を対象として音
声認識処理を行なう装置が使用されつつある。この場合
、その特定話者は電話回線等を通して、あらかじめ決め
られた嚇語を発声し、音声認識処理装置ｌこその発声パ
ターンを登録させておくようｌこする。以後は、特定話
者から入力される音声情報と既登録の発声パターンとの
比較照合により、音声情報より単語を抽出し、各種の処
理ｆこ使用するものである。(b) Prior Art and Problems In recent years, devices that perform speech recognition processing for a specific speaker are being used as all input media such as telephones. In this case, the specific speaker utters a predetermined threatening word through a telephone line or the like, so that the speech recognition processing device registers the utterance pattern. Thereafter, words are extracted from the voice information by comparing and matching voice information input from a specific speaker with registered speech patterns, and are used in various processes.

ここで問題となるのは、電話回線上の雑音あるいは特定
話者が発声する際の周囲雑音等に変動があると、登録時
と認識時の音声情報パターンが異なり、音声認識処理装
置側（こおける認識処理ｌこおいて認識率が低下するこ
とである。The problem here is that if there are fluctuations in the noise on the telephone line or the ambient noise when a specific speaker speaks, the voice information pattern at the time of registration and recognition will differ, and the voice recognition processing device (this This is because the recognition rate decreases during the recognition process.

（ハ）発明の目的本発明は、上記の点ｌこ鑑み、雑音パターン全も考慮ｌ
こ入れて音声認識処ｍを実行することにより認識率の向
上を計ることケ目的としている。(c) Purpose of the invention In view of the above points, the present invention takes into account all noise patterns.
The purpose is to improve the recognition rate by executing speech recognition processing.

に）発明の構成上記目的′に達成するために本発明は、特定話者より入
力された音声情報とあらかじめパターン辞書にパターン
登録されている当該特定話者の発声単語の音声情報と全
比較し、該比較結果にもとづき当該特定話者より入力さ
れた音声の単語を識別するよう構成された音声認識処理
装置において、上記パターン辞書へのパターン登録に際
して正規の発声単語パターン以外に無発声時に入力され
る雑音のパターンをも登録し、以後上記特定話者から入
力される音声情報を認識処理する際、当該音声認識処理
に先立ってその時点１こ入力さ扛ている雑音のパターン
全作成し、当該雑音パターンと上記パターン辞書に既ρ
こ登録されている雑音パターンとの照合を行ない、両雑
音パターン間の距離があらかじめ定められた値より小さ
いときのみ当該特定話者の音声認識処理を実行するよう
構成したことを特徴とする。B) Structure of the Invention In order to achieve the above object, the present invention performs a complete comparison between the voice information input by a specific speaker and the voice information of words uttered by the specific speaker whose patterns are registered in advance in a pattern dictionary. In the speech recognition processing device configured to identify the words of the speech input by the specific speaker based on the comparison result, when registering the pattern in the pattern dictionary, in addition to the regular utterance word pattern, the speech recognition processing device is configured to identify the words of the speech input by the specific speaker. From now on, when recognizing the voice information input from the above-mentioned specific speaker, all patterns of the noise that are being input at that time are created prior to the voice recognition process, and the corresponding noise patterns are registered. Already ρ in the noise pattern and the above pattern dictionary
The present invention is characterized in that the registered noise pattern is compared with the registered noise pattern, and only when the distance between both noise patterns is smaller than a predetermined value, the speech recognition process of the particular speaker is executed.

（ホ）発明の実施例第１図は本発明による実施例の音声認識処理装置のブロ
ック図であり、図中、１は特徴抽出部、２はパターン登
録部、３は辞書部、４は照合判定部、５は制御部である
。第２図は、第１図図示の辞書部３のフォーマット例で
ある。第３図は実施例１こおける認識処理フローの１例
である。(e) Embodiment of the invention FIG. 1 is a block diagram of a speech recognition processing device according to an embodiment of the invention, in which 1 is a feature extraction section, 2 is a pattern registration section, 3 is a dictionary section, and 4 is a collation section. The determination section 5 is a control section. FIG. 2 shows an example of the format of the dictionary section 3 shown in FIG. FIG. 3 is an example of the recognition processing flow in the first embodiment.

実施例の動作は以下の通りである。The operation of the embodiment is as follows.

（１）パターン登録動作まず、電話回線等により、特定話者と第１図図示の音声
認識処理装置が接続され、特定話者が電話機等ζこより
音声入力可能状態となると、音声認識処理装置は、特定
話者からの音声入力前の無声３− 状態時９こおいて入力される雑音全特徴抽出部ＩＩこ入
力する。特徴抽出部１は、この雑音のノ（ターンを抽１
８シ、パターン登録部２へ送る。パターン登録ＮＩ２は
、この雑音パターン金、例えば第２図図示の如く辞書部
３の先頭位置に格納する。(1) Pattern registration operation First, a specific speaker is connected to the voice recognition processing device shown in FIG. , the noise that is input at step 9 in the unvoiced state before voice input from a specific speaker is input to the total feature extraction unit II. The feature extraction unit 1 extracts the no(turn) of this noise.
8, send it to the pattern registration section 2. The pattern registration NI2 stores this noise pattern, for example, at the beginning position of the dictionary section 3 as shown in FIG.

」コノ後、図示しない手段により、特定話者に発声全指
示するメツセージが送出されると、その特定話者は順次
、あらかじめ定められた単語全発声してゆく。特定話者
の発声した音声情報は、順次、特徴抽出部ＩＩこてパタ
ーン抽出さ扛、パターン登録部２の制御のもとに辞書部
３に、第２図図示の如く、単語１パターン、単語２パタ
ーン、・・・・・・の形で格納されてゆく。After that, a message instructing the particular speaker to say all the words is sent by means not shown, and the particular speaker sequentially says all the predetermined words. The voice information uttered by a specific speaker is sequentially extracted by the feature extraction unit II, and then stored in the dictionary unit 3 under the control of the pattern registration unit 2, as shown in FIG. 2 patterns are stored in the form of...

（２）認識処理動作手記パターン登録動作の場合と同様にして、特定話者が
電話機等により音声入力可能状態となると、音声認識処
理装置は、特定話者からの音声入力前の無声状態時ζこ
おいて入力される雑音を特徴抽出部Ｈこ入力する。特徴
抽出部１は、この雑音のパターンを抽出し、照合判定部
４へ送る。批合４− 判定部４は、特徴抽出部１から送出さｎてきた雑音パタ
ーンと、辞書部３に既登録の雑音パターンとの比較照合
動作を行ない、両雑音パターン間の距離（相違度）があ
らかじめ足めら扛ている閾値より大きいか、小さいかを
判定する。両雑音パターン間の距離が当該閾値よりも小
さいときは認識処理可能と判定し、図示しない手段番こ
より、特定話者に発声全指示するメツセージが送出され
る。(2) Recognition Processing Operation Notes Similarly to the pattern registration operation, when a specific speaker becomes ready for voice input using a telephone, etc., the voice recognition processing device recognizes ζ in the silent state before voice input from the specific speaker. The noise input here is input to the feature extraction unit H. The feature extraction unit 1 extracts this noise pattern and sends it to the matching determination unit 4. Criticism 4 - The determination unit 4 compares and matches the noise pattern sent from the feature extraction unit 1 with the noise pattern already registered in the dictionary unit 3, and determines the distance (difference) between the two noise patterns. is larger or smaller than a predetermined threshold. When the distance between both noise patterns is smaller than the threshold value, it is determined that recognition processing is possible, and a message instructing the specific speaker to utter all the words is sent by means (not shown).

以後、特定話者からの音声情報は、特徴抽出部１、照合
判定部４、辞書部３を使用した公知の認識処理可能によ
り認識されてゆく。Thereafter, the speech information from the specific speaker is recognized by a known recognition process using the feature extraction section 1, the comparison/determination section 4, and the dictionary section 3.

一方、上記両雑音パターン間の距離（相違度）があらか
じめ定められている閾値より大きいときは、第３図図示
の処理フローに示すように、再登録動作が行なわれる。On the other hand, when the distance (difference) between the two noise patterns is greater than a predetermined threshold, a re-registration operation is performed as shown in the processing flow shown in FIG.

この再登録動作は上記（１）項のパターン登録動作の場
合と同様であり、特定話者の無声状態時における雑音パ
ターンの登録動作から開始され、その後、個々の単語の
発声・登録動作が行なわれる。そして、この再登録動作
の後認識処理動作が実行される。This re-registration operation is similar to the pattern registration operation described in item (1) above, and starts with the registration operation of the noise pattern when the specific speaker is in a silent state, and then utters and registers each word. It will be done. After this re-registration operation, a recognition processing operation is executed.

（へ）発明の効果本発明によ扛ば、特定話者音声認識処理において、音声
パターン発鈴時と、その後の認識処理動作時との間にお
けるパターン変動が少なくなり、認識率の向上をもたら
すことが可能となる。(F) Effects of the Invention According to the present invention, in a specific speaker speech recognition process, pattern fluctuations between the sound pattern generation and the subsequent recognition processing operation are reduced, resulting in an improvement in the recognition rate. becomes possible.

[Brief explanation of the drawing]

第１図は本発明による実施例の音声認識処理装置のブロ
ック図、第２図は辞書部のフォーマツｉ・例、第３図は
実施例における認識処理フローの１例である。図中、■は特徴抽出部、２はパターン登録部、３は辞書
部、４は朋合判足部である。第１図FIG. 1 is a block diagram of a speech recognition processing apparatus according to an embodiment of the present invention, FIG. 2 is an example of a format i of a dictionary section, and FIG. 3 is an example of a recognition processing flow in the embodiment. In the figure, ■ is a feature extraction section, 2 is a pattern registration section, 3 is a dictionary section, and 4 is a format foot section. Figure 1

Claims

[Claims]

(1) The voice information input by a specific speaker is compared with the voice information of the words uttered by the specific speaker whose pattern is registered in the pattern dictionary in advance, and based on the comparison result, the voice information input by the specific speaker is In a speech recognition processing device configured to identify words in speech, when registering patterns in the pattern dictionary, it also registers noise patterns that are input when no utterance is made in addition to the regular uttered word patterns, and from now on, the above-mentioned When recognition processing is performed on voice information input from a specific speaker, a pattern of the noise that is being input at that time is created prior to the speech recognition processing, and the noise pattern and the noise pattern are already registered in the pattern dictionary. A speech recognition process characterized in that the speech recognition process is configured such that the speech recognition process of the specific speaker is executed only when the short distance between the two noise patterns is smaller than a predetermined value. method.

(2) When the distance between the two noise patterns is greater than a predetermined value, a re-registration operation is performed for the specific speaker C, including all of the noise patterns. Speech recognition processing method.