JPH0727519Y2

JPH0727519Y2 - Voice recognizer

Info

Publication number: JPH0727519Y2
Application number: JP1988150759U
Authority: JP
Inventors: 靖彦加藤; 雅男渡; 太郎仲上; 正照赤羽; 幸田中; 泰勝又
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1988-11-21
Filing date: 1988-11-21
Publication date: 1995-06-21
Anticipated expiration: 2003-11-21
Also published as: JPH0271900U

Description

【考案の詳細な説明】〔産業上の利用分野〕本考案は、例えば音声入力ワードプロセッサ等に用いて
好適な音声認識装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Industrial field of application] The present invention relates to a speech recognition apparatus suitable for use in, for example, a speech input word processor.

[Outline of device]

本考案は入力音声信号から音声パラメータを求め、この
音声パラメータから騒音パラメータを差し引き、上記音
声パラメータのレベルに応じた補正値を加えて対数変換
することにより、上記音声レベルの大きさ部分の特徴を
失うことなく、且つ、音声レベルの小さな部分の特徴パ
ラメータの安定化をはかることのできるような音声認識
装置である。The present invention obtains a voice parameter from an input voice signal, subtracts a noise parameter from the voice parameter, adds a correction value according to the level of the voice parameter, and performs logarithmic conversion to determine the characteristics of the portion of the voice level. It is a voice recognition device capable of stabilizing feature parameters in a low voice level portion without loss.

[Conventional technology]

従来の音声認識装置において、騒音を測定して得られる
騒音パラメータを騒音の混入した音声パラメータより差
し引き、対数変換を行って音声の特徴パラメータを得る
ことにより、耐騒音性を向上させる方法がある。In a conventional voice recognition device, there is a method of improving noise resistance by subtracting a noise parameter obtained by measuring noise from a voice parameter mixed with noise and performing logarithmic conversion to obtain a characteristic parameter of the voice.

しかし、対数変換の性質上、音声パラメータの値の小さ
い部分で変化量が大きく、上述のように音声パラメータ
から騒音パラメータを差し引いた信号を対数変換する
と、音声及び騒音の揺らぎ等が強調されてしまい、音声
の特徴パラメータが不安定になってしまう。However, due to the nature of logarithmic conversion, the amount of change is large in the portion where the value of the voice parameter is small, and when the signal obtained by subtracting the noise parameter from the voice parameter is logarithmically converted, fluctuations of voice and noise are emphasized. , The characteristic parameter of voice becomes unstable.

[Problems to be solved by the device]

ところで、上記音声パラメータから騒音パラメータを差
し引き、一定の補正値を加えてから対数変換を行うこと
によって上記音声及び騒音の揺らぎ等の影響を軽減する
方法が提案されている。By the way, a method has been proposed in which the noise parameter is subtracted from the voice parameter, a constant correction value is added, and then logarithmic conversion is performed to reduce the influence of the voice and noise fluctuations.

しかし、この場合に音声レベルの大きな部分にも小さな
部分にも一定の補正値を加えているため、音声レベルの
小さな部分においては音声及び騒音の揺らぎ等を軽減で
きるが、逆に音声レベルの大きな部分は元々音声及び騒
音の揺らぎ等の影響を受けにくいにもかかわらず、上記
一定の補正値を加えることによりその特徴を失ってしま
っていた。However, in this case, since a constant correction value is added to both the high and low voice levels, fluctuations of voice and noise can be reduced in the low voice level. Although the part was originally not susceptible to fluctuations of voice and noise, its characteristic was lost by adding the above-mentioned constant correction value.

本考案は上述のような課題に鑑みて成されたものであ
り、音声レベルの小さな部分での音声及び騒音の揺らぎ
等の影響を受けることなく、音声レベルの大きな部分で
の特徴を失うことがないような音声確認装置の提供を目
的とする。The present invention has been made in view of the above-mentioned problems, and it is possible to lose the characteristics in a high voice level portion without being affected by fluctuations of voice and noise in a low voice level portion. The purpose is to provide a voice confirmation device that does not exist.

[Means for Solving the Problems]

本考案は上述の課題を解決するために、騒音の混入した
入力音声信号から得られる音声パラメータより、上記騒
音に対応する騒音パラメータを差し引き、対数変換を施
して特徴パラメータを抽出する音声認識装置において、
上記騒音パラメータの差し引かれた対数変換前の音声パ
ラメータに補正値を重畳する補正値重畳手段と、上記入
力音声信号のパワーレベルを算出する音声パワーレベル
算出手段とを有し、上記音声パワーレベル算出手段から
の出力に応じて上記補正値を変化させることを特徴とし
ている。In order to solve the above-mentioned problems, the present invention provides a speech recognition device that extracts a characteristic parameter by performing logarithmic conversion by subtracting a noise parameter corresponding to the noise from a speech parameter obtained from an input speech signal containing noise. ,
Comprising correction value superimposing means for superimposing a correction value on the audio parameter before logarithmic conversion from which the noise parameter has been subtracted, and audio power level calculating means for calculating the power level of the input audio signal. The correction value is changed according to the output from the means.

[Action]

本考案に係る音声認識装置では、入力音声信号から得ら
れる音声パラメータから騒音パラメータを差し引き、上
記入力音声信号のレベルに応じた補正値を加えることに
より、音声及び騒音の揺らぎ等の影響に左右されず低レ
ベル部分が安定し、且つ高レベル部分の特徴を失わない
音声の特徴パラメータを得ることができる。In the voice recognition device according to the present invention, the noise parameter is subtracted from the voice parameter obtained from the input voice signal, and the correction value is added according to the level of the input voice signal, thereby being influenced by the influence of voice and noise fluctuations. It is possible to obtain a speech feature parameter in which the low-level portion is stable and the features of the high-level portion are not lost.

〔Example〕

以下、本発明に係る音声認識装置の実施例について図面
を参照しながら説明する。An embodiment of a voice recognition device according to the present invention will be described below with reference to the drawings.

第１図は本考案に係る音声認識装置の実施例を示す概略
的な回路図である。FIG. 1 is a schematic circuit diagram showing an embodiment of a voice recognition device according to the present invention.

この第１図に示す音声認識装置において、入力音声信号
の入力される入力端子１は、通過帯域を異ならせた複数
個（ｎ個）のパンドパスフィルタ2₁,2₂・・・2_n及び上
記入力音声信号のパワーレベルを算出する音声パワー算
出回路３に接続されている。上記各バンドパスフィルタ
2₁,2₂・・・2_nは、それぞれ整流回路4₁,4₂・・・4_nに接
続されており、この各整流回路4₁,4₂・・・4_nは、それ
ぞれローパスフィルタ5₁,5₂・・・5_nに接続されてい
る。この各ローパスフィルタ5₁,5₂・・・5_nを介した音
声信号は、それぞれ加算器6₁,6₂・・・6_nに入力される
ように接続されている。この加算器6₁,6₂・・・6_nに
は、騒音パラメータ出力回路7₁,7₂・・・7_nから出力さ
れる騒音パラメータ信号に負の係数（−α）を乗算する
係数乗算器8₁,8₂・・・8_nを介した信号が入力されて、
実質的に減算を行うように接続されている。上記各加算
器6₁,6₂・・・6_nからの出力信号は、加算器11₁,11₂・・
・11_nに供給されており、この加算器11₁,11₂・・・11_n
には補正値出力回路9₁,9₂・・・9_nからの補正値信号に
係数（β）を乗算する係数乗算器10₁,10₂・・・10_nを介
した信号が入力されるように接続されている。上記各加
算器11₁,11₂・・・11_nからの信号は、それぞれ対数変換
回路13₁,13₂・・・13_nに入力されており、この各対数変
換回路13₁,13₂・・・13_nはそれぞれの出力端子14₁,14₂
・・・14_nに接続されている。In the speech recognition apparatus shown in FIG. 1, an input terminal 1 to which an input speech signal is input has a plurality (n) of pand pass filters 2 ₁ , 2 ₂ ... 2 _n having different pass bands and It is connected to an audio power calculation circuit 3 for calculating the power level of the input audio signal. Each band pass filter
₂ _1, 2 2 ··· 2 _n are respectively connected to the rectifier circuit 4 _1, 4 ₂ ··· 4 _n, the respective rectifier circuits 4 _1, 4 ₂ ··· 4 _n are each low-pass filter It is connected to 5 ₁ , 5 ₂ ... 5 _n . The audio signals passed through the low-pass filters 5 ₁ , 5 _2, ... 5 _n are connected so as to be input to the adders 6 ₁ , 6 _2, ... 6 _n , respectively. This adder 6 ₁ , 6 ₂ ... 6 _n is multiplied by a coefficient for multiplying the noise parameter signal output from the noise parameter output circuit 7 ₁ , 7 ₂ ... 7 _n by a negative coefficient (−α). 8 ₁ , 8 ₂ ... 8 _n signals are input,
Substantially connected to perform the subtraction. The output signals from the adders 6 ₁ , 6 ₂ ... 6 _n are added by the adders 11 ₁ , 11 ₂ ...
・ Supplied to 11 _n , this adder 11 ₁ , 11 ₂ ... 11 _n
9 _n is input with a signal via a coefficient multiplier 10 ₁ , 10 ₂ ... 10 _n for multiplying the correction value signal from the correction value output circuit 9 ₁ , 9 ₂ ... 9 _n by a coefficient (β) Are connected as. The signal from the adders 11 _1, 11 ₂ ··· 11 _n, respectively are inputted to the logarithmic converter 13 _1, 13 ₂ · · · 13 _n, the respective logarithmic converter 13 _1, 13 _2, ..13 _n are output terminals 14 ₁ and 14 ₂
... Connected to 14 _n .

なお、上記音声パワーレベル算出回路３からの出力がバ
イアス値コントロール回路12を介して各補正値出力回路
9₁,9₂・・・9_nに送られている。The output from the audio power level calculation circuit 3 is output to each correction value output circuit via the bias value control circuit 12.
It is sent to 9 ₁ , 9 ₂ ... 9 _n .

次に動作説明をする。Next, the operation will be described.

この実施例においては、入力音声信号の各バンドパスフ
ィルタ2₁,2₂・・・2_nの通過帯域毎に音声の特徴パラメ
ータを抽出し、例えば周波数スペクトラムのパターンマ
ッチングにより、音声認識を行うものを想定している。In this embodiment, a voice characteristic parameter is extracted for each pass band of each band-pass filter 2 ₁ , 2 ₂ ... 2 _n of an input voice signal, and voice recognition is performed by pattern matching of a frequency spectrum, for example. Is assumed.

すなわち、入力端子１には騒音の混入した入力音声信号
が供給されており、この入力信号は、それぞれのバンド
パスフィルタ2₁,2₂・・・2_nを介すことによって、各通
過周波数帯域毎に分別される。この各バンドパスフィル
タ2₁,2₂・・・2_nからの各周波数帯域別に分けられた上
記入力音声信号は、整流回路4₁,4₂・・・4_n及びローパ
スフィルタ5₁,5₂・・・5_nによってレベル検出されて、
騒音成分の混入した音声パラメータが得られる。これら
の各帯域毎の音声パラメータは、各加算器6₁,6₂・・・6
_nに供給されて、上記入力音声信号の騒音に対応する騒
音パラメータがそれぞれ差し引かれる。すなわち、これ
らの加算器6₁,6₂・・・6_nには、騒音パラメータ出力回
路7₁,7₂・・・7_nからの騒音パラメータが係数乗算器8₁,
8₂・・・8_nにより負（−α）の係数が乗算されて供給さ
れているため、上記音声信号から騒音パラメータを減算
する働きをしている。この騒音パラメータが差し引かれ
た音声パラメータは、加算器11₁,11₂・・・11_nに入力さ
れ、補正値出力回路9₁,9₂・・・9_nからの補正値（のβ
倍）が加算されている。ここで、入力端子１を介して入
力された入力音声信号の音声パワーレベルが音声パワー
レベル算出回路３から出力され、バイアス値コントロー
ル回路12を介して各補正値出力回路9₁,9₂・・・9_nのバ
イアス値を制御するため、入力音声信号のレベルに応じ
て上記各補正値が変化する。That is, the input terminal 1 is supplied with the input audio signal mixed noise, the input signal, each of the band pass filter 2 _1, 2 by the intervention of the ₂ · · · 2 _n, each pass band It is separated by each. The input audio signals divided by the respective frequency bands from the band pass filters 2 ₁ , 2 ₂ ... 2 _n are rectifier circuits 4 ₁ , 4 ₂ ... 4 _n and low pass filters 5 ₁ , 5 ₂ ... The level is detected by 5 _n ,
A voice parameter containing a noise component is obtained. The speech parameters for each band are added by each adder 6 ₁ , 6 _2, ... 6
_The noise parameters corresponding to the noise of the input audio signal are respectively subtracted by being supplied to _n . That is, these adders 6 _1, 6 _2, ... 6 _n, the noise parameter output circuit 7 _1, 7 _2, ... 7 noise parameter coefficient multiplier 8 ₁ from _n,
Since it is supplied after being multiplied by a negative (-α) coefficient by 8 ₂ ... 8 _n, it serves to subtract the noise parameter from the audio signal. The voice parameter from which this noise parameter has been subtracted is input to the adders 11 ₁ , 11 ₂ ... 11 _n , and the correction value (β of the correction value output circuit 9 ₁ , 9 ₂ ... 9 _n
Double) has been added. Here, the audio power level of the input audio signal input through the input terminal 1 is output from the audio power level calculation circuit 3, and the correction value output circuits 9 ₁ , 9 ₂ ... Through the bias value control circuit 12.・ Because the bias value of 9 _n is controlled, the above correction values change according to the level of the input audio signal.

すなわち、第２図に示す入力音声信号の音声パワーレベ
ルの変化〔第２図（ａ）〕に対する上記補正値の変化
〔同図（ｂ）〕を用いて説明すると、音声パワーレベル
の大きくなる部分では上記補正値を小さくし、逆に音声
及び騒音の揺らぎ等の影響を受けやすい上記音声パワー
レベルの小さくなる部分では上記補正値を大きくして、
上記騒音パラメータの差し引かれた音声信号に重畳して
いる。そして、このような補正値が重畳された加算器11
₁,11₂・・・11_nからの音声パラメータは、それぞれ対数
変換回路13₁,13₂・・・13_nに入力され、対数変換処理さ
れることにより、入力音声レベルの小さい部分では補正
値が大きいため、音声及び騒音の揺らぎ等の影響を受け
ることがなく、且つ入力音声レベルの大きい部分では補
正値が小さいため、音声の特徴を失わないような音声の
特徴パラメータを出力端子14₁,14₂・・・14_nから取り出
すことができる。That is, the change in the correction value [FIG. 2 (b)] with respect to the change in the audio power level of the input audio signal [FIG. 2 (a)] shown in FIG. 2 will be explained. Then, the correction value is reduced, and conversely, the correction value is increased in a portion where the voice power level is easily affected by fluctuations of voice and noise.
It is superimposed on the audio signal from which the noise parameters have been subtracted. Then, the adder 11 on which such a correction value is superimposed
_1, 11 ₂ speech parameters from · · · 11 _n are respectively inputted to the logarithmic converter _{_{_{13 1, 13 2 ··· 13 n}}} , by being logarithmic conversion process, the correction value is a small portion of the input speech level Is large, it is not affected by fluctuations in voice and noise, and since the correction value is small in a portion where the input voice level is high, a voice characteristic parameter that does not lose the voice characteristic is output terminal 14 ₁ , It can be taken out from 14 ₂ ... 14 _n .

[Effect of device]

本考案に係る音声認識装置は、入力音声信号のパワーレ
ベルに応じた補正値を騒音パラメータの差し引かれた音
声パラメータに重畳し、対数変換処理を行うことによ
り、音声及び騒音の揺らぎ等の影響を受けず入力音声レ
ベルの小さい部分が安定し、且つ該入力音声レベルの大
きい部分の特徴を失わない音声の特徴パラメータを取り
出すことができる。A speech recognition apparatus according to the present invention superimposes a correction value according to a power level of an input speech signal on a speech parameter from which a noise parameter has been subtracted, and performs a logarithmic conversion process to reduce the influence of fluctuations in speech and noise. It is possible to extract a characteristic parameter of a voice that is stable without being received, and that does not lose the characteristics of a portion with a high input voice level.

[Brief description of drawings]

第１図は本考案に係る実施例の音声認識装置を示す概略
的な回路図、第２図は入力音声信号と補正値の関係を説
明するための波形図である。１……入力端子 2₁,2₂・・・2_n……バンドパスフィルタ３……音声パワーレベル算出回路 6₁,6₂・・・6_n……加算器 7₁,7₂・・・7_n……騒音パラメータ出力回路 8₁,8₂・・・8_n……係数乗算器 9₁,9₂・・・9_n……補正値出力回路 10₁,10₂・・・10_n……係数乗算器 11₁,11₂・・・11_n……加算器 12……バイアス値コントロール回路 13₁,13₂・・・13_n……対数変換回路 14₁,14₂・・・14_n……出力端子FIG. 1 is a schematic circuit diagram showing a speech recognition apparatus according to an embodiment of the present invention, and FIG. 2 is a waveform diagram for explaining the relationship between an input speech signal and a correction value. 1 …… Input terminal 2 ₁ , 2 ₂・・・ 2 _n …… Bandpass filter 3 …… Voice power level calculation circuit 6 ₁ , 6 ₂・・・ 6 _n …… Adder 7 ₁ , 7 ₂・・・7 _n …… Noise parameter output circuit 8 ₁ , 8 ₂・・・ 8 _n …… Coefficient multiplier 9 ₁ , 9 ₂・・・ 9 _n・・・ Correction value output circuit 10 ₁ , 10 ₂・・・ 10 _n・・・… Coefficient multiplier 11 ₁ , 11 ₂・・・ 11 _n …… Adder 12 …… Bias value control circuit 13 ₁ , 13 ₂・・・ 13 _n …… Logarithmic conversion circuit 14 ₁ , 14 ₂・・・ 14 _n ...... Output terminal

───────────────────────────────────────────────────── フロントページの続き (72)考案者赤羽正照東京都品川区北品川６丁目７番35号ソニー株式会社内 (72)考案者田中幸東京都品川区北品川６丁目７番35号ソニー株式会社内 (72)考案者勝又泰東京都品川区北品川６丁目７番35号ソニー株式会社内 (56)参考文献特開昭56−88199（ＪＰ，Ａ) 実開昭56−159400（ＪＰ，Ｕ) 特公昭63−34477（ＪＰ，Ｂ２) 特公昭61−2960（ＪＰ，Ｂ２) ─────────────────────────────────────────────────── ─── Continuation of front page (72) Masateru Akabane Masateru Akabane 6-735 Kita-Shinagawa, Shinagawa-ku, Tokyo Sony Corporation (72) Inventor Yuko Tanaka 6-7-35 Kita-Shinagawa, Shinagawa-ku, Tokyo Sony Corporation (72) Inventor Yasushi Katsumata 6-35 Kitashinagawa, Shinagawa-ku, Tokyo Sony Corporation (56) References JP-A-56-88199 (JP, A) 159400 (JP, U) JP 63-34477 (JP, B2) JP 61-2960 (JP, B2)

Claims

[Scope of utility model registration request]

1. A voice recognition device for subtracting a noise parameter corresponding to the noise from a voice parameter obtained from an input voice signal containing noise and performing logarithmic conversion to extract a characteristic parameter, wherein the noise parameter is subtracted. And a correction value superimposing means for superimposing a correction value on the audio parameter before logarithmic conversion, and an audio power level calculating means for calculating the power level of the input audio signal, according to the output from the audio power level calculating means. A voice recognition device, characterized in that the correction value is changed.