JP2797616B2

JP2797616B2 - Noise suppression device

Info

Publication number: JP2797616B2
Application number: JP2067706A
Authority: JP
Inventors: 良二鈴木; 正之三崎
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1990-03-16
Filing date: 1990-03-16
Publication date: 1998-09-17
Anticipated expiration: 2013-09-17
Also published as: JPH03266899A

Description

【発明の詳細な説明】産業上の利用分野本発明は雑音の重畳した音声信号から雑音を減らして
音声のみを抽出するようにした雑音抑圧装置に関するも
のである。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a noise suppression device that reduces noise from a speech signal on which noise is superimposed and extracts only speech.

従来の技術従来より、無線通信を行うときに支障となる雑音を減
らすために雑音抑圧装置が利用されている。2. Description of the Related Art Conventionally, a noise suppression device has been used to reduce noise that hinders wireless communication.

以下、図面を参照しながら、上述した従来の雑音抑圧
装置について説明を行う。Hereinafter, the above-described conventional noise suppression device will be described with reference to the drawings.

第８図は従来の雑音抑圧装置の構成図を示すものであ
る。第８図において、800は帯域分割手段、801,802,80
3,804は帯域分割手段800の出力を入力するレベル検出手
段、805,806,807,808はレベル検出手段801,802,803,804
の出力を入力する雑音レベル推定手段、809,810,811,81
2はレベル検出手段801,802,803,804の出力と雑音レベル
推定手段805,806,807,808の出力を入力する音声レベル
比推定手段、813,814,815,816は帯域分割手段800の出力
と音声レベル比推定手段809,810,811,812の出力を乗算
する乗算手段、817は乗算手段813,814,815,816の出力を
加算する加算手段である。FIG. 8 shows a configuration diagram of a conventional noise suppression device. In FIG. 8, 800 is a band dividing means, 801, 802, 80
3,804 is a level detecting means for inputting the output of the band dividing means 800, and 805,806,807,808 are level detecting means 801,802,803,804
Noise level estimating means for inputting the output of, 809, 810, 811, 81
2 is a sound level ratio estimating means for inputting the output of the level detecting means 801, 802, 803, 804 and the output of the noise level estimating means 805, 806, 807, 808; This is addition means for adding the outputs of the multiplication means 813, 814, 815, 816.

以上のように構成された雑音抑圧装置について、以下
その動作について説明する。なお、帯域分割数は４とし
て説明する。The operation of the noise suppression device configured as described above will be described below. It is assumed that the number of band divisions is four.

まず入力信号が帯域分割手段800により４個の周波数
帯域に分けられる。次に各帯域の信号レベルがレベル検
出手段801,802,803,804により求められる。そして雑音
レベル推定手段805,806,807,808は入力信号が無音声で
あると判断したときに各帯域の雑音レベルの推定を行
う。次に音声レベル比推定手段809,810,811,812は各帯
域の入力信号レベルに占める音声信号レベルの割合の推
定値の演算を行う。そして各帯域の信号は乗算手段813,
814,815,816により音声レベル比推定手段809,810,811,8
12の出力で重み付けされる。次に加算手段817は乗算手
段813,814,815,816の出力の全てを加算して出力する。First, the input signal is divided into four frequency bands by the band dividing means 800. Next, the signal level of each band is obtained by the level detecting means 801, 802, 803, 804. Then, the noise level estimating means 805, 806, 807, 808 estimates the noise level of each band when determining that the input signal is silent. Next, the audio level ratio estimating means 809, 810, 811, 812 calculates the estimated value of the ratio of the audio signal level to the input signal level of each band. And the signal of each band is multiplied by 813,
Speech level ratio estimating means 809,810,811,8 by 814,815,816
Weighted by 12 outputs. Next, the adding means 817 adds and outputs all the outputs of the multiplying means 813, 814, 815, 816.

発明が解決しようとする課題しかしながら、上記のような構成では、雑音のレベル
は下がるが音声自身も抑圧されるという課題を有してい
た。Problems to be Solved by the Invention However, the above configuration has a problem that the noise level is reduced but the voice itself is suppressed.

本発明は上記課題に鑑み、雑音の明瞭度をなるべく下
げることなく雑音を抑圧することのできる雑音抑圧装置
を提供するものである。The present invention has been made in view of the above problems, and provides a noise suppression device capable of suppressing noise without lowering the clarity of noise as much as possible.

課題を解決するための手段この目的を達成するために本発明の雑音抑圧装置は、
入力される音声信号を複数の周波数帯域信号に分ける帯
域分割手段（110）と、分けられた周波数帯域信号が供
給される複数の帯域強調手段（100）と、複数の帯域強
調手段（100）の出力信号を加算出力する加算器（120）
からなる雑音抑圧装置であって、帯域強調手段（100）は、レベル検出手段（101）と、
雑音レベル推定手段（102）と、減算器（103）と、音声
レベル比推定手段（105）と、スペクトル強調手段（10
4）と、乗算器（106）からなり、レベル検出手段（10
1）は、周波数帯域信号のレベルを算出し、雑音レベル
推定手段（102）は、無音声部分を検出することによっ
て雑音レベルを算出し、減算器（103）は、信号レベル
から雑音レベルを減算することにより音声レベルを算出
し、音声レベル比推定手段（105）は、周波数帯域信号
のレベルに対する音声レベルの比である音声レベル比を
算出し、スペクトル強調手段（104）は、記憶された強
調特性係数と各帯域強調手段（100）の音声レベルを畳
み込み演算することによりスペクトル強調係数を算出
し、乗算器（106）は、周波数帯域信号と音声レベル比
とスペクトル強調係数を乗算出力する。Means for Solving the Problems In order to achieve this object, a noise suppression device of the present invention comprises:
Band dividing means (110) for dividing an input audio signal into a plurality of frequency band signals, a plurality of band emphasizing means (100) to which the divided frequency band signals are supplied, and a plurality of band emphasizing means (100). Adder that adds and outputs the output signal (120)
A noise suppression device comprising: a band emphasis means (100); a level detection means (101);
Noise level estimating means (102), subtractor (103), voice level ratio estimating means (105), and spectrum emphasizing means (10
4) and a multiplier (106).
1) calculates the level of the frequency band signal, the noise level estimating means (102) calculates the noise level by detecting a non-voice portion, and the subtractor (103) subtracts the noise level from the signal level. The sound level ratio estimating means (105) calculates a sound level ratio which is a ratio of the sound level to the level of the frequency band signal, and the spectrum emphasizing means (104) calculates the sound level ratio. The spectrum enhancement coefficient is calculated by convolving the characteristic coefficient and the audio level of each band enhancement means (100), and the multiplier (106) multiplies and outputs the frequency band signal, the audio level ratio, and the spectrum enhancement coefficient.

作用この構成によって、乗算手段がスペクトル強調手段の
求めた係数を入力信号に乗じることにより雑音抑圧によ
って劣化した音声の明瞭度を改善することとなる。Operation With this configuration, the multiplication means multiplies the input signal by the coefficient obtained by the spectrum emphasis means, thereby improving the intelligibility of speech deteriorated by noise suppression.

実施例本発明は、音声の明瞭度をなるべく下げることなく雑
音を抑圧することのできる雑音抑圧装置を提供するもの
である。Embodiments The present invention provides a noise suppression device capable of suppressing noise without lowering the intelligibility of speech as much as possible.

以下本発明の一実施例について、図面を参照しながら
説明する。Hereinafter, an embodiment of the present invention will be described with reference to the drawings.

第１図は本発明の一実施例における雑音抑圧装置の構
成図を示すものである。なお、本実施例においては便宜
上、入力信号の周波数帯域を４帯域に分ける場合を例に
して説明する。第１図において、100a、100b、100c、10
0dは帯域強調手段であり、レベル検出手段101a,101b,10
1c,101dと、雑音レベル推定手段102a,102b,102c,102d
と、減算器103a,1103b,103c,103dと、音声レベル比推定
手段105a,105b,105c,105dと、スペクトル強調手段104
a、104b、104c、104dと、乗算器106a、106b、106c、106
dから構成される。110は帯域分割手段、101a,101b,101
c,101dは各々帯域分割手段110から出力される異なる周
波数帯域の信号を入力するレベル検出手段、102a,102b,
102c,102dは各々、レベル検出手段101a,101b,101c,101d
の出力を入力する雑音レベル推定手段、103a,1103b,103
c,103dは各々、レベル検出手段101a,101b,101c,101dの
出力から雑音レベル推定手段102a,102b,102c,102dの出
力を減算する減算手段、105a,105b,105c,105dは各々、
レベル検出手段101a,101b,101c,101dの出力と減算手段1
03a,1103b,103c,103dの出力を入力する音声レベル比推
定手段、104a、104b、104c、104dは相当する帯域及びそ
の前または後の数帯域の減算手段103a,103b,103c,103d
の出力を入力するスペクトル強調手段、106a、106b、10
6c、106dは各々、帯域分割手段110の出力と音声レベル
比推定手段105a,105b,105c,105dの出力とスペクトル強
調手段104a、104b、104c、104dの出力を入力する乗算手
段、120は乗算手段106a,106b,106c,106dからの出力、す
なわち、全帯域の出力を入力する加算手段である。FIG. 1 shows a configuration diagram of a noise suppression device according to an embodiment of the present invention. In this embodiment, a case where the frequency band of the input signal is divided into four bands will be described as an example for convenience. In FIG. 1, 100a, 100b, 100c, 10
0d is band emphasis means, and level detection means 101a, 101b, 10
1c, 101d and noise level estimating means 102a, 102b, 102c, 102d
, Subtractors 103a, 1103b, 103c, 103d, voice level ratio estimating means 105a, 105b, 105c, 105d, and spectrum emphasizing means 104.
a, 104b, 104c, 104d and multipliers 106a, 106b, 106c, 106
Consists of d. 110 is a band dividing means, 101a, 101b, 101
c, 101d are level detecting means for inputting signals of different frequency bands output from the band dividing means 110, 102a, 102b,
102c and 102d are level detecting means 101a, 101b, 101c and 101d, respectively.
Noise level estimating means 103a, 1103b, 103
c, 103d are subtraction means for subtracting the output of the noise level estimation means 102a, 102b, 102c, 102d from the output of the level detection means 101a, 101b, 101c, 101d, respectively, 105a, 105b, 105c, 105d are
Output of level detection means 101a, 101b, 101c, 101d and subtraction means 1
The voice level ratio estimating means 104a, 104b, 104c, 104d for inputting the outputs of 03a, 1103b, 103c, 103d are subtracting means 103a, 103b, 103c, 103d for the corresponding band and several bands before or after it.
, 106a, 106b, 10
6c and 106d are multiplication means for receiving the output of the band division means 110, the output of the audio level ratio estimation means 105a, 105b, 105c and 105d, and the output of the spectrum enhancement means 104a, 104b, 104c and 104d, respectively, and 120 is a multiplication means. It is an adding means for inputting outputs from 106a, 106b, 106c, 106d, that is, outputs of all bands.

以上のように構成された雑音抑圧装置について、以下
その動作について説明する。The operation of the noise suppression device configured as described above will be described below.

まず入力信号が帯域分割手段110により４個の周波数
帯域に分けられる。次に各帯域の信号レベルがレベル検
出手段101a,101b,101c,101dにより求められる。雑音レ
ベル推定手段102a,102b,102c,102dは入力信号が無音声
であると判断したときに各帯域の雑音レベルの推定を行
う。次に減算手段103a,103b,103c,103dは各々、レベル
検出手段101a,101b,101c,101dの出力から雑音レベル推
定手102a、102b、102c、102dの出力を差し引くことによ
り各帯域の音声レベルの推定を行う。音声レベル比推定
手段105a,105b,105c,105dは各々、レベル検出手段101a,
101b,101c,101dの出力と減算手段103a,103b,103c,103d
の出力に基づき各帯域の入力信号レベルに占める音声信
号レベルの割合の推定値を求める。First, the input signal is divided into four frequency bands by the band dividing means 110. Next, the signal level of each band is obtained by the level detecting means 101a, 101b, 101c, 101d. The noise level estimating means 102a, 102b, 102c, 102d estimates the noise level of each band when it is determined that the input signal is silent. Next, the subtraction means 103a, 103b, 103c, and 103d respectively subtract the output of the noise level estimating means 102a, 102b, 102c, and 102d from the output of the level detection means 101a, 101b, 101c, and 101d to obtain the audio level of each band. Make an estimate. The sound level ratio estimating means 105a, 105b, 105c, 105d are respectively provided with level detecting means 101a,
Outputs of 101b, 101c, 101d and subtraction means 103a, 103b, 103c, 103d
, An estimated value of the ratio of the audio signal level to the input signal level of each band is obtained.

次にスペクトル強調手段104a、104b、104c、104dは、
各々に相当する帯域及びその前後の帯域の減算手段103
a、103b、103c、103dの出力に基づきスペクトルを強調
する係数を求める。106a、106b、106c、106dは各々、帯
域分割手段100の出力に音声レベル比推定手段105a、105
b、105c、105dの出力とスペクトル強調手段の出力を乗
じる。次に加算手段120は乗算手段106a、106b、106c、1
06dからの出力、すなわち全帯域の出力を加え合わせ
る。Next, the spectrum enhancing means 104a, 104b, 104c, 104d
Subtracting means 103 for the band corresponding to each and the bands before and after it
Based on the outputs of a, 103b, 103c and 103d, a coefficient for enhancing the spectrum is obtained. 106a, 106b, 106c, and 106d respectively output the audio level ratio estimating means 105a, 105
The output of b, 105c, 105d is multiplied by the output of the spectrum emphasis means. Next, the addition means 120 is multiplied by the multiplication means 106a, 106b, 106c, 1
The output from 06d, that is, the output of all bands is added.

第２図は本実施例におけるスペクトル強調手段の構成
図を示すものである。第２図において、200はスペクト
ルを強調するための特性を記憶した記憶手段、201はデ
ータセレクタ、202は記憶手段200の出力とデータセレク
タ、201の出力を入力とする畳み込み手段である。FIG. 2 shows a configuration diagram of the spectrum emphasizing means in this embodiment. In FIG. 2, reference numeral 200 denotes storage means for storing characteristics for enhancing the spectrum, 201 denotes a data selector, 202 denotes a convolution means which receives the output of the storage means 200 and the data selector, and the output of 201 as inputs.

すなわち、減算手段103a、103b、103c、103dから出力
される４帯域の音声レベルの推定値をデータセレクタ20
1に入力し、次に記憶手段200の内容とデータセレクタ20
1の出力とを畳み込み手段202により積和演算を行い出力
をする。That is, the estimated values of the audio levels of the four bands output from the subtracting means 103a, 103b, 103c, and 103d are converted into data selectors 20.
1 and then the contents of the storage means 200 and the data selector 20
The convolution means 202 performs a product-sum operation on the output of 1 and outputs the result.

第３図は本発明におけるスペクトル強調手段を構成す
る記憶手段により記憶されている内容の特性図であり、
本実施例よりさらに帯域分割数を拡張して示した図であ
る。この特性はガウスの誤差関数の差の形になってお
り、生理学における神経細胞の側抑制回路を模擬したも
のである。FIG. 3 is a characteristic diagram of the contents stored by the storage means constituting the spectrum emphasis means in the present invention;
FIG. 9 is a diagram showing the number of band divisions further expanded than in the present embodiment. This characteristic is in the form of a difference of a Gaussian error function, and simulates a nerve cell side suppression circuit in physiology.

なお、第３図において縦軸は記憶手段の出力値、横軸
は帯域分割手段によって分割された帯域を示し、０は各
帯域の中で自らの帯域を示す。In FIG. 3, the vertical axis represents the output value of the storage means, the horizontal axis represents the band divided by the band dividing means, and 0 represents its own band in each band.

以上のように本実施例によれば、各スペクトル強調手
段は自らに相当する帯域及びその前後の帯域の減算手段
の出力である音声レベルの推定値に基づきスペクトルを
強調する係数を求め、乗算手段は帯域分割手段の出力に
音声レベル比推定手段の出力とスペクトル強調手段の出
力を乗じることにより、雑音を抑圧すると同時に音声の
みを強調することができる。さらにスペクトル強調手段
において、記憶手段に記憶されているデータとデータセ
レクタの出力とを畳み込み手段により積和演算を行って
出力することにより、簡単に音声強調の係数を求めるこ
とができる。なお、本実施例においては、帯域分割手段
の帯域分割数を４として説明したが、この値に限定され
るものではなく、さらに分割数を増やしても同様の効果
が発揮されるのは言うまでもない。As described above, according to the present embodiment, each spectrum emphasis unit obtains a coefficient for emphasizing a spectrum based on the estimated value of the audio level which is the output of the band subtraction unit of the band corresponding to itself and the bands before and after the band. By multiplying the output of the band dividing means by the output of the sound level ratio estimating means and the output of the spectrum emphasizing means, it is possible to suppress noise and simultaneously emphasize only the sound. Further, in the spectrum emphasizing means, the data stored in the storage means and the output of the data selector are subjected to a product-sum operation by the convolution means and output, so that the voice emphasis coefficient can be easily obtained. In the present embodiment, the number of band divisions of the band dividing means is described as 4. However, the present invention is not limited to this value. Needless to say, the same effect is exhibited even if the number of divisions is further increased. .

以下第２の発明の一実施例について、図面を参照しな
がら説明する。Hereinafter, an embodiment of the second invention will be described with reference to the drawings.

本発明は、音声の明瞭度をなるべく下げることなく雑
音を抑圧することのできる雑音抑圧方法を提供するもの
である。The present invention provides a noise suppression method that can suppress noise without lowering the intelligibility of speech as much as possible.

第４図は第２の本発明の一実施例における雑音抑圧方
法のフローチャートである。以下その動作について説明
する。FIG. 4 is a flowchart of a noise suppressing method according to the second embodiment of the present invention. The operation will be described below.

まずステップ400で入力信号を周波数分析し、それに
より得られたＮ個の要素をYi（ｉ＝１〜Ｎ）とする。次
にステップ401で入力信号が音声区間か無音声区間かの
判定を行う。そしてステップ401で入力信号が音声区間
であると判定された場合にはステップ402で各周波数毎
に入力信号に占める音声信号の割合SYi（ｉ＝１〜Ｎ）
を推定し、ステップ403で各周波数毎にスペクトルを強
調する値Ei（ｉ−１〜Ｎ）を計算する。またステップ40
1で入力信号が無音声区間であると判定された場合には
ステップ404で各周波数毎に入力信号に占める音声信号
の割合SYi（ｉ＝１〜Ｎ）を最小値に設定し、ステップ4
05で各周波数毎にスペクトルを強調する値Ei（ｉ＝１〜
Ｎ）を１に設定する。次にステップ406で周波数分析さ
れた入力信号Yi（ｉ＝１〜Ｎ）にSYiとEiを乗じること
により雑音抑圧と音声強調を行い、ステップ407でYiを
周波数合成して時間軸波形に戻して出力し、ステップ40
0に戻って以上の処理を繰り返す。First, in step 400, an input signal is subjected to frequency analysis, and N elements obtained by the analysis are Yi (i = 1 to N). Next, in step 401, it is determined whether the input signal is a voice section or a non-voice section. If it is determined in step 401 that the input signal is a voice section, in step 402 the ratio SYi (i = 1 to N) of the voice signal to the input signal for each frequency
Is calculated, and in step 403, a value Ei (i-1 to N) for enhancing the spectrum is calculated for each frequency. Step 40
If it is determined in step 1 that the input signal is in the non-voice section, in step 404, the ratio SYi (i = 1 to N) of the voice signal in the input signal is set to the minimum value for each frequency.
At 05, the value Ei (i = 1 to 1) that emphasizes the spectrum for each frequency
N) is set to 1. Next, in step 406, noise suppression and voice enhancement are performed by multiplying the frequency-analyzed input signal Yi (i = 1 to N) by SYi and Ei. In step 407, Yi is frequency-synthesized to return to a time-axis waveform. Output and step 40
Return to 0 and repeat the above processing.

以上のように本実施例によれば、音声区間ではステッ
プ403でスペクトルを強調するめ値Ei（ｉ＝１〜Ｎ）を
求め、ステップ406で周波数分析された入力信号Yi（ｉ
＝１〜Ｎ）にスペクトルを強調する値Eiと入力信号に占
める音声信号の割合SYiを乗じることにより、雑音を抑
圧すると同時に音声のみを強調することができる。さら
に無音声区間ではステップ404で入力信号に占める音声
信号の割合SYi（ｉ＝１〜Ｎ）を最小値に設定し、ステ
ップ405でスペクトルを強調する値Ei（ｉ＝１〜Ｎ）を
無効値に設定し、ステップ406で周波数分析された入力
信号Yi（ｉ＝１〜Ｎ）にSYiとEiを乗じることにより、
無音声区間の雑音抑圧量を増すことができる。As described above, according to the present embodiment, in the voice section, a value Ei (i = 1 to N) for enhancing the spectrum is obtained in step 403, and the input signal Yi (i
= 1 to N) multiplied by the value Ei for enhancing the spectrum and the ratio SYi of the audio signal to the input signal, it is possible to suppress noise and at the same time enhance only the voice. Further, in the non-voice section, the ratio SYi (i = 1 to N) of the audio signal in the input signal is set to the minimum value in step 404, and the value Ei (i = 1 to N) for enhancing the spectrum is invalid in step 405. And multiplying the input signal Yi (i = 1 to N) subjected to frequency analysis in step 406 by SYi and Ei,
The amount of noise suppression in a non-voice section can be increased.

以下第３の発明の一実施例について、図面を参照しな
がら説明する。Hereinafter, an embodiment of the third invention will be described with reference to the drawings.

本発明は、求められた係数を周波数分析された音声信
号に乗じることにより、音声の明瞭度をなるべく下げる
ことなく雑音を抑圧することのできる音声レベル比推定
方法を提供するものである。The present invention provides an audio level ratio estimation method capable of suppressing noise without lowering the intelligibility of audio as much as possible by multiplying an obtained audio signal by the obtained coefficient.

第５図は第３の発明の一実施例における発明レベル比
推定方法のフローチャートである。以下その動作につい
て説明する。FIG. 5 is a flowchart of an invention level ratio estimating method in one embodiment of the third invention. The operation will be described below.

まずステップ500で入力信号が音声区間か無音声区間
かの判定を行う。そしてステップ500で入力信号が音声
区間であると判定された場合にはまずステップ501で周
波数番号ｉを初期化する。そしてステップ502で周波数
分析してレベルに変換された入力信号のｉ番目の要素Yi
を入力する。次にステップ503で周波数分析してレベル
に変換された雑音推定レベルのｉ番目の要素Niを入力す
る。そしてステップ504で各周波数毎に入力信号に占め
る音声信号の割合の推定値SYi＝1/2＋（Yi−Ni）/2Yiを
計算する。次にステップ505で周波数番号ｉを増加させ
る。そしてステップ506でまだ全ての周波数についてSYi
の計算が終了していないと判断したらステップ502に戻
り、計算が終了していると判断した場合には処理を終了
する。またステップ500で入力信号が無音声区間である
と判定された場合にはステップ507で各周波数毎に入力
信号に占める音声信号の割合SYi（ｉ＝１〜Ｎ）を最小
値である1/2に設定し、終了する。First, in step 500, it is determined whether the input signal is a voice section or a non-voice section. If it is determined in step 500 that the input signal is in the voice section, first, in step 501, the frequency number i is initialized. Then, the i-th element Yi of the input signal that has been frequency-analyzed and converted to a level in step 502
Enter Next, in step 503, the i-th element Ni of the noise estimation level converted into a level by frequency analysis is input. Then, in step 504, the estimated value SYi = 1/2 + (Yi-Ni) / 2Yi of the ratio of the audio signal to the input signal is calculated for each frequency. Next, in step 505, the frequency number i is increased. And in step 506, SYi is still
If it is determined that the calculation has not been completed, the process returns to step 502, and if it is determined that the calculation has been completed, the process ends. If it is determined in step 500 that the input signal is a non-voice section, in step 507 the ratio SYi (i = 1 to N) of the voice signal to the input signal for each frequency is set to the minimum value of 1/2. And exit.

以上のように本実施例によれば、音声区間ではステッ
プ504で周波数分析された入力信号レベルYi（ｉ＝１〜
Ｎ）と推定音声レベルNi（ｉ＝１〜Ｎ）から入力信号に
占める音声信号の割合SYiを計算することにより、簡単
な四則演算でしかも雑音の強さによって非線形的に雑音
を抑圧するための係数を求めることができる。さらに無
音声区間ではステップ507で入力信号に占める音声信号
の割合SYi（ｉ＝１〜Ｎ）を最小値である1/2に設定する
ことにより、無音声区間の雑音抑圧量を増すことができ
る。As described above, according to the present embodiment, in the voice section, the input signal level Yi (i = 1 to
N) and the estimated audio level Ni (i = 1 to N) to calculate the ratio SYi of the audio signal to the input signal, so that the noise can be suppressed nonlinearly by a simple four arithmetic operation and by the strength of the noise. The coefficients can be determined. Further, in the non-voice section, the noise suppression amount in the non-voice section can be increased by setting the ratio SYi (i = 1 to N) of the voice signal in the input signal to 1/2 which is the minimum value in step 507. .

以下第４の発明の一実施例について、図面を参照しな
がら説明する。Hereinafter, an embodiment of the fourth invention will be described with reference to the drawings.

本発明は、雑音の重畳した音声から音声区間のみを抽
出することのできる音声検出方法を提供するものであ
る。The present invention provides a voice detection method capable of extracting only a voice section from voice on which noise is superimposed.

第６図は第４の発明の一実施例における音声検出方法
の雑音特性推定のフローチャートである。以下その動作
について説明する。FIG. 6 is a flowchart of the noise characteristic estimation of the voice detection method according to one embodiment of the fourth invention. The operation will be described below.

まずステップ600で入力信号が音声区間か無音声区間
かの判定を行い、入力信号が無音声区間であると判定さ
れるとまずステップ601で周波数番号ｉを初期化する。
そしてステップ602で周波数分析してレベルに変換され
た入力信号のｉ番目の要素Yiに基づいて雑音の分散Viを
推定する。この推定は立ち下がりの時定数の長い積分回
路などにより簡単に行うことができる。次にステップ60
3で周波数番号ｉを増加させる。そしてステップ604でま
だ全ての周波数についてViの計算が終了していないと判
断したらステップ602に戻って処理を繰り返し、計算が
終了していると判断した場合にはステップ605でViの推
定の終了の判定を行い、推定が終了していなければステ
ップ600に戻って処理を繰り返し、推定が終了していれ
ばステップ606へ進む。次にステップ606で雑音が雑音の
分数Viを越える周波数幅のカウンタMAXを初期化する。
そしてステップ607で入力信号が音声区間か無音声区間
かの判定を行い、入力信号が無音声区間であると判定さ
れるとステップ608で周波数番号ｉを初期化する。First, at step 600, it is determined whether the input signal is a voice section or a non-voice section. When it is determined that the input signal is a voice section, at step 601 the frequency number i is initialized.
Then, in step 602, a noise variance Vi is estimated based on the i-th element Yi of the input signal that has been frequency-analyzed and converted to a level. This estimation can be easily performed by an integration circuit having a long falling time constant. Then step 60
At 3, the frequency number i is increased. If it is determined in step 604 that the calculation of Vi has not been completed for all the frequencies, the process returns to step 602 to repeat the processing.If it is determined that the calculation has been completed, the end of the estimation of Vi is determined in step 605. The determination is performed, and if the estimation is not completed, the process returns to step 600 to repeat the processing, and if the estimation is completed, the process proceeds to step 606. Next, in step 606, a counter MAX having a frequency width in which the noise exceeds the fraction Vi of the noise is initialized.
Then, in step 607, it is determined whether the input signal is a voice section or a non-voice section. If it is determined that the input signal is a voice section, the frequency number i is initialized in step 608.

次にステップ609でカウンタｊを初期化する。そして
ステップ610で周波数分析してレベルに変換された入力
信号のｉ番目の要素Yiと雑音の分散の推定値Viとの比較
を行い、YiがViよりも大きい場合にはステップ611でカ
ウンタｊを増加させる。次にステップ612で周波数番号
ｉを増加させる。そしてステップ613でまだ全ての周波
数についてYiとViの比較が終了していないと判断したら
ステップ610に戻って処理を繰り返し、比較が終了して
いると判断した場合にはステップ614に進む。次にステ
ップ614でカウンタｊとMAXの比較を行い、カウンタｊの
値がMAXの値を越えていたらMAXの値をｊで更新する。そ
してステップ616でMAXの測定の終了の判定を行い、推定
が終了していなければステップ607に戻って処理を繰り
返し、推定が終了していれば雑音が雑音の分散Viを越え
る周波数幅の測定の処理を終了する。Next, in step 609, the counter j is initialized. Then, in step 610, the i-th element Yi of the input signal, which has been frequency-analyzed and converted into a level, is compared with the estimated value Vi of the variance of noise. increase. Next, in step 612, the frequency number i is increased. If it is determined in step 613 that the comparison between Yi and Vi has not been completed for all the frequencies, the process returns to step 610 to repeat the processing. If it is determined that the comparison has been completed, the process proceeds to step 614. Next, at step 614, the counter j is compared with MAX, and if the value of the counter j exceeds the value of MAX, the value of MAX is updated with j. Then, in step 616, the end of the measurement of MAX is determined, and if the estimation is not completed, the process returns to step 607 to repeat the processing.If the estimation is completed, the measurement of the frequency width in which the noise exceeds the noise variance Vi is performed. The process ends.

第７図は第４の発明の一実施例における音声検出方法
の音声／無音声判定のフローチャートである。以下その
動作について説明する。FIG. 7 is a flowchart of the voice / non-voice determination of the voice detection method according to one embodiment of the fourth invention. The operation will be described below.

まずステップ700で周波数番号ｉを初期化する。次に
ステップ701でカウンタｊを初期化する。そしてステッ
プ702で周波数分析してレベルに変換された入力信号の
ｉ番目の要素Yiと雑音の分散の推定値Viとの比較を行
い、YiがViよりも大きい場合にはステップ703でカウン
タｊを増加させる。次にステップ704で周波数番号ｉを
増加させる。そしてステップ705でまだ全ての周波数に
ついてYiとViの比較が終了していないと判断したらステ
ップ702に戻って処理を繰り返し、比較が終了している
と判断した場合にはステップ706に進む。次にステップ7
06でカウンタｊとMAXの比較を行い、カウンタｊの値がM
AXの値を越えていたら入力信号は音声であると判断し、
カウンタｊの値がMAXの値を越えていなければ入力信号
は無音声であると判断して終了する。First, at step 700, the frequency number i is initialized. Next, in step 701, the counter j is initialized. Then, in step 702, the i-th element Yi of the input signal, which has been frequency-analyzed and converted into a level, is compared with the estimated variance Vi of the noise. If Yi is larger than Vi, the counter j is incremented in step 703. increase. Next, in step 704, the frequency number i is increased. If it is determined in step 705 that the comparison between Yi and Vi has not been completed for all the frequencies, the process returns to step 702 to repeat the processing. If it is determined that the comparison has been completed, the process proceeds to step 706. Then step 7
At 06, the counter j is compared with MAX, and the value of the counter j is M
If the value exceeds AX, the input signal is determined to be audio,
If the value of the counter j does not exceed the value of MAX, the input signal is determined to be silent, and the process ends.

以上のように本実施例によれば、ステップ602で推定
した雑音の分散Vi（ｉ＝１〜Ｎ）を閾値としてステップ
610で雑音がViを越える周波数幅MAXを測定することによ
り、雑音の特性を簡単にしかも確実に把握することがで
きる。さらにステップ702で雑音の分散Vi（ｉ＝１〜
Ｎ）を閾値として周波数分析された入力信号レベルYiが
Viを越える周波数幅ｊを測定し、ステップ706で周波数
幅ｊがMAXを越える場合には音声であると判定し、周波
数幅ｊがMAXを越えない場合には無音声であると判定す
ることにより、簡単に音声／無音声の判定を行うことが
できる。As described above, according to this embodiment, the noise variance Vi (i = 1 to N) estimated in step 602 is set as the threshold value.
By measuring the frequency width MAX at which the noise exceeds Vi at 610, the characteristics of the noise can be easily and reliably grasped. Further, in step 702, the noise variance Vi (i = 1 to
N), the input signal level Yi subjected to frequency analysis with the threshold value is
By measuring the frequency width j exceeding Vi, in step 706, if the frequency width j exceeds MAX, it is determined that the voice is sound, and if the frequency width j does not exceed MAX, it is determined that there is no voice. Thus, it is possible to easily determine the voice / non-voice.

発明の効果以上のように、第１の本発明は、音声比推定手段とス
ペクトル強調手段と乗算手段を設けることにより、明瞭
度をなるべく下げずに雑音を抑圧するという効果を得る
ことができる優れた雑音抑圧装置を実現できるものであ
る。As described above, according to the first embodiment of the present invention, by providing the voice ratio estimating unit, the spectrum emphasizing unit, and the multiplying unit, it is possible to obtain an effect of suppressing noise without lowering intelligibility as much as possible. The noise suppression device can be realized.

また、第２の発明は、音声区間ではスペクトル強調す
る値を求め、周波数分析された入力信号に、スペクトル
を強調する値と入力信号に占める音声信号の割合を乗じ
ることにより、雑音を抑圧すると同時に音声のみを強調
することができる。さらに、無音声区間では入力信号に
占める音声信号の割合を最小値に設定してスペクトルを
強調する値を無効値に設定し、周波数分析された入力信
号に、入力信号に占める音声信号の割合とスペクトルを
強調する値を乗じることにより、無音声区間の雑音抑圧
量を増すことができる。Further, the second invention obtains a value for spectrum emphasis in a voice section, and suppresses noise by multiplying the frequency-analyzed input signal by a value for spectrum emphasis and a ratio of a voice signal in the input signal. Only audio can be emphasized. Furthermore, in the non-voice section, the ratio of the audio signal to the input signal is set to the minimum value, and the value for enhancing the spectrum is set to the invalid value. By multiplying by a value that emphasizes the spectrum, it is possible to increase the amount of noise suppression in a non-voice section.

また、第３の発明は、音声区間では周波数分析された
入力信号レベルと推定雑音レベルから入力信号に占める
音声信号の割合を計算することにより、簡単な四則演算
でしかも雑音の強さによって非線形的に雑音を抑圧する
ための係数を求めることができる。さらに、無音声区間
では入力信号に占める音声信号の割合を最小値に設定す
ることにより、無音声区間の雑音抑圧量を増すことがで
きる。Further, the third invention calculates a ratio of a voice signal to an input signal in a voice section from a frequency-analyzed input signal level and an estimated noise level, thereby performing a simple four arithmetic operation and a nonlinear calculation based on noise intensity. A coefficient for suppressing noise can be obtained. Furthermore, by setting the ratio of the audio signal to the input signal in the non-voice section to the minimum value, the amount of noise suppression in the non-voice section can be increased.

また、第４の発明は、推定した雑音の分散を閾値とし
て雑音が雑音の分散を越える周波数幅を測定することに
より、雑音の特性を簡単にかつ確実に把握することがで
きる。さらに、雑音の分散を閾値として周波数分析され
た入力信号レベルが雑音の分散を越える周波数幅を測定
し、周波数幅が、雑音の分散を越える周波数幅を越えな
い場合には無音声であると判定することにより、簡単に
音声／無音声の判定をすることができる。According to the fourth aspect, the characteristic of the noise can be easily and reliably grasped by measuring the frequency width where the noise exceeds the variance of the noise by using the estimated variance of the noise as a threshold value. Furthermore, the frequency width of the input signal whose frequency analysis is performed using the noise variance as a threshold value exceeds the variance of the noise, and if the frequency width does not exceed the frequency width that exceeds the variance of the noise, it is determined that there is no voice. By doing so, it is possible to easily determine voice / non-voice.

[Brief description of the drawings]

第１図は本発明の一実施例における雑音抑圧装置の構成
図、第２図は本発明の一実施例における雑音抑圧装置の
スペクトル強調手段の構成図、第３図は本発明の一実施
例における雑音抑圧装置のスペクトル強調手段を構成す
る記憶手段の特性図、第４図は第２の発明の一実施例に
おける雑音抑圧方法のフローチャート、第５図は第３の
発明の一実施例における音声レベル比推定方法のフロー
チャート、第６図は第４の発明の一実施例における音声
検出方法の雑音特性推定のフローチャート、第７図は本
発明の第４の実施例における音声検出方法の音声／無音
声判定のフローチャート、第８図は従来の雑音抑圧装置
の構成図である。 100a、100b、100c、100d……帯域強調手段、110……帯
域分割手段、101a、101b、101c、101d……レベル検出手
段、102a,102b,102c,102d……雑音レベル推定手段、103
a、103b、103c、103d……減算手段、104a、104b、104
c、104d……スペクトル強調手段、105a、105b、105c、1
05d……音声レベル比推定手段、106a、106b、106c、106
d……乗算手段、120……加算手段。FIG. 1 is a configuration diagram of a noise suppression device according to one embodiment of the present invention, FIG. 2 is a configuration diagram of a spectrum emphasis means of the noise suppression device according to one embodiment of the present invention, and FIG. 3 is an embodiment of the present invention. FIG. 4 is a characteristic diagram of storage means constituting spectrum emphasis means of the noise suppression device in FIG. 4, FIG. 4 is a flowchart of a noise suppression method in one embodiment of the second invention, and FIG. FIG. 6 is a flowchart of a level ratio estimating method, FIG. 6 is a flowchart of noise characteristic estimation of the voice detecting method in one embodiment of the fourth invention, and FIG. 7 is a voice / non-voice of the voice detecting method in the fourth embodiment of the present invention. FIG. 8 is a configuration diagram of a conventional noise suppression device. 100a, 100b, 100c, 100d ... band emphasis means, 110 ... band division means, 101a, 101b, 101c, 101d ... level detection means, 102a, 102b, 102c, 102d ... noise level estimation means, 103
a, 103b, 103c, 103d... Subtraction means, 104a, 104b, 104
c, 104d: spectrum enhancing means, 105a, 105b, 105c, 1
05d: voice level ratio estimating means, 106a, 106b, 106c, 106
d: multiplication means, 120: addition means.

Claims

(57) [Claims]

1. A band dividing means (110) for dividing an input audio signal into a plurality of frequency band signals; a plurality of band emphasizing means (100) to which the divided frequency band signals are supplied; A noise suppression device comprising an adder (120) for adding and outputting an output signal of the means (100), wherein the band emphasis means (100) includes a level detection means (101), a noise level estimation means (102), Subtractor (103), voice level ratio estimating means (105), and spectrum emphasizing means (104)
And a multiplier (106). The level detecting means (101) calculates the level of the frequency band signal, and the noise level estimating means (102) calculates the noise level by detecting the non-voice portion. The subtracter (103) calculates the audio level by subtracting the noise level from the signal level, and the audio level ratio estimating means (105) calculates the audio level ratio which is the ratio of the audio level to the frequency band signal level. The spectrum emphasis means (104) calculates a spectrum emphasis coefficient by convolving the stored emphasis characteristic coefficient and the audio level of each band emphasis means (100), and the multiplier (106) Noise suppression device that multiplies and multiplies the sound level ratio and spectrum enhancement coefficient and outputs the result.

2. An audio level ratio estimating means (105) calculates an audio level ratio as SY = 1/2 + (Y−N) / 2Y (where SY: audio level ratio, Y: level of a frequency band signal, N The noise suppression apparatus according to claim 1, wherein the noise suppression apparatus calculates the noise suppression level as a noise level.