JP3205463B2

JP3205463B2 - Noise reduction method

Info

Publication number: JP3205463B2
Application number: JP10537294A
Authority: JP
Inventors: 博人須田; 俊雄三木; 義則三木
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 1993-05-19
Filing date: 1994-05-19
Publication date: 2001-09-04
Anticipated expiration: 2016-09-04
Also published as: JPH0738454A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】この発明は、背景雑音の大きな環
境下で用いられることがある固定、あるいは移動音声通
話システムに適用され、音声の統計量と雑音の統計量と
からフィルタ係数を決定し、そのフィルタ係数により入
力信号をフィルタリング処理して原音声に重畳した雑音
を軽減する雑音軽減方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention is applied to a fixed or mobile voice communication system which is sometimes used in an environment having a large background noise, and determines a filter coefficient from voice statistics and noise statistics. The present invention also relates to a noise reduction method for filtering an input signal using the filter coefficient to reduce noise superimposed on original speech.

【０００２】[0002]

【従来の技術】従来の雑音軽減方法は、雑音の統計的特
性が予め分かっていることを前提とする方法が、Ｊ．
Ｄ．Ｇｉｂｓｏｎ（“ＦｉｌｔｅｒｉｎｇｏｆＣｏ
ｌｏｒｅｄＮｏｉｓｅｆｏｒＳｐｅｅｃｈＥｎ
ｈａｎｃｅｍｅｎｔａｎｄＣｏｄｉｎｇ”，ＩＥＥ
ＥＴｒａｎｓ．ＳＰ．Ｖｏｌ．３９，Ｎｏ．８，１９
９１）らにより検討されている。この方法はフレームご
とのカルマンフィルタリング処理で構成される。カルマ
ンフィルタリング処理には、音声および雑音の各統計量
が必要となる。Ｇｉｂｓｏｎらの方法は、フレーム毎の
音声の統計量は未知としてフレームごとに推定され、フ
レームごとの雑音の統計量は既知としてフィルタリング
処理する方法である。従って、既知と仮定する雑音統計
量の推定精度が雑音軽減の効果を左右することになる。2. Description of the Related Art A conventional noise reduction method is based on the premise that the statistical characteristics of noise are known in advance.
D. Gibson (“Filtering of Co
lored Noise for Speech En
Hancement and Coding ”, IEEE
E Trans. SP. Vol. 39, no. 8,19
91) et al. This method includes a Kalman filtering process for each frame. The Kalman filtering process requires statistics of speech and noise. The method of Gibson et al. Is a method in which the statistic of speech for each frame is estimated as unknown for each frame, and the statistic of noise for each frame is known and filtering is performed. Therefore, the estimation accuracy of the noise statistic assumed to be known affects the effect of noise reduction.

【０００３】[0003]

【発明が解決しようとする課題】ところが、正確な雑音
統計量を予め推定することは、あまり簡単ではない。音
声の影響を排除した状態で事前に雑音信号のみを収録
し、雑音の統計量を推定することが必要になる。通話の
前に雑音収録処理を行う手間が増加し、システムのコス
ト増につながる欠点がある。しかも、移動無線の様な適
用領域では、事前に推定した雑音統計量が時間的に変動
することがあり、この場合には、事前に推定した雑音統
計量の推定精度が時間とともに低下することになり、そ
の結果、雑音軽減が十分に行なえないという欠点を合わ
せ持つ。However, it is not very easy to estimate an accurate noise statistic in advance. It is necessary to preliminarily record only the noise signal in a state where the influence of the voice is eliminated, and estimate the statistics of the noise. There is a disadvantage that the trouble of performing noise recording processing before a call is increased, which leads to an increase in system cost. In addition, in an application area such as mobile radio, the noise statistic estimated in advance may fluctuate with time. In this case, the estimation accuracy of the noise statistic estimated in advance decreases with time. As a result, it also has the disadvantage that the noise cannot be sufficiently reduced.

【０００４】時間的変動に追従するため、音声収録用の
マイクロホンと同時に、雑音収録用のマイクロホンをも
ち、つまり音声収録系と別系統の雑音収録系を持つ方法
が考えられる。この方法を用いることで、雑音統計量の
推定精度は向上するが、システムコストの増加につなが
る欠点をもつ。また従来の雑音軽減方法は、入力信号の
状態、つまり、入力信号が雑音のみ、音声のみ、音声に
雑音が重畳された状態、音声が立上る状態、音声が立下
る状態などを考慮していなく、聴覚的に不自然に感じら
れる場合が生じる欠点があった。[0004] In order to follow the temporal variation, a method of having a microphone for noise recording simultaneously with the microphone for voice recording, that is, a method of having a noise recording system of a different system from the voice recording system can be considered. By using this method, the estimation accuracy of the noise statistic is improved, but there is a disadvantage that the system cost is increased. Also, the conventional noise reduction method does not consider the state of the input signal, that is, the input signal is only noise, only voice, the state where noise is superimposed on the voice, the state where voice rises, the state where voice falls, etc. However, there is a disadvantage that the sound may be unnaturally felt.

【０００５】この発明の目的は雑音軽減を十分に行うこ
とができる雑音軽減方法を提供することにある。この発
明の他の目的は価格上昇を伴うことなく、雑音統計量の
推定精度を向上することにより雑音軽減を十分に行う雑
音軽減方法を提供することにある。この発明の更に他の
目的は音声の状態、雑音の状態に応じて適応的に処理し
て雑音軽減効果を高める雑音軽減方法を提供することに
ある。An object of the present invention is to provide a noise reduction method capable of sufficiently reducing noise. Another object of the present invention is to provide a noise reduction method for sufficiently reducing noise by improving the estimation accuracy of a noise statistic without increasing the price. It is still another object of the present invention to provide a noise reduction method for enhancing the noise reduction effect by adaptively processing according to the state of voice and the state of noise.

【０００６】[0006]

【課題を解決するための手段】請求項１の発明によれば
入力信号中の無音区間を検出し、その検出した無音区間
における入力信号から雑音の統計量を推定する。つまり
電話などの双方向通話においては、一方の話者に注目す
ると、音声を発していない時間（相手が話している時
間）がある程度存在する。この状態は雑音だけが入力さ
れている状態であり、この状態では雑音の統計量を高い
精度で推定することができる。According to the present invention, a silent section in an input signal is detected, and a statistic of noise is estimated from the input signal in the detected silent section. That is, in a two-way telephone call or the like, when attention is paid to one of the speakers, there is a certain amount of time during which no sound is emitted (the time during which the other party is talking). This state is a state in which only noise is input. In this state, the statistics of noise can be estimated with high accuracy.

【０００７】更に、入力信号中の音声と雑音との状態を
予め決めた少なくとも４つの状態に区分し、その各状態
をフレーム状態テーブルとして記憶しておき、入力信号
を一定時間（フレーム）ごとに処理してフレーム状態テ
ーブルを参照して各フレームの状態を決定し、その決定
されたフレーム状態に応じて音声の統計量と雑音の統計
量とを適応的に修正する。 Further , the states of voice and noise in the input signal are divided into at least four predetermined states, and each state is stored as a frame state table, and the input signal is stored at predetermined time intervals (frames). The processing is performed to determine the state of each frame with reference to the frame state table, and the speech statistics and noise statistics are adaptively modified according to the determined frame states.

【０００８】請求項２の発明では請求項１の発明におけ
るフレーム状態テーブルを用意しておき、また雑音用マ
イクロホンで雑音を収録し、その雑音用マイクロホンに
収録した入力信号から雑音の統計量を推定し、一方音声
用マイクロホン、雑音用マイクロホンの両入力信号から
音声、雑音の状態を各フレームごとに決定し、その状態
に応じてフレーム状態テーブルを参照して音声の統計量
と雑音の統計量とを適応的に修正する。According to a second aspect of the present invention, the frame state table according to the first aspect of the present invention is prepared, noise is recorded by a noise microphone, and a noise statistic is estimated from an input signal recorded in the noise microphone. On the other hand, the state of voice and noise is determined for each frame from both input signals of the microphone for voice and the microphone for noise, and the statistic of voice and the statistic of noise are determined by referring to the frame state table according to the state. Is modified adaptively.

【０００９】請求項３の発明によれば、請求項１の発明
におけるフレーム状態テーブルを用意しておき、通話の
前に雑音の統計量を推定し、その雑音の統計量と、入力
信号から推定した音声の統計量とを用い、入力信号を一
定時間（フレーム）ごとに処理して、そのフレームの状
態をフレーム状態テーブルを参照して決定し、その決定
されたフレーム状態に応じて音声の統計量及び雑音の統
計量を適応的に修正する。According to the third aspect of the present invention, the frame state table in the first aspect of the present invention is prepared, and a noise statistic is estimated before a call, and the noise statistic is estimated from the noise statistic and an input signal. The input signal is processed at regular time intervals (frames) using the determined voice statistics and the state of the frame is determined with reference to the frame status table, and the voice statistics are determined according to the determined frame status. Adaptively modify volume and noise statistics.

【００１０】請求項４の発明によれば上記の何れかの発
明において音声、雑音の各推定統計量から信号対雑音比
を求め、その信号対雑音比に応じ、又は雑音電力に応じ
て前記適応的修正を制御する。[0010] According to the invention 請 Motomeko 4 obtains speech, the signal-to-noise ratio from the estimated statistics of the noise in any one of the above, depending on the signal-to-noise ratio, or in response to said noise power Control adaptive modification.

【００１１】請求項７の発明によれば請求項１乃至４の
何れかの発明において、各フレームの先頭のｎサンプル
（ｎは２以上の整数で１フレームのサンプル数Ｎより
小）においては各サンプルごとにフィルタ係数の更新を
行い、残りの（Ｎ−ｎ）サンプルではフィルタ係数の更
新を行わない。According to a seventh aspect of the present invention, in any one of the first to fourth aspects, each of the first n samples of each frame (n is an integer of 2 or more and smaller than the number N of samples in one frame) The filter coefficients are updated for each sample, and the filter coefficients are not updated for the remaining (N−n) samples.

【００１２】[0012]

【実施例】請求項１の発明の実施例の処理手順を図１に
示し、その実施例が適用される雑音軽減装置を図２に示
す。入力端子１１より入力信号は一定時間（フレー
ム）、例えば数１０ミリ秒ごとに分割し（Ｓ₁）、状態
判定部１２で各フレームの状態を判定する（Ｓ₂）。つ
まり入力信号中の音声と雑音との関係を、例えば図３Ａ
に示すように、背景雑音がなく、音声のみが存在する状
態１０、背景雑音も音声も共に存在しない、完全無音の
状態０、背景雑音のみが存在する状態２０、背景雑音が
存在し、音声は話頭、または話尾の状態２１、背景雑音
が存在し、音声の定常の状態２２、に区分けし、これら
の状態に符号を与えてフレーム状態テーブル１３に予め
記憶しておく。EXAMPLES showing a processing procedure of embodiment of the invention of claim 1 in FIG. 1, illustrating a noise reducing apparatus to which the embodiment is applied in FIG. The input signal from the input terminal 11 is divided for a predetermined time (frame), for example, every several tens of milliseconds (S ₁ ), and the state of each frame is determined by the state determination unit 12 (S ₂ ). That is, the relationship between the voice and the noise in the input signal is shown in FIG.
As shown in the figure, there is no background noise and only voice exists, state 10 where neither background noise nor voice exists, complete silence state 0, only background noise exists state 20, background noise exists and voice is present. The state is divided into a speech head or tail state 21 and a stationary state 22 of speech with background noise. These states are given codes and stored in the frame state table 13 in advance.

【００１３】入力信号はこのような状態を図３Ｂに示す
ように状態遷移する。この状態は入力信号中の音声パワ
ーおよびその変動や、音声スペクトル特性およびその変
動などにより決定する。各フレームの状態判定を行った
結果が、雑音のみが存在する状態２０である場合（Ｓ
₃ ）はその時の入力信号から雑音統計量推定部１４で雑
音の統計量、例えば雑音のＡＲモデルおよび電力を推定
する（Ｓ₄ ）。状態２０以外では音声統計量推定部１５
で音声の統計量、例えば音声のＡＲモデルおよび電力を
推定する（Ｓ₅ ）。The input signal makes such a state transition as shown in FIG. 3B. This state is determined by the audio power in the input signal and its fluctuation, the audio spectrum characteristic and its fluctuation, and the like. When the result of the state determination of each frame is state 20 in which only noise exists (S20)
_{In 3} ), the noise statistic estimating unit 14 estimates noise statistics, for example, an AR model and power of the noise, from the input signal at that time (S ₄ ). In states other than the state 20, the voice statistic estimation unit 15
In audio statistics, for example, it estimates the AR model and power of the voice (S _5).

【００１４】これら推定された雑音の統計量および音声
の統計量に対して、そのフレームの判定された状態に応
じ適応的に修正を修正部１６，１７でそれぞれ行う（Ｓ
₆ ）。この修正は例えば図４に示すように行う。即ち状
態２０では雑音軽減処理を強く行うため、音声信号の電
力値を入力信号の電力（雑音の電力）よりも小さい値に
設定する。入力信号の電力に乗ずる係数ａは１以下の
値、例えば０．１を用い、雑音電力に対する音声の電力
を小さくして、強い雑音軽減処理をカルマンフィルタに
より行われるようにする。またこの場合は音声のＡＲモ
デルをなまらせ、雑音のＡＲモデル、電力はその時、推
定したものをそのまま用いる。The estimated noise statistics and speech statistics are adaptively modified by the correction units 16 and 17 according to the determined state of the frame (S
₆ ). This correction is performed, for example, as shown in FIG. That is, in the state 20, in order to perform the noise reduction processing strongly, the power value of the audio signal is set to a value smaller than the power of the input signal (power of the noise). The coefficient a multiplied by the power of the input signal is set to a value of 1 or less, for example, 0.1, and the power of the voice with respect to the noise power is reduced so that the strong noise reduction processing is performed by the Kalman filter. Also, in this case, the AR model of the voice is blunted, and the AR model and the power of the noise used at that time are used as they are.

【００１５】状態２１では直前の状態２０で雑音電力に
１以下の係数、例えば０．７を乗じて、つまり雑音電力
を小さくして、雑音軽減処理を弱くする。音声のＡＲモ
デルおよび電力はそのフレームで求めたものをそのまま
用い、雑音のＡＲモデルは直前の状態２０で求めたもの
をそのまま用いる。状態２２では音声はそのフレームで
推定したＡＲモデルおよび電力をそのまま用い、雑音は
直前の状態２０で推定したＡＲモデルおよび電力をその
まま用い、つまり、通常の強さの雑音軽減処理を行う。
即ち従来においては推定した音声および雑音の各統計量
をそのまま用いたが、この実施例では状態２２以外の状
態２０では音声の推定電力を小さく修正して雑音軽減処
理を強くし、状態２１では雑音の推定電力を小さく修正
して雑音軽減処理を弱くするように状態に応じて適応的
に推定統計量を修正する。In a state 21, the noise power is multiplied by a coefficient of 1 or less, for example, 0.7 in the immediately preceding state 20, that is, the noise power is reduced to weaken the noise reduction processing. The AR model and the power of the voice used in the frame are used as they are, and the AR model of the noise used in the previous state 20 is used as it is. In state 22, the voice uses the AR model and power estimated in the frame as it is, and the noise uses the AR model and power estimated in state 20 immediately before, that is, performs noise reduction processing of normal strength.
That is, in the prior art, the estimated statistics of speech and noise were used as they are, but in this embodiment, in state 20 other than state 22, the estimated power of the speech is modified small to enhance the noise reduction processing, and in state 21, the noise reduction processing is performed. , The estimated statistic is adaptively modified according to the state so that the estimated power is corrected to be small to weaken the noise reduction processing.

【００１６】このようして適応的に修正された音声およ
び雑音の各統計量を用いてフィルタ係数を算出し（Ｓ
₇ ）、そのフィルタ係数をカルマンフィルタ１８に設定
し、端子１１からの入力信号を各対応フレームについて
フィルタ１８でフィルタリング処理して入力信号中の雑
音成分を抑圧する（Ｓ₈ ）。フィルタ１８でフィルタリ
ング処理された信号は必要に応じて音声符号化されて伝
送又は蓄積される。A filter coefficient is calculated using the statistics of speech and noise adaptively modified as described above (S
_7), and sets the filter coefficient in the Kalman filter 18, an input signal from the terminal 11 to filtering processing by the filter 18 for each corresponding frame suppresses a noise component in the input signal (S _8). The signal filtered by the filter 18 is voice-encoded and transmitted or stored as needed.

【００１７】このように無音区間、つまり状態２０で雑
音の統計量を推定しているため高い精度の推定が実現で
き、予め雑音の統計量推定を行う手数がなく、また１つ
の収録系で済み、かつ雑音状態が変化すると、これに応
じた雑音の統計量が推定され、それだけ良好に雑音を軽
減することができる。また状態に応じて雑音の軽減度が
強くされたり、弱くされたり適応的に変化され、雑音軽
減が効果的に行われる。As described above, since the statistic of the noise is estimated in the silent section, that is, in the state 20, it is possible to realize a highly accurate estimation, there is no need to perform the estimation of the statistic of the noise in advance, and only one recording system is required. When the noise state changes, the noise statistic corresponding thereto is estimated, so that the noise can be reduced satisfactorily. Also, the degree of noise reduction is increased, weakened, or adaptively changed according to the state, so that noise reduction is effectively performed.

【００１８】状態数や状態の定義などは上記例に限られ
るものではない。例えば、上述の実施例においては話頭
と話尾をあわせて状態２１として定義していたが、話頭
を状態２１とし話尾を状態２３とし、図５に示すように
状態２３においては、音声信号の電力に乗ずる係数を徐
々に小さくする、例えば、０．９，０．８９，０．８
８，０．８７・・・，０．７のように徐々に小さい値を
乗ずる。この操作により、有音つまり状態２１又は２２
から無音つまり状態２０に移行した場合、同一状態２０
と判定されても状態の遷移に応じて乗数ａを制御するこ
とにより、この例では雑音が徐々に軽減され、聴感上の
自然感が得られる。The number of states and the definition of the states are not limited to the above example. For example, in the above-described embodiment, the speech head and the speech tail are defined as the state 21 together. However, the speech head is the state 21 and the speech tail is the state 23. In the state 23 as shown in FIG. The coefficient by which the power is multiplied is gradually reduced, for example, 0.9, 0.89, 0.8
Multiply by gradually smaller values such as 8, 0.87..., 0.7. By this operation, a sound, that is, a state 21 or 22
From state to silence, that is, state 20, the same state 20
By controlling the multiplier a in accordance with the transition of the state even if it is determined that in this example, the noise is gradually reduced in this example, and a natural feeling in audibility can be obtained.

【００１９】さらに状態遷移が複雑な場合の実施例を図
７に示す。図３に示した実施例よりも状態数が増え状態
遷移も多様になっている。この実施例においては、雑音
が重畳していない場合に誤って雑音軽減処理を行うこと
を防止するために、図７に示す構成をとっている。図７
における状態遷移の条件を図８および図９に示し、これ
ら図８および図９における各変数の説明を以下に示す。FIG. 7 shows an embodiment in which the state transition is further complicated. The number of states is increased and the state transition is diversified as compared with the embodiment shown in FIG. In this embodiment, the configuration shown in FIG. 7 is employed to prevent erroneous noise reduction processing when no noise is superimposed. FIG.
8 and 9 show the conditions of the state transition in. The explanation of each variable in FIGS. 8 and 9 is shown below.

【００２０】Ｐ：入力信号の１サンプルあたりの平均電力。Ｐratio ：現フレームの平均電力をＰ，ひとつ前のフレ
ームの平均電力をＰ’とすると、Ｐratio ＝Ｐ／Ｐ’ st-cnt ：遷移制御補助変数。初期値は８。遷移条件
（ａ１，ｂ１，ｄ１，ｅ１，ｆ１，ｇ１，ｊ１）におい
て１減ぜられる。また、番号（ｅ１）において、８に初
期化される。P: Average power per sample of the input signal. Ratio: Assuming that the average power of the current frame is P and the average power of the immediately preceding frame is P ′, Ratio = P / P ′ st-cnt: Transition control auxiliary variable. The default value is 8. In the transition condition (a1, b1, d1, e1, f1, g1, j1), it is reduced by one. In the number (e1), it is initialized to 8.

【００２１】Ｐs1, Ｐn,Ｐsp：定数。入力音声の平均電力に対するし
きい値。Ｐs1＝3.０，Ｐn ＝６４0.０，Ｐsp＝２５０0.
０ spg ：変数。無次元数。初期値は1.０である。遷移条件
（ｉ２）かつ‘Ｐ≦ＰspまたはＰratio ≦0.３またはＰ
ratio ≧1.０／0.３’が１０回連続成立した場合には、
psg＝3.０×Ｐmin ／Ｐspとする。遷移条件（ｃ１）に
おいて1.０にセットされる。Ps1, Pn, Psp: constants. Threshold for average power of input voice. Ps1 = 3.0, Pn = 640.0, Psp = 2500.
0 spg: Variable. Dimensionless number. The initial value is 1.0. Transition condition (i2) and 'P ≦ Psp or Ratio ≦ 0.3 or P
If ratio ≧ 1.0 / 0.3 'holds 10 times in a row,
Let psg = 3.0 × Pmin / Psp. Set to 1.0 in the transition condition (c1).

【００２２】Ｐmin ：Ｐmin ＝ＭＩＮ（Ｐmin,Ｐ）（番号（ｉ２）か
つ‘Ｐ≦ＰspまたはＰratio ≦0.３またはＰratio ≧1.
０／0.３’が成立した場合）Ｐmin＝正の最大値（その
他の場合）Ｐow10：電力の次元を持つ変数。初期値は0.０とする。
番号（ｅ２）において、Ｐow10＝0.９×Ｐow10＋0.１×
Ｐの処理がなされる。Pmin: Pmin = MIN (Pmin, P) (number (i2) and 'P ≦ Psp or Pratio ≦ 0.3 or Pratio ≧ 1.
0 / 0.3 'is satisfied) Pmin = positive maximum value (other cases) Pow10: a variable having a power dimension. The initial value is 0.0.
In the number (e2), Pow10 = 0.9 × Pow10 + 0.1 ×
The processing of P is performed.

【００２３】Ｐow10cnt ：変数。初期値は０とする。番号（ｅ２）に
おいて、Ｐow10cnt ＝Ｐow10cnt ＋１( Ｐow10＞2.０×
Ｐn の場合）Ｐow10cnt ＝０(Ｐow10≦2.０×Ｐn の場
合）の処理がなされる。Ｐr ：Ｐr ＝Ｐ／Ｐow20 Ｐow20：Ｐow20＝Ｐ（条件ｄ１の時），Ｐow20＝0.９×
Ｐow20＝0.１×Ｐ（条件ｌ１の時），Ｐow20＝Ｍin
（Ｐ）（st22が５０回連続した場合）Ｋ：Ｋ＝（1.０− ref１× ref１）×（1.０− ref
２× ref２）で求められる。ただし ref１および ref２
は、入力を２次の線型予測フィルタに通したときのｋパ
ラメータである。Pow10cnt: Variable. The initial value is 0. In the number (e2), Pow10cnt = Pow10cnt + 1 (Pow10> 2.0 ×
In the case of Pn), the processing of Pow10cnt = 0 (in the case of Pow10 ≦ 2.0 × Pn) is performed. Pr: Pr = P / Pow20 Pow20: Pow20 = P (under condition d1), Pow20 = 0.9 ×
Pow20 = 0.1 × P (at condition 11), Pow20 = Min
(P) (when st22 is repeated 50 times) K: K = (1.0−ref1 × ref1) × (1.0−ref
2 × ref2). Where ref1 and ref2
Is the k parameter when the input is passed through a second-order linear prediction filter.

【００２４】Ｖar ：入力を２次の線型予測フィルタに通したときの
残差の電力。 sstt ：０，２，４，６，８，１０のいずれかの値をと
る。遷移条件（ｋ１，ｋ２，ｌ１，ｍ１，ｏ１，ｏ２，
ｐ１，ｑ１，ｓ１，ｔ１，ｔ２）において、以下の計算
式に基づいて値が求められる。ただし、Ｖar' をひとつ
前のフレームのＶarの値として、Ｒatio＝Ｖar' ／Ｖar
で定義される。Var: residual power when the input is passed through a second-order linear prediction filter. sstt: takes one of the values 0, 2, 4, 6, 8, and 10. Transition conditions (k1, k2, l1, m1, o1, o2,
In (p1, q1, s1, t1, t2), a value is obtained based on the following formula. Where Var 'is the value of Var of the immediately preceding frame, and Ratio = Var' / Var.
Is defined by

【００２５】 sstt＝０またはsstt＝２の時： 0.６＜Ｒatio＜1.４：sstt＝sstt＋２ 0.５＜Ｒatio≦0.６または1.４≦Ｒatio＜1.５：sstt＝
２Ｒatio≦0.５または1.５≦Ｒatio：sstt＝２ sstt＝４の時： 0.７＜Ｒatio＜1.３：sstt＝６ 0.６＜Ｒatio≦0.７または1.３≦Ｒatio＜1.４：sstt＝
４ 0.５＜Ｒatio≦0.６または1.４≦Ｒatio＜1.５：sstt＝
２Ｒatio≦0.５または1.５≦Ｒatio：sstt＝０ sstt＝６の時： 0.８＜Ｒatio＜1.２：sstt＝８ 0.７＜Ｒatio≦0.７または1.２≦Ｒatio＜1.３：sstt＝
６ 0.６＜Ｒatio≦0.７または1.３≦Ｒatio＜1.４：sstt＝
４ 0.５＜Ｒatio≦0.６または1.４≦Ｒatio＜1.５：sstt＝
２Ｒatio≦0.５または1.５≦Ｒatio：sstt＝０ sstt＝８またはsstt＝１０の時： 0.９＜Ｒatio＜1.１：sstt＝１０ 0.８＜Ｒatio≦0.９または1.１≦Ｒatio＜1.２：sstt＝
８ 0.７＜Ｒatio≦0.７または1.２≦Ｒatio＜1.３：sstt＝
６ 0.６＜Ｒatio≦0.７または1.３≦Ｒatio＜1.４：sstt＝
４ 0.５＜Ｒatio≦0.６または1.４≦Ｒatio＜1.５：sstt＝
２Ｒatio≦0.５または1.５≦Ｒatio：sstt＝０図７の状態１０及び１１は、背景雑音が十分小さいと推
定されている状態である。これらの状態から他の状態
（２０以上）への遷移条件を十分厳しくすることで、背
景雑音が無いにも係わらず雑音軽減のフィルタ動作を誤
って行うことを避けることを可能としている。When sstt = 0 or sstt = 2: 0.6 <Ratio <1.4: sstt = sstt + 2 0.5 <Ratio ≦ 0.6 or 1.4 ≦ Ratio <1.5: sstt =
2 Ratio ≦ 0.5 or 1.5 ≦ Ratio: sstt = 2 When sstt = 4: 0.7 <Ratio <1.3: sstt = 6 0.6 <Ratio ≦ 0.7 or 1.3 ≦ Ratio <1.4: sstt =
4 0.5 <Ratio ≦ 0.6 or 1.4 ≦ Ratio <1.5: sstt =
2 Ratio ≦ 0.5 or 1.5 ≦ Ratio: sstt = 0 When sstt = 6: 0.8 <Ratio <1.2: sstt = 8 0.7 <Ratio ≦ 0.7 or 1.2 ≦ Ratio <1.3: sstt =
6 0.6 <Ratio ≦ 0.7 or 1.3 ≦ Ratio <1.4: sstt =
4 0.5 <Ratio ≦ 0.6 or 1.4 ≦ Ratio <1.5: sstt =
2 Ratio ≦ 0.5 or 1.5 ≦ Ratio: sstt = 0 When sstt = 8 or sstt = 10: 0.9 <Ratio <1.1: sstt = 10 0.8 <Ratio ≦ 0.9 or 1 .1 ≦ Ratio <1.2: sstt =
8 0.7 <Ratio ≦ 0.7 or 1.2 ≦ Ratio <1.3: sstt =
6 0.6 <Ratio ≦ 0.7 or 1.3 ≦ Ratio <1.4: sstt =
4 0.5 <Ratio ≦ 0.6 or 1.4 ≦ Ratio <1.5: sstt =
2 Ratio ≦ 0.5 or 1.5 ≦ Ratio: sstt = 0 States 10 and 11 in FIG. 7 are states where the background noise is estimated to be sufficiently small. By making the transition conditions from these states to other states (20 or more) sufficiently strict, it is possible to avoid performing a noise reduction filter operation erroneously despite the absence of background noise.

【００２６】通常はカルマンフィルタ１８の係数はサン
プルごとに計算する。この係数の計算に必要となる処理
量はフィルタリング動作そのものよりも大きい。一方、
フィルタリング係数の計算に用いるＡＲモデル係数およ
び電力の値はフレーム（例えば１フレームのサンプル数
Ｎは１６０）ごとに一定値をとるため、フィルタリング
係数はフレームの後半で一定値に収束していく。そこ
で、全サンプルごとにフィルタの係数を更新するのでは
なく、フレームの先頭のｎサンプル、例えば３サンプル
においてはフィルタの係数を更新し、かつ得られた係数
を用いてフィルタリング処理を行い、残りの（Ｎ−ｎ）
サンプルについてはフィルタの係数は更新せず、ｎサン
プル目に得られた係数を用いてフィルタリングを行う。
このようにして計算量および処理時間を少なくすること
ができる。Normally, the coefficients of the Kalman filter 18 are calculated for each sample. The amount of processing required to calculate this coefficient is larger than the filtering operation itself. on the other hand,
Since the values of the AR model coefficient and the power used for calculating the filtering coefficient take a constant value for each frame (for example, the number N of samples in one frame is 160), the filtering coefficient converges to a constant value in the latter half of the frame. Therefore, instead of updating the filter coefficients for every sample, the filter coefficients are updated for the first n samples of the frame, for example, three samples, and the obtained coefficients are used to perform a filtering process. (N-n)
For the sample, the filter coefficient is not updated, and filtering is performed using the coefficient obtained for the n-th sample.
Thus, the amount of calculation and the processing time can be reduced.

【００２７】上述において入力信号電力に乗じる係数ａ
は、推定される雑音電力や音声電力のレベルに応じて適
応的に制御する。つまり雑音電力が小さい、または入力
のＳＲＮ（信号対雑音比）が高い場合には、上記のよう
にカルマンフィルタ１８が強く動作する方向に設定し、
雑音電力が大きい、または入力のＳＲＮが低い場合に
は、逆にカルマンフィルタ１８が弱く動作する方向に設
定する。入力のＳＮＲが低い場合に強い雑音軽減を行う
と、音声信号の不自然感が増加することになり、これを
避けるための係数ａを適応的に制御する。入力のＳＮＲ
に応じ係数ａは例えば図１０Ａに示すように、ＳＮＲの
小さい所で大きな値を、ＳＮＲの大きい所で小さな値
を、これら中間で直線的に変化させる。In the above description, the coefficient a for multiplying the input signal power
Controls adaptively according to the estimated noise power and voice power levels. That is, when the noise power is small or the input SRN (signal-to-noise ratio) is high, the direction is set so that the Kalman filter 18 operates strongly as described above.
When the noise power is large or the input SRN is low, the Kalman filter 18 is set to operate in a weaker direction. If strong noise reduction is performed when the SNR of the input is low, the unnaturalness of the audio signal increases, and the coefficient a for avoiding this is adaptively controlled. Input SNR
For example, as shown in FIG. 10A, the coefficient a changes linearly between a large value at a small SNR and a small value at a large SNR as shown in FIG. 10A.

【００２８】更に図１０Ｂに示すように、入力信号の振
幅が所定値以下、つまり最大振幅のα分の１以下（例え
ば最大振幅を１とする時、α＝１０００とする）で入力
信号振幅をゼロとした後、図２に示した処理を行うよう
にし、つまり入力信号が小レベルの時は、雑音とみな
し、これを抑圧してしまうようにしてもよい。図１０Ｃ
に示すように、入力信号の電力が所定値β（例えばβ／
最大入力電力＝３０ｄＢ）以下ではその電力に応じて入
力信号を減衰させるようにしてもよい。つまりいわゆる
エクスパンダと同様の処理をした後、図２に示した処理
を行うようにしてもよい。このエクスパンダによる減衰
に時間的な（時定数を用いた）制御を適用してもよい。
このエクスパンダ処理は例えば研究実用化報告第３０巻
第３号（１９８１）１８７〜１９５頁「８００ＭＨｚ帯
自動車電話方式におけるシラビック・コンパンダの適用
とその効果」に記載されているシラビック・コンパンダ
中のエクスパンダと同様に時定数を持たせ、そのアタッ
クタイムは例えば３．０±０．１ｍＳ程度、リカバリタ
イムは１．３５±０．６ｍＳ程度とすればよい。Further, as shown in FIG. 10B, when the amplitude of the input signal is equal to or smaller than a predetermined value, that is, equal to or less than 1 / α of the maximum amplitude (for example, when the maximum amplitude is 1, α = 1000), the input signal amplitude is reduced. After the value is set to zero, the processing shown in FIG. 2 may be performed. That is, when the input signal is at a low level, it may be regarded as noise and may be suppressed. FIG. 10C
As shown in the figure, the power of the input signal is a predetermined value β (for example, β /
If the maximum input power is 30 dB or less, the input signal may be attenuated according to the power. That is, after performing the same processing as that of the so-called expander, the processing illustrated in FIG. 2 may be performed. A temporal control (using a time constant) may be applied to the attenuation by the expander.
This expander processing is performed, for example, in the Slavic compander described in "Application and Effect of Sirabic Compander in 800 MHz Band Car Telephone System", Vol. 30, No. 3 (1981), p. A time constant may be provided like the panda, and the attack time may be, for example, about 3.0 ± 0.1 mS, and the recovery time may be about 1.35 ± 0.6 mS.

【００２９】これら図１０に示した各種信号処理は、入
力信号自体のレベル（つまり、振幅又は電力）を利用し
て行う場合に限らず、推定雑音統計量又は推定音声統計
量の電力を用いてもよい。更にこれら信号処理は図１１
Ａに示すように信号処理手段３１で行った後、図２に示
す雑音軽減手段３２へ供給する場合に限らず、図１１Ｂ
に示すように雑音軽減手段３２の出力に対して図１０に
示した各種処理を行ってもよい。The various signal processings shown in FIG. 10 are not limited to the case where the level (ie, the amplitude or the power) of the input signal itself is used, but the power of the estimated noise statistic or the estimated speech statistic. Is also good. Further, these signal processings are shown in FIG.
As shown in FIG. 11A, the signal is supplied to the noise reduction unit 32 shown in FIG.
The various processes shown in FIG. 10 may be performed on the output of the noise reduction unit 32 as shown in FIG.

【００３０】図２に示した雑音軽減手段３２により処理
した信号を図１１Ｃに音声符号化手段３３で符号化して
出力してもよい。この符号化手段３３としては例えばＶ
ＳＥＬＰ，ＰＳＩ−ＣＥＬＰ（信学技報ＲＣ５９３−７
８−１９９３年１１月「ＰｉｔｃｈＳｙｎｃｈｒｏｎ
ｏｕｓＩｎｎｏｖａｔｉｏｎＣＥＬＰ（ＰＳＩ−Ｃ
ＥＬＰ）」）などが用いられる。The signal processed by the noise reduction means 32 shown in FIG. 2 may be encoded and output by the speech encoding means 33 in FIG. 11C. For example, V
SELP, PSI-CELP (IEICE RC593-7)
8-1993 November "Pitch Synchron
ous Innovation CELP (PSI-C
ELP) ”).

【００３１】図１１Ｄに示すように雑音軽減手段３２と
当符号化手段３３との間に図１０に示した処理手段３１
を挿入してもよい。図１１Ｅに示すように、音声符号化
手段３３と対応した音声復号化手段３４の出力側に雑音
軽減手段３２を設けてもよい。また図に示していない
が、図１１Ｅにおいて、雑音軽減手段３２の前段又は後
段に信号処理手段３１を設けてもよい。As shown in FIG. 11D, between the noise reduction means 32 and the encoding means 33, the processing means 31 shown in FIG.
May be inserted. As shown in FIG. 11E, a noise reduction unit 32 may be provided on the output side of the audio decoding unit 34 corresponding to the audio encoding unit 33. Although not shown in the figure, in FIG. 11E, a signal processing unit 31 may be provided before or after the noise reduction unit 32.

【００３２】図１において、ステップＳ₆ の統計量の修
正を省略してもよい、これが請求項１の発明の実施例と
なる。また図１においてステップＳ₃ の無音区間か否か
のチェック、およびステップＳ₄ の雑音統計量の推定を
省略し、通話に先立ち、雑音の統計量の推定を行い、こ
の推定統計量を用い、各フレームの状態に応じて、統計
量を適応的に修正してもよい。これは請求項４の発明の
実施例であり、固定通信のように雑音状態が大きく変動
しない場合に特に有効である。更に音声用マイクロホン
の他に雑音用マイクロホンを設け、音声用マイクロホン
よりの入力信号から音声の統計量を推定し、雑音用マイ
クロホンの入力信号から雑音の統計量を推定し、両入力
信号から各種の状態や、状態遷移を検出して、これに応
じて統計量を適応的に修正するようにしてもよい。これ
は請求項３の発明である。[0032] In FIG. 1, a statistic correction of Step S ₆ may be omitted, which is the embodiment of the invention of claim 1. The silent interval whether the check in step S _3, and the noise statistics estimation step S ₄ is omitted in FIG. 1, prior to the call, performs the estimation of the noise statistics, using the estimated statistics, The statistics may be modified adaptively according to the state of each frame. This is an embodiment of the fourth aspect of the present invention, and is particularly effective when the noise state does not fluctuate greatly as in fixed communication. Furthermore, a noise microphone is provided in addition to the audio microphone, a voice statistic is estimated from an input signal from the audio microphone, a noise statistic is estimated from an input signal of the noise microphone, and various types of noise are estimated from both input signals. A state or a state transition may be detected, and the statistics may be adaptively modified accordingly. This is the third aspect of the present invention.

【００３３】図１０、図１１において雑音軽減手段３２
としては、請求項１の発明のみならず請求項２又は３の
発明にも適用される。フィルタリング処理としてはカル
マンフィルタ処理に限らない。In FIG. 10 and FIG.
This applies not only to the invention of claim 1 but also to the invention of claim 2 or 3 . The filtering process is not limited to the Kalman filter process.

【００３４】[0034]

【発明の効果】以上述べたように請求項１の発明によれ
ば、通話に先立ち雑音の統計量の推定をいちいち行う手
間がはぶける、また、１つの収録系でよく、システムコ
ストが安価になり、しかも移動通信のように雑音状態が
時間的に変動する場合、この変動に追従して雑音の統計
量の推定が行われ、それだけ良好に雑音軽減が行われ
る。As described above, according to the first aspect of the present invention, the trouble of estimating the noise statistic prior to a call is saved, and one recording system is sufficient, and the system cost is reduced. In addition, when the noise state fluctuates with time as in the case of mobile communication, the statistic of the noise is estimated following the fluctuation, and the noise is reduced more satisfactorily.

【００３５】請求項２の発明によれば入力信号の短時間
ごとの状態に応じて、雑音軽減が強められたり、弱めら
れたりして、聴感性がよいものとなる。請求項７の発明
によれば計算量が少なくなり処理時間が速くなる。請求
項１の発明を適用した場合の入力信号（雑音軽減前）の
ＳＮ比と、出力信号（雑音軽減後）のＳＮ比との関係を
図６に示す。この図から出力信号のＳＮ比は入力信号の
ＳＮ比よりも常に大となっており、特に入力信号が存在
しない場合、つまり背景雑音のみの場合に適用効果が大
きくなっていることがわかる。According to the second aspect of the present invention, the noise reduction is enhanced or weakened according to the state of the input signal every short time, so that the audibility is improved. 請 Motomeko 7 calculation amount is reduced processing time according to the invention of is increased. FIG. 6 shows the relationship between the S / N ratio of the input signal (before noise reduction) and the S / N ratio of the output signal (after noise reduction) when the invention of claim 1 is applied. From this figure, it can be seen that the S / N ratio of the output signal is always higher than the S / N ratio of the input signal, and especially when no input signal is present, that is, when the background noise alone is used, the application effect is large.

【００３６】以上のようにこの発明によれば雑音軽減効
果が大きく、従って音声信号を各種の符号化を行う際
に、符号化に先立ってこの発明による雑音軽減処理を行
うことにより、良品質な符号化音声信号を得ることがで
きる。As described above, according to the present invention, the effect of noise reduction is great. Therefore, when performing various kinds of coding on a speech signal, the noise reducing processing according to the present invention is performed prior to coding, so that good quality can be obtained. An encoded audio signal can be obtained.

[Brief description of the drawings]

【図１】請求項１の発明の実施例を示す流れ図。FIG. 1 is a flowchart showing an embodiment of the invention of claim 1 ;

【図２】請求項１の発明を適用した雑音軽減装置を示す
ブロック図。FIG. 2 is a block diagram showing a noise reduction apparatus to which the invention of claim 1 is applied.

【図３】Ａは入力信号の状態の定義例を示す図、Ｂは入
力信号の状態遷移を示す図である。3A is a diagram illustrating an example of a definition of a state of an input signal, and FIG. 3B is a diagram illustrating a state transition of an input signal;

【図４】請求項１の発明における各状態における推定統
計量の修正例を示す図。FIG. 4 is a view showing an example of correction of an estimated statistic in each state according to the invention of claim 1 ;

【図５】その他の例を示す図。FIG. 5 is a diagram showing another example.

【図６】請求項１の発明を適用した場合の入力信号のＳ
／Ｎと、出力信号のＳ／Ｎとの関係例を示す図。FIG. 6 shows the S of the input signal when the invention of claim 1 is applied.
FIG. 4 is a diagram showing an example of the relationship between / N and the S / N of an output signal.

【図７】Ａは入力信号の状態の定義の他の例を示す図、
Ｂは入力信号の状態遷移の他の例を示す図である。FIG. 7A is a diagram showing another example of the definition of the state of the input signal;
B is a diagram showing another example of the state transition of the input signal.

【図８】図７における状態遷移の条件を示す図。FIG. 8 is a diagram showing conditions for state transition in FIG. 7;

【図９】図８の続きを示す図。FIG. 9 is a view showing a continuation of FIG. 8;

【図１０】雑音軽減処理の前後に用いられる各種信号処
理の特性例を示す図。FIG. 10 is a view showing an example of characteristics of various signal processings used before and after the noise reduction processing.

【図１１】Ａ及びＢは雑音軽減手段と信号処理手段との
組合せを示すブロック図、雑音軽減手段と音声符号化、
復号化手段との組合せを示すブロック図である。11A and 11B are block diagrams each showing a combination of a noise reduction unit and a signal processing unit, and FIG.
It is a block diagram which shows the combination with a decoding means.

フロントページの続き (56)参考文献特開昭62−294315（ＪＰ，Ａ) 特開平５−49054（ＪＰ，Ａ) ＲＯＢＥＲＴＪ．ＭｃＡＵＬＡＹａｎｄＭＡＲＩＬＹＮＬ．ＭＡＬＰＡＳＳ；“ＳｐｅｅｃｈＥｎｈａｎｃｅｍｅｎｔＵａｓｉｎｇａＳｏｆｔ−ＤｅｃｉｓｉｏｎＮｏｉｓｅＳｕｐｐｒｅｓｓｉｏｎＦｉｌｔｅｒ" ＩＥＥＥＴＲＡＮＳＡＣＴＩＯＮＳＯＮＡＣＯＵＳＴＩＣＳ，ＳＰＥＥＣＨ，ＡＮＤＳＩＧＮＡＬＰＲＯＣＥＳＳＩＮＧ，ＶＯＬ．ＡＳＳＰ−28，ＮＯ．２（1980．４月）ｐ．137−145 ＳｈａｒｏｎＧａｎｎｏｔ，ＤａｖｉｄＢｕｒｓｈｔｅｉｎａｎｄＷｅｉｎｓｔｅｉｎ；“ＩｔｅｒａｔｉｖｅａｎｄＳｅｑｕｅｎｔｉａｌＫａｌｍａｎＦｉｌｔｅｒ−ＢａｓｅｄＳｐｅｅｃｈＥｎｈａｎｃｅｍｅｎｔＡｌｇｏｒｉｔｈｍｓ”ＩＥＥＥＴＲＡＮＳＡＣＴＩＯＮＳＯＮＳＰＥＥＣＨＡＮＤＡＵＤＩＯＰＲＯＣＥＳＳＩＮＧ，ＶＯＬ．６，ＮＯ．４（1998．７月）ｐ．373−385 (58)調査した分野(Int.Cl.⁷，ＤＢ名) H04B 1/10 - 1/14 H04B 15/00 H03H 17/00 - 21/00 Continuation of front page (56) References JP-A-62-294315 (JP, A) JP-A-5-49054 (JP, A) ROBERT J. McAULAY and MARILYN L. MARP ASS; "Speech Enhancement Using a Soft-Decision Noise Suppression Filter", IEEE TRANSACTIONS ON ACOUSTICS, SPEECH H, AND SIGNAL GROUP. ASSP-28, NO. 2 (April 1980) p. 137-145 Sharon Gannot, David Burshtein and Weinstein; 6, NO. 4 (July 1998) p. 373-385 (58) Fields surveyed (Int.Cl. ⁷ , DB name) H04B 1/10-1/14 H04B 15/00 H03H 17/00-21/00

Claims

(57) [Claims]

1. A speech statistic is estimated from an input signal, a filter coefficient is calculated using the statistic of the speech and a statistic of noise in the input signal, and the input signal is calculated based on the filter coefficient. In the noise reduction method of performing the filtering process, the state of the voice and the noise in the input signal, the state where only the voice exists, the state where only the background noise exists, the state where the background noise and the voice are the beginning or end of the speech, Background noise is present and speech is at least divided into steady states. Each state is stored in a frame state table, and the input signal is processed for each fixed time (frame), and the frame state table is referred to. The state of each frame is determined by the following procedure. If the determined frame state is a state where background noise is present and the sound is in a steady state, the sound is
The power of the voice estimated in the frame of
Is the noise estimated when only the background noise
If the sound power is used as is and only background noise is present, the power value of the sound is
Correct the power estimated in that frame to a smaller value
The noise uses the power estimated in the frame as it is.
If the background noise and the speech are at the beginning or end of the speech, the speech will be
Using the power estimated in that frame as it is, the noise is
Noise estimated when only the background noise immediately before exists
To weaken the noise reduction processing is used to fix small power
Adaptively the statistics of the speech and the statistics of the noise
A noise reduction method characterized by correcting .

2. A statistic of speech is estimated from an input signal recorded by a microphone for speech, and a statistic of noise is estimated from an input signal recorded by a microphone for noise. A noise reduction method for calculating a filter coefficient using the statistic and filtering the input signal using the filter coefficient, wherein a state of the voice and the noise in the input signal is determined based on a state where only the voice exists and a background noise. Only, the background noise and speech are at the beginning or end of the speech, and the background noise is present and the speech is in a steady state, and each state is stored in the frame state table. The input signal is processed at fixed time intervals (frames), and the state of each frame is determined by referring to the above-mentioned frame state table. If the background noise is present and the sound is in a steady state, the sound
The power of the voice estimated in the frame of
Is the noise estimated when only the background noise
If the sound power is used as is and only background noise is present, the power value of the sound is
Correct the power estimated in that frame to a smaller value
The noise uses the power estimated in the frame as it is.
If the background noise and the speech are at the beginning or end of the speech, the speech will be
Using the power estimated in that frame as it is, the noise is
Noise estimated when only the background noise immediately before exists
To weaken the noise reduction processing is used to fix small power
Adaptively the statistics of the speech and the statistics of the noise
A noise reduction method characterized by correcting .

3. A noise reduction method for estimating voice statistics from an input signal, calculating a filter coefficient using the voice statistics and noise statistics, and filtering the input signal using the filter coefficients. In the above, the noise statistic is estimated from an input signal in a silent section before a call, the states of voice and noise in the input signal are divided into predetermined states, and each state is stored as a frame state table. The state of speech and noise in the input signal is defined as follows: a state in which only speech is present, a state in which only background noise is present, a state in which background noise and speech are at the beginning or end of speech, and a state in which background noise is present. The voice is at least divided into a steady state, and each state is stored in a frame state table, and the input signal is processed at regular time intervals (frames). Referring to determine the status of each frame, the determined frame state, if the steady state sound exists background noise, audio its
The power of the voice estimated in the frame of
Is the noise estimated when only the background noise
If the sound power is used as is and only background noise is present, the power value of the sound is
Correct the power estimated in that frame to a smaller value
The noise uses the power estimated in the frame as it is.
If the background noise and the speech are at the beginning or end of the speech, the speech will be
Using the power estimated in that frame as it is, the noise is
Noise estimated when only the background noise immediately before exists
To weaken the noise reduction processing is used to fix small power
Adaptively the statistics of the speech and the statistics of the noise
A noise reduction method characterized by correcting .

4. A correction amount of said adaptive correction is adaptively controlled so that said filter processing is emphasized when a signal-to-noise ratio is high or noise power is low. The noise reduction method according to any one of the above.

5. The noise reduction method according to claim 1, wherein the input signal or the output signal is a predetermined value or less, and the input signal or the output signal is attenuated according to the level. .

6. The noise reduction method according to claim 5, wherein temporal control is performed on the attenuation.

7. In the first n samples of each frame (n is an integer of 2 or more and smaller than the number N of samples in one frame), the filter coefficient is updated for each sample, and the remaining (N−n−n) 7. The noise reduction method according to claim 1, wherein the filter coefficient is not updated in the sample.