JP3244252B2

JP3244252B2 - Background noise average level prediction method and apparatus

Info

Publication number: JP3244252B2
Application number: JP26573095A
Authority: JP
Inventors: 陽一羽田; 昭二牧野; 雅史田中; 潤子清水; 順治小島
Original assignee: Nippon Telegraph and Telephone Corp; NTT Inc
Current assignee: Nippon Telegraph and Telephone Corp; NTT Inc
Priority date: 1995-10-13
Filing date: 1995-10-13
Publication date: 2002-01-07
Anticipated expiration: 2015-10-13
Also published as: JPH09113350A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、受音器で受音さ
れた信号の中から背景雑音の平均レベル（平均パワー又
は平均振幅値）を予測し、例えば背景雑音の混入した音
声信号からの音声信号検出や、２線４線変換系および拡
声通話系などで使用される反響消去装置での背景雑音平
均振幅値を予測するのに適用される背景雑音平均レベル
予測方法及び装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention predicts an average level (average power or average amplitude value) of background noise from a signal received by a sound receiver, for example, from a speech signal mixed with background noise. The present invention relates to a method and an apparatus for estimating a background noise average level applied to a speech signal detection and estimating a background noise average amplitude value in a reverberation canceling device used in a two-wire / four-wire conversion system and a loudspeaker system. .

【０００２】[0002]

【従来の技術】背景雑音平均振幅値の２乗である背景雑
音平均レベルを正確に予測することは、音声検出技術や
音声スイッチ、反響消去装置などの音情報装置において
大変重要な技術である。一般にマイクロホンを使用して
音声を受音した場合には、常に空調音などの背景雑音が
混入する。この時、従来においては音声の有音区間のみ
を検出しようとした場合に、あらかじめ予測される背景
雑音レベルをしきい値Ｐ _tとし、マイクロホン受音信号
のレベルＰ_xに対し、Ｐ_x＞Ｐ_tであれば有音区間であ
ると判断するような制御が行われる。しかし、背景雑音
があらかじめ決めたしきい値以上であれば、常に有音区
間であると判断されてしまうという問題があった。2. Description of the Related Art Background noise which is the square of the average amplitude of background noise is used.
Accurately predicting the average sound level depends on voice detection technology and
For sound information devices such as voice switches and echo cancellers
This is a very important technology. Generally using a microphone
When sound is received, background noise such as air conditioning
Mixed. At this time, conventionally, only the voiced sound section
If you try to detect
Noise level at threshold P _tAnd the microphone receiving signal
Level P_xFor P_x> P_tIs a voiced section
Is performed. But background noise
Is greater than or equal to a predetermined threshold,
There was a problem that it was judged to be between.

【０００３】そこで、背景雑音レベルを適応的に予測
し、この予測値に基づいてしきい値を決定する方法が提
案されている。この方法としては、マイクロホン受音信
号を一定時間区間毎に区切り、各時間区間の平均パワー
を求め、これをある時間分だけメモリに蓄積し、その平
均パワーの中から最小のものを背景雑音レベルとする方
法がある。しかしながら、この方法では、瞬間的に背景
雑音が小さくなった場合などで背景雑音レベルが実際よ
り小さく予測されるといった問題が生じる。Therefore, a method has been proposed in which the background noise level is adaptively predicted and a threshold value is determined based on the predicted value. In this method, the microphone receiving signal is divided into fixed time intervals, an average power in each time interval is obtained, the average power is stored in a memory for a certain time, and a minimum one of the average powers is set as a background noise level. There is a method. However, this method has a problem that the background noise level is predicted to be smaller than the actual value when the background noise is instantaneously reduced.

【０００４】また、各時間区間毎の平均パワーのレベル
差が、背景雑音では音声に比べ小さいことに着目し、時
間区間毎の平均パワーのレベル差が一定値以下の場合に
は、その平均パワーを背景雑音レベルとする方法もあ
る。この方法においても、時間区間の平均パワーのレベ
ル差がどの程度になれば良いかを試行錯誤的に決める必
要があり、汎用性に欠ける。Attention is also paid to the fact that the level difference of the average power in each time section is smaller than that of speech in the background noise. If the level difference of the average power in each time section is smaller than a certain value, the average power May be used as the background noise level. Also in this method, it is necessary to determine by trial and error how much the level difference of the average power in the time section should be, which lacks versatility.

【０００５】[0005]

【発明が解決しようとする課題】従来、背景雑音のレベ
ルを予測する方法として決められた時間区間毎の平均パ
ワーレベルを用いて行う方法では、時間区間毎の平均パ
ワーの最小値を使用すると、予測される背景雑音レベル
が小さくなり過ぎ、またあらかじめ決めたしきい値を用
いると、汎用性に欠けるといった問題があった。この発
明は、情報信号に混入している背景雑音のレベルを音声
などの情報信号と背景雑音が常に混在している場合にお
いても、汎用的に予測することができる方法及び装置を
提供することを目的としている。Conventionally, a method of predicting the level of the background noise by using the average power level for each time section has been proposed. When the minimum value of the average power for each time section is used, There is a problem that the predicted background noise level becomes too small, and the use of a predetermined threshold value lacks versatility. An object of the present invention is to provide a method and apparatus that can predict the level of background noise mixed in an information signal in a general-purpose manner even when an information signal such as voice and background noise are always mixed. The purpose is.

【０００６】本願発明の背景雑音平均レベル予測は、入
力信号について所定の短時間の平均レベルを求め、複数
に分割されたレベル区間のうち短時間平均レベルが属す
る区間の出現頻度を求め、各区間について、その前後の
区間における出現頻度よりも出現頻度が大となっている
ピーク区間のうちその区間値が最小となる区間値を求
め、入力信号について短時間よりも長時間の平均レベル
を求め、最小となる区間値と対応したレベルと長時間の
平均レベルとを比較して、長時間平均レベルの方が小さ
いと所定時間連続して判定されると長時間平均レベルを
背景雑音の平均レベルとするものである。 The background noise average level prediction of the present invention
A predetermined short-term average level of the force signal is obtained.
The short-term average level belongs to the level section divided into
The appearance frequency of each section, and for each section,
The appearance frequency is higher than the appearance frequency in the section
Find the section value that minimizes the section value in the peak section
The average level of the input signal is longer than the short
And the level corresponding to the minimum section value and the long
Long term average level is smaller than average level
The average level for a long time
The average level of the background noise is used.

【０００７】さらに、背景雑音レベルの変化に追従する
ために、過去のヒストグラムデータに対し忘却係数を前
記短時間平均レベルの計算ごとに乗算する。この構成に
よれば、短時間平均レベルのヒストグラムを用いている
ために瞬間的に小さくなるような背景雑音に対してもそ
の瞬間低下に影響されることなく、平均値を確実に求め
ることが可能となる。Further, in order to follow a change in the background noise level, the past histogram data is multiplied by a forgetting factor every time the short-time average level is calculated. According to this configuration, the average value can be reliably obtained without being affected by the instantaneous reduction of background noise that is instantaneously reduced because the short-time average level histogram is used. Becomes

【０００８】[0008]

【発明の実施の形態】図１にこの発明による装置の実施
例を示す。マイクロホン１０の出力はＡＤ変換器１１で
例えば１／（８ｋＨｚ）周期でデジタル信号とされて短
時間パワー計算部１２及び長時間パワー計算部１３でそ
れぞれ短時間平均パワーの平方根及び長時間平均パワー
の平方根が計算され、その短時間平均パワー平方根はそ
のヒストグラムがヒストグラム計算部１４で計算され、
そのヒストグラムにもとづいてしきい値Ｐ_tが適応的し
きい値決定部１５で決定され、そのしきい値Ｐ_tと長時
間平均パワー平方根値とが比較部１６で比較され、その
比較結果の連続してしきい値以下の回数が計数部１７で
計数され、その計数値が所定値になるとその時の長時間
平均パワー平方根値が背景雑音平均振幅値決定部１８か
ら出力される。FIG. 1 shows an embodiment of the device according to the present invention. The output of the microphone 10 is converted into a digital signal at a period of, for example, 1 / (8 kHz) by the AD converter 11, and the square root of the short-time average power and the long-term average power by the short-time power calculation unit 12 and the long-time power calculation unit 13, respectively. The square root is calculated, and the short-time average power square root is calculated by the histogram calculator 14 for the histogram.
As the basis of the histogram threshold P _t is determined by the adaptive threshold determining section 15, and the threshold P _t and long-term mean power square value are compared by the comparator unit 16, a sequence of the comparison result Then, the number of times equal to or smaller than the threshold value is counted by the counting unit 17, and when the counted value reaches a predetermined value, the long-term average power square root value at that time is output from the background noise average amplitude value determining unit 18.

【０００９】図１に示した装置を用いるこの発明による
背景雑音平均レベル推定方法の処理手順を示す。図１及
び図２を参照してこの発明による装置の動作と、この発
明の方法とを説明する。まず計数部１７の計数値Ｋを０
に初期化し（Ｓ₁）、マイクロホン１０により受音さ
れ、ＡＤ変換器１１でデジタル信号とされた受音信号ｘ
（ｎ）（ｎはサンプリング時刻を示す）は短時間パワー
計算部１２及び長時間パワー計算部１３へ供給され予め
決められた短時間、例えば５〜１０ミリ秒間（これを長
くすると背景雑音と音声との区別ができなくなる）にお
ける平均パワーＰ _s（ｎ）及び予め決められた長時間、
例えば１００〜３００ミリ秒間(長いと平均精度はよく
なるが音声信号も混入してしまう)における平均パワー
Ｐ_L（ｎ）がそれぞれ次式で計算される。According to the present invention using the apparatus shown in FIG.
4 shows a processing procedure of a background noise average level estimation method. Figure 1 and
Referring to FIG. 2 and FIG.
The method will be explained. First, the count value K of the counting unit 17 is set to 0.
To (S₁), Received by the microphone 10
And a sound receiving signal x converted into a digital signal by the AD converter 11.
(N) (n indicates sampling time) is short-time power
Supplied to the calculation unit 12 and the long-time power calculation unit 13
For a predetermined short time, for example, 5 to 10 milliseconds (
If it is difficult to distinguish between background noise and voice)
Average power P _s(N) and a predetermined long time,
For example, for 100-300 milliseconds (the longer the average accuracy, the better
(Although it also mixes audio signals)
P_L(N) is calculated by the following equations.

【００１０】Ｐ_S(ｎ）＝α₁ ×Ｐ_S（ｎ−１）＋（１−α₁ ）×ｘ²（ｎ）（１）Ｐ_L(ｎ）＝α₂ ×Ｐ_L（ｎ−１）＋（１−α₂ ）×ｘ²（ｎ）（２） α₁ ，α₂ はそれぞれパワー平均の時間に関係する１以
下の量であり、積分時間（平均時間）をＴとするとＴ＝
１／（１−α）の関係が成り立つ。従ってα₁＜α₂ ＜
１となる。更にこれら平均パワーＰ_S（ｎ），Ｐ_L（ｎ）
の平方根√Ｐ_S（ｎ）、√Ｐ_L（ｎ）、つまり平均振幅
を求める（Ｓ₂ ，Ｓ₃）。これらの演算はｎごと、つま
り受音信号のサンプリング周期ごとに行われる。P _S (n) = α ₁ × P _S (n−1) + (1−α ₁ ) × x ² (n) (1) P _L (n) = α ₂ × P _L (n−1) ) + (1−α ₂ ) × x ² (n) (2) α ₁ and α ₂ are quantities of 1 or less relating to the time of power averaging, and T is the integration time (average time).
The relationship of 1 / (1−α) holds. Therefore, α ₁ <α ₂ <
It becomes 1. Furthermore, these average powers P _S (n) and P _L (n)
Square root _{√P S (n), √ P} L (n), that is obtaining the average amplitude (S _2, S _3). These calculations are performed for each n, that is, for each sampling cycle of the sound receiving signal.

【００１１】次にヒストグラム計算部１４において短時
間パワー平方根√Ｐ_S（ｎ)のヒストグラムを計算する
（Ｓ₄）。つまり予測される√Ｐ_S（ｎ）の範囲、例えば
最大音声振幅の半分（背景雑音は通常、音声より可成小
さい）を等間隔δで複数の振幅区間(０，１，…，Ｎ）
に分割し、短時間パワー平方根√Ｐ_S（ｎ）を計算する
ごとに、その値が属する振幅区間の数を１加算する。つ
まり下記演算を行う。Next, the histogram of the short-time power square root √P _S (n) is calculated in the histogram calculator 14 (S ₄ ). That is, a range of predicted √P _S (n), for example, half of the maximum voice amplitude (the background noise is usually smaller than voice) is divided into a plurality of amplitude sections (0, 1,..., N) at equal intervals δ.
Each time the short-time power square root ΔP _S (n) is calculated, the number of amplitude sections to which the value belongs is incremented by one. That is, the following calculation is performed.

【００１２】ｈ(int（√Ｐ_S（ｎ）／δ))＝ｈ(int（√Ｐ_S（ｎ）／δ))＋１ …（３）ｉｎｔ（Ａ）はＡの値の小数点以下を切捨て整数値化す
ることを示す。従って例えばδ＝４０で、√Ｐ_S（ｎ）
が５０であればｈ（ int(５０／４０))＝ｈ(１）とな
り、１番目の振幅区間に１を加算し、また√Ｐ_S（ｎ）
が９０であれば、ｈ( int（９０／４０）)＝ｈ（２）と
なり、２番目の振幅区間に１を加算する。[0012] _{h (int (√P S (n} ) / δ)) = h (int (√P S (n) / δ)) + 1 ... (3) int (A) is truncated after the decimal point of the value of A Indicates that the value is to be converted to an integer. Therefore, for example, when δ = 40, ΔP _S (n)
Is 50, h (int (50/40)) = h (1), and 1 is added to the first amplitude section, and ΔP _S (n)
Is 90, h (int (90/40)) = h (2), and 1 is added to the second amplitude section.

【００１３】このヒストグラムを背景雑音レベルの変化
に追従しやすくするために、１以下の忘却係数λ（例え
ば５００〜１０００ミリ秒程度）を、次式に示すよう
に、各振幅区間０〜Ｎの個数にそれぞれ乗算する
（Ｓ₅）。ｈ（ｉ）＝λ×ｈ（ｉ） …（４）ｉ＝０，１，２，…，Ｎ次に、計算されたヒストグラムｈ（ｉ）（ｉ＝０，…，
Ｎ）は適応的しきい値決定部１５に転送される。適応的
しきい値決定部１５では、ｈ（ｉ）（ｉ＝０，…，Ｎ）
に対し、ピークとなっているｈ（ｉ）を検索し、つまり
各区間の出現頻度ｈ（ｉ）についてその前後の区間の出
現頻度ｈ（ｉ−１）、ｈ（ｉ＋１）より大きい振幅区間
を探し、その各区間番号を記憶する（Ｓ₆）。さらにそ
の検索したピークの振幅区間中で最小である区間番号ｉ
を求め、これをｗとする（Ｓ₇）。音声と背景雑音が混
在した信号が図３Ａに示す場合に、その短時間パワー平
方根のヒストグラムｈ（ｉ）が図３Ｂに示すようになっ
た場合、第１番目の振幅区間の出現率ｈ（１）がピーク
となっていて、かつその区間番号ｉが最小であって、ｗ
＝１となる。その結果、適応的しきい値Ｐ_t（ｎ）を、
ｉｎｔ（）関数による切り捨ての効果を考慮して、下
記式により求める（Ｓ₈）。In order to make this histogram easily follow changes in the background noise level, a forgetting factor λ (for example, about 500 to 1000 milliseconds) of 1 or less is set for each of the amplitude sections 0 to N as shown in the following equation. multiplying respectively the number (S _5). h (i) = λ × h (i) (4) i = 0, 1, 2,..., N Next, the calculated histogram h (i) (i = 0,.
N) is transferred to the adaptive threshold value determination unit 15. In the adaptive threshold value determining unit 15, h (i) (i = 0,..., N)
, The peak h (i) is searched, that is, for the appearance frequency h (i) of each section, an amplitude section larger than the appearance frequencies h (i−1) and h (i + 1) of the preceding and following sections is determined. Then, the section number is searched and stored (S ₆ ). Further, a section number i which is the smallest in the amplitude section of the searched peak.
And this is set as w (S ₇ ). When the signal h and the noise h and the background noise are mixed as shown in FIG. 3A and the histogram h (i) of the short-time power square becomes as shown in FIG. 3B, the appearance rate h (1) of the first amplitude section is obtained. ) Is the peak and the section number i is the minimum, and w
= 1. As a result, the adaptive threshold P _t (n) is
In consideration of the effect of truncation by the int () function, the value is obtained by the following equation (S ₈ ).

【００１４】Ｐ_t（ｎ）＝δ×（ｗ＋１） …（５）ここで、実際の信号を例に取り、適応的しきい値の決定
についてさらに説明を加える。図３Ａに、マイクロホン
１０で観測された信号ｘ（ｎ）を示す。ｘ（ｎ）は、背
景雑音に音声信号が加算された信号である。図３Ｂに、
振幅区間δを４０としてヒストグラム計算部１４で計数
したヒストグラムを示す。適応的しきい値決定部１５で
は、ピークを示す出現頻度の内で、最小の振幅区間番号
を持つものを検索するが、これは図３Ｂから区間１であ
ることが分かる。その結果、適応的しきい値は、（５）
式に従い、Ｐ_t（ｎ）＝４０×（１＋１）＝８０ …（６）と計算される。これは振幅絶対値４０〜８０の出現頻度
の振幅区間番号１がピーク区間の最小区間番号であるこ
とに対応している。P _t (n) = δ × (w + 1) (5) Here, taking the actual signal as an example, the determination of the adaptive threshold will be further described. FIG. 3A shows the signal x (n) observed by the microphone 10. x (n) is a signal obtained by adding an audio signal to background noise. In FIG. 3B,
The histogram calculated by the histogram calculator 14 with the amplitude section δ as 40 is shown. The adaptive threshold value determination unit 15 searches for the frequency having the smallest amplitude section number from among the appearance frequencies indicating the peaks, and this is found to be section 1 from FIG. 3B. As a result, the adaptive threshold becomes (5)
According to the equation, P _t (n) = 40 × (1 + 1) = 80 (6) This corresponds to that the amplitude section number 1 of the appearance frequency of the amplitude absolute values 40 to 80 is the minimum section number of the peak section.

【００１５】比較部１６では、適応的しきい値決定部１
６で決定された適応的しきい値Ｐ_t(ｎ）と、長時間パワ
ー計算部１３により計算された長時間平均パワー平方根
Ｐ_L (ｎ）とを比較し（Ｓ₉）、Ｐ_L（ｎ）＜Ｐ_t（ｎ）で
あれば、計数部１７の計数値ｋを１加算し（Ｓ₁₀）、こ
の計数値が所定値Ｋ以上になったかを判定する(Ｓ₁ ₁)
。比較部１６でＰ_L（ｎ）がＰ_t（ｎ）以下であれば計
数部１７の計数値ｋを０にリセットしてステップＳ₂に
戻り（Ｓ₁₂）、ステップＳ₁₁でｋ＜Ｋであればステップ
₂に戻る。ステップＳ₁₁でｋ＞Ｋならばつまり所定時間
連続してＰ_L(ｎ）＜Ｐ_t（ｎ）であれば、Ｐ_L（ｎ）を背
景雑音平均振幅値決定部１８は背景雑音の平均振幅値と
決定する( Ｓ₁₃）。Ｋの値は時間にして例えば１秒程
度、つまり相互の会話が明らかに途切れている時間とさ
れる。In the comparing section 16, the adaptive threshold value determining section 1
Adaptive threshold value P determined in step 6_t(n) and long time power
-Long-term average power square root calculated by calculation unit 13
P_L (n) and (S₉), P_L(N) <P_t(N)
If there is, the count value k of the counting unit 17 is incremented by 1 (S_Ten), This
It is determined whether or not the counted value is equal to or greater than a predetermined value K (S₁ ₁)
. P in comparison unit 16_L(N) is P_t(N) Total if less than
The count value k of the number part 17 is reset to 0 and step S_TwoTo
Return (S₁₂), Step S₁₁And if k <K, step
_TwoReturn to Step S₁₁If k> K, that is, the predetermined time
P continuously_L(n) <P_t(N), P_L(N)
The scene noise average amplitude value determination unit 18 calculates the average amplitude value of the background noise
Decide (S₁₃). The value of K is time, for example, about 1 second
Degree, that is, the time when mutual conversation is clearly interrupted.
It is.

【００１６】図２における各処理は前述したように受音
信号のサンプリング周期で行われる。長時間パワー計算
部１３では、初期状態では、それまでのＰ_L(ｎ）がゼロ
とされ、平均のための長時間分の時間が経過した後、こ
の発明による処理が正しく行われる。上述の実施例では
平均パワーの平方根をとり平均振幅を予測したが、上述
の処理で平方根をとることなく、平均パワーで処理し、
背景雑音の平均パワーを予測してもよい。つまりこの発
明でレベルとは振幅又はパワーを意味するものである。Each process in FIG. 2 is performed at the sampling period of the sound receiving signal as described above. In the long-time power calculation unit 13, in the initial state, P _L (n) up to that point is set to zero, and after a long time for averaging has elapsed, the processing according to the present invention is correctly performed. In the above-described embodiment, the average amplitude was predicted by taking the square root of the average power.
The average power of the background noise may be predicted. That is, in the present invention, the level means amplitude or power.

【００１７】[0017]

【発明の効果】以上説明したように、この発明によれば
事前に背景雑音レベルに仮定を置くことなく、音声と背
景雑音が混在している場合においても、適応的に背景雑
音平均レベル値を適応的しきい値として予測することが
可能である。さらにこの適応的しきい値以下である正確
な背景雑音平均レベル値を測定することができる。この
発明は正確な背景雑音平均レベル値が測定可能であり、
音声検出やノイズリダクション等に利用することによ
り、音声検出の高精度化やノイズ消去量の改善が図れ
る。As described above, according to the present invention, the background noise average level value can be adaptively adjusted without assuming the background noise level in advance, even when speech and background noise are mixed. It can be predicted as an adaptive threshold. Further, an accurate background noise average level value that is equal to or less than the adaptive threshold value can be measured. The present invention is capable of measuring an accurate background noise average level value,
Utilization for voice detection, noise reduction, and the like can improve the accuracy of voice detection and improve the amount of noise cancellation.

[Brief description of the drawings]

【図１】この発明装置の実施例を示すブロック図。FIG. 1 is a block diagram showing an embodiment of the apparatus of the present invention.

【図２】この発明方法の実施例を示す流れ図。FIG. 2 is a flowchart showing an embodiment of the method of the present invention.

【図３】Ａはこの発明の効果を示すために使用した音声
と背景雑音が混在した信号を示す図、Ｂはこの発明の中
のヒストグラム計算部１４の計算結果を示す図である。FIG. 3A is a diagram showing a signal in which voice and background noise are used to show the effect of the present invention, and FIG. 3B is a diagram showing a calculation result of a histogram calculator 14 in the present invention.

───────────────────────────────────────────────────── フロントページの続き (72)発明者清水潤子東京都千代田区内幸町１丁目１番６号日本電信電話株式会社内 (72)発明者小島順治東京都千代田区内幸町１丁目１番６号日本電信電話株式会社内 (56)参考文献特開平２−272835（ＪＰ，Ａ) 特開平４−119028（ＪＰ，Ａ) 特開平４−120927（ＪＰ，Ａ) 特開平６−197049（ＪＰ，Ａ) 特開平７−74709（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) H04B 3/00 - 3/44 ──────────────────────────────────────────────────続き Continued on the front page (72) Inventor Junko Shimizu 1-1-6 Uchisaiwaicho, Chiyoda-ku, Tokyo Nippon Telegraph and Telephone Corporation (72) Junji Kojima 1-16-1 Uchisaiwaicho, Chiyoda-ku, Tokyo Japan (56) References JP-A-2-272835 (JP, A) JP-A-4-119028 (JP, A) JP-A-4-120927 (JP, A) JP-A-6-197049 ( JP, A) JP-A-7-74709 (JP, A) (58) Fields investigated (Int. Cl. ⁷ , DB name) H04B 3/00-3/44

Claims

(57) [Claims]

A step of obtaining a predetermined short- time average level of an input signal; and a step of calculating an appearance frequency of a section to which the short-time average level belongs among a plurality of divided level sections . For each of the above sections , of the peak sections where the appearance frequency is higher than the appearance frequency in the sections before and after it,
Phase and the steps of obtaining a long-time average level than short above for the input signal, the level corresponding with the period value becomes the minimum and the long-time average level for obtaining the interval value which the interval value is minimized DOO compare, the long-term average and towards the level is low it is determined continuously for a predetermined time period the steps of the long-term average level and the average level of background noise, background noise average level prediction method having.

2. The short-time average level P _S (n) (n is
Sampling time) is calculated as α ₁ × P _S (n−1) + (1−
α ₁ ) × x ² (n) (x (n) is the above input signal)
The long-term average level P _L (n) is represented by α ₂ × P _L (n−
1) + (1−α ₂ ) × x ² (n) (α ₁ <α ₂ <1)
2. The method for predicting an average background noise level according to claim 1, wherein the calculation is performed .

3. A division and the short level calculation means for calculating a mean level of a predetermined short time for the input signal, and a long level calculation means for calculating a long-time average level than short above for the input signal, a plurality Of the short-term average level
And frequency calculating means which calculates the occurrence frequency of the section Le belongs, the ward of the peak interval frequency than frequency of appearance before and after the interval for each interval has a large
Adaptive threshold value determining means for determining a level corresponding to the interval value at which the inter-value becomes a minimum as a threshold value; comparing means for comparing the threshold value with the long-term average level; A background noise average level predicting device comprising: a background noise average level determination unit that outputs a long-term average level at that time as a background noise level when a value equal to or less than a predetermined value is continuously obtained.