JPH09113350A

JPH09113350A - Method and apparatus for predicting average level of background noise

Info

Publication number: JPH09113350A
Application number: JP26573095A
Authority: JP
Inventors: Yoichi Haneda; 陽一羽田; Shoji Makino; 昭二牧野; Masafumi Tanaka; 雅史田中; Jiyunko Shimizu; 潤子清水; Junji Kojima; 順治小島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1995-10-13
Filing date: 1995-10-13
Publication date: 1997-05-02
Anticipated expiration: 2015-10-13
Also published as: JP3244252B2

Abstract

PROBLEM TO BE SOLVED: To determine the average level of background noise by dividing the range of background noise level into a plurality of sections, determining the generation frequency of short time average level obtained from a sound receiving signal, and then judging a long time average level obtained independently based on a threshold value corresponding to a value in the minimum section among peak sections. SOLUTION: Output from a microphone 10 is passed through an A/D converter 11 and fed to power calculating sections 12, 13 where short time and long time root mean powers are determined. A histogram calculating section 14 determines the histogram of short time root mean power and an adaptive threshold determining section 15 determines a threshold value corresponding to the value in minimum section among the peak section of frequency of occurrence. The long time root mean power is compared with its threshold value and the number of times when the long time root mean power is consecutively lower than the threshold value is counted at a counting section 17. When a predetermined count is reached, a long time root mean power is outputted from a background noise mean amplitude determining section 18 thus determining an accurate mean amplitude of background noise.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】この発明は、受音器で受音さ
れた信号の中から背景雑音の平均レベル（平均パワー又
は平均振幅値）を予測し、例えば背景雑音の混入した音
声信号からの音声信号検出や、２線４線変換系および拡
声通話系などで使用される反響消去装置での背景雑音平
均振幅値を予測するのに適用される背景雑音平均レベル
予測方法及び装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention predicts an average level (average power or average amplitude value) of background noise from a signal received by a sound receiver and, for example, from an audio signal mixed with background noise. BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a background noise average level prediction method and apparatus applied to predict a background noise average amplitude value in an echo canceller used in voice signal detection, a two-wire four-wire conversion system, a voice communication system, and the like. .

【０００２】[0002]

【従来の技術】背景雑音平均振幅値の２乗である背景雑
音平均レベルを正確に予測することは、音声検出技術や
音声スイッチ、反響消去装置などの音情報装置において
大変重要な技術である。一般にマイクロホンを使用して
音声を受音した場合には、常に空調音などの背景雑音が
混入する。この時、従来においては音声の有音区間のみ
を検出しようとした場合に、あらかじめ予測される背景
雑音レベルをしきい値Ｐ _tとし、マイクロホン受音信号
のレベルＰ_xに対し、Ｐ_x＞Ｐ_tであれば有音区間であ
ると判断するような制御が行われる。しかし、背景雑音
があらかじめ決めたしきい値以上であれば、常に有音区
間であると判断されてしまうという問題があった。Background noise Background noise which is the square of the average amplitude value of background noise.
Accurate prediction of the average sound level depends on voice detection technology and
In sound information devices such as voice switches and echo cancellers
This is a very important technology. Generally using a microphone
When receiving sound, background noise such as air conditioning noise is always
mixing. At this time, conventionally only the voiced section of the voice
Predictive background when trying to detect
Noise level is threshold P _tAnd microphone received signal
Level P_xOn the other hand, P_x> P_tIf it is a sound section
The control is performed to determine that But the background noise
Is equal to or greater than a predetermined threshold,
There was a problem that it was decided that it was in the middle.

【０００３】そこで、背景雑音レベルを適応的に予測
し、この予測値に基づいてしきい値を決定する方法が提
案されている。この方法としては、マイクロホン受音信
号を一定時間区間毎に区切り、各時間区間の平均パワー
を求め、これをある時間分だけメモリに蓄積し、その平
均パワーの中から最小のものを背景雑音レベルとする方
法がある。しかしながら、この方法では、瞬間的に背景
雑音が小さくなった場合などで背景雑音レベルが実際よ
り小さく予測されるといった問題が生じる。Therefore, a method has been proposed in which the background noise level is adaptively predicted and the threshold value is determined based on this predicted value. This method divides the microphone sound signal into fixed time intervals, calculates the average power of each time interval, stores this for a certain amount of time in memory, and selects the minimum of the average power as the background noise level. There is a way to. However, this method has a problem that the background noise level is predicted to be smaller than the actual value when the background noise is momentarily reduced.

【０００４】また、各時間区間毎の平均パワーのレベル
差が、背景雑音では音声に比べ小さいことに着目し、時
間区間毎の平均パワーのレベル差が一定値以下の場合に
は、その平均パワーを背景雑音レベルとする方法もあ
る。この方法においても、時間区間の平均パワーのレベ
ル差がどの程度になれば良いかを試行錯誤的に決める必
要があり、汎用性に欠ける。Also, paying attention to the fact that the level difference of the average power in each time section is smaller than that of the voice in the background noise, and when the level difference of the average power in each time section is less than a certain value, the average power is There is also a method of using as the background noise level. Also in this method, it is necessary to determine by trial and error how much the level difference of the average power in the time section should be, and this lacks versatility.

【０００５】[0005]

【発明が解決しようとする課題】従来、背景雑音のレベ
ルを予測する方法として決められた時間区間毎の平均パ
ワーレベルを用いて行う方法では、時間区間毎の平均パ
ワーの最小値を使用すると、予測される背景雑音レベル
が小さくなり過ぎ、またあらかじめ決めたしきい値を用
いると、汎用性に欠けるといった問題があった。この発
明は、情報信号に混入している背景雑音のレベルを音声
などの情報信号と背景雑音が常に混在している場合にお
いても、汎用的に予測することができる方法及び装置を
提供することを目的としている。In the conventional method of using the average power level for each time interval determined as a method for predicting the level of background noise, if the minimum value of the average power for each time interval is used, There is a problem that the predicted background noise level becomes too small, and if a predetermined threshold value is used, it lacks versatility. The present invention provides a method and apparatus that can predict the level of background noise mixed in an information signal in a general-purpose manner even when the information signal such as voice and the background noise are always mixed. Has an aim.

【０００６】[0006]

【課題を解決するための手段】この発明では、マイクロ
ホンで受音された信号の短時間平均レベルをその短時間
ごとに計算し、背景雑音レベルの範囲を複数の区間毎に
分け、その各区間について計算した短時間平均レベルの
発生頻度、つまりヒストグラムを計算し、そのピークと
なる区間のうち最小の区間のレベルと対応する値を適応
的しきい値とし、前記短時間よりも長い時間における受
音信号の平均レベルを求め、この長時間平均レベルが前
記適応的しきい値以下に一定時間連続して存在する場合
には、その長時間平均レベル値を背景雑音平均レベルと
する。According to the present invention, a short-time average level of a signal received by a microphone is calculated for each short time, the background noise level range is divided into a plurality of sections, and each of the sections is divided. The frequency of occurrence of the short-time average level calculated for, that is, a histogram is calculated, and the value corresponding to the level of the minimum section of the peak section is used as an adaptive threshold, and the threshold value for a time longer than the short section is received. The average level of the sound signal is obtained, and if the long-term average level is continuously lower than the adaptive threshold value for a certain period of time, the long-time average level value is set as the background noise average level.

【０００７】さらに、背景雑音レベルの変化に追従する
ために、過去のヒストグラムデータに対し忘却係数を前
記短時間平均レベルの計算ごとに乗算する。この構成に
よれば、短時間平均レベルのヒストグラムを用いている
ために瞬間的に小さくなるような背景雑音に対してもそ
の瞬間低下に影響されることなく、平均値を確実に求め
ることが可能となる。Further, in order to follow the change in the background noise level, the forgetting factor is multiplied to the past histogram data for each calculation of the short-time average level. According to this configuration, since the histogram of the short-time average level is used, it is possible to reliably obtain the average value without being affected by the instantaneous decrease even for the background noise that becomes momentarily small. Becomes

【０００８】[0008]

【発明の実施の形態】図１にこの発明による装置の実施
例を示す。マイクロホン１０の出力はＡＤ変換器１１で
例えば１／（８ｋＨｚ）周期でデジタル信号とされて短
時間パワー計算部１２及び長時間パワー計算部１３でそ
れぞれ短時間平均パワーの平方根及び長時間平均パワー
の平方根が計算され、その短時間平均パワー平方根はそ
のヒストグラムがヒストグラム計算部１４で計算され、
そのヒストグラムにもとづいてしきい値Ｐ_tが適応的し
きい値決定部１５で決定され、そのしきい値Ｐ_tと長時
間平均パワー平方根値とが比較部１６で比較され、その
比較結果の連続してしきい値以下の回数が計数部１７で
計数され、その計数値が所定値になるとその時の長時間
平均パワー平方根値が背景雑音平均振幅値決定部１８か
ら出力される。1 shows an embodiment of the device according to the present invention. The output of the microphone 10 is converted into a digital signal with a period of 1 / (8 kHz) by the AD converter 11, and the short-time power calculation unit 12 and the long-time power calculation unit 13 respectively calculate the square root of the short-time average power and the long-time average power. The square root is calculated, and the histogram of the short-time average power square root is calculated by the histogram calculation unit 14,
The threshold value P _t is determined by the adaptive threshold value determination unit 15 based on the histogram, the threshold value P _t and the long-time average power square root value are compared by the comparison unit 16, and the comparison results are continuously calculated. Then, the number of times equal to or less than the threshold value is counted by the counting unit 17, and when the count value reaches a predetermined value, the long-time average power square root value at that time is output from the background noise average amplitude value determination unit 18.

【０００９】図１に示した装置を用いるこの発明による
背景雑音平均レベル推定方法の処理手順を示す。図１及
び図２を参照してこの発明による装置の動作と、この発
明の方法とを説明する。まず計数部１７の計数値Ｋを０
に初期化し（Ｓ₁）、マイクロホン１０により受音さ
れ、ＡＤ変換器１１でデジタル信号とされた受音信号ｘ
（ｎ）（ｎはサンプリング時刻を示す）は短時間パワー
計算部１２及び長時間パワー計算部１３へ供給され予め
決められた短時間、例えば５〜１０ミリ秒間（これを長
くすると背景雑音と音声との区別ができなくなる）にお
ける平均パワーＰ _s（ｎ）及び予め決められた長時間、
例えば１００〜３００ミリ秒間(長いと平均精度はよく
なるが音声信号も混入してしまう)における平均パワー
Ｐ_L（ｎ）がそれぞれ次式で計算される。According to the invention using the device shown in FIG.
The processing procedure of the background noise average level estimation method is shown. Figure 1 and
And the operation of the device according to the invention and
Ming's method is explained. First, the count value K of the counter 17 is set to 0.
Initialized to (S₁), Received by the microphone 10
And the sound reception signal x converted into a digital signal by the AD converter 11
(N) (n indicates sampling time) is short-time power
It is supplied to the calculation unit 12 and the long-term power calculation unit 13 in advance.
A fixed short time, for example 5-10 milliseconds
If you turn it off, you will not be able to distinguish the background noise from the voice).
Average power P _s(N) and a predetermined long time,
For example, 100 to 300 milliseconds (the longer the average accuracy, the better
However, the audio signal is also mixed in)
P_LEach (n) is calculated by the following equation.

【００１０】Ｐ_S(ｎ）＝α₁×Ｐ_S（ｎ−１）＋（１−α₁）×ｘ²（ｎ）（１）Ｐ_L(ｎ）＝α₂×Ｐ_L（ｎ−１）＋（１−α₂）×ｘ²（ｎ）（２） α₁，α₂はそれぞれパワー平均の時間に関係する１以
下の量であり、積分時間（平均時間）をＴとするとＴ＝
１／（１−α）の関係が成り立つ。従ってα₁＜α₂＜
１となる。更にこれら平均パワーＰ_S（ｎ），Ｐ_L（ｎ）
の平方根√Ｐ_S（ｎ）、つまり振幅、√Ｐ_L（ｎ）、つま
り平均振幅を求める（Ｓ₂，Ｓ₃）。これらの演算はｎ
ごと、つまり受音信号のサンプリング周期ごとに行われ
る。P _S (n) = α ₁ × P _S (n-1) + (1-α ₁ ) × x ² (n) (1) P _L (n) = α ₂ × P _L (n-1) ) + (1−α ₂ ) × x ² (n) (2) α ₁ and α ₂ are quantities less than or equal to 1 related to the power averaging time, and T = T when the integration time (averaging time) is T =
The relationship of 1 / (1-α) is established. Therefore α ₁ <α ₂ <
It becomes 1. Further, these average powers P _S (n) and P _L (n)
The square root √P _S (n), that is, the amplitude, √P _L (n), that is, the average amplitude is obtained (S ₂ , S ₃ ). These operations are n
Every time, that is, every sampling period of the sound reception signal.

【００１１】次にヒストグラム計算部１４において短時
間パワー平方根√Ｐ_S（ｎ)のヒストグラムを計算する
（Ｓ₄）。つまり予測される√Ｐ_S（ｎ）の範囲、例えば
最大音声振幅の半分（背景雑音は通常、音声より可成小
さい）を等間隔δで複数の振幅区間(０，１，…，Ｎ）
に分割し、短時間パワー平方根√Ｐ_S（ｎ）を計算する
ごとに、その値が属する振幅区間の数を１加算する。つ
まり下記演算を行う。Next, the histogram calculation unit 14 calculates a histogram of the short-time power square root √P _S (n) (S ₄ ). That is, a predicted range of √P _S (n), for example, half of the maximum voice amplitude (background noise is usually smaller than voice) is divided into a plurality of amplitude sections (0, 1, ..., N) at equal intervals δ.
Every time the short-time power square root √P _S (n) is calculated, the number of amplitude sections to which the value belongs is incremented by one. That is, the following calculation is performed.

【００１２】ｈ(int（√Ｐ_S（ｎ）／δ))＝ｈ(int（√Ｐ_S（ｎ）／δ))＋１ …（３）ｉｎｔ（Ａ）はＡの値の小数点以下を切捨て整数値化す
ることを示す。従って例えばδ＝４０で、√Ｐ_S（ｎ）
が５０であればｈ（ int(５０／４０))＝ｈ(１）とな
り、１番目の振幅区間に１を加算し、また√Ｐ_S（ｎ）
が９０であれば、ｈ( int（９０／４０）)＝ｈ（２）と
なり、２番目の振幅区間に１を加算する。H (int (√P _S (n) / δ)) = h (int (√P _S (n) / δ)) + 1 (3) int (A) rounds down the fractional part of the value of A Indicates that the value is an integer. Therefore, for example, when δ = 40, √P _S (n)
Is 50, h (int (50/40)) = h (1), 1 is added to the first amplitude section, and √P _S (n)
Is 90, h (int (90/40)) = h (2), and 1 is added to the second amplitude section.

【００１３】このヒストグラムを背景雑音レベルの変化
に追従しやすくするために、１以下の忘却係数λ（例え
ば５００〜１０００ミリ秒程度）を、次式に示すよう
に、各振幅区間０〜Ｎの個数にそれぞれ乗算する
（Ｓ₅）。ｈ（ｉ）＝λ×ｈ（ｉ） …（４）ｉ＝０，１，２，…，Ｎ次に、計算されたヒストグラムｈ（ｉ）（ｉ＝０，…，
Ｎ）は適応的しきい値決定部１５に転送される。適応的
しきい値決定部１５では、ｈ（ｉ）（ｉ＝０，…，Ｎ）
に対し、ピークとなっているｈ（ｉ）を検索し、つまり
各区間の出現頻度ｈ（ｉ）についてその前後の区間の出
現頻度ｈ（ｉ−１）、ｈ（ｉ＋１）より大きい振幅区間
を探し、その各区間番号を記憶する（Ｓ₆）。さらにそ
の検索したピークの振幅区間中で最小である区間番号ｉ
を求め、これをｗとする（Ｓ₇）。音声と背景雑音が混
在した信号が図３Ａに示す場合に、その短時間パワー平
方根のヒストグラムｈ（ｉ）が図３Ｂに示すようになっ
た場合、第１番目の振幅区間の出現率ｈ（１）がピーク
となっていて、かつその区間番号ｉが最小であって、ｗ
＝１となる。その結果、適応的しきい値Ｐ_t（ｎ）を、
ｉｎｔ（）関数による切り捨ての効果を考慮して、下
記式により求める（Ｓ₈）。In order to make this histogram easy to follow changes in the background noise level, a forgetting factor λ of 1 or less (for example, about 500 to 1000 milliseconds) is set for each amplitude section 0 to N as shown in the following equation. Each number is multiplied (S ₅ ). h (i) = λ × h (i) (4) i = 0, 1, 2, ..., N Next, the calculated histogram h (i) (i = 0, ..., N).
N) is transferred to the adaptive threshold value determining unit 15. In the adaptive threshold value determining unit 15, h (i) (i = 0, ..., N)
On the other hand, the peak h (i) is searched for, that is, with respect to the appearance frequency h (i) of each section, an amplitude section larger than the appearance frequencies h (i-1) and h (i + 1) of the sections before and after that is found. looking, and stores the each section number (S _6). Further, the section number i that is the smallest in the amplitude section of the retrieved peak
Is obtained and is set as w (S ₇ ). When the signal in which voice and background noise are mixed is shown in FIG. 3A and the histogram h (i) of the short-time power square root thereof is as shown in FIG. 3B, the appearance rate h (1 ) Is a peak, and the section number i is the smallest, and w
= 1. As a result, the adaptive threshold P _t (n) is
In consideration of the effect of truncation by the int () function, it is determined by the following formula (S ₈ ).

【００１４】Ｐ_t（ｎ）＝δ×（ｗ＋１） …（５）ここで、実際の信号を例に取り、適応的しきい値の決定
についてさらに説明を加える。図３Ａに、マイクロホン
１０で観測された信号ｘ（ｎ）を示す。ｘ（ｎ）は、背
景雑音に音声信号が加算された信号である。図３Ｂに、
振幅区間δを４０としてヒストグラム計算部１４で計数
したヒストグラムを示す。適応的しきい値決定部１５で
は、ピークを示す出現頻度の内で、最小の振幅区間番号
を持つものを検索するが、これは図３Ｂから区間１であ
ることが分かる。その結果、適応的しきい値は、（５）
式に従い、Ｐ_t（ｎ）＝４０×（１＋１）＝８０ …（６）と計算される。これは振幅絶対値４０〜８０の出現頻度
の振幅区間番号１がピーク区間の最小区間番号であるこ
とに対応している。P _t (n) = δ × (w + 1) (5) Here, taking an actual signal as an example, the determination of the adaptive threshold value will be further described. FIG. 3A shows a signal x (n) observed by the microphone 10. x (n) is a signal in which the voice signal is added to the background noise. In FIG. 3B,
The histogram calculated by the histogram calculation unit 14 when the amplitude section δ is 40 is shown. The adaptive threshold value determining unit 15 searches for the one having the smallest amplitude section number among the appearance frequencies showing the peaks, and it can be seen from FIG. 3B that it is the section 1. As a result, the adaptive threshold is (5)
According to the formula, P _t (n) = 40 × (1 + 1) = 80 (6) is calculated. This corresponds to that the amplitude section number 1 of the appearance frequency of the absolute amplitude values 40 to 80 is the minimum section number of the peak section.

【００１５】比較部１６では、適応的しきい値決定部１
６で決定された適応的しきい値Ｐ_t(ｎ）と、長時間パワ
ー計算部１３により計算された長時間平均パワー平方根
Ｐ_L (ｎ）とを比較し（Ｓ₉）、Ｐ_L（ｎ）＜Ｐ_t（ｎ）で
あれば、計数部１７の計数値ｋを１加算し（Ｓ₁₀）、こ
の計数値が所定値Ｋ以上になったかを判定する(Ｓ₁ ₁)
。比較部１６でＰ_L（ｎ）がＰ_t（ｎ）以下であれば計
数部１７の計数値ｋを０にリセットしてステップＳ₂に
戻り（Ｓ₁₂）、ステップＳ₁₁でｋ＜Ｋであればステップ
₂に戻る。ステップＳ₁₁でｋ＞Ｋならばつまり所定時間
連続してＰ_L(ｎ）＜Ｐ_t（ｎ）であれば、Ｐ_L（ｎ）を背
景雑音平均振幅値決定部１８は背景雑音の平均振幅値と
決定する( Ｓ₁₃）。Ｋの値は時間にして例えば１秒程
度、つまり相互の会話が明らかに途切れている時間とさ
れる。In the comparison unit 16, the adaptive threshold value determination unit 1
Adaptive threshold P determined in 6_t(n) and long time power
-Long time average power square root calculated by the calculator 13.
P_L Compare with (n) (S₉), P_L(N) <P_tIn (n)
If there is, the count value k of the counting unit 17 is incremented by 1 (S_Ten), This
It is judged whether the count value of is equal to or more than the predetermined value K (S₁ ₁)
. P in the comparison unit 16_L(N) is P_tIf (n) or less, total
The count value k of the number part 17 is reset to 0 and step S_TwoTo
Return (S₁₂), Step S₁₁If k <K, step
_TwoReturn to Step S₁₁If k> K, that is, predetermined time
P in succession_L(n) <P_tIf (n), then P_L(N) back
The background noise average amplitude value determination unit 18 calculates the average amplitude value of background noise and
Determine (S₁₃). The value of K is about 1 second in time
Degree, that is, the time and
It is.

【００１６】図２における各処理は前述したように受音
信号のサンプリング周期で行われる。長時間パワー計算
部１３では、初期状態では、それまでのＰ_L(ｎ）がゼロ
とされ、平均のための長時間分の時間が経過した後、こ
の発明による処理が正しく行われる。上述の実施例では
平均パワーの平方根をとり平均振幅を予測したが、上述
の処理で平方根をとることなく、平均パワーで処理し、
背景雑音の平均パワーを予測してもよい。つまりこの発
明でレベルとは振幅又はパワーを意味するものである。Each processing in FIG. 2 is performed at the sampling cycle of the sound reception signal as described above. In the long-time power calculation unit 13, in the initial state, P _L (n) up to that point is set to zero, and after the time corresponding to the long time for averaging has elapsed, the process according to the present invention is correctly performed. In the above-mentioned embodiment, the square root of the average power was taken to predict the average amplitude, but the square root was not taken in the above processing, but the average power was processed,
The average power of background noise may be predicted. That is, in the present invention, the level means amplitude or power.

【００１７】[0017]

【発明の効果】以上説明したように、この発明によれば
事前に背景雑音レベルに仮定を置くことなく、音声と背
景雑音が混在している場合においても、適応的に背景雑
音平均レベル値を適応的しきい値として予測することが
可能である。さらにこの適応的しきい値以下である正確
な背景雑音平均レベル値を測定することができる。この
発明は正確な背景雑音平均レベル値が測定可能であり、
音声検出やノイズリダクション等に利用することによ
り、音声検出の高精度化やノイズ消去量の改善が図れ
る。As described above, according to the present invention, the background noise average level value is adaptively set even when the voice and the background noise are mixed without making an assumption on the background noise level in advance. It can be predicted as an adaptive threshold. Furthermore, it is possible to measure an accurate background noise average level value that is below this adaptive threshold. This invention can measure accurate background noise average level value,
By using it for voice detection, noise reduction, etc., it is possible to improve the accuracy of voice detection and improve the amount of noise elimination.

[Brief description of the drawings]

【図１】この発明装置の実施例を示すブロック図。FIG. 1 is a block diagram showing an embodiment of the device of the present invention.

【図２】この発明方法の実施例を示す流れ図。FIG. 2 is a flow chart showing an embodiment of the method of the present invention.

【図３】Ａはこの発明の効果を示すために使用した音声
と背景雑音が混在した信号を示す図、Ｂはこの発明の中
のヒストグラム計算部１４の計算結果を示す図である。3A is a diagram showing a signal mixed with voice and background noise used to show the effect of the present invention, and FIG. 3B is a diagram showing a calculation result of a histogram calculating section 14 in the present invention.

───────────────────────────────────────────────────── フロントページの続き (72)発明者清水潤子東京都千代田区内幸町１丁目１番６号日本電信電話株式会社内 (72)発明者小島順治東京都千代田区内幸町１丁目１番６号日本電信電話株式会社内 ─────────────────────────────────────────────────── ─── Continuation of front page (72) Inventor Junko Shimizu 1-1-6 Uchisaiwaicho, Chiyoda-ku, Tokyo Nihon Telegraph and Telephone Corp. Nippon Telegraph and Telephone Corporation

Claims

[Claims]

1. A step of obtaining a level of an average input signal for a predetermined short time, and a step of obtaining a frequency at which the above-obtained short-time average level appears in each section obtained by dividing a predetermined range of the background noise level into a plurality of sections. For each section, the step of obtaining the section value of the minimum section in the peak section whose appearance frequency is higher than the appearance frequency in the sections before and after the section, and the input signal of the time longer than the short time The step of obtaining the average level, the step of comparing the level corresponding to the section value of the minimum section with the average level of the long time, and the comparison that it is judged that the long-term average level is smaller is continuous. For a predetermined time, the average level of the long-term average is used as the average level of background noise, and the average level of background noise is predicted.

2. The background noise average level prediction method according to claim 1, wherein the minimum section of the peak section is obtained after multiplying the appearance frequency of each section by a forgetting factor.

3. A microphone for receiving an incoming sound, a short time level calculating means for obtaining an average level of the output of the microphone for a predetermined short time, and an average of the output of the microphone for a time longer than the short time. Long-time level calculating means for obtaining the level, histogram calculating means for obtaining the frequency at which the obtained short-time average level appears for each section obtained by dividing the predetermined range of the background noise level into a plurality of sections, and before and after each section. An adaptive threshold value determining means for obtaining as a threshold value a level corresponding to the interval value of the minimum interval in the peak interval in which the appearance frequency is higher than the appearance frequency in the interval, the threshold value and the length When the comparison means for comparing the time average level with the time average level is continuously obtained a predetermined number of times or less in the above comparison, the long time average level at that time is obtained. A background noise average level predicting apparatus comprising: a background noise average level determining means for outputting a bell as a background noise level.