JPH09311696A

JPH09311696A - Automatic gain control device

Info

Publication number: JPH09311696A
Application number: JP8125697A
Authority: JP
Inventors: Masahide Mizushima; 昌英水島; Kenzo Ito; 憲三伊藤
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1996-05-21
Filing date: 1996-05-21
Publication date: 1997-12-02

Abstract

PROBLEM TO BE SOLVED: To audibly receive objective sound alone at a proper sound volume without any distortion even if the dimension of voice of a speaker or the distance from the speaker to a microphone fluctuates. SOLUTION: After an input signal is processed by a noise suppressing means 100 so that background noise is suppressed, a threshold value serving as a boundary between compression and expansion is decided from an effective value of residual noise after noise suppression for a frame in which noise is determined. Subsequently, for finding a compression ratio in a certain frame, smoothing of effective values of frames including this frame for the last several seconds is carried out. The compression ratio is calculated from the smoothed effective value and multiplied by a necessary gain provided from the threshold value and a target mean effective value so as to be recomposed and outputted to each input frame.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、拡声通話装置や難
聴者が使用する補助受聴装置等の音響機器における自動
利得調整装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an automatic gain control device for audio equipment such as a loudspeaker device or an auxiliary listening device used by a hearing-impaired person.

【０００２】[0002]

【従来の技術】ＴＶ会議装置や電話会議装置では複数人
対複数人の会話の必要性から、拡声通話が利用される場
合が多い。この際、収音マイクが机等に置かれ、発言者
の口からの距離が一様でないことも多い。すると、発言
者によってマイクへの入力レベルが一定せず、受聴側で
の聴きづらさの一因となる。また、補聴器を装用してい
れば、一対一の会話にはそれほど不自由しない難聴者の
中でも、会議や講演会等のように、話者との距離が遠く
なると、相手の言っていることがわからなくなると訴え
る人が非常に多い。この理由の一つにも補聴器への入力
レベルの低下が考えられる。2. Description of the Related Art In a TV conference apparatus and a telephone conference apparatus, a loudspeaking call is often used because of the necessity of a conversation between a plurality of persons. At this time, the sound pickup microphone is placed on a desk or the like, and the distance from the speaker's mouth is often not uniform. Then, the input level to the microphone is not constant depending on the speaker, which contributes to the difficulty of listening on the listening side. Also, if you are wearing a hearing aid, even hearing impaired people who are not so inconvenient in one-on-one conversations may say that the distance to the speaker is long, such as a conference or a lecture, etc. Many people complain that they do not understand. One of the reasons may be a decrease in the input level to the hearing aid.

【０００３】一般に音源と収音位置の距離が遠くなれば
遠くなるほど入力音圧レベルは低下する。どのくらい低
下するかは音源の指向性や部屋の反響の状態に依存する
が、例えば自由音場（壁，床，天井等の音を反射する境
界が全くない仮想空間）で音源を点音源（３６０度どの
方向にも均等に音を放射する仮想音源）と仮定すると、
音圧レベルは、音源と受聴点の距離が２倍になると６ｄ
Ｂ減衰する。実際の部屋では直接音だけでなく境界での
反射音が加算されるため、それほどは減衰しないもの
の、５ｍも離れると１ｍの位置での音圧レベルから１０
ｄＢ程度は減衰する。Generally, the input sound pressure level decreases as the distance between the sound source and the sound collecting position increases. How much is reduced depends on the directivity of the sound source and the state of reverberation of the room. For example, the sound source is a point sound source (360) in a free sound field (a virtual space where there is no boundary such as walls, floors, and ceilings that reflects sound). Assuming a virtual sound source that emits sound evenly in any direction),
The sound pressure level is 6d when the distance between the sound source and the listening point doubles.
B is attenuated. In the actual room, not only the direct sound but also the reflected sound at the boundary is added, so it is not attenuated so much, but if it is 5 m away, the sound pressure level at the position of 1 m becomes 10
It attenuates by about dB.

【０００４】入力音圧レベルを一定に保つために、従来
は例えばコンプレッサと呼ばれる装置があった。入力信
号のダイナミックレンジを圧縮して相対的に小さいレベ
ルの入力に対する利得を増すことができる装置である。
しかしこの装置は、入力レベルを毎時刻観察し、その値
によって毎時刻圧縮率を変更する。ここで毎時刻とは、
そのシステムの「時間間隔の最小単位」で、デジタルで
言えば、「毎サンプル」に対応し、アナログでは連続的
に「刻一刻」ということである。比較的レベル幅の大き
な距離減衰による音圧低下を補償するためには大きな圧
縮率が必要で、その結果目的音声が歪んでしまい明瞭性
を損ねてしまう。In order to keep the input sound pressure level constant, there has conventionally been a device called a compressor, for example. It is a device capable of compressing the dynamic range of an input signal and increasing the gain for a relatively low level input.
However, this device observes the input level every hour and changes the hourly compression rate according to the value. Here, every hour is
It is the "minimum unit of time interval" of the system, which corresponds to "every sample" in digital terms, and is continuous "every moment" in analog. A large compression ratio is required to compensate for the sound pressure drop due to the distance attenuation having a relatively large level width, and as a result, the target voice is distorted and the clarity is deteriorated.

【０００５】一方、難聴者は、補聴器のボリュームコン
トローラで利得を調整する。しかし、入力音圧レベルが
変る（例えば、話者が変る）度に利得を調整することは
非常に煩わしく、うるさくて耐えられない時（例えば地
下鉄車内等）以外、ボリュームコントローラは動かさな
いという人も多い。結果的に「声の小さい人，遠くで発
言している人の声は聴こえづらい」ことになる。On the other hand, the hearing-impaired person adjusts the gain with the volume controller of the hearing aid. However, adjusting the gain every time the input sound pressure level changes (for example, the speaker changes) is very cumbersome, and some people may not operate the volume controller except when it is noisy and unbearable (for example, in a subway train). Many. As a result, "the voice of a person with a small voice and a person speaking far away is hard to hear".

【０００６】さらに、通常の音響空間には、目的とする
音以外に暗騒音が存在する場合が多い。目的音を一定の
音圧レベルに保とうとすると、暗騒音レベルが目的音の
大小によって変動してしまうために非常に聴きづらくな
る。Further, in the normal acoustic space, background noise is often present in addition to the intended sound. If an attempt is made to keep the target sound at a constant sound pressure level, the background noise level will vary depending on the size of the target sound, making it very difficult to hear.

【０００７】それを防ぐためには、利得の制御を行う前
段で暗騒音のみを抑圧することが考えられる。そのため
の従来法として例えば本発明者の提案にかかる「雑音抑
圧装置（特願平８−１４８７４号）」のような方法があ
る。これを図５を参照して簡単に説明する。In order to prevent this, it is possible to suppress only background noise before the gain control. As a conventional method therefor, there is a method such as "noise suppression device (Japanese Patent Application No. 8-14874)" proposed by the present inventor. This will be briefly described with reference to FIG.

【０００８】図５において、２００は雑音抑圧装置、２
０１は聴覚的重み付け側、２０２は損失制御側を示して
いる。２１は入力信号端、２２は周波数分析回路、２３
は線形予測分析回路、２４は自己相関分析回路、２５は
最大値選択回路、２６は音声／非音声識別回路であり、
この出力で後述のスイッチ２７Ａ，２７Ｂをオン，オフ
制御する。In FIG. 5, reference numeral 200 denotes a noise suppressor, 2
Reference numeral 01 indicates the auditory weighting side, and 202 indicates the loss control side. 21 is an input signal terminal, 22 is a frequency analysis circuit, 23
Is a linear prediction analysis circuit, 24 is an autocorrelation analysis circuit, 25 is a maximum value selection circuit, 26 is a voice / non-voice discrimination circuit,
This output controls ON / OFF of switches 27A and 27B described later.

【０００９】２８は雑音スペクトル特性計算および記憶
回路で、ここで聴覚的重み付けが行われる。２９は減算
手段、３０は逆周波数分析回路で、周波数分析回路２２
と逆の順序の動作を行う。以上が聴覚的重み付け側２０
１に対応する。Reference numeral 28 is a noise spectrum characteristic calculation and storage circuit, in which auditory weighting is performed. 29 is a subtracting means, 30 is an inverse frequency analysis circuit, and the frequency analysis circuit 22
Performs operations in the reverse order of. The above is the auditory weighting side 20
Corresponds to 1.

【００１０】３１は平均雑音レベル記憶回路、３２は損
失制御係数計算回路、３３は出力信号計算回路、３４は
演算手段、３５は出力信号端であり、以上が損失制御側
２０２に対応する。Reference numeral 31 is an average noise level storage circuit, 32 is a loss control coefficient calculation circuit, 33 is an output signal calculation circuit, 34 is a calculating means, and 35 is an output signal terminal. The above corresponds to the loss control side 202.

【００１１】次に、動作を説明する。入力信号は信号入
力端２１から取り込まれ、従来法と同様に周波数分析回
路２２において、パワースペクトルトＳ（ｆ）と位相情
報Ｐ（ｆ）を得る。同時に、入力信号は、線形予測分析
回路２３によって線形予測残差信号（ここではこれを残
差信号と呼ぶ）を抽出する。残差信号は、自己相関分析
回路２４に送られ、ここで残差信号の自己相関関数（Ｃ
ｏｒ［ｉ］）を得る。そして、最大値選択回路２５にお
いて自己相関係数のピーク値（最大値であり、ここでは
これをＲｍａｘと呼ぶ）を求め、このＲｍａｘを用い
て、音声／非音声識別回路２６で入力信号の種類を識別
する。すなわち、Ｒｍａｘがある値（例えばＴｈ）より
も大きい場合を音声信号、それ以下を雑音と判別するこ
とにする。このＲｍａｘは、信号波形の周期性の強弱を
よく表現できる特徴量としてよく用いられる。つまり、
入力信号のうち、雑音信号の多くは時間あるいは周波数
領域においてランダムな特性を持つことが多く、また一
方、音声信号の大部分は有声音が占めており、その信号
は周期性を持つ。従って、周期性のない信号区間はこれ
を雑音と識別することは有効である。勿論、音声信号に
は無声子音が含まれており、このような周期性に関する
特徴量のみだけでは正確な音声／非音声の識別はできな
い。しかし、種々の環境騒音などの中から信号レベルが
非常に小さい無声子音（例えば、ｐ，ｔ，ｋ，ｓ，ｈ，
ｆなど）を正確に検出することは非常に困難である。従
って、図５の装置では、「確実に音声信号ではないと思
われる信号区間を識別して、その長時間平均スペクトル
特徴を求める」という考えに基ずいて音声／非音声の識
別を行っている。Next, the operation will be described. The input signal is taken in from the signal input terminal 21, and the power spectrum S (f) and the phase information P (f) are obtained in the frequency analysis circuit 22 as in the conventional method. At the same time, the linear prediction analysis circuit 23 extracts a linear prediction residual signal (this is referred to as a residual signal here) from the input signal. The residual signal is sent to the autocorrelation analysis circuit 24, where the autocorrelation function (C
or [i]) is obtained. Then, the maximum value selection circuit 25 finds the peak value of the autocorrelation coefficient (which is the maximum value and is referred to as Rmax here), and the Rmax is used to determine the type of the input signal in the voice / non-voice identification circuit 26. Identify. That is, when Rmax is larger than a certain value (for example, Th), it is determined that the signal is a voice signal and below that is noise. This Rmax is often used as a feature amount that can well express the strength of the periodicity of the signal waveform. That is,
Of the input signals, most of the noise signals have random characteristics in the time or frequency domain, while most of the voice signals are voiced sounds, and the signals have periodicity. Therefore, it is effective to identify a signal section having no periodicity as noise. Of course, the voice signal contains unvoiced consonants, and accurate voice / non-voice discrimination cannot be made only by such feature amounts relating to periodicity. However, unvoiced consonants (eg, p, t, k, s, h, etc.) whose signal level is very low among various environmental noises.
It is very difficult to accurately detect (f etc.). Therefore, in the apparatus shown in FIG. 5, speech / non-speech discrimination is performed based on the idea of “discriminating a signal section that is not surely a speech signal and obtaining a long-term average spectral feature thereof”. .

【００１２】換言すると、「確実に雑音信号と思われる
信号の平均スペクトル特性」が求められればよいわけで
あり、Ｒｍａｘを小さい値に設定しておくことによって
代表的な雑音スペクトル特性が得られる訳である。In other words, it suffices to obtain "the average spectrum characteristic of a signal that is considered to be a noise signal", and by setting Rmax to a small value, a typical noise spectrum characteristic can be obtained. Is.

【００１３】さて、周波数分析された信号スペクトルＳ
（ｆ）は、雑音と識別された場合のみスイッチ２７Ａが
閉じ、雑音スペクトルＳ_ns（ｆ）として雑音スペクトル
特性計算および記憶回路２８に蓄積される。時刻ｔに入
力信号が雑音と判定された場合の雑音スペクトル特性の
更新は、式（１）で求める。Now, the frequency-analyzed signal spectrum S
In (f), the switch 27A is closed only when it is identified as noise, and is stored in the noise spectrum characteristic calculation and storage circuit 28 as the noise spectrum S _ns (f). The update of the noise spectrum characteristic when the input signal is determined to be noise at time t is obtained by Expression (1).

【００１４】[0014]

【数１】ここで、Ｓｎｅｗ（ｔ，ｆ）は更新された雑音スペクト
ル、Ｓｏｌｄ（ｆ）は更新前の雑音スペクトル、Ｓｔ
（ｆ）は入力信号が雑音と識別された時の雑音スペクト
ルをそれぞれ示す。また、βは平均の重み係数である。[Equation 1] Here, Snew (t, f) is the updated noise spectrum, Solid (f) is the noise spectrum before the update, St
(F) shows the noise spectrum when the input signal is identified as noise. Further, β is an average weighting coefficient.

【００１５】雑音の抑圧処理には式（２）で示すような
Ｗ（ｆ）を用いる。W (f) as shown in equation (2) is used for noise suppression processing.

【００１６】[0016]

【数２】このＷ（ｆ）は、前述した残留雑音の「聴こえ」を限り
なく小さくする働きがあり、その効果は式（３）のよう
にすることで効果がより大きくなる。すなわち、Ｗ
（ｆ）のｆを周波数のポイントとしてｉに置き換える
と、[Equation 2] This W (f) has a function of reducing the above-mentioned "audibility" of the residual noise as much as possible, and the effect is further enhanced by using the equation (3). That is, W
Replacing f in (f) with i as a frequency point,

【００１７】[0017]

【数３】で表わされる。ここで、ｆｃは、入力信号の周波数帯域
に相当する値、ＢおよびＫは重み係数であり、この値が
大きいほど抑圧量が大きくなる。この聴覚的重み係数
は、式（３）で示したような特性だけでなく、雑音の平
均的特性を擬似したものでも当然同様な効果があり、式
（３）に限定されるものではない。さらに、重み係数Ｂ
とＫは、装置である値に固定してもよいが、雑音の種類
や大きさによって適応的に逐次変化させることによっ
て、雑音抑圧の効率をより大きくすることができる。(Equation 3) Is represented by Here, fc is a value corresponding to the frequency band of the input signal, B and K are weighting coefficients, and the larger this value, the larger the suppression amount. This auditory weighting coefficient naturally has a similar effect not only in the characteristic shown in the equation (3) but also in a pseudo noise average characteristic, and is not limited to the equation (3). Furthermore, the weighting factor B
Although K and K may be fixed to a value that is a device, the efficiency of noise suppression can be further increased by sequentially changing them adaptively according to the type and magnitude of noise.

【００１８】以上の処理で、入力信号に重畳した雑音の
平均的なスペクトルが除去され、新たなスペクトルＳ´
（ｆ）が減算手段２９から得られる。これと先に分析し
た位相情報Ｐ（ｆ）を用いて逆周波数分析回路３０で処
理し、周波数領域から時間領域に戻して信号波形を得
る。この信号波形は雑音の周波数成分は抑圧されている
ので音声信号のみが残ることになる。By the above processing, the average spectrum of noise superimposed on the input signal is removed, and a new spectrum S'is obtained.
(F) is obtained from the subtracting means 29. This and the previously analyzed phase information P (f) are used for processing in the inverse frequency analysis circuit 30, and the signal waveform is obtained by returning from the frequency domain to the time domain. Since the frequency component of noise is suppressed in this signal waveform, only the voice signal remains.

【００１９】[0019]

【発明が解決しようとする課題】図５に示す従来の雑音
抑圧方法は暗騒音の抑圧には極めて有効なものである
が、発話者の声の大きさやマイクロホンとの距離が変化
するような場合は適切に対応することが難しいという問
題点があった。Although the conventional noise suppression method shown in FIG. 5 is extremely effective in suppressing background noise, when the loudness of the speaker or the distance from the microphone changes. Had a problem that it was difficult to respond appropriately.

【００２０】本発明の目的は、拡声通話装置や難聴者が
使用する補助受聴装置等の音響機器において、発話者の
声の大小の変化や、マイクロホンとの距離の変化に対応
でき、定常な暗騒音を抑圧し、目的音声のみを適切な音
量でかつ歪なく受聴できるようにすることである。It is an object of the present invention to deal with a change in loudness of a speaker's voice and a change in distance to a microphone in an audio device such as a loudspeaker communication device or an auxiliary hearing device used by a hearing-impaired person, and a steady darkness. It is to suppress noise so that only the target voice can be heard at an appropriate volume and without distortion.

【００２１】[0021]

【課題を解決するための手段】上記目的を達成するた
め、本発明は入力信号中の背景雑音である暗騒音のレベ
ルと平均パワースペクトルを測定し、暗騒音の平均パワ
ースペクトルに基づいて暗騒音を抑圧する雑音抑圧処理
手段と、前記暗騒音のレベルに基づいて圧縮，伸長のし
きい値を更新するしきい値算出手段と、入力信号の実効
値を計算する実効値算出手段と、前記入力信号の実効値
の平滑化を行う実効値平滑化手段と、平滑化された実効
値から圧縮比を計算する圧縮比算出手段と、目標とする
出力の平均実効値を得るのに必要な利得を計算する利得
算出手段と、入力信号に前記圧縮比算出手段で求めた圧
縮比と前記利得算出手段で求めた利得とを乗算し、その
乗算された信号を出力する乗算手段とを有する。To achieve the above object, the present invention measures the level and background power spectrum of background noise, which is background noise in an input signal, and determines the background noise based on the average power spectrum of background noise. Noise suppression processing means for suppressing noise, threshold value calculating means for updating compression and expansion threshold values based on the background noise level, effective value calculating means for calculating an effective value of an input signal, and the input The effective value smoothing means for smoothing the effective value of the signal, the compression ratio calculating means for calculating the compression ratio from the smoothed effective value, and the gain necessary to obtain the average effective value of the target output It has a gain calculating means for calculating, and a multiplying means for multiplying the input signal by the compression ratio calculated by the compression ratio calculating means and the gain calculated by the gain calculating means and outputting the multiplied signal.

【００２２】本発明では、入力された音信号をフレーム
単位で分割し、フレーム毎に音声信号か、非音声信号、
即ち雑音かを判定する。ここでフレームとは、ある一定
時間間隔で切り出した音声区間のことで、要するに周波
数分析および実効値を計算する最小単位とするものであ
る。次にまず、フレーム毎に高速フーリエ変換ＦＦＴ
（Fast Fourier Transform）等の周波数分析回路によっ
て周波数領域に変換し、雑音と判定された信号のパワー
スペクトルを適当なフレーム数格納しておき、その平均
値，平均雑音パワースペクトルを計算する。その平均雑
音パワースペクトルに適当な重み付けをして、周波数領
域に変換された毎フレームの信号のパワースペクトルか
ら差引き、ＩＦＦＴ（Inverse Fast Fourier Transfor
m）等の逆周波数分析回路によって、時間領域信号に戻
す。次に、雑音と判定されたフレームの雑音抑圧後の信
号（残留雑音）数秒分の実効値から、圧縮と伸長の境目
であるしきい値を決定する。次に、あるフレームの圧縮
率を計算するために、そのフレームを含めて、過去数秒
分のフレームの実効値を利用して、実効値の平滑化を行
う。平滑化された実効値と時定数から圧縮率を計算し、
しきい値と目標平均実効値から得られる必要利得ととも
に各入力フレームに乗算し、再合成，出力する。In the present invention, the input sound signal is divided into frame units, and a voice signal, a non-voice signal,
That is, it is determined whether it is noise. Here, a frame is a voice section cut out at a certain fixed time interval, and is basically the minimum unit for frequency analysis and calculation of an effective value. First, the fast Fourier transform FFT is performed for each frame.
A frequency analysis circuit such as (Fast Fourier Transform) is used to transform the signal into the frequency domain, and the power spectrum of the signal determined to be noise is stored in an appropriate number of frames, and the average value and average noise power spectrum are calculated. The average noise power spectrum is appropriately weighted and subtracted from the power spectrum of the signal of each frame converted into the frequency domain to obtain IFFT (Inverse Fast Fourier Transfor
Inverse frequency analysis circuit such as m) restores the time domain signal. Next, the threshold value that is the boundary between compression and expansion is determined from the effective value of a few seconds of the noise-suppressed signal (residual noise) of the frame determined to be noise. Next, in order to calculate the compression ratio of a certain frame, the effective value of the past several seconds including the frame is used to smooth the effective value. Calculate the compression ratio from the smoothed effective value and time constant,
Each input frame is multiplied with the required gain obtained from the threshold value and the target average effective value, recombined, and output.

【００２３】[0023]

【発明の実施の形態】図１は、本発明の一実施形態のブ
ロック図である。１は入力端で、ここから入力される信
号は、適当な時間で区切られたフレーム毎の信号であ
る。まず、２は非音声識別回路でそれが雑音であるかど
うかを判定する。その方法としては、例えば前述した図
５の特願平８−１４８７４号にあるような雑音のランダ
ム性に着目した線形予測残差信号の自己相関の最大値を
利用した方法がある。すなわち、図５の線形予測分析回
路２３，自己相関分析回路２４および最大値選択回路２
５を用い、その結果を非音声識別回路２で非音声か否か
判断するようにする。1 is a block diagram of an embodiment of the present invention. Reference numeral 1 denotes an input terminal, and the signal input from this is a signal for each frame divided by an appropriate time. First, 2 is a non-speech discrimination circuit which determines whether or not it is noise. As such a method, for example, there is a method utilizing the maximum value of the autocorrelation of the linear prediction residual signal, which focuses on the randomness of noise, as disclosed in Japanese Patent Application No. 8-14874 of FIG. That is, the linear prediction analysis circuit 23, the autocorrelation analysis circuit 24, and the maximum value selection circuit 2 of FIG.
5, the non-voice discrimination circuit 2 judges whether the result is non-voice.

【００２４】３Ａ，３Ｂはスイッチ、４はＦＦＴのよう
な周波数分析回路、５は平均雑音スペクトル算出回路
で、非音声識別回路２が非音声であると識別している期
間、作動する。６は減算回路、７は逆周波数分析回路、
８は実効値算出回路、９は実効値平滑化回路、１０は圧
縮比算出回路、１１は圧縮比平滑化回路、１２は前記実
効値算出回路８の出力から平均雑音実効値を算出する平
均雑音実効値算出回路、１３はしきい値算出回路、１４
は利得算出回路、１５Ａ，１５Ｂは乗算回路、１６は出
力端を示す。3A and 3B are switches, 4 is a frequency analysis circuit such as an FFT, and 5 is an average noise spectrum calculation circuit, which operates during the period when the non-voice identification circuit 2 identifies non-voice. 6 is a subtraction circuit, 7 is an inverse frequency analysis circuit,
8 is an effective value calculating circuit, 9 is an effective value smoothing circuit, 10 is a compression ratio calculating circuit, 11 is a compression ratio smoothing circuit, and 12 is an average noise for calculating an average noise effective value from the output of the effective value calculating circuit 8. Effective value calculation circuit, 13 is a threshold value calculation circuit, 14
Is a gain calculation circuit, 15A and 15B are multiplication circuits, and 16 is an output end.

【００２５】次に図１の実施形態の動作について説明す
る。Next, the operation of the embodiment shown in FIG. 1 will be described.

【００２６】入力端１からの入力信号は非音声識別回路
２に加えられるのと並行して周波数分析回路４に送られ
周波数領域に変換される。非音声識別回路２で雑音と判
定されたフレームのパワースペクトルＳ（ｆ）は、平均
雑音スペクトル算出回路５にスイッチ３Ａの作用により
送られ、ここに格納される。過去適当なフレーム数分格
納しておき、その平均値を算出し、それを雑音平均パワ
ースペクトルＳ_ns（ｆ）とする。これに下記に述べるよ
うな適当な重み付け関数Ｗ（ｆ）を乗じて、それを減算
回路６によって、パワースペクトルＳ（ｆ）より差し引
く。雑音を差し引いた信号のパワースペクトルＳ′
（ｆ）と、原信号の位相情報Ｐ（ｆ）より、逆周波数分
析回路７で時間領域信号に戻す。ここまでが、雑音抑圧
処理手段１００の処理である。The input signal from the input terminal 1 is sent to the frequency analysis circuit 4 in parallel with being applied to the non-speech discrimination circuit 2 and converted into the frequency domain. The power spectrum S (f) of the frame determined to be noise by the non-voice identification circuit 2 is sent to the average noise spectrum calculation circuit 5 by the action of the switch 3A and stored therein. An appropriate number of frames are stored in the past, the average value thereof is calculated, and the average value is used as the noise average power spectrum S _ns (f). This is multiplied by an appropriate weighting function W (f) as described below, and the subtraction circuit 6 subtracts it from the power spectrum S (f). Power spectrum S'of the signal from which noise is subtracted
Based on (f) and the phase information P (f) of the original signal, the inverse frequency analysis circuit 7 restores the time domain signal. The processing up to this point is the processing of the noise suppression processing unit 100.

【００２７】図２は、重み付け関数Ｗ（ｆ）の説明図で
ある。重み付け関数Ｗ（ｆ）は、式（４）で表せる。FIG. 2 is an explanatory diagram of the weighting function W (f). The weighting function W (f) can be expressed by equation (4).

【００２８】[0028]

【数４】図２に示されるように、Ｗ（ｆ）は雑音パワースペクト
ルの大きいほど差し引く量を増やしている。こうするこ
とで、雑音パワーの大きな低域における消し残りと、パ
ワーの小さい高域における引き過ぎの低減を図ってい
る。(Equation 4) As shown in FIG. 2, W (f) increases the amount to be subtracted as the noise power spectrum increases. By doing so, the unerased portion in the low range where the noise power is large and the overdraw in the high range where the power is small are reduced.

【００２９】実効値算出回路８で計算された入力信号の
実効値ｒｍｓのうち、雑音と判定された過去Ｌフレーム
の実効値を平均雑音実効値算出回路１２に格納してお
く。これは、雑音抑圧処理手段１００により消し残った
残留雑音の実効値と考えられる。新たに（残留）雑音と
判定された実効値Ｎｓ（Ｎ）を格納する場合、最も古い
値Ｎｓ（Ｎ−Ｌ）を消去する。そして、それらの平均値
〈Ｎｓ（Ｎ）〉を計算し、しきい値算出回路１３で〈Ｎ
ｓ（Ｎ）〉に１以上の定数を乗算し、しきい値ｔｈｄを
得る。この作用は、非音声識別回路２で新たに入力信号
が雑音と判定された場合のみスイッチ３Ａ，３Ｂの切替
により行われる。Of the effective value rms of the input signal calculated by the effective value calculating circuit 8, the effective value of the past L frame determined to be noise is stored in the average noise effective value calculating circuit 12. This is considered to be the effective value of the residual noise left unerased by the noise suppression processing means 100. When the effective value Ns (N) newly determined as (residual) noise is stored, the oldest value Ns (NL) is deleted. Then, the average value <Ns (N)> thereof is calculated, and the threshold value calculation circuit 13 calculates <Ns (N)>.
s (N)> is multiplied by a constant of 1 or more to obtain the threshold value thd. This action is performed by switching the switches 3A and 3B only when the non-voice identification circuit 2 newly determines that the input signal is noise.

【００３０】一方、実効値算出回路８で計算された実効
値ｒｍｓは、過去Ｍフレーム分連続して実効値平滑化回
路９に格納される。新しい実効値ｒｍｓ（Ｎ）を格納す
る場合、最も古い値ｒｍｓ（Ｎ−Ｍ）を消去する。この
格納された実効値列（ｒｍｓ（Ｎ−Ｍ＋１），・・・，
ｒｍｓ（Ｎ−１），ｒｍｓ（Ｎ））から、平滑化実効値
〈ｒｍｓ（Ｎ）〉を以下の手順で得る。On the other hand, the effective value rms calculated by the effective value calculating circuit 8 is stored in the effective value smoothing circuit 9 continuously for the past M frames. When storing a new effective value rms (N), the oldest value rms (NM) is erased. This stored effective value sequence (rms (N−M + 1), ...,
The smoothed effective value <rms (N)> is obtained from rms (N-1), rms (N)) by the following procedure.

【００３１】１音声の発話中：格納された実効値列
（ｒｍｓ（Ｎ−Ｍ＋１），・・・，ｒｍｓ（Ｎ−１），
ｒｍｓ（Ｎ））の中から、しきい値ｔｈｄ以上の値を平
均する。1 During utterance of voice: stored effective value sequence (rms (N−M + 1), ..., Rms (N−1),
From rms (N)), the values above the threshold value thd are averaged.

【００３２】２音声の立上り部：一つ前のフレームの
平滑化実効値〈ｒｍｓ（Ｎ−１）〉より〈ｒｍｓ
（Ｎ）〉が著しく大きい場合、音声の立上り部、もしく
は突発音と判定し、ｒｍｓ（Ｎ）をそのまま〈ｒｍｓ
（Ｎ）〉とする。2 Speech rising portion: <rms from smoothed effective value <rms (N-1)> of the previous frame
When (N)> is extremely large, it is determined that the rising part of the voice or sudden sound is generated, and rms (N) is directly set to <rms
(N)>.

【００３３】３無音声時：Ｋ（＜Ｍ）フレーム連続し
てｔｈｄ以下の入力が続いた場合、無音声時と判定し、
Ｋフレーム全てのｒｍｓ列で平均する。3 No voice: When K (<M) frames are continuously input for thd or less, it is determined that there is no voice,
Average over the rms columns of all K frames.

【００３４】平滑化実効値〈ｒｍｓ（Ｎ）〉を使って、
圧縮比算出回路１０で圧縮比ｐ（Ｎ）を（５）式のよう
に計算する。Using the smoothed effective value <rms (N)>,
The compression ratio calculation circuit 10 calculates the compression ratio p (N) according to the equation (5).

【００３５】[0035]

【数５】さらに、特に音声の立上り，立下がりでの急激な圧縮比
ｐの変化を抑えるために、圧縮比平滑化回路１１で、
（６）式のように平滑化する。(Equation 5) Further, in order to suppress a sudden change in the compression ratio p particularly at the rise and fall of voice, the compression ratio smoothing circuit 11
Smoothing is performed as in equation (6).

【００３６】[0036]

【数６】〈ｐ（Ｎ−１）〉は一つ前のフレームの平滑化圧縮比、
Ｃ０＋Ｃ１＝１．０であり、Ｃ１が大きいほど、滑らか
に変化する。利得算出回路１４では、出力される音声の
目標とする平均実効値をｄ１とすると、（７）式で利得
Ｇが計算される。(Equation 6) <P (N-1)> is the smoothing compression ratio of the previous frame,
C0 + C1 = 1.0, and the larger C1, the smoother the change. In the gain calculation circuit 14, assuming that the target average effective value of the output voice is d1, the gain G is calculated by the expression (7).

【００３７】[0037]

【数７】乗算回路１５Ａ，１５Ｂで〈ｐ〉，Ｇを入力信号に乗算
し、出力端１６より出力する。(Equation 7) The multiplying circuits 15A and 15B multiply <p> and G by the input signal, and output the output signal from the output terminal 16.

【００３８】図３は、入出力の関係を両対数で示したも
のである。しきい値ｔｈｄ以上の入力は全て目標値ｄ１
に圧縮，増幅され、ｔｈｄ以下は伸長，減衰されてい
る。FIG. 3 shows the input / output relationship in logarithmic logarithm. All inputs above the threshold value thd are target values d1
Is compressed and amplified, and below thd is expanded and attenuated.

【００３９】図４は、本発明によって得られる処理結果
の一例である。図４（ａ）が原波形で後者の信号は前者
より１５ｄＢ減衰しており、さらに定常な空調雑音を後
者の信号に対して１０ｄＢ減衰させて付加したものであ
る。図４（ｂ）は処理結果の波形であり、後者の音声が
前者とほぼ同じレベルに増幅されている一方、暗騒音は
逆に抑圧されている。図４（ｃ）は、総合利得（〈ｐ〉
×Ｇ）の時間変動をレベル表示したものである。この図
から細かい変動がなく、音声に大きな歪みが生じないこ
とが分かる。FIG. 4 shows an example of the processing result obtained by the present invention. FIG. 4 (a) shows the original waveform, in which the latter signal is attenuated by 15 dB from the former signal, and stationary air conditioning noise is further attenuated by 10 dB and added to the latter signal. FIG. 4B shows the waveform of the processing result. The latter voice is amplified to almost the same level as the former voice, while the background noise is suppressed. FIG. 4C shows the total gain (<p>
× G) is a level display of time variation. From this figure, it can be seen that there is no small fluctuation and no large distortion occurs in the voice.

【００４０】なお、上記本発明の実施の形態においてブ
ロック図の各部を「回路」として示してあるが、これは
プログラムで実行することもあるので、一般的には、
「手段」として表現される。In the above-described embodiment of the present invention, each part of the block diagram is shown as a "circuit". However, since this may be executed by a program, in general,
Expressed as "means".

【００４１】[0041]

【発明の効果】以上のように本発明によれば、雑音を抑
圧した後に利得制御を行うため、目的音のみを一定の音
圧レベルにすることが可能である。また、しきい値を自
動的に設定し、圧縮比を計算するための実効値を平滑化
することで、発話者の声の大きさやマイクロホンとの距
離に応じて速やかに、かつ滑らかに適切な利得を得るこ
とができる。その結果、従来のコンプレッサを使用した
ダイナミックレンジを圧縮する方法で起こるような音声
の歪みは生じず、歪みのない明瞭な音声を常に一定の平
均音圧レベルで受聴することができるようになる。As described above, according to the present invention, since the gain control is performed after the noise is suppressed, it is possible to set only the target sound to a constant sound pressure level. In addition, by automatically setting the threshold value and smoothing the effective value for calculating the compression ratio, an appropriate value can be quickly and smoothly adjusted according to the loudness of the speaker's voice and the distance from the microphone. Gain can be obtained. As a result, the distortion of the sound that occurs in the method of compressing the dynamic range using the conventional compressor does not occur, and clear sound without distortion can always be heard at a constant average sound pressure level.

[Brief description of drawings]

【図１】本発明の一実施形態を示すブロック図である。FIG. 1 is a block diagram showing an embodiment of the present invention.

【図２】本発明で用いる重み付け関数の説明図である。FIG. 2 is an explanatory diagram of a weighting function used in the present invention.

【図３】本発明の入出力関係図を示す図である。FIG. 3 is a diagram showing an input / output relationship diagram of the present invention.

【図４】本発明の一処理例を示す図である。FIG. 4 is a diagram showing a processing example of the present invention.

【図５】先に提案した雑音抑圧装置を示すブロック図で
ある。FIG. 5 is a block diagram showing a previously proposed noise suppression device.

[Explanation of symbols]

１入力端２非音声識別回路３Ａスイッチ３Ｂスイッチ４周波数分析回路５平均雑音スペクトル算出回路６減算回路７逆周波数分析回路８実効値算出回路９実効値平滑化回路１０圧縮比算出回路１１圧縮比平滑化回路１２平均雑音実効値算出回路１３しきい値算出回路１４利得算出回路１５Ａ乗算回路１５Ｂ乗算回路１６出力端１００雑音抑圧処理手段 1 Input Terminal 2 Non-Voice Discrimination Circuit 3A Switch 3B Switch 4 Frequency Analysis Circuit 5 Average Noise Spectrum Calculation Circuit 6 Subtraction Circuit 7 Inverse Frequency Analysis Circuit 8 Effective Value Calculation Circuit 9 Effective Value Smoothing Circuit 10 Compression Ratio Calculation Circuit 11 Compression Ratio Smoothing Circuit 12 average noise effective value calculation circuit 13 threshold value calculation circuit 14 gain calculation circuit 15A multiplication circuit 15B multiplication circuit 16 output terminal 100 noise suppression processing means

Claims

[Claims]

1. An automatic gain adjustment device in an audio device such as a loudspeaker communication device or an auxiliary hearing device used by a hearing-impaired person, wherein the level of background noise that is background noise in an input signal and an average power spectrum are measured. Noise suppression processing means for suppressing background noise based on the average power spectrum of background noise; threshold value calculating means for updating compression and expansion threshold values based on the background noise level; An effective value calculating means for calculating a value, an effective value smoothing means for smoothing the effective value of the input signal, a compression ratio calculating means for calculating a compression ratio from the smoothed effective value, and a target output Gain calculation means for calculating the gain required to obtain the average effective value of, and the input signal multiplied by the compression ratio calculated by the compression ratio calculation means and the gain calculated by the gain calculation means, Signal Automatic gain control apparatus characterized by comprising: a multiplying means for, the.