JP2003309493A

JP2003309493A - Method, device and program for reducing echo

Info

Publication number: JP2003309493A
Application number: JP2002115157A
Authority: JP
Inventors: Akira Emura; 暁江村; Yoichi Haneda; 陽一羽田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2002-04-17
Filing date: 2002-04-17
Publication date: 2003-10-31
Anticipated expiration: 2022-04-17
Also published as: JP3756839B2

Abstract

<P>PROBLEM TO BE SOLVED: To propose an echo reducing method for suppressing only echo components from a voice acquisition signal superimposed with the voice of a speaker around. <P>SOLUTION: In a speech communication system with amplified voices, an M channel reproduced signal, and the power spectrum and cross spectrum of a voice acquisition signal are calculated, the coherence between the M channel reproduced signal and the voice acquisition signal of at least one channel or more is calculated, the rate of echo components occupying the voice acquisition signal is estimated from them for each frequency band, an echo suppression gain is calculated from the rate, and the short time spectrum of the voice acquisition signal is multiplied by the echo suppression gain to suppress the echo. <P>COPYRIGHT: (C)2004,JPO

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】この発明は、拡声通話系にお
いてハウリングの原因となる音響エコーを抑圧する反響
低減方法、反響低減装置、反響低減プログラムに関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a reverberation reducing method, a reverberation reducing device, and a reverberation reducing program for suppressing acoustic echo that causes howling in a loud voice communication system.

【０００２】[0002]

【従来の技術】拡声通話系では、受話音声がスピーカか
ら拡声されマイクロホンに収音されて音響エコーが生
じ、そのまま送信されると通話の障害や不快感などの問
題が生じる。そして対地の拡声通話系を含めて形成され
る閉ループのループゲインが１より大きい場合には、音
響エコーはハウリングを引き起こし、通話を不可能にす
る。このような拡声通話系の問題点を克服し自然な通話
環境を実現するために、エコーキャンセラ（エコー消去
装置）を用いて、スピーカからマイクロホンへの音響的
回込みによるエコーを消去する。エコーキャンセラは、
スピーカ再生信号と収音信号からエコー経路を推定し音
響エコー信号を予測し、収音信号から予測エコー信号を
差し引くことでエコー消去をはかる。2. Description of the Related Art In a voice call system, a received voice is voiced from a speaker and picked up by a microphone to generate an acoustic echo. If the voice is transmitted as it is, problems such as a call disturbance and discomfort occur. When the loop gain of the closed loop formed including the voice communication system to the ground is larger than 1, the acoustic echo causes howling and makes the communication impossible. In order to overcome such a problem of the voice communication system and realize a natural communication environment, an echo canceller (echo canceling device) is used to cancel the echo caused by the acoustic sneak from the speaker to the microphone. Echo canceller
Echo cancellation is performed by estimating the echo path from the speaker reproduction signal and the sound collection signal, predicting the acoustic echo signal, and subtracting the predicted echo signal from the sound collection signal.

【０００３】Ｍチャネル再生系と１チャネル収音系とで
構成される音響エコーキャンセラは、図７に示すような
構成により音響エコーの消去を行う。受話端子１_m(ｍ＝
１．．．Ｍ)からの受話信号はスピーカ２_m(ｍ＝
１．．．Ｍ)で音響信号として再生され、音響エコー経
路を経てマイクロホン３に回りこむ。同時に予測エコー
生成部４１に入力されて予測エコー信号が生成される。
減算器４２によってマイクロホン３からの収音信号y(k)
と予測エコー信号との差がとられ、この残差信号がエコ
ー経路推定部４３にフィードバックされる。マイクロホ
ンがＮ個ある場合には、図７のようなＭ入力１出力型の
エコー消去部４を、Ｎ個並列に並べた構成をとる。スピ
ーカ２_mからマイクロホンまでの音響エコー経路のイン
パルス応答をｈ_m(k)、その長さをＬとすると、受話チャ
ネル数Ｍ＝１のとき入力信号と収音信号の間には、An acoustic echo canceller composed of an M-channel reproduction system and a 1-channel sound collection system eliminates acoustic echoes by the structure shown in FIG. Receiver terminal 1 _m (m =
1. ．． The received signal from M) is the speaker 2 _m (m =
1. ．． M) is reproduced as an acoustic signal, and circulates to the microphone 3 via the acoustic echo path. At the same time, the predicted echo signal is input to the predicted echo generation unit 41 to generate a predicted echo signal.
The subtractor 42 collects the sound pickup signal y (k) from the microphone 3.
And the predicted echo signal are obtained, and this residual signal is fed back to the echo path estimation unit 43. When there are N microphones, the configuration is such that N M-input 1-output type echo cancellers 4 as shown in FIG. 7 are arranged in parallel. Assuming that the impulse response of the acoustic echo path from the speaker 2 _m to the microphone is h _m (k) and the length thereof is L, when the number M of receiving channels is M = 1, between the input signal and the picked-up signal,

【数３】の関係があり、インパルス応答と入力信号を[Equation 3] And the impulse response and the input signal

【数４】のようにベクトル化すると、入力信号と収音信号の関係
は次のように簡潔に記述される。[Equation 4] When the vectorization is performed as described above, the relationship between the input signal and the collected sound signal is simply described as follows.

【０００４】[0004]

【数５】受話チャネル数Ｍ≧２のときも、インパルス応答と入力
信号を[Equation 5] Even when the number of receiving channels M ≧ 2, the impulse response and the input signal are

【数６】のようにベクトル化することで、入力信号と収音信号の
関係を受話チャネル数Ｍ＝１のケースと同様に記述でき
る。エコー消去部４の内部では、予測エコー生成部４１
により予測エコー信号が生成され、実際の収音信号との
差e(k)および過去の受話信号に基づいて、収音信号と予
測エコー信号の差が小さくなるように予測エコー生成用
の適応フィルタ係数が逐次更新される。適応フィルタ係
数の更新法としてＮＬＭＳ法を用いた場合、適応フィル
タは[Equation 6] By vectorizing as described above, the relationship between the input signal and the picked-up signal can be described in the same manner as in the case where the number of reception channels M = 1. Inside the echo canceller 4, the predicted echo generator 41
A predictive echo signal is generated by the adaptive filter for predictive echo generation so that the difference between the picked-up signal and the predicted echo signal becomes small based on the difference e (k) from the actual picked-up signal and the past received signal. Coefficients are updated sequentially. When the NLMS method is used to update the adaptive filter coefficient, the adaptive filter is

【数７】により更新される。ただしμは推定を安定にするために
０〜１の固定値に設定されるステップサイズであり、e
(k)は収音信号y(k)から適応フィルタによる予測エコー
信号を差し引いた残差信号である。適応フィルタが収束
した状態では各周波数帯域で振幅と位相が一致している
エコー信号が予測されるようになり、エコーを十分に消
去することができる。[Equation 7] Updated by However, μ is a step size set to a fixed value of 0 to 1 to stabilize the estimation, and e
(k) is a residual signal obtained by subtracting the predicted echo signal by the adaptive filter from the collected sound signal y (k). With the adaptive filter converged, an echo signal whose amplitude and phase match in each frequency band can be predicted, and the echo can be sufficiently canceled.

【０００５】[0005]

【発明が解決しようとする課題】しかし実際にエコーキ
ャンセラが使用される状況では、常に十分にエコー消去
できるとは限らない。各周波数帯域で振幅と位相が一致
している予測エコー信号が生成されるまでには多量の情
報を必要とする。音声が入力される場合には、射影アル
ゴリズムなどの高速な適応アルゴリズムを使用しても収
束までに数秒を要する。エコー経路は人の体の動きなど
により容易に変動し、変動直後の数秒は残留エコーが増
大してしまう。However, in a situation where the echo canceller is actually used, it is not always possible to sufficiently cancel the echo. A large amount of information is required until a predicted echo signal whose amplitude and phase match in each frequency band is generated. When voice is input, it takes several seconds to converge even if a high-speed adaptive algorithm such as a projection algorithm is used. The echo path easily changes due to the movement of the human body, etc., and the residual echo increases for a few seconds immediately after the change.

【０００６】また適応フィルタは、遠端話者音声がスピ
ーカで再生されて生じた音響エコー信号のみが収音さ
れ、近端話者音声が存在しないと想定してエコー経路推
定を行っている。そのため、遠端話者と近端話者が同時
に会話するダブルトーク状態では、エコー経路推定が不
安定になる。これを避けるために、通常ダブルトーク状
態を検出してステップサイズμを０に設定することで、
エコー経路推定を停止させている。しかし、エコー経路
の推定が不十分な状態では、残差信号に残留エコー信号
と送話信号の両方が含まれてしまうために、ダブルトー
クの検出率が低下する。ダブルトーク判定を誤り、収音
信号にエコー以外の信号が含まれている状態で適応フィ
ルタを更新すると、エコー経路の推定誤差が拡大して残
留エコー信号が増大してしまう。The adaptive filter estimates the echo path on the assumption that only the acoustic echo signal generated by reproducing the far-end speaker's voice by the speaker is picked up and the near-end speaker's voice does not exist. Therefore, in the double-talk state in which the far-end speaker and the near-end speaker talk at the same time, the echo path estimation becomes unstable. To avoid this, usually by detecting the double talk state and setting the step size μ to 0,
Echo path estimation is stopped. However, when the estimation of the echo path is insufficient, the residual signal includes both the residual echo signal and the transmission signal, and thus the double-talk detection rate is reduced. If the double-talk determination is erroneous and the adaptive filter is updated in a state where the picked-up signal includes a signal other than the echo, the estimation error of the echo path increases and the residual echo signal increases.

【０００７】さらに、マルチチャネルのエコーキャンセ
ラでは、受話信号のチャネル間相関が高いために、エコ
ーが消去されている状態であっても推定されたエコー経
路と真のエコー経路は必ずしも一致しないことが、文献
M. M. Sondhi, D. R. Morgan, and J. L. Hall, "Stere
o-phonic Acoustic Echo Cancellation -An Overviewof
theFundamental Problem,”IEEE Signal Processing L
etters, vol.2,no.8,pp.148-151(1995)に詳細に解析さ
れている。推定されたエコー経路と真のエコー経路が一
致していない状態では、話者が交代して受話信号のチャ
ネル間相互相関が変化した瞬間に、突然音響エコーが消
去されなくなる。Further, in the multi-channel echo canceller, the estimated echo path and the true echo path do not always match even when the echo is canceled because the received signal has a high inter-channel correlation. , Literature
MM Sondhi, DR Morgan, and JL Hall, "Stere
o-phonic Acoustic Echo Cancellation -An Overviewof
theFundamental Problem, ”IEEE Signal Processing L
It is analyzed in detail in etters, vol.2, no.8, pp.148-151 (1995). When the estimated echo path and the true echo path do not coincide with each other, the acoustic echo suddenly ceases to be canceled at the moment when the talkers take turns and the cross-correlation between channels of the received signal changes.

【０００８】このように真のエコー経路と推定したエコ
ー経路に乖離があると、残留エコーが生じて通話品質が
劣化してしまう。そこで、各周波数帯域で収音信号に占
めるエコー成分の比率を随時推定する方法があると仮定
してみる。すると、各周波数帯域で音響エコーy_E(k)相
当分だけ収音信号の振幅を減衰させることが可能にな
る。近端話者の音声スペクトルとエコーのスペクトルの
重なりは小さいケースが多いので、近端の音声スペクト
ルをなるべく保ちながら音響エコーのスペクトルを抑圧
することができ通話品質の改善が期待される。If there is a deviation between the estimated echo path and the estimated echo path in this way, residual echo occurs and the speech quality deteriorates. Therefore, it is assumed that there is a method for estimating the ratio of the echo component in the sound pickup signal in each frequency band as needed. Then, it becomes possible to attenuate the amplitude of the picked-up signal by an amount corresponding to the acoustic echo y _E (k) in each frequency band. Since the overlap between the voice spectrum of the near-end speaker and the echo spectrum is small in many cases, the spectrum of the acoustic echo can be suppressed while maintaining the near-end voice spectrum as much as possible, and it is expected that the speech quality will be improved.

【０００９】この処理では、各周波数帯域でエコーの位
相は推定されない。そのため予測エコーの振幅・位相を
収音されたエコーの振幅・位相に一致させようとする適
応フィルタと比較して、推定に必要な情報が少なくな
り、エコー経路変化などへの応答性が向上する可能性が
ある。また近端話者音声の有無によらずに収音信号中に
占めるエコー成分の比率を求めることができるならば、
ダブルトーク検出の結果に影響されることなくエコーを
抑圧することが可能になる。しかし、収音信号に占める
エコー成分比率の推定は、これまで困難であると考えら
れてきた。それは、エコー信号に近端話者音声が重畳し
て収音信号になっており、収音信号からエコー信号だけ
を分離抽出できないためである。仮に予め近端話者の音
声信号パワーが一定でそのレベルが既知であれば、収音
信号に占めるエコー成分のパワー比は算出可能である。
しかし、通常のケースでは、音声レベルは時々刻々と変
動し、一定とはみなせない。In this process, the echo phase is not estimated in each frequency band. Therefore, compared to an adaptive filter that tries to match the amplitude / phase of the predicted echo with the amplitude / phase of the picked-up echo, less information is needed for estimation and the response to echo path changes etc. is improved. there is a possibility. If the ratio of the echo component in the collected signal can be obtained regardless of the presence or absence of the near-end speaker's voice,
Echo can be suppressed without being affected by the result of double-talk detection. However, it has been considered difficult to estimate the echo component ratio in the collected sound signal. This is because the near-end speaker's voice is superimposed on the echo signal to form a sound collection signal, and only the echo signal cannot be separated and extracted from the sound collection signal. If the voice signal power of the near-end speaker is constant and its level is known in advance, the power ratio of the echo component in the sound pickup signal can be calculated.
However, in a normal case, the voice level fluctuates from moment to moment and cannot be considered constant.

【００１０】[0010]

【課題を解決するための手段】この発明では、スピーカ
Ｍ個（Ｍは２以上の整数）とマイクロホンＮ個（Ｎは１
以上の整数）が共通の音場に配置され、スピーカからＭ
チャネル受話信号を再生し、マイクロホンからの収音信
号を処理して送信信号とする拡声通話システムにおい
て、Ｍチャネル再生信号と、収音信号のパワースペクト
ルとクロススペクトルを求め、Ｍチャネル再生信号と少
なくとも１チャネル以上の収音信号のコヒーレンスを求
め、これらから周波数帯域ごとに収音信号に占めるエコ
ー成分の比率を推定し、この比率からエコー抑圧ゲイン
を算出し、収音信号の短時間スペクトルにエコー抑圧ゲ
インを乗算することで、エコーを抑圧する反響低減方法
を提案する。According to the present invention, M speakers (M is an integer of 2 or more) and N microphones (N is 1) are used.
The above integers) are placed in a common sound field, and the
In a loudspeaker communication system that reproduces a channel reception signal and processes a sound pickup signal from a microphone into a transmission signal, obtains an M channel reproduction signal, a power spectrum and a cross spectrum of the sound collection signal, and obtains at least the M channel reproduction signal. Obtain the coherence of the picked-up signal of more than one channel, estimate the ratio of the echo component in the picked-up signal for each frequency band from these, calculate the echo suppression gain from this ratio, and echo in the short-time spectrum of the picked-up signal. We propose an echo reduction method that suppresses echoes by multiplying the suppression gain.

【００１１】この発明では更に、前記反響低減方法にお
いて、第１チャネル再生信号の短時間スペクトルと収音
信号y(k)の短時間スペクトルから第１のコヒーレンスγ
² _1y(f)を求め、第ｍチャネル再生信号（ｍ≧２）から第
１〜第m-1チャネル再生信号との相関成分を除去した信
号の短時間スペクトルと、収音信号y(k)から第１〜第m-
1チャネル再生信号との相関成分を除去した信号の短時
間スペクトルから、第ｍのコヒーレンスγ² _my(m-1)(f)
を求め、第１〜第Ｍのコヒーレンスから収音信号に占め
るエコー成分の比をAccording to the present invention, further, in the echo reduction method, the first coherence γ is calculated from the short-time spectrum of the first channel reproduction signal and the short-time spectrum of the sound collection signal y (k).
² _1y (f) is obtained, and the short-time spectrum of the signal obtained by removing the correlation component between the m-th channel reproduced signal (m ≧ 2) and the 1st to (m-1) th channel reproduced signals and the picked-up signal y (k) From 1st to m-
The m-th coherence γ ² _{my (m-1)} (f) from the short-time spectrum of the signal from which the correlation component with the 1-channel playback signal has been removed
From the 1st to Mth coherences,

【数８】で求める反響低減方法を提案する。[Equation 8] We propose a method for reducing echoes required in.

【００１２】この発明では更に、前記の反響低減方法の
何れかにおいて、周波数帯域ごとに指定した収音信号に
占める受話エコー成分のパワー比率をγ²(f)として、Further, in the present invention, in any one of the above echo reduction methods, the power ratio of the received echo component in the sound pickup signal designated for each frequency band is γ ² (f),

【数９】に設定する反響低減方法を提案する。この発明では更
に、前記反響低減方法の何れかにおいて、スピーカに出
力される信号を擬似エコー経路に入力して擬似エコー信
号を生成し、マイクロホンからの収音信号より前記擬似
エコー信号を差し引き、その残差信号をもとに擬似エコ
ー経路を更新するエコー消去処理を経た収音信号を対象
としてエコーを抑圧する反響低減方法を提案する。この
発明では更に、共通の音場に配置されたスピーカＭ個
（Ｍは２以上の整数）とマイクロホンＮ個（Ｎは１以上
の整数）と接続され、Ｍチャネル再生信号と収音信号の
パワースペクトルを求める手段とクロススペクトルを求
める手段と、パワースペクトルの情報とクロススペクト
ルの情報からＭチャネル再生信号と収音信号のコヒーレ
ンスを求めて、周波数帯域ごとに収音信号に占めるエコ
ー成分の比率を推定する手段と、この比率からエコー抑
圧ゲインを算出する手段と、収音信号の短時間スペクト
ルにエコー抑圧ゲインを乗算する手段とを備える反響低
減装置を提案する。[Equation 9] We propose a method for reducing echoes set to. In any one of the echo reduction methods according to the present invention, the signal output to the speaker is input to the pseudo echo path to generate the pseudo echo signal, and the pseudo echo signal is subtracted from the sound pickup signal from the microphone. We propose an echo reduction method that suppresses echoes for a picked-up signal that has undergone echo cancellation processing that updates the pseudo echo path based on the residual signal. Further, in the present invention, M speakers (M is an integer of 2 or more) and N microphones (N is an integer of 1 or more) arranged in a common sound field are connected, and the powers of the M channel reproduction signal and the sound collection signal are connected. The means for obtaining the spectrum and the means for obtaining the cross spectrum, and the coherence of the M-channel reproduction signal and the picked-up signal from the information of the power spectrum and the information of the crossed spectrum, determine the ratio of the echo component in the picked-up signal for each frequency band. An echo reduction apparatus is proposed which includes a means for estimating, a means for calculating an echo suppression gain from this ratio, and a means for multiplying a short-time spectrum of a sound pickup signal by the echo suppression gain.

【００１３】この発明では更に、前記反響低減装置にお
いて、第１チャネル再生信号の短時間スペクトルと収音
信号の短時間スペクトルから第１のコヒーレンスを求め
る手段と、第ｍチャネル再生信号（ｍ≧２）から第１〜
第m-1チャネル再生信号との相関成分を除去した信号の
短時間スペクトルと、収音信号から第１〜第m-1チャネ
ル再生信号との相関成分を除去した信号の短時間スペク
トルから第ｍのコヒーレンスを求める手段とを備える反
響低減装置を提案する。この発明では更に、前記反響低
減装置の何れかにおいて、スピーカに出力される再生信
号を擬似エコー経路に入力して擬似エコー信号を生成す
る手段と、マイクロホンからの収音信号から前記擬似エ
コー信号を差し引く手段と、その残差信号をもとに擬似
エコー経路を更新する手段とによるエコー消去処理手段
を経た収音信号を対象とする反響低減装置を提案する。
この発明では更に、前記反響低減方法の何れかをコンピ
ュータにより実行する反響低減プログラムを提案する。According to the present invention, further, in the echo reduction device, means for obtaining a first coherence from the short-time spectrum of the first channel reproduction signal and the short-time spectrum of the sound pickup signal, and the m-th channel reproduction signal (m ≧ 2) ) From the first
From the short-time spectrum of the signal from which the correlation component with the m-1th channel reproduction signal has been removed and from the short-time spectrum of the signal from which the correlation component with the 1st to m-1th channel reproduction signal has been removed from the collected signal And a means for determining the coherence of the echo reduction device. According to the present invention, further, in any of the echo reduction devices, means for generating a pseudo echo signal by inputting a reproduction signal output to a speaker into a pseudo echo path, and the pseudo echo signal from a sound pickup signal from a microphone. We propose an echo reduction device that targets a sound pickup signal that has passed through an echo cancellation processing unit that includes a subtraction unit and a unit that updates a pseudo echo path based on the residual signal.
The present invention further proposes an echo reduction program that executes any of the echo reduction methods by a computer.

【００１４】作用収音信号に占めるエコー成分の比率を推定する目的に、
コヒーレンスすなわちクロススペクトルをパワースペク
トルで正規化して得られる複素関数の振幅２乗値を用い
ることができる。いま音響エコー信号をy_E(k)、近端話
者の音声などエコー以外の信号をy_I(k)とすると、収音
信号は、[0014] the purpose of estimating the proportion of echo component occupying the action collected sound signal,
It is possible to use the magnitude squared value of the complex function obtained by normalizing the coherence, that is, the cross spectrum with the power spectrum. If the acoustic echo signal is y _E (k) and the non-echo signal such as the voice of the near-end speaker is y _I (k), the picked-up signal is

【数１０】になる。再生チャネル数Ｍ＝１のケースでは、再生信号
と収音信号のコヒーレンスは、[Equation 10] become. In the case of the number of reproduction channels M = 1, the coherence between the reproduction signal and the picked-up signal is

【数１１】で定義される。ただし、 S_xx(f)：再生信号のパワースペクトル S_yy(f)：収音信号のパワースペクトル S_xy(f)：再生信号と収音信号のクロススペクトルである。通常、再生信号x(k)とエコー以外の信号y
_I(k)、およびエコー信号y_E(k)とエコー以外の信号y_I(k)
は無相関と見なせるので、[Equation 11] Is defined by However, S _xx (f): power spectrum of reproduced signal S _yy (f): power spectrum of collected signal S _xy (f): cross spectrum of reproduced signal and collected signal. Normally, the playback signal x (k) and the signal y other than the echo are
_I (k), and echo signal y _E (k) and non-echo signal y _I (k)
Can be considered uncorrelated, so

【数１２】が成立している。また、再生信号x(k)からエコー信号y_E
(k)への伝達特性がほぼ一定と見なせる場合には、２つ
の信号のクロススペクトルについて[Equation 12] Has been established. Also, from the reproduced signal x (k) to the echo signal y _E
When the transfer characteristic to (k) can be regarded as almost constant, the cross spectrum of two signals

【数１３】が成立している。これより再生信号x(k)と収音信号y(k)
のコヒーレンスは、[Equation 13] Has been established. From this, the playback signal x (k) and the picked-up signal y (k)
The coherence of

【数１４】を満たしている。[Equation 14] Meets

【００１５】この式によれば、このコヒーレンスとは、
再生信号と相関のある成分が収音信号のパワースペクト
ルに占める割合である。すなわち再生信号と収音信号の
コヒーレンスは、収音信号に占めるエコー成分のパワー
比率を表わしている。尚、コヒーレンスについては例え
ば日野著、朝倉書店発行『スペクトル解析』に、コヒー
レンスを用いた解析については例えば森下、小畑著、計
測自動制御学会発行『信号処理』に詳しい。従って信号
パワーで見ると、エコー以外の信号が収音信号中に占め
る割合はAccording to this equation, this coherence is
It is the ratio of the component having a correlation with the reproduction signal to the power spectrum of the collected sound signal. That is, the coherence between the reproduced signal and the collected sound signal represents the power ratio of the echo component in the collected sound signal. Details of coherence are described in, for example, "Spectrum Analysis" published by Hino and Asakura Shoten, and details of analysis using coherence are described in, for example, "Signal Processing" by Morishita and Obata. Therefore, in terms of signal power, the ratio of signals other than echoes in the collected sound signal is

【数１５】で求められる。収音信号y(k)をフレーム化してフーリエ
変換により周波数領域に変換したY(f)について、各周波
数帯域で[Equation 15] Required by. In each frequency band, Y (f), which is the sound pickup signal y (k) framed and transformed into the frequency domain by Fourier transform,

【数１６】のように処理することでエコー成分を抑圧できる。この
処理結果Z(f)を時間領域に変換して、エコー成分が抑圧
された信号z(k)が得られる。なおソフト判定による推定
(Soft-decision estimation),最小二乗誤差による推定
(Minimum Mean Square Error estimation)，最尤推定法
(Maximum Likelihood estimation)を用いると、エコー
以外の信号が収音信号中に占める割合及びエコー抑圧ゲ
インはγ²(f)から複数の方法によって求めることができ
る。各推定手法については、文献P. Scalart and J. V.
Filho, "Speech Enhancement based on a priori sign
al tonoise estimation," Proc. ICASSP96, pp. 629-63
2(1996)に詳しい。[Equation 16] The echo component can be suppressed by processing as described above. This processing result Z (f) is transformed into the time domain to obtain the signal z (k) in which the echo component is suppressed. Estimated by soft judgment
(Soft-decision estimation), estimation by least square error
(Minimum Mean Square Error estimation), maximum likelihood estimation method
When (Maximum Likelihood estimation) is used, the ratio of the signal other than the echo to the picked-up signal and the echo suppression gain can be obtained from γ ² (f) by a plurality of methods. For each estimation method, see P. Scalart and JV.
Filho, "Speech Enhancement based on a priori sign
al tonoise estimation, "Proc. ICASSP96, pp. 629-63
Details in 2 (1996).

【００１６】同様に任意の再生チャネル数についても収
音信号に占めるエコー成分の比率としてコヒーレンスを
用いることができる。例えば、再生チャネル数Ｍ＝２の
とき、収音信号に占めるエコー成分の比率はSimilarly, for any number of reproduction channels, coherence can be used as a ratio of echo components in a sound pickup signal. For example, when the number of reproduction channels M = 2, the ratio of the echo component in the sound pickup signal is

【数１７】になり、 γ² _1y(f):x₁(k)とy(k)のコヒーレンス γ² _2y(1)(f):x₁(k)との相関成分が除去されたx₂(k)お
よびy(k)から求めたコヒーレンスである。なお、上記の
ような多入力１出力系のコヒーレンスについては、文献
J.S. Bendat and A.G. Piersol, "Engineering Applica
tions of Correlation and Spectral Analysis," John
Wiley & Sons (1980) に詳しい。[Equation 17] X ₂ (k) where the correlation component between γ ² _1y (f): x ₁ (k) and the coherence γ ² _{2y (1)} (f): x ₁ (k) of y (k) is removed. And the coherence obtained from y (k). Regarding the coherence of the above-mentioned multi-input 1-output system, see
JS Bendat and AG Piersol, "Engineering Applica
tions of Correlation and Spectral Analysis, "John
Familiar with Wiley & Sons (1980).

【００１７】[0017]

【発明の実施の形態】この発明による反響低減方法は、
図１の音響エコー抑圧部６の信号フローにより実現され
る。図１に示す音響エコー抑圧部６は、拡声通話系の信
号フロー図（図７）のエコー消去部４の代りに用いるこ
とができる。以下では、各信号が１フレーム＝２Ｌサン
プルでL/Dサンプル毎にブロック化される場合について
説明する。ステップ１時間領域−周波数領域変換部（以下ＴＦ変換部と称す）ＴＦ変換部61_I〜61_Mは再生信号x_I(k)〜x_M(k)を、ＴＦ変
換部62は収音信号y(k)をそれぞれBEST MODE FOR CARRYING OUT THE INVENTION The echo reduction method according to the present invention is
This is realized by the signal flow of the acoustic echo suppression unit 6 in FIG. The acoustic echo suppressor 6 shown in FIG. 1 can be used in place of the echo canceller 4 in the signal flow diagram (FIG. 7) of the voice communication system. Hereinafter, a case will be described in which each signal is divided into blocks for each L / D sample with 1 frame = 2L samples. Step 1 Time domain-frequency domain transforming unit (hereinafter referred to as TF transforming unit) TF transforming units 61 _{I to} 61 _M play signals x _I (k) to x _M (k), and TF transforming unit 62 collects sound signals y. (k) respectively

【数１８】のように周波数領域係数に変換する。[Equation 18] To the frequency domain coefficient.

【００１８】ステップ２エコー成分比率推定部63は、時刻k=j Ｌ/ＤにおけるＭ
チャネル再生信号と収音信号の短時間スペクトルを入力
とし、図２の信号フローに従って収音信号に占めるエコ
ー成分の比率を周波数帯域ごとに推定する。相関除去部
631_m(m=2~M)は、第ｍチャネル再生信号x_m(k)から第１〜
第m-1チャネル再生信号との相関成分を除去した信号の
短時間スペクトルX_m(m-1)(f,j)を求める。ただしその入
力は、第１チャネル再生信号および既に相関成分除去処
理を経た第２〜第m-1チャネル再生信号であり、実際の
処理は下式で記述される。 Step 2 The echo component ratio estimation unit 63 uses the M at time k = j L / D.
The short-time spectrums of the channel reproduction signal and the sound collection signal are input, and the ratio of the echo component in the sound collection signal is estimated for each frequency band according to the signal flow of FIG. Correlation remover
631 _m (m = 2 to M) is the first to the first m-th channel reproduction signal x _m (k)
A short-time spectrum X _{m (m-1)} (f, j) of the signal from which the correlation component with the m-1th channel reproduction signal is removed is obtained. However, the input is the reproduction signal of the first channel and the reproduction signals of the 2nd to m-1th channels that have already undergone the correlation component removal processing, and the actual processing is described by the following equation.

【数１９】ここでε[]は時間平均をとることを意味する。時間平均
処理は、例えば[Formula 19] Here, ε [] means to take a time average. The time averaging process is, for example,

【数２０】のように、１フレーム前の処理結果と０〜１の値をとる
平滑化定数βを用いる方法がある。また、相関除去部63
2_m(m=2~M)では、収音信号y(k)から信号第1〜第m-1チャ
ネル再生信号との相関成分を除去した信号の短時間スペ
クトルを求める。その処理は[Equation 20] As described above, there is a method of using the processing result of one frame before and the smoothing constant β having a value of 0 to 1. Also, the correlation removing unit 63
For 2 _m (m = 2 to M), a short-time spectrum of the signal obtained by removing the correlation component with the reproduced signals of the 1st to m-1th channels from the collected signal y (k) is obtained. The process is

【数２１】であり、上述の相関除去部631_mとほぼ同様である。そし
て、コヒーレンス算出部633₁では第１チャネル再生信号
と収音信号のコヒーレンスを求め、コヒーレンス算出部
633_m(m=2~M)では相関除去処理を経た再生信号と収音信
号の短時間スペクトルからコヒーレンスを求める。[Equation 21] And is almost the same as the above correlation removing unit 631 _m . Then, the coherence calculation unit 633 ₁ obtains the coherence of the first channel reproduced signal and the collected sound signal, the coherence calculator
At 633 _m (m = 2 to M), coherence is obtained from the short-time spectrum of the reproduced signal and the picked-up signal that have undergone the correlation removal processing.

【数２２】これらコヒーレンスを用いて、エコー成分比率算出部63
4は収音信号に占めるエコー成分の比率を求める。[Equation 22] Using these coherences, the echo component ratio calculation unit 63
4 calculates the ratio of the echo component in the picked-up signal.

【数２３】 [Equation 23]

【００１９】ステップ３周波数帯域ごとに、減衰比算出部64で収音信号に占める
エコー成分比率から振幅減衰率を求め、乗算器65で収音
信号の振幅を減衰させる。エコー成分比率から求めた振
幅減衰率の一例として、下式のような減衰率を考えるこ
とができる。 Step 3 For each frequency band, the attenuation ratio calculation unit 64 obtains the amplitude attenuation rate from the echo component ratio in the sound pickup signal, and the multiplier 65 attenuates the amplitude of the sound pickup signal. As an example of the amplitude attenuation rate obtained from the echo component ratio, the attenuation rate as shown in the following equation can be considered.

【数２４】これによりエコーが抑圧される。[Equation 24] This suppresses the echo.

【数２５】 [Equation 25]

【００２０】ステップ４周波数領域での処理結果は、FT変換部66において Step 4 The processing result in the frequency domain is obtained by the FT converter 66.

【数２６】のように逆FFT変換を用いて時間領域のブロック信号に
変換される。このブロック信号からは例えば[Equation 26] Inverse FFT transformation is used to transform into a block signal in the time domain. From this block signal, for example,

【数２７】のようにフレームの一部を切り出してエコー抑圧処理後
の信号を求めてもよいし、複数フレームをウィンドウ処
理し、オーバーラップする区間を合成することでエコー
抑圧処理後の信号を求めてもよい。[Equation 27] A signal after echo suppression processing may be obtained by cutting out a part of the frame as described above, or the signal after echo suppression processing may be obtained by performing window processing on a plurality of frames and combining overlapping sections. .

【００２１】「実施例２」図３にこの発明の他の音響エ
コー抑圧方法を示す。図３に示す構成では、本発明の音
響エコー抑圧方法が、適応フィルタによる音響エコー消
去方法と組み合わされている。ここでは音響エコー抑圧
部６で行われる音響エコー抑圧処理は、収音信号でなく
エコー消去部４でエコー消去処理を経た信号に適用され
る。音響エコーと同時に騒音もマイクロホンにより収音
されるとき、騒音の影響によりエコー経路推定精度が頭
打ちになり、聴感上音響エコーが残り続けることが知ら
れている。このような音響的に厳しいケースでも、音響
エコー抑圧方法を適応フィルタによる音響エコーキャン
セル方法と組み合わせることで、通話品質を高く保つこ
とが可能となる。[Second Embodiment] FIG. 3 shows another acoustic echo suppressing method of the present invention. In the configuration shown in FIG. 3, the acoustic echo suppression method of the present invention is combined with the acoustic echo cancellation method using an adaptive filter. Here, the acoustic echo suppression processing performed by the acoustic echo suppression unit 6 is applied to the signal that has undergone the echo cancellation processing by the echo cancellation unit 4 instead of the collected signal. It is known that when noise is picked up by a microphone at the same time as acoustic echo, the accuracy of echo path estimation reaches a ceiling due to the effect of noise, and acoustic echo continues to remain audibly. Even in such an acoustically severe case, it is possible to maintain a high communication quality by combining the acoustic echo suppression method with the acoustic echo cancellation method using the adaptive filter.

【００２２】「実施例３」第３の実施例では、コヒーレ
ンスに基づきエコー抑圧を行う方法を、文献江村、羽
田、“付加信号強調型の周波数領域ステレオ適応アルゴ
リズム”、日本音響学会２００１年秋季研究発表会、p
p. 537-538(2001)で提案されているマルチチャネル適応
アルゴリズムと組み合わせた場合について説明する。こ
の適応アルゴリズムは入力信号の替わりに修正用信号か
ら適応フィルタの修正ベクトルを求める。そのため、図
４のＭチャネルエコーキャンセル部７には、Ｍチャネル
受話信号（Ｍは２以上の整数）のほかに、相関変動処理
8₁〜8_Mにより生成されたＭチャネル付加信号も入力され
る。なお相関変動処理は、マルチチャネルエコーキャン
セラのエコー経路推定性能向上のために一般的に使われ
る手段である。音響エコー抑圧部６で実行される音響エ
コー抑圧処理はＭチャネルエコーキャンセル部７でエコ
ー消去処理された信号に対して施される。[Embodiment 3] In the third embodiment, a method of performing echo suppression based on coherence is described in the literature: Emura, Haneda, "Frequency-domain stereo adaptive algorithm with additional signal enhancement", Acoustical Society of Japan 2001 Autumn Research. Recital, p
The case of combining with the multi-channel adaptive algorithm proposed in p. 537-538 (2001) is explained. This adaptive algorithm obtains the correction vector of the adaptive filter from the correction signal instead of the input signal. Therefore, in addition to the M channel received signal (M is an integer of 2 or more), the M channel echo canceling unit 7 of FIG.
The M-channel additional signals generated by 8 ₁ to 8 _M are also input. The correlation variation process is a means generally used for improving the echo path estimation performance of the multi-channel echo canceller. The acoustic echo suppression processing executed by the acoustic echo suppression unit 6 is performed on the signal subjected to the echo cancellation processing by the M channel echo cancellation unit 7.

【００２３】図４のＭチャネルエコーキャンセル部７で
は、以下のステップ１〜６に従って適応フィルタの係数
が更新される。そして音響エコー抑圧部６では以下のス
テップ７〜９に従ってエコー抑圧処理を行う。ステップ１各チャネルの受話信号u_m(k)と相関変動処理のための付
加信号g_m(u_m(k))から、再生信号x_m(k)と修正用信号v
_m(k)をIn the M channel echo canceling section 7 of FIG. 4, the coefficient of the adaptive filter is updated according to steps 1 to 6 below. Then, the acoustic echo suppression unit 6 performs echo suppression processing according to steps 7 to 9 below. Step 1 From the received signal u _m (k) of each channel and the additional signal g _m (u _m (k)) for correlation fluctuation processing, the reproduced signal x _m (k) and the correction signal v
_m (k)

【数２８】により生成する。ただしａは０より大きく１以下の値で
ある。そして、L/Dサンプル毎に長さ２Ｌの信号ベクト
ルにブロック化し、FFTを用いて[Equation 28] Generated by. However, a is a value greater than 0 and less than or equal to 1. Then, for each L / D sample, block into a signal vector of length 2L, and use FFT

【数２９】のように周波数領域に変換する。関数diag( )はベクト
ルを、その要素を対角成分とする行列に変換する。[Equation 29] To the frequency domain. The function diag () transforms a vector into a matrix whose elements are diagonal elements.

【数３０】ステップ２ [Equation 30] Step two

【数３１】 [Equation 31]

【００２４】ステップ３ Step 3

【数３２】ステップ４ [Equation 32] Step 4

【数３３】ステップ５ [Expression 33] Step 5

【数３４】ステップ６ [Equation 34] Step 6

【数３５】そして各チャネルの適応フィルタを次式で更新する。[Equation 35] Then, the adaptive filter of each channel is updated by the following equation.

【数３６】ただしμは０〜１の値をとるステップサイズである。ステップ７ FFTを用いて時刻jL/Dでの残差信号からなるベクトルを[Equation 36] However, μ is a step size that takes a value of 0 to 1. Step 7 Use FFT to find the vector consisting of the residual signal at time jL / D

【数３７】のように周波数領域に変換する。[Equation 37] To the frequency domain.

【００２５】ステップ８エコー成分比率推定部63の相関除去部では、時刻k=j L/
DにおけるＭチャネル再生信号と残差信号の短時間スペ
クトルを入力として、図２のフローに従い第ｍチャネル
再生信号x_m(k)から第１〜第m-1チャネル再生信号との相
関成分を除去した信号の短時間スペクトルを、 Step 8 In the correlation removing unit of the echo component ratio estimating unit 63, time k = j L /
With the short-time spectrum of the M channel reproduction signal and the residual signal in D as input, the correlation component between the mth channel reproduction signal x _m (k) and the 1st to m-1th channel reproduction signals is removed according to the flow of FIG. The short-term spectrum of the signal

【数３８】により求める。また残差信号から信号第１〜第m-1チャ
ネル再生信号との相関成分を除去した信号の短時間スペ
クトルを[Equation 38] Ask by. In addition, the short-term spectrum of the signal obtained by removing the correlation component with the signal 1 to m-1th channel reproduction signal from the residual signal

【数３９】により求める。ただしε[]は時間平均をとることを意味
する。時間平均処理は、例えば[Formula 39] Ask by. However, ε [] means to take the time average. For example, the time averaging process is

【数４０】のように、平滑化定数β（0~1）と１フレーム前の処理
結果を用いる方法がある。そして、相関除去された短時
間スペクトル同士から[Formula 40] As described above, there is a method of using the smoothing constant β (0 to 1) and the processing result of one frame before. And from the short-term spectra that have been de-correlated

【数４１】によりコヒーレンスを求める。また第１チャネル再生信
号と収音信号のコヒーレンスも求める。そしてこれらコ
ヒーレンスを用いて残差信号に占めるエコー成分の比率
を求める。[Formula 41] Seek coherence by. Further, the coherence between the reproduction signal of the first channel and the picked-up signal is also obtained. Then, using these coherences, the ratio of the echo component in the residual signal is obtained.

【数４２】 [Equation 42]

【００２６】ステップ９ステップ８で求められた残差信号に占めるエコー成分の
比率γ²(f)から、周波数帯域ごとに振幅減衰率を求め、
周波数領域で残差信号に適用して残留エコーを抑圧す
る。エコー成分比率から振幅減衰率を求める方法の一例
として、下式のような減衰率を考えることができる。 Step 9 From the ratio γ ² (f) of the echo components in the residual signal obtained in step 8, the amplitude attenuation rate is obtained for each frequency band,
It is applied to the residual signal in the frequency domain to suppress the residual echo. As an example of the method of obtaining the amplitude attenuation rate from the echo component ratio, the attenuation rate as shown in the following equation can be considered.

【数４３】そして、Ｅs(j)を逆FFT変換により時間領域に戻して、
残留エコーの抑圧された信号を得る。[Equation 43] Then, Es (j) is returned to the time domain by the inverse FFT transform,
A signal with suppressed residual echo is obtained.

【数４４】 [Equation 44]

【００２７】「効果の実証例」実施例１の方法につい
て、実際に数値シミュレーションを行った結果を図５、
図６に示す。この数値シミュレーションでは、入力チャ
ネル数をM=2とし、サンプリング周波数を8kHzに設定し
た。音響エコー経路として残響時間200msの部屋で実測
した室内伝達関数を700タップに打ち切って音響エコー
を生成した。2チャネルの入力信号は、実測した室内伝
達関数を用い、遠端話者2人の音声ステレオ収音を模擬
して生成した。話者はt=5.3sの時点で交代している。こ
の2チャネル信号に、チャネル間相関変動処理として、
P. Eneroth, T. Gaensler, S. Gay andJ.Benesty, "Stu
dies of a Wideband Stereophonic Acoustic Echo Canc
eller," Proc.IWAENC, pp. 207-210 (1999).で提案され
ている半波整流方式を付加ゲイン0.25で適用した。"Demonstration Example of Effect" The result of actual numerical simulation of the method of Example 1 is shown in FIG.
As shown in FIG. In this numerical simulation, the number of input channels was M = 2 and the sampling frequency was set to 8 kHz. As the acoustic echo path, the room transfer function measured in a room with a reverberation time of 200 ms was truncated to 700 taps to generate an acoustic echo. The input signals of two channels were generated by simulating the voice stereo sound pickup of two far-end speakers using the measured indoor transfer function. The speakers are taking turns at t = 5.3s. This two-channel signal, as the inter-channel correlation fluctuation processing,
P. Eneroth, T. Gaensler, S. Gay and J. Benesty, "Stu
dies of a Wideband Stereophonic Acoustic Echo Canc
The half-wave rectification method proposed by eller, "Proc. IWAENC, pp. 207-210 (1999). was applied with an additional gain of 0.25.

【００２８】この信号を用い、適応フィルタによりエコ
ー消去を行う従来方法と、実施例１のエコー抑圧方法を
比較した。適応アルゴリズムとして、文献D. Mansour a
nd A. H. Gray, "Unconstrained Frequency-Domain Ada
ptive Filter," IEEE Trans.onAcoust.,Speech, Signal
Processing, vol. ASSP-30, No. 5, pp. 726-734(198
2)の提案アルゴリズムをマルチチャネルに拡張したアル
ゴリズムを用いた。チャネル当たりの適応フィルタタッ
プ数をL=512とし、適応フィルタが256サンプルすなわち
32msごとに更新されるように、D=2に設定した。ステッ
プサイズはμ=0.3に設定した。Using this signal, the conventional method of canceling the echo by the adaptive filter and the echo suppressing method of the first embodiment are compared. As an adaptive algorithm, reference D. Mansour a
nd AH Gray, "Unconstrained Frequency-Domain Ada
ptive Filter, "IEEE Trans.onAcoust., Speech, Signal
Processing, vol. ASSP-30, No. 5, pp. 726-734 (198
An algorithm that extends the proposed algorithm in 2) to multiple channels is used. The number of adaptive filter taps per channel is L = 512, and the adaptive filter has 256 samples,
D = 2 is set so that it is updated every 32 ms. The step size was set to μ = 0.3.

【００２９】収音信号＝音響エコー信号＋近端話者の音
声信号、適応フィルタによるエコー消去処理後の信号、
および近端話者の音声信号を図５に示す。区間t = 0.3
~ 1sでは、適応フィルタによる推定が不十分なため、残
留エコーが目立っている。そして話者交代の直後(t=5.3
s)では、突然残留エコーが増大している。同様にして、
収音信号＝音響エコー信号＋近端話者の音声信号、本発
明のエコー抑圧処理後の信号、および近端話者の音声信
号を図６に示す。コヒーレンス推定用の平滑化定数とし
てβ=0.25を用いた。このグラフによれば、t = 0.3 ~ 1
sの区間でもエコーがよく抑圧されている。話者交代の
直後(t=5.3s)では、適応フィルタのようなエコーの急激
な増大は見られない。またt = 7.5 ~ 9 s の区間では、
振幅の減衰は見られるものの近端話者音声の概形は保た
れている。このように、提案するエコー抑圧方法が音響
条件の変化に対して良好な応答性を持つことが分かっ
た。Sound collection signal = acoustic echo signal + speech signal of near-end speaker, signal after echo cancellation processing by adaptive filter,
And the voice signal of the near-end speaker is shown in FIG. Interval t = 0.3
In ~ 1s, the residual echo is conspicuous because the estimation by the adaptive filter is insufficient. Immediately after the speaker change (t = 5.3
In s), the residual echo suddenly increases. Similarly,
FIG. 6 shows the sound collection signal = acoustic echo signal + speech signal of the near-end speaker, the signal after the echo suppression processing of the present invention, and the speech signal of the near-end speaker. Β = 0.25 was used as the smoothing constant for coherence estimation. According to this graph, t = 0.3 ~ 1
The echo is well suppressed even in the section of s. Immediately after the change of the speaker (t = 5.3s), there is no sudden increase in echo like the adaptive filter. Also, in the interval from t = 7.5 to 9 s,
Although the amplitude is attenuated, the general shape of the near-end speaker's voice is maintained. Thus, it was found that the proposed echo suppression method has good responsiveness to changes in acoustic conditions.

【００３０】上述したこの発明による反響低減方法はコ
ンピュータ上において、コンピュータが読み取り可能な
符号によって記述されているプログラムを実行すること
によって実現される。プログラムはＣＤ−ＲＯＭ等の記
録媒体もしくは通信回線を経由してコンピュータにダウ
ンロードされインストールしてＣＰＵ等の演算手段で実
行される。The echo reduction method according to the present invention described above is realized by executing a program described by a computer-readable code on a computer. The program is downloaded to a computer via a recording medium such as a CD-ROM or a communication line, installed, and executed by an arithmetic means such as a CPU.

【００３１】[0031]

【発明の効果】本発明は、収音信号中に占めるエコー成
分の比率を周波数帯域ごとに推定し、エコー相当分だけ
収音信号の振幅を減衰させる。これにより話者交代など
で音響条件が急変しても音響エコーの抑圧をはかること
ができる。According to the present invention, the ratio of the echo component occupied in the collected sound signal is estimated for each frequency band, and the amplitude of the collected sound signal is attenuated by an amount corresponding to the echo. This makes it possible to suppress the acoustic echo even if the acoustic conditions suddenly change due to a change in speakers.

[Brief description of drawings]

【図１】この発明の反響低減方法を実行する音響エコー
抑圧部の一例を説明するためのブロック図。FIG. 1 is a block diagram for explaining an example of an acoustic echo suppression unit that executes an echo reduction method of the present invention.

【図２】図１に示した音響エコー抑圧部に用いられてい
るエコー成分比率推定部の内部を説明するためのブロッ
ク図。FIG. 2 is a block diagram for explaining the inside of an echo component ratio estimation unit used in the acoustic echo suppression unit shown in FIG.

【図３】この発明による反響低減方法と従来のエコー消
去方法とを組み合わせた実施例を説明するためのブロッ
ク図。FIG. 3 is a block diagram for explaining an embodiment in which the echo reduction method according to the present invention and a conventional echo cancellation method are combined.

【図４】この発明による反響低減方法と従来のＭチャネ
ルエコーキャンセル方法とを組み合わせた実施例を説明
するためのブロック図。FIG. 4 is a block diagram for explaining an embodiment in which the echo reduction method according to the present invention and a conventional M channel echo cancellation method are combined.

【図５】従来の技術によるエコー消去効果を説明するた
めのグラフ。FIG. 5 is a graph for explaining an echo canceling effect according to a conventional technique.

【図６】この発明による反響低減方法による反響抑圧効
果を説明するためのグラフ。FIG. 6 is a graph for explaining an echo suppressing effect by the echo reducing method according to the present invention.

【図７】従来の技術を説明するためのブロック図。FIG. 7 is a block diagram for explaining a conventional technique.

[Explanation of symbols]

６音響エコー抑圧部６４減衰比算出部６１_I〜６１_M ,６２ＴＦ変換部６５
乗算部６３エコー成分比率推定部６６ＦＴ変換部6 Acoustic Echo Suppression Unit 64 Attenuation Ratio Calculation Unit 61 _{I to} 61 _M , 62 TF Conversion Unit 65
Multiplier 63 Echo component ratio estimator 66 FT converter

───────────────────────────────────────────────────── フロントページの続きＦターム(参考） 5K027 BB03 DD07 DD10 HH03 5K046 HH01 HH24 HH25 HH31 HH78 HH79 ─────────────────────────────────────────────────── ─── Continued front page F term (reference) 5K027 BB03 DD07 DD10 HH03 5K046 HH01 HH24 HH25 HH31 HH78 HH79

Claims

[Claims]

1. M speakers (M is an integer of 2 or more) and N microphones (N is an integer of 1 or more) are arranged in a common sound field to reproduce an M channel reception signal from the speakers, In a voice communication system that processes a picked-up signal into a transmission signal, the power spectrum and cross spectrum of the M-channel reproduced signal, the picked-up signal are obtained, and the coherence between the M-channel reproduced signal and the picked-up signal of at least one channel is determined. Seeking,
Estimate the ratio of the echo component occupying the collected signal for each frequency band, calculate the echo suppression gain from this ratio, and multiply the short-time spectrum of the collected signal by the echo suppression gain to suppress the echo. An echo reduction method characterized by the above.

2. The echo reduction method according to claim 1, wherein
Short-term spectrum of the 1st channel playback signal and the collected sound signal y
From the short-time spectrum of (k), the first coherence γ
² _1y (f) is _calculated, and the short-time spectrum of the signal obtained by removing the correlation component from the m-th channel reproduced signal (m ≧ 2) with the 1st to m-1th channel reproduced signals and the picked-up signal y (k) From the short-time spectrum of the signal from which the correlation component with the _{1st to (m-1) th} channel reproduced signals has been removed, the mth coherence γ ² _{my (m-1)} (f) is obtained, and the 1st to Mth coherences are obtained. The ratio of the echo component in the picked up signal is Reverberation reduction method characterized by obtaining in.

3. The echo reduction method according to claim 1, wherein the power ratio of the received echo component in the sound pickup signal estimated for each frequency band is γ ² (f), and Reverberation reduction method characterized by setting to.

4. The echo reduction method according to claim 1, 2, or 3, wherein a signal output to a speaker is input to a pseudo echo path to generate a pseudo echo signal, and a sound pickup signal from a microphone is collected. The echo reduction method is characterized in that the echo is suppressed for a sound pickup signal subjected to the echo cancellation processing of updating the pseudo echo path based on the residual signal of the pseudo echo signal.

5. The power of an M channel reproduction signal and a picked-up signal connected to M speakers (M is an integer of 2 or more) and N microphones (N is an integer of 1 or more) arranged in a common sound field. A means for obtaining a spectrum, a means for obtaining a cross spectrum, a coherence of an M channel reproduction signal and at least one channel of a sound pickup signal from power spectrum information and cross spectrum information, and from these, a sound pickup signal for each frequency band is obtained. An echo reduction apparatus comprising: a means for estimating a ratio of occupied echo components; a means for calculating an echo suppression gain from this ratio; and a means for multiplying a short time spectrum of a sound collection signal by the echo suppression gain.

6. The echo reduction device according to claim 5, wherein the first coherence is obtained from the short-time spectrum of the first channel reproduction signal and the short-time spectrum of the sound pickup signal, and the m-th channel reproduction signal (m ≧). 2) from the short-time spectrum of the signal from which the correlation component with the 1st to m-1th channel reproduction signals has been removed, and the signal from which the correlation component with the 1st to m-1th channel reproduction signals has been removed from the collected signal A means for obtaining an m-th coherence from a short-time spectrum, and an echo reduction device comprising:

7. The echo reduction device according to claim 5, wherein the reproduction signal output to the speaker is input to the pseudo echo path to generate a pseudo echo signal, and sound is collected from the microphone. An echo reduction device, wherein the picked-up sound signal passed through the echo cancellation processing means by means for subtracting the pseudo echo signal from the signal and means for updating the pseudo echo path based on the residual signal is targeted. .

8. An echo reduction program for executing the echo reduction method according to claim 1 by a computer.