JP6075783B2

JP6075783B2 - Echo canceling apparatus, echo canceling method and program

Info

Publication number: JP6075783B2
Application number: JP2013253804A
Authority: JP
Inventors: 江村　暁; 暁江村; 島内　末廣; 末廣島内; 仲大室
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2013-12-09
Filing date: 2013-12-09
Publication date: 2017-02-08
Anticipated expiration: 2033-12-09
Also published as: JP2015115624A

Description

本発明は、マルチチャネル拡声通話系において音響エコーを消去する技術に関する。 The present invention relates to a technique for canceling acoustic echo in a multi-channel loudspeaker communication system.

より自然な通話環境を提供できるマルチチャネル拡声型の双方向通信会議システムの開発が、ＩＰ通信の高速化・大容量化を背景に、近年進展している。マルチチャネル再生技術も、ステレオ再生から５．１チャネル再生へとチャネル数拡大の方向に進んでいる。しかし、音が高い立体感を持って再生されるリスニングエリアが限られていて、スィートスポット化しており、その外では音の立体感が大幅に低減してしまう。 In recent years, development of a multi-channel loudspeaker type two-way communication conferencing system that can provide a more natural calling environment has progressed against the background of higher speed and higher capacity of IP communication. Multi-channel playback technology is also progressing in the direction of expanding the number of channels from stereo playback to 5.1 channel playback. However, the listening area where the sound is reproduced with a high three-dimensional effect is limited, and it has become a sweet spot, and outside it, the three-dimensional effect of the sound is greatly reduced.

そのため、リスニングエリアの広いマルチチャネル再生技術として、近年Wave Field Synthesis（以下「ＷＦＳ」と略す）の研究が進められている（非特許文献１参照）。ＷＦＳは、ある地点での音波面を取得し、別の地点で再合成する技術である。 Therefore, research on Wave Field Synthesis (hereinafter abbreviated as “WFS”) has recently been advanced as a multi-channel playback technique with a wide listening area (see Non-Patent Document 1). WFS is a technique for acquiring a sound wave surface at a certain point and recombining it at another point.

ＷＦＳを双方向映像音声通信会議に適用しようとする場合、快適な通話環境を実現するには、数十〜数百のスピーカから数十〜数百のマイクロホンに音響的に回り込む信号成分（以下「エコー」ともいう）をマイクロホンの収音信号から消去する必要がある。この処理を効率的に行う音響エコーキャンセラアルゴリズムとして、波数領域適応アルゴリズムが提案されている（非特許文献２参照）。この波数領域適応アルゴリズムは、適応フィルタのフィルタ係数を波数領域に持つアルゴリズムである。 When a WFS is applied to a two-way video / audio communication conference, in order to realize a comfortable call environment, a signal component (hereinafter referred to as “sound sneaking” from tens to hundreds of speakers to tens to hundreds of microphones). (Also called “echo”) must be erased from the microphone's collected signal. A wave number domain adaptive algorithm has been proposed as an acoustic echo canceller algorithm that efficiently performs this processing (see Non-Patent Document 2). This wave number domain adaptive algorithm is an algorithm having filter coefficients of an adaptive filter in the wave number domain.

しかしながら、非特許文献２のシミュレーション結果の説明に記されているように、スピーカアレーから再生する波面の放射方向が変わったときに、エコー消去量が急激に劣化する。この状況は、双方向通信において遠隔地で話者が交代して、交代後の話者再生音声の放射方向が交代前の放射方向と異なるケースに対応する。エコー消去量が劣化する理由は、再生波面の放射方向が変化するとエコー消去に波数の異なる適応フィルタ係数が必要になるが、その適応フィルタ係数がほとんど未学習なためである。 However, as described in the explanation of the simulation result of Non-Patent Document 2, when the radiation direction of the wavefront reproduced from the speaker array is changed, the echo cancellation amount rapidly deteriorates. This situation corresponds to a case in which a speaker is switched at a remote place in two-way communication, and the radiation direction of the speaker reproduced voice after the substitution is different from the radiation direction before the substitution. The reason why the echo cancellation amount deteriorates is that when the radiation direction of the reproduction wavefront changes, adaptive filter coefficients having different wave numbers are required for echo cancellation, but the adaptive filter coefficients are almost unlearned.

快適な拡声通話を実現するには、適応フィルタによるエコー経路推定及び消去が十分でない状態において、会話状態によらず迅速に残留エコーを低減する必要がある。特にダブルトーク状態では、送話の品質に影響を与えることなく残留エコーを低減する必要がある。 In order to realize a comfortable voice call, it is necessary to quickly reduce the residual echo regardless of the conversation state in a state where the echo path estimation and cancellation by the adaptive filter is not sufficient. Especially in the double talk state, it is necessary to reduce the residual echo without affecting the quality of transmission.

そのような方法として、波数領域で誤差信号に含まれる残留エコーを推定し、消去する方法が非特許文献３で提案されている。 As such a method, Non-Patent Document 3 proposes a method of estimating and canceling a residual echo included in an error signal in the wave number domain.

J. Berkhout, D de Vries, and P. Vogel, "Acoustic Control by wave field synthesis", Journal of Acoustic Society of America, 1993, vol.93, no.5, p.2764-2778J. Berkhout, D de Vries, and P. Vogel, "Acoustic Control by wave field synthesis", Journal of Acoustic Society of America, 1993, vol.93, no.5, p.2764-2778 M. Schneider, W. Kellermann, "A Wave-domain model for acoustic MIMO systems with reduced complexity", 2012, 2011 Joint Workshop on Hands-free Speech Communication and Microphone arrays, pp.133-138M. Schneider, W. Kellermann, "A Wave-domain model for acoustic MIMO systems with reduced complexity", 2012, 2011 Joint Workshop on Hands-free Speech Communication and Microphone arrays, pp.133-138 S. Emura et. al., "Posterior residual echo cancellation and its complexity reduction in the wave domain, Acoustic Signal Enhancement", Proceedings of IWAENC 2012, 2012, International Workshop on.S. Emura et.al., "Posterior residual echo cancellation and its complexity reduction in the wave domain, Acoustic Signal Enhancement", Proceedings of IWAENC 2012, 2012, International Workshop on.

しかし、受聴エリアを広げるために再生音量を大きくしたり、収音エリアを広げるためにマイクゲインを大きくしたりするためには、残留エコー消去の性能をさらに向上させる必要がある。 However, in order to increase the playback volume in order to expand the listening area and increase the microphone gain in order to expand the sound collection area, it is necessary to further improve the performance of residual echo cancellation.

残留エコーには、反射等によらない直接波によるものと、直接波以外の反射波等によるもの（拡散残留エコー）とが含まれる。非特許文献３の方法は、ベースとして使用するモデルのために、直接波による残留エコーのみが対象になる。 Residual echoes include those based on direct waves that do not depend on reflections, and those based on reflected waves other than direct waves (diffuse residual echoes). Since the method of Non-Patent Document 3 is a model used as a base, only a residual echo due to a direct wave is targeted.

本発明は、拡散残留エコーも対象とすることで、残留エコーを従来法以上に低減させるエコー消去技術の提供を目的とする。 An object of the present invention is to provide an echo cancellation technique for reducing the residual echo more than the conventional method by targeting the diffuse residual echo.

上記の課題を解決するために、本発明の第一の態様によれば、エコー消去装置は、Ｐを２以上の整数とし、Ｐ個のスピーカとＰ個のマイクロホンとが共通の音場に配置され、スピーカから受話信号を再生した際にエコー経路を経てマイクロホンに回り込むエコーを消去する。エコー消去装置は、マイクロホンで収音される収音信号を波数領域に変換した信号と波数領域の受話信号とを用いて、波数領域の収音信号に含まれる拡散残留エコーを推定し、波数領域の収音信号から推定した拡散残留エコーを消去する波数領域拡散残留エコー推定消去部を含む。波数領域拡散残留エコー推定消去部は、波数毎の受話信号を要素とするＰ次元のベクトルである受話信号ベクトルＸとその複素共役かつ転置とを用いてＰ×Ｐ行列であるパワースペクトル行列を算出し、波数毎の収音信号を要素とするＰ次元のベクトルである収音信号ベクトルと受話信号ベクトルＸの複素共役かつ転置とを用いてＰ×Ｐ行列であるクロススペクトル行列を算出する圧縮入出力相関係数算出部と、パワースペクトル行列とクロススペクトル行列とを用いて、受話信号と収音信号との入出力伝達特性の推定値を要素とするＰ×Ｐ行列である入出力伝達特性行列を求める圧縮入出力伝達特性推定部と、受話信号ベクトルＸに入出力伝達特性行列を乗じて、波数毎の拡散残留エコーの推定値を要素とするＰ次元のベクトルである拡散残留エコーベクトルを求める拡散残留エコー推定部と、波数領域の収音信号と波数領域の拡散残留エコーの推定値との差分を求める減算部とを含む。 In order to solve the above-described problem, according to the first aspect of the present invention, the echo canceller is configured such that P is an integer equal to or greater than 2, and P speakers and P microphones are arranged in a common sound field. Then, when the received signal is reproduced from the speaker, the echo that goes around the microphone via the echo path is deleted. The echo canceller estimates the diffuse residual echo contained in the sound signal collected in the wave number domain using the signal obtained by converting the sound signal collected by the microphone into the wave number domain and the received signal in the wave number domain. A wave number domain residual echo estimation canceling unit that cancels the diffuse residual echo estimated from the collected sound signal. The wave number domain diffuse residual echo estimation / erasing unit calculates a power spectrum matrix that is a P × P matrix using the received signal vector X, which is a P-dimensional vector having received signal for each wave number as an element, and its complex conjugate and transpose. Then, a compression input for calculating a cross spectrum matrix which is a P × P matrix using a complex conjugate and transpose of a sound pickup signal vector which is a P-dimensional vector having a sound pickup signal for each wave number as an element and a received signal vector X An input / output transfer characteristic matrix which is a P × P matrix having an estimated value of input / output transfer characteristics of the received signal and the collected sound signal as elements using an output correlation coefficient calculation unit, a power spectrum matrix and a cross spectrum matrix A compression input / output transfer characteristic estimator for obtaining a spread residual which is a P-dimensional vector obtained by multiplying the received signal vector X by an input / output transfer characteristic matrix and having an estimated value of a diffuse residual echo for each wave number as an element. Comprising a diffusion residual echo estimator for determining an echo vector, and a subtraction unit for obtaining a difference between the estimated value of the diffusion residual echo between the picked-up signal and the wavenumber region of wavenumbers region.

上記の課題を解決するために、本発明の第二の態様によれば、エコー消去装置は、Ｐを２以上の整数とし、Ｐ個のスピーカとＰ個のマイクロホンとが共通の音場に配置され、スピーカから受話信号を再生した際にエコー経路を経てマイクロホンに回り込むエコーを消去する。エコー消去装置は、マイクロホンで収音される収音信号を波数領域に変換した信号と波数領域の受話信号とを用いて、波数領域の収音信号に含まれる拡散残留エコーを推定し、波数領域の収音信号から推定した拡散残留エコーを消去する波数領域拡散残留エコー推定消去部を含む。波数領域拡散残留エコー推定消去部は、Ｐ’＜Ｐとし、Ｐ’×Ｐ行列である圧縮行列Ｗを用いて、波数毎の受話信号を要素とするＰ次元のベクトルである受話信号ベクトルＸを、Ｐ’次元の圧縮ベクトルＺに圧縮する入力次元圧縮部と、圧縮ベクトルＺを圧縮行列Ｗの複素共役転置行列で伸長したＰ次元のベクトルと、受話信号ベクトルＸとの差が最小になるように、圧縮行列Ｗを更新する次元圧縮行列更新部と、圧縮ベクトルＺとその複素共役かつ転置とを用いてＰ’×Ｐ’行列であるパワースペクトル行列を算出し、波数毎の収音信号を要素とするＰ次元のベクトルである収音信号ベクトルと圧縮ベクトルＺの複素共役かつ転置とを用いてＰ×Ｐ’行列であるクロススペクトル行列を算出する圧縮入出力相関係数算出部と、パワースペクトル行列とクロススペクトル行列とを用いて、受話信号と収音信号との入出力伝達特性の推定値を要素とするＰ×Ｐ’行列である入出力伝達特性行列を求める圧縮入出力伝達特性推定部と、圧縮ベクトルＺに入出力伝達特性行列を乗じて、波数毎の拡散残留エコーの推定値を要素とするＰ次元のベクトルである拡散残留エコーベクトルを求める拡散残留エコー推定部と、波数領域の収音信号と波数領域の拡散残留エコーの推定値との差分を求める減算部とを含む。 In order to solve the above problems, according to a second aspect of the present invention, an echo canceller is configured such that P is an integer equal to or greater than 2, and P speakers and P microphones are arranged in a common sound field. Then, when the received signal is reproduced from the speaker, the echo that goes around the microphone via the echo path is deleted. The echo canceller estimates the diffuse residual echo contained in the sound signal collected in the wave number domain using the signal obtained by converting the sound signal collected by the microphone into the wave number domain and the received signal in the wave number domain. A wave number domain residual echo estimation canceling unit that cancels the diffuse residual echo estimated from the collected sound signal. The wave number domain diffusion residual echo estimation elimination unit sets P ′ <P, and uses a compression matrix W that is a P ′ × P matrix to obtain a received signal vector X that is a P-dimensional vector having received signals for each wave number as elements. The difference between the received signal vector X and the input dimension compression unit that compresses the compressed vector Z into the P′-dimensional compressed vector Z, the P-dimensional vector obtained by expanding the compressed vector Z with the complex conjugate transpose matrix of the compression matrix W, and Next, a power spectrum matrix which is a P ′ × P ′ matrix is calculated using a dimensional compression matrix updating unit for updating the compression matrix W, a compressed vector Z and its complex conjugate and transposition, and a sound collected signal for each wave number is calculated. A compression input / output correlation coefficient calculating unit that calculates a cross spectrum matrix that is a P × P ′ matrix by using a complex conjugate and transpose of a sound pickup signal vector that is a P-dimensional vector as an element and a compression vector Z; Spect Compressed input / output transfer characteristic estimator for obtaining an input / output transfer characteristic matrix which is a P × P ′ matrix having an estimated value of the input / output transfer characteristics of the received signal and the collected sound signal as elements using the matrix and the cross spectrum matrix A diffusion residual echo estimator that multiplies the compression vector Z by an input / output transfer characteristic matrix to obtain a diffusion residual echo vector that is a P-dimensional vector whose element is an estimated value of diffusion residual echo for each wave number; A subtractor for obtaining a difference between the collected sound signal and an estimated value of the diffuse residual echo in the wave number domain.

上記の課題を解決するために、本発明の第三の態様によれば、エコー消去方法は、Ｐを２以上の整数とし、Ｐ個のスピーカとＰ個のマイクロホンとが共通の音場に配置され、スピーカから受話信号を再生した際にエコー経路を経てマイクロホンに回り込むエコーを消去する。エコー消去方法は、マイクロホンで収音される収音信号を波数領域に変換した信号と波数領域の受話信号とを用いて、波数領域の収音信号に含まれる拡散残留エコーを推定し、波数領域の収音信号から推定した拡散残留エコーを消去する波数領域拡散残留エコー推定消去ステップを含む。波数領域拡散残留エコー推定消去ステップは、波数毎の受話信号を要素とするＰ次元のベクトルである受話信号ベクトルＸとその複素共役かつ転置とを用いてＰ×Ｐ行列であるパワースペクトル行列を算出し、波数毎の収音信号を要素とするＰ次元のベクトルである収音信号ベクトルと受話信号ベクトルＸの複素共役かつ転置とを用いてＰ×Ｐ行列であるクロススペクトル行列を算出する圧縮入出力相関係数算出ステップと、パワースペクトル行列とクロススペクトル行列とを用いて、受話信号と収音信号との入出力伝達特性の推定値を要素とするＰ×Ｐ行列である入出力伝達特性行列を求める圧縮入出力伝達特性推定ステップと、受話信号ベクトルＸに入出力伝達特性行列を乗じて、波数毎の拡散残留エコーの推定値を要素とするＰ次元のベクトルである拡散残留エコーベクトルを求める拡散残留エコー推定ステップと、波数領域の収音信号と波数領域の拡散残留エコーの推定値との差分を求める減算ステップとを含む。 In order to solve the above-described problem, according to a third aspect of the present invention, an echo canceling method is such that P is an integer equal to or greater than 2, and P speakers and P microphones are arranged in a common sound field. Then, when the received signal is reproduced from the speaker, the echo that goes around the microphone via the echo path is deleted. The echo cancellation method estimates the diffuse residual echo contained in the collected sound signal in the wave number domain using the signal obtained by converting the collected sound signal collected by the microphone into the wave number domain and the received signal in the wave number domain. A wave number domain diffuse residual echo estimation canceling step for canceling the diffuse residual echo estimated from the collected sound signal. The wave number domain diffusion residual echo estimation elimination step calculates a power spectrum matrix that is a P × P matrix using the received signal vector X, which is a P-dimensional vector whose elements are received signals for each wave number, and its complex conjugate and transpose. Then, a compression input for calculating a cross spectrum matrix which is a P × P matrix using a complex conjugate and transpose of a sound pickup signal vector which is a P-dimensional vector having a sound pickup signal for each wave number as an element and a received signal vector X An input / output transfer characteristic matrix which is a P × P matrix whose elements are estimated values of input / output transfer characteristics of the received signal and the collected sound signal using the output correlation coefficient calculating step, the power spectrum matrix and the cross spectrum matrix A compression input / output transfer characteristic estimation step for obtaining the received signal vector X by multiplying the input / output transfer characteristic matrix by an input / output transfer characteristic matrix, Comprising a diffusion residual echo estimation step of obtaining a diffusion residual echo vector is Torr, and a subtraction step of obtaining a difference between the estimated value of the diffusion residual echo between the picked-up signal and the wavenumber region of wavenumbers region.

上記の課題を解決するために、本発明の第四の態様によれば、エコー消去方法は、Ｐを２以上の整数とし、Ｐ個のスピーカとＰ個のマイクロホンとが共通の音場に配置され、スピーカから受話信号を再生した際にエコー経路を経てマイクロホンに回り込むエコーを消去する。エコー消去方法は、マイクロホンで収音される収音信号を波数領域に変換した信号と波数領域の受話信号とを用いて、波数領域の収音信号に含まれる拡散残留エコーを推定し、波数領域の収音信号から推定した拡散残留エコーを消去する波数領域拡散残留エコー推定消去ステップを含む。波数領域拡散残留エコー推定消去ステップは、Ｐ’＜Ｐとし、Ｐ’×Ｐ行列である圧縮行列Ｗを用いて、波数毎の受話信号を要素とするＰ次元のベクトルである受話信号ベクトルＸを、Ｐ’次元の圧縮ベクトルＺに圧縮する入力次元圧縮ステップと、圧縮ベクトルＺを圧縮行列Ｗの複素共役転置行列で伸長したＰ次元のベクトルと、受話信号ベクトルＸとの差が最小になるように、圧縮行列Ｗを更新する次元圧縮行列更新ステップと、圧縮ベクトルＺとその複素共役かつ転置とを用いてＰ’×Ｐ’行列であるパワースペクトル行列を算出し、波数毎の収音信号を要素とするＰ次元のベクトルである収音信号ベクトルと圧縮ベクトルＺの複素共役かつ転置とを用いてＰ×Ｐ’行列であるクロススペクトル行列を算出する圧縮入出力相関係数算出ステップと、パワースペクトル行列とクロススペクトル行列とを用いて、受話信号と収音信号との入出力伝達特性の推定値を要素とするＰ×Ｐ’行列である入出力伝達特性行列を求める圧縮入出力伝達特性推定ステップと、圧縮ベクトルＺに入出力伝達特性行列を乗じて、波数毎の拡散残留エコーの推定値を要素とするＰ次元のベクトルである拡散残留エコーベクトルを求める拡散残留エコー推定ステップと、波数領域の収音信号と波数領域の拡散残留エコーの推定値との差分を求める減算ステップとを含む。 In order to solve the above-described problem, according to a fourth aspect of the present invention, an echo canceling method uses P as an integer of 2 or more, and P speakers and P microphones are arranged in a common sound field. Then, when the received signal is reproduced from the speaker, the echo that goes around the microphone via the echo path is deleted. The echo cancellation method estimates the diffuse residual echo contained in the collected sound signal in the wave number domain using the signal obtained by converting the collected sound signal collected by the microphone into the wave number domain and the received signal in the wave number domain. A wave number domain diffuse residual echo estimation canceling step for canceling the diffuse residual echo estimated from the collected sound signal. The wave number domain diffusion residual echo estimation erasure step sets P ′ <P, and uses a compression matrix W that is a P ′ × P matrix to obtain a received signal vector X that is a P-dimensional vector having received signals for each wave number as elements. The difference between the received signal vector X and the input dimension compression step for compressing the compressed vector Z into the P′-dimensional compressed vector Z, the P-dimensional vector obtained by expanding the compressed vector Z with the complex conjugate transpose matrix of the compression matrix W, and the received signal vector X is minimized. Then, a power spectrum matrix that is a P ′ × P ′ matrix is calculated using a dimension compression matrix update step for updating the compression matrix W, and the compression vector Z and its complex conjugate and transposition, and a sound collected signal for each wave number is calculated. A compression input / output correlation coefficient calculation unit for calculating a cross spectrum matrix that is a P × P ′ matrix using a complex conjugate and transpose of a sound pickup signal vector that is a P-dimensional vector as an element and a compression vector Z And an input / output transfer characteristic matrix which is a P × P ′ matrix whose elements are estimated values of the input / output transfer characteristics of the received signal and the collected sound signal using the power spectrum matrix and the cross spectrum matrix. Input / output transfer characteristic estimation step, and diffusion residual echo estimation for multiplying the compression vector Z by an input / output transfer characteristic matrix to obtain a diffuse residual echo vector which is a P-dimensional vector having an estimated value of diffuse residual echo for each wave number as an element And a subtracting step for obtaining a difference between the collected sound signal in the wave number domain and the estimated value of the diffuse residual echo in the wave number domain.

本発明によれば、従来法以上に残留エコーを低減することができるという効果を奏する。 According to the present invention, it is possible to reduce the residual echo more than the conventional method.

マルチチャネル通信会議システムにおけるエコー消去装置の配置例を示す図。The figure which shows the example of arrangement | positioning of the echo cancellation apparatus in a multichannel communication conference system. エコー消去装置１００の機能ブロック図Functional block diagram of the echo canceller 100 エコー消去装置１００の処理フローを示す図。The figure which shows the processing flow of the echo cancellation apparatus. 波数領域エコーレプリカ生成部の機能ブロック図。The functional block diagram of a wave number domain echo replica production | generation part. フレーム合成を説明するための図。The figure for demonstrating frame composition. 残留エコー消去部の機能ブロック図。The functional block diagram of a residual echo cancellation part. 残留エコー消去部の処理フローを示す図。The figure which shows the processing flow of a residual echo cancellation part. 波数領域残留エコー推定消去部の機能ブロック図。The functional block diagram of the wave number domain residual echo estimation elimination part. 波数領域残留エコー推定消去部の処理フローを示す図。The figure which shows the processing flow of a wave number area | region residual echo estimation elimination part. 波数領域拡散残留エコー推定消去部の機能ブロック図。The functional block diagram of the wave number area | region spreading | diffusion residual echo estimation elimination part. 波数領域拡散残留エコー推定消去部の処理フローを示す図。The figure which shows the processing flow of the wave number area | region spreading | diffusion residual echo estimation elimination part. 波数領域拡散残留エコー推定消去部を単独で用いた場合の残留エコー消去部の機能ブロック図。The functional block diagram of a residual echo cancellation part at the time of using the wave number area | region spreading | diffusion residual echo estimation cancellation part independently. 波数領域拡散残留エコー推定消去部を単独で用いた場合の波数領域拡散残留エコー推定消去部の機能ブロック図。The functional block diagram of the wave number area | region spreading | diffusion residual echo estimation cancellation | release part at the time of using a wave number area | region spreading | diffusion residual echo estimation cancellation | release part independently. 従来法の処理結果を説明するための図。The figure for demonstrating the processing result of a conventional method. 第一実施形態の変形例での処理結果を説明するための図。The figure for demonstrating the processing result in the modification of 1st embodiment.

以下、本発明の実施形態について説明する。なお、以下の説明に用いる図面では、同じ機能を持つ構成部や同じ処理を行うステップには同一の符号を記し、重複説明を省略する。以下の説明において、テキスト中で使用する記号「^」等は、本来直前の文字の真上に記載されるべきものであるが、テキスト記法の制限により、当該文字の直後に記載する。式中においてはこれらの記号は本来の位置に記述している。また、ベクトルや行列の各要素単位で行われる処理は、特に断りが無い限り、そのベクトルやその行列の全ての要素に対して適用されるものとする。 Hereinafter, embodiments of the present invention will be described. In the drawings used for the following description, constituent parts having the same function and steps for performing the same process are denoted by the same reference numerals, and redundant description is omitted. In the following description, the symbol “^” or the like used in the text should be described immediately above the character immediately before, but it is described immediately after the character due to restrictions on text notation. In the formula, these symbols are written in their original positions. Further, the processing performed for each element of a vector or matrix is applied to all elements of the vector or matrix unless otherwise specified.

＜第一実施形態のポイント＞
第一実施形態では、波数領域において受話信号から拡散残留エコーへの伝達特性を高精度かつ低演算量で推定する手段と、波数領域において誤差信号から拡散残留エコーを差し引く手段とを備える。波数領域の受話信号から波数領域の誤差信号への伝達特性を行列として推定することで拡散残留エコーの推定を高精度化する。さらに、この波数領域の受話信号を圧縮してから推定に使用することで演算量を大幅に圧縮する。この圧縮した受話信号と誤差信号の相関を利用することで残留エコー以外の信号による推定揺らぎを抑える。 <Points of first embodiment>
In the first embodiment, there are provided means for estimating the transfer characteristic from the received signal to the diffuse residual echo in the wave number domain with high accuracy and low computational complexity, and means for subtracting the diffuse residual echo from the error signal in the wave number domain. The estimation of the diffuse residual echo is improved by estimating the transfer characteristic from the received signal in the wave number domain to the error signal in the wave number domain as a matrix. Furthermore, the amount of calculation is greatly reduced by compressing the received signal in the wave number domain and then using it for estimation. By utilizing the correlation between the compressed received signal and the error signal, the estimated fluctuation due to a signal other than the residual echo is suppressed.

＜第一実施形態に係るエコー消去装置１００＞
図１はマルチチャネル通信会議システムにおけるエコー消去装置１００の配置例を、図２はエコー消去装置１００の機能ブロック図を、図３はその処理フローを示す。
エコー消去装置１００を含むマルチチャネル通信会議システムはＰチャネルの再生系とＰチャネルの収音系からなる。ただし、Ｐ≧２である。このマルチチャネル通信会議システムにおいて、Ｐ個のスピーカ２_ｐとＰ個のマイクロホン３_ｐとが共通の音場に配置される。Ｐチャネルの受話信号ｘ（ｐ，ｎ）は、スピーカ２_ｐで音響信号として再生され、音響エコー経路を経てＰ個のマイクロホン３_ｐにそれぞれ回り込む。この回り込む信号成分が前述のエコーである。ただし、ｐ＝１，２，…，Ｐであり、ｎは時刻を表すインデックスである。 <Echo Canceling Device 100 according to First Embodiment>
FIG. 1 shows an arrangement example of the echo cancellation apparatus 100 in the multi-channel communication conference system, FIG. 2 shows a functional block diagram of the echo cancellation apparatus 100, and FIG. 3 shows a processing flow thereof.
The multi-channel communication conference system including the echo canceller 100 includes a P-channel playback system and a P-channel sound collection system. However, P ≧ 2. In this multi-channel communication conference system, P speakers 2 _p and P microphones 3 _p are arranged in a common sound field. P-channel of the received signal x (p, n) is reproduced as an acoustic signal by the loudspeaker 2 _p, wraps around each of the P number of microphones 3 _p through the acoustic echo path. This signal component that wraps around is the aforementioned echo. Here, p = 1, 2,..., P, and n is an index representing time.

エコー消去装置１００は、Ｐ個の受話端１_ｐのそれぞれを介して受話信号ｘ（ｐ，ｎ）を受け取り、Ｐ個のマイクロホン３_ｐのそれぞれで収音される収音信号ｙ（ｐ，ｎ）を受け取る。さらに、Ｐ個の収音信号ｙ（ｐ，ｎ）のそれぞれからエコーを消去して、送話信号e⁽³⁾（ｐ，ｎ）を生成し、送話端４_ｐに出力する。 Echo canceller 100 receives the received signal x (p, n) via the respective P-number of the receiving end _{1 p,} collected signal y (p being picked up by each of the P number of microphones _{3 p,} n ). Furthermore, to erase the echoes from each of the P number of collected signal y (p, n), and generates a transmission signal ^{e (3) (p, n} ), and outputs the transmission terminal 4 _p.

エコー消去装置１００は、周波数領域変換部１１と、波数変換部１２と、波数領域エコーレプリカ生成部２１と、逆波数変換部３１と、時間領域変換部３２と、フレーム合成部３４と、Ｐ個の減算部３３_ｐと、誤差周波数領域変換部４１と、誤差波数変換部４２とを含む。なお、エコー消去装置１００は、既存技術（例えば非特許文献２参照）を用いて、波数領域適応アルゴリズムを実現する。
さらに、エコー消去装置１００は、波数領域で受話信号と誤差信号とから残留エコーを推定し、誤差信号から残留エコーを差し引く残留エコー消去部１２０を含む。以下、各部の詳細を説明する。 The echo cancellation apparatus 100 includes a frequency domain conversion unit 11, a wave number conversion unit 12, a wave number domain echo replica generation unit 21, an inverse wave number conversion unit 31, a time domain conversion unit 32, a frame synthesis unit 34, and P pieces. of including a subtraction unit 33 _p, and the error frequency domain transform section 41, and an error-wavenumber conversion unit 42. Note that the echo cancellation apparatus 100 implements a wavenumber domain adaptive algorithm using existing technology (see, for example, Non-Patent Document 2).
Further, echo canceling apparatus 100 includes a residual echo canceling unit 120 that estimates a residual echo from the received signal and the error signal in the wave number domain and subtracts the residual echo from the error signal. Details of each part will be described below.

＜周波数領域変換部１１＞
周波数領域変換部１１は、Ｐチャネルの時間領域の受話信号ｘ（ｐ，ｎ）を受け取り、チャネルｐ毎に周波数領域の受話信号Ｘ_ｆ（ｐ，ｉ）に変換し（ｓ１）、Ｐ×２Ｆ個の周波数領域の受話信号Ｘ_ｆ（ｐ，ｉ）を波数変換部１２に出力する。ただし、ｉはフレーム番号を、２Ｆは１フレーム内に含まれるサンプル数を、ｆは周波数のインデックスを表し、ｆ＝０，１，…，２Ｆ−１である。信号のサンプリング周波数をｆ_Ｓとすると、Ｘ_ｆ（ｐ，ｉ）はフレームｉにおけるチャネルｐの受話信号の周波数ｆ_Ｓｆ／２Ｆ［Ｈｚ］の成分を表す。なお、周波数領域変換の方法としては、高速フーリエ変換（Fast Fourier Transform；以下「ＦＦＴ」と略す）等が考えられる。 <Frequency domain converter 11>
The frequency domain transform unit 11 receives the received signal x (p, n) in the time domain of the P channel, converts it into a received signal X _f (p, i) in the frequency domain for each channel p (s1), and P × 2F The received signal X _f (p, i) in the frequency domain is output to the wave number converter 12. However, i represents a frame number, 2F represents the number of samples included in one frame, f represents a frequency index, and f = 0, 1,..., 2F-1. If the sampling frequency of the signal is f _S , X _f (p, i) represents a component of the frequency f _S f / 2F [Hz] of the received signal of channel p in frame i. As a method of frequency domain transformation, Fast Fourier Transform (hereinafter abbreviated as “FFT”) or the like can be considered.

まず、周波数領域変換部１１は、受話信号ｘ（ｐ，ｎ）をＦ／Ｄ個受け取る毎に（言い換えると、ｎ＝ｉＦ／Ｄの関係になる毎に）、２Ｆ個の受話信号ｘ（ｐ，ｎ−２Ｆ＋１），ｘ（ｐ，ｎ−２Ｆ＋２），…，ｘ（ｐ，ｎ）を１フレーム分としてブロック化し、フレーム単位の受話信号ｘ（ｐ，ｉ）を得る。ただし、Ｆは自然数であり、ＤはＦを割り切る自然数である。例えば、
x(p,i)=[x(p,(iF/D)-2F+1),x(p,(iF/D)-2F+2),…,x(p,iF/D)]^T (1)
である。ただし、^Ｔは転置を表す。以下、各信号を１フレーム＝２Ｆサンプル、シフト量Ｆ／Ｄサンプルでブロック化する。ＦＦＴ計算を簡略化・高速化するために、Ｆを２のべき乗にとることが多い。以下ではＤ≧２の場合を示す。 First, the frequency domain transform unit 11 receives 2F received signals x (p every time F / D received signals x (p, n) are received (in other words, every time n = iF / D). , N−2F + 1), x (p, n−2F + 2),..., X (p, n) are blocked for one frame to obtain a received signal x (p, i) in units of frames. However, F is a natural number and D is a natural number that divides F. For example,
x (p, i) = [x (p, (iF / D) -2F + 1), x (p, (iF / D) -2F + 2), ..., x (p, iF / D)] ^T (1)
It is. However, ^T represents transposition. Hereinafter, each signal is blocked by 1 frame = 2F samples and shift amount F / D samples. In order to simplify and speed up the FFT calculation, F is often raised to a power of 2. Hereinafter, a case of D ≧ 2 is shown.

さらに、周波数領域変換部１１は、フレーム単位の受話信号ｘ（ｐ，ｉ）を、次式のように周波数領域の受話信号Ｘ（ｐ，ｉ）に変換する。
X(p,i)=FFT(x(p,i))=[X₀(p,i) … X_f(p,i) … X_2F-1(p,i)] (2)
なお、受話信号Ｘ（ｐ，ｉ）を含め、周波数領域の各信号は短時間スペクトルにより表される。 Further, the frequency domain converting unit 11 converts the received signal x (p, i) in units of frames into a received signal X (p, i) in the frequency domain as shown in the following equation.
X (p, i) = FFT (x (p, i)) = [X ₀ (p, i)… X _f (p, i)… X _2F-1 (p, i)] (2)
Note that each signal in the frequency domain, including the received signal X (p, i), is represented by a short-time spectrum.

＜波数変換部１２＞
波数変換部１２は、Ｐ×２Ｆ個の周波数領域の受話信号Ｘ_ｆ（ｐ，ｉ）を受け取り、以下の式（３）や（４）により、周波数ｆ毎に波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ）に変換し（ｓ３）、Ｐ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ）を波数領域エコーレプリカ生成部２１及び残留エコー消去部１２０に出力する。ただし、ｋは波数のインデックスであり、Ｋを自然数とし、チャネル数Ｐが偶数でＰ＝２Ｋのときｋ＝−Ｋ＋１，−Ｋ＋２，…，−１，０，１，…，Ｋであり、チャネル数Ｐが奇数でＰ＝２Ｋ＋１のときｋ＝−Ｋ，−Ｋ＋１，…，−１，０，１，…，Ｋである。 <Wave number converter 12>
The wave number converter 12 receives P × 2F frequency domain received signals X _f (p, i), and receives the received signal X ^{(W (W} ) in the wave number domain for each frequency f according to the following equations (3) and (4). ⁾ _F (k, i) is converted (s3), and P × 2F reception signals X ^(W) _f (k, i) in the wave number domain are output to the wave number domain echo replica generation unit 21 and the residual echo cancellation unit 120 To do. Where k is a wave number index, K is a natural number, K = −K + 1, −K + 2,..., −1, 0, 1,. When the number P is an odd number and P = 2K + 1, k = −K, −K + 1,..., −1, 0, 1,.

（１）チャネル数Ｐが偶数でＰ＝２Ｋのとき、
X^(W) _f(i)=FFT([X_f(1,i) X_f(2,i) … X_f(P,i)])
=[X^(W) _f(0,i) … X^(W) _f(k,i) … X^(W) _f(K,i) X^(W) _f(-K+1,i) … X^(W) _f(-1,i)]
(3)
である。
（２）チャネル数Ｐが奇数でＰ＝２Ｋ＋１のとき、
X^(W) _f(i)=FFT([X_f(1,i) X_f(2,i) … X_f(P,i)])
=[X^(W) _f(0,i) … X^(W) _f(k,i) … X^(W) _f(K,i) X^(W) _f(-K,i) … X^(W) _f(-1,i)] (4)
である。波数領域への変換は、２のべき乗の点数を持つＦＦＴで高速に行うため、以下、チャネル数Ｐが偶数の場合（Ｐ＝２Ｋ）について説明を進める。なお、受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ）を含め、波数領域の各信号は短時間スペクトルにより表される。 (1) When the number of channels P is an even number and P = 2K,
X ^(W) _f (i) = FFT ([X _f (1, i) X _f (2, i)… X _f (P, i)])
= [X ^(W) _f (0, i)… X ^(W) _f (k, i)… X ^(W) _f (K, i) X ^(W) _f (-K + 1, i)… X ^{( W)} _f (-1, i)]
(3)
It is.
(2) When the number of channels P is odd and P = 2K + 1,
X ^(W) _f (i) = FFT ([X _f (1, i) X _f (2, i)… X _f (P, i)])
= [X ^(W) _f (0, i)… X ^(W) _f (k, i)… X ^(W) _f (K, i) X ^(W) _f (-K, i)… X ^(W) _f (-1, i)] (4)
It is. Since the conversion to the wave number domain is performed at high speed with an FFT having a power of 2, the following description will be given for the case where the number of channels P is an even number (P = 2K). Each signal in the wave number domain including the received signal X ^(W) _f (k, i) is represented by a short-time spectrum.

＜波数領域エコーレプリカ生成部２１＞
波数領域エコーレプリカ生成部２１は、Ｐ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ）とＰ×２Ｆ個の波数領域の誤差信号Ｅ^（Ｗ） _ｆ（ｋ，ｉ）（詳細は後述する）とを受け取り、これらの値を用いて、ｆ≦Ｆにおいて、Ｐ×（Ｆ＋１）個の波数領域のエコーレプリカＹ＾^（Ｗ） _ｆ（ｋ，ｉ）を生成し、逆波数変換部３１に出力する。なお、エコーレプリカとは、収音信号に含まれるエコーを模したものであり、エコーの推定値である。 <Wave number domain echo replica generator 21>
The wave number domain echo replica generation unit 21 receives the received signal X ^(W) _f (k, i) in the P × 2F wave number domain and the error signal E ^(W) _f (k, i) in the P × 2F wave number domain. (Details will be described later), and using these values, P × (F + 1) number of wave number domain echo replicas Y ^ ^(W) _f (k, i) are generated for f ≦ F and vice versa. Output to the wave number converter 31. The echo replica imitates an echo included in the collected sound signal and is an estimated value of the echo.

図４は波数領域エコーレプリカ生成部２１の機能ブロック図を示す。波数領域エコーレプリカ生成部２１は、修正量算出部２１１と、フィルタ係数部２１３と、乗算部２１５とを含む。 FIG. 4 shows a functional block diagram of the wave number domain echo replica generation unit 21. The wave number domain echo replica generation unit 21 includes a correction amount calculation unit 211, a filter coefficient unit 213, and a multiplication unit 215.

（乗算部２１５）
波数領域エコーレプリカ生成部２１の乗算部２１５は、Ｐ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ）を受け取る。また、後述するフィルタ係数部２１３からＰ×（Ｆ＋１）×（２δ＋１）個の波数領域のフィルタ係数Ｈ^（Ｗ） _ｆ（ｋ，ｋ＋ｄｋ，ｉ）（ただしｆ≦Ｆ）を受け取る。ただし、ｄｋ＝−δ，−δ＋１，…，−１，０，１，…，δ−１，δである。δとして、非特許文献２では１もしくは２が推奨されている。乗算部２１５は、ｆ≦Ｆにおいて、次式のように、受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ）にフィルタ係数Ｈ^（Ｗ） _ｆ（ｋ，ｋ＋ｄｋ，ｉ）を乗じて、波数領域のエコーレプリカＹ＾^（Ｗ） _ｆ（ｋ，ｉ）を生成し（ｓ５）、逆波数変換部３１に出力する。 (Multiplier 215)
The multiplication unit 215 of the wave number domain echo replica generation unit 21 receives the received signal X ^(W) _f (k, i) of P × 2F wave number regions. Also, P × (F + 1) × (2δ + 1) filter coefficients H ^(W) _f (k, k + dk, i) (where f ≦ F) are received from a filter coefficient unit 213 described later. However, dk = −δ, −δ + 1,..., −1, 0, 1,. As δ, 1 or 2 is recommended in Non-Patent Document 2. The multiplication unit 215 multiplies the received signal X ^(W) _f (k, i) by the filter coefficient H ^(W) _f (k, k + dk, i) in the wave number domain when f ≦ F. The echo replica Y ^ ^(W) _f (k, i) is generated (s5) and output to the inverse wave number converter 31.

このように波数領域のエコーレプリカＹ＾^（Ｗ） _ｆ（ｋ，ｉ）を生成することで、隣接する空間周波数成分を含むことができる。隣接する空間周波数成分を含む必要がない場合には、δ＝０として次式により、波数領域のエコーレプリカＹ＾^（Ｗ） _ｆ（ｋ，ｉ）を生成してもよい。
Y^^(W) _f(k,i)=H^(W) _f(k,k,i)X^(W) _f(k,i) (6)
なお、修正量算出部２１１及びフィルタ係数部２１３の処理については後述する。 By generating the echo replica Y ^ ^(W) _f (k, i) in the wave number domain in this way, adjacent spatial frequency components can be included. If it is not necessary to include adjacent spatial frequency components, an echo replica Y ^ ^(W) _f (k, i) in the wave number domain may be generated by the following equation with δ = 0.
Y ^ ^(W) _f (k, i) = H ^(W) _f (k, k, i) X ^(W) _f (k, i) (6)
The processing of the correction amount calculation unit 211 and the filter coefficient unit 213 will be described later.

＜逆波数変換部３１＞
逆波数変換部３１は、Ｐ×（Ｆ＋１）個の波数領域のエコーレプリカＹ＾^（Ｗ） _ｆ（ｋ，ｉ）を受け取り（ただしｆ≦Ｆ）、次式のように周波数ｆ毎に周波数領域のエコーレプリカＹ＾_ｆ（ｐ，ｉ）に変換する（ｓ９）。
[Y^_f(1,i) Y^_f(2,i) … Y^_f(P,i)]
=IFFT([Y^^(W) _f(0,i)…Y^^(W) _f(k,i)…Y^^(W) _f(K,i) Y^^(W) _f(-K+1,i)…Y^^(W) _f(-1,i)])
(7)
なお、周波数ｆ＞Ｆについては、実数信号のＦＦＴ結果に関する対称性から、次式で周波数領域のエコーレプリカＹ＾_ｆ（ｐ，ｉ）を求める。
Y^_f(p,i)=conj(Y^_2F-f(p,i)) (8)
ここで、ｃｏｎｊ（・）は、・の複素共役をとることを意味する。このようにして求めた合計Ｐ×２Ｆ個の周波数領域のエコーレプリカＹ＾_ｆ（ｐ，ｉ）を時間領域変換部３２に出力する。なお、逆波数変換方法としては、波数変換部１２における波数領域変換方法に対応するものを用いればよい。 <Reverse Wave Number Converter 31>
The inverse wave number conversion unit 31 receives echo replicas Y ^ ^(W) _f (k, i) of P × (F + 1) wave number regions (where f ≦ F), and the frequency region for each frequency f as in the following equation: Is converted to an echo replica Y ^ _f (p, i) (S9).
[Y ^ _f (1, i) Y ^ _f (2, i)… Y ^ _f (P, i)]
= IFFT ([Y ^ ^(W) _f (0, i)… Y ^ ^(W) _f (k, i)… Y ^ ^(W) _f (K, i) Y ^ ^(W) _f (-K + 1 , i)… Y ^ ^(W) _f (-1, i)])
(7)
For the frequency f> F, the echo replica Y ^ _f (p, i) in the frequency domain is obtained by the following equation from the symmetry regarding the FFT result of the real signal.
Y ^ _f (p, i) = conj (Y ^ _2F-f (p, i)) (8)
Here, conj (·) means taking a complex conjugate of •. The total P × 2F frequency domain echo replicas ＾ _f (p, i) obtained in this way are output to the time domain transform unit 32. As the inverse wave number conversion method, a method corresponding to the wave number domain conversion method in the wave number conversion unit 12 may be used.

＜時間領域変換部３２＞
時間領域変換部３２は、Ｐ×２Ｆ個の周波数領域のエコーレプリカＹ＾_ｆ（ｐ，ｉ）を受け取り、次式のように、チャネルｐ毎に周波数領域のエコーレプリカＹ＾_ｆ（ｐ，ｉ）を逆ＦＦＴし、時間領域のエコーレプリカ信号ベクトルｙ＾（ｐ，ｉ）（要素数はＦ個）に変換する（ｓ９）。
y^(p,i)=[I_F 0_F]IFFT([Y^₀(p,i)…Y^_f(p,i)…Y^_2F-1(p,i)]) (9)
ここで０_ＦはＦ×Ｆの零行列、Ｉ_ＦはＦ×Ｆの単位行列である。Ｐ個の時間領域のエコーレプリカ信号ベクトルｙ＾（ｐ，ｉ）をフレーム合成部３４に出力する。時間領域変換方法としては、周波数領域変換部１１における周波数領域変換方法に対応するものを用いればよい。 <Time domain conversion unit 32>
Time domain transforming section 32 receives the echo replica _Y ^ f of P × 2F frequency-domain (p, i), the following equation, an echo replica in the frequency domain for each channel p _Y ^ f (p, i ) Is subjected to inverse FFT and converted to an echo replica signal vector y ^ (p, i) (the number of elements is F) in the time domain (s9).
y ^ (p, i) = [I _F 0 _F ] IFFT ([Y ^ ₀ (p, i)… Y ^ _f (p, i)… Y ^ _2F-1 (p, i)]) (9)
Here, 0 _F is an F × F zero matrix, and _IF is an F × F unit matrix. P time echo replica signal vectors y ^ (p, i) are output to the frame synthesis unit 34. As the time domain conversion method, a method corresponding to the frequency domain conversion method in the frequency domain conversion unit 11 may be used.

＜フレーム合成部３４＞
フレーム合成部３４は、Ｐ個の時間領域のエコーレプリカ信号ベクトルｙ＾（ｐ，ｉ）を受け取る。周波数領域変換部１１において受話信号ｘ（ｐ，ｎ）をＤ≧２でフレーム化した場合には、フレーム合成部３４は、フレームｉで求めたエコーレプリカ信号ベクトルｙ＾（ｐ，ｉ）と一つ前のフレームｉ−１で求めたエコーレプリカ信号ベクトルｙ＾（ｐ，ｉ−１）とに対して窓かけ処理を行った上で、合成し（ｓ１３）、合成後のＰ個の時間領域のエコーレプリカ信号ベクトルｙ＾’（ｐ，ｉ）をそれぞれＰ個の減算部３３_ｐに出力する。 <Frame synthesis unit 34>
The frame synthesizer 34 receives P time-domain echo replica signal vectors y ^ (p, i). When the received signal x (p, n) is framed with D ≧ 2 in the frequency domain transform unit 11, the frame synthesis unit 34 matches the echo replica signal vector y ^ (p, i) obtained in the frame i. A windowing process is performed on the echo replica signal vector y ^ (p, i-1) obtained in the previous frame i-1, and then synthesized (s13), and P time domains after synthesis are performed. The echo replica signal vectors y ^ '(p, i) are output to the P subtracting units 33 _p .

Ｄ＝２の場合、長さＦ／Ｄのハニング窓をＷ_Ｈとして、合成後の長さＦ／Ｄのエコーレプリカ信号ベクトルｙ＾’（ｐ，ｉ）は次式で算出される。この合成の様子を図５に示す。
y^'(p,i-1)=[0_F/DI_F/D]diag(W_H)y^(p,i-1)+[I_F/D 0_F/D]diag(W_H)y^(p,i) (10)
ただし、０_Ｆ／Ｄは（Ｆ／Ｄ）×（Ｆ／Ｄ）のゼロ行列、Ｉ_Ｆ／Ｄは（Ｆ／Ｄ）×（Ｆ／Ｄ）の単位行列、ｄｉａｇ（・）は・を対角成分とし、それ以外が零であるような行列である。 In the case of D = 2, the Hanning window of length F / D is set to _WH , and the synthesized echo replica signal vector y ^ '(p, i) of length F / D is calculated by the following equation. The state of this synthesis is shown in FIG.
y ^ '(p, i-1) = [0 _{F / D} I _{F / D} ] diag (W _H ) y ^ (p, i-1) + [I _{F / D} 0 _{F / D} ] diag (W _H ) y ^ (p, i) (10)
_{However, 0 F / D} is zero _{matrix, I F / D} is a unit matrix of (F / D) × (F / D), diag (·) is a-pair (F / D) × (F / D) The matrix is a corner component and the others are zero.

＜減算部３３_ｐ＞
減算部３３_ｐは、時間領域のエコーレプリカ信号ベクトルｙ＾’（ｐ，ｉ−１）とマイクロホン３_ｐで収音された収音信号ｙ（ｐ，ｎ）とを受け取る。エコーレプリカ信号はフレーム合成のためにＦ／Ｄ遅延している。これを考慮して収音信号ｙ（ｐ，ｎ）を１フレーム＝Ｆサンプル、シフト量Ｆ／Ｄサンプルで
y(p,i-1)=[y(p,((i-1)F/D)-F+1),y(p,((i-1)F/D)-F+2),…,y(p,(i-1)F/D)]^T
のようにブロック化し、収音信号ベクトルｙ（ｐ，ｉ−１）とする。減算部３３_ｐは、次式のように時間領域の収音信号ベクトルｙ（ｐ，ｉ−１）から時間領域のエコーレプリカ信号ベクトルｙ＾’（ｐ，ｉ−１）を差し引き（ｓ１１）、時間領域の誤差信号ベクトルｅ（ｐ，ｉ）（要素数はＦ個）を求め、残留エコー消去部１２０及び誤差周波数領域変換部４１に出力する。
e(p,i)=y(p,i-1)-y^'(p,i-1) (11)
このような構成により、エコー消去装置１００は、エコー消去を図る。 <Subtraction unit 33 _p >
The subtractor 33 _p receives the time-domain echo replica signal vector y ^ ′ (p, i−1) and the collected sound signal y (p, n) collected by the microphone 3 _p . The echo replica signal is F / D delayed for frame synthesis. Taking this into consideration, the collected sound signal y (p, n) is 1 frame = F samples, and the shift amount is F / D samples.
y (p, i-1) = (y (p, ((i-1) F / D) -F + 1), y (p, ((i-1) F / D) -F + 2), …, Y (p, (i-1) F / D)] ^T
And the collected sound signal vector y (p, i-1). The subtractor 33 _p subtracts the echo replica signal vector y ^ '(p, i-1) in the time domain from the collected signal vector y (p, i-1) in the time domain as in the following equation (s11), A time domain error signal vector e (p, i) (the number of elements is F) is obtained and output to the residual echo canceling unit 120 and the error frequency domain converting unit 41.
e (p, i) = y (p, i-1) -y ^ '(p, i-1) (11)
With such a configuration, the echo canceller 100 attempts to cancel echo.

＜誤差周波数領域変換部４１＞
誤差周波数領域変換部４１は、Ｐ個の時間領域の誤差信号ベクトルｅ（ｐ，ｉ）を受け取り、次式のように、チャネルｐ毎に時間領域の誤差信号ベクトルｅ（ｐ，ｉ）に０詰めをしたものを周波数領域に変換し（ｓ１５）、Ｐ×２Ｆ個の周波数領域の誤差信号Ｅ_ｆ（ｐ，ｉ）を誤差波数変換部４２に出力する。 <Error frequency domain conversion unit 41>
The error frequency domain transform unit 41 receives P time domain error signal vectors e (p, i), and sets the time domain error signal vector e (p, i) to 0 for each channel p as shown in the following equation. The padding is converted into the frequency domain (s15), and P × 2F frequency domain error signals E _f (p, i) are output to the error wave number converter 42.

＜誤差波数変換部４２＞
誤差波数変換部４２は、Ｐ×２Ｆ個の周波数領域の誤差信号Ｅ_ｆ（ｐ，ｉ）を受け取り、次式により、周波数ｆ毎に波数領域の誤差信号Ｅ^（Ｗ） _ｆ（ｋ，ｉ）に変換し（ｓ１７）、Ｐ×２Ｆ個の波数領域の誤差信号Ｅ^（Ｗ） _ｆ（ｋ，ｉ）を波数領域エコーレプリカ生成部２１に出力する。
E^(W) _f(p,i)=FFT([E_f(1,i) … E_f(P,i)]
=[E^(W) _f(0,i) … E^(W) _f(k,i) … E^(W) _f(K,i) E^(W) _f(-K+1,i) … E^(W) _f(-1,i)]
(13) <Error wave number converter 42>
The error wave number conversion unit 42 receives P × 2F frequency domain error signals E _f (p, i), and uses the following equation to calculate the wave number domain error signal E ^(W) _f (k, i) for each frequency f. (S17), and P × 2F wave number domain error signals E ^(W) _f (k, i) are output to the wave number domain echo replica generation unit 21.
E ^(W) _f (p, i) = FFT ([E _f (1, i)… E _f (P, i)]
= [E ^(W) _f (0, i)… E ^(W) _f (k, i)… E ^(W) _f (K, i) E ^(W) _f (-K + 1, i)… E ^{( W)} _f (-1, i)]
(13)

（修正量算出部２１１）
波数領域エコーレプリカ生成部２１内の修正量算出部２１１は、Ｐ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ）とＰ×２Ｆ個の波数領域の誤差信号Ｅ^（Ｗ） _ｆ（ｋ，ｉ）とを受け取り（図２及び図４参照）、ｆ（ｆ≦Ｆ）において、−Ｋ＋１≦ｋ≦Ｋの範囲で、次式のように波数領域の適応フィルタのフィルタ係数の修正量ｄＨ^（Ｗ） _ｆ（ｋ，ｋ＋ｄｋ，ｉ）（ただし−δ≦ｄｋ≦δ）を算出し（ｓ１９）、Ｐ×（Ｆ＋１）×（２δ＋１）個の修正量ｄＨ^（Ｗ） _ｆ（ｋ，ｋ＋ｄｋ，ｉ）をフィルタ係数部２１３に出力する。 (Correction amount calculation unit 211)
The correction amount calculation unit 211 in the wave number domain echo replica generation unit 21 receives the received signal X ^(W) _f (k, i) in the P × 2F wave number domain and the error signal E ^{(W in the} P × 2F wave number domain). ⁾ _F (k, i) is received (see FIG. 2 and FIG. 4), and in f (f ≦ F), the filter coefficient of the adaptive filter in the wavenumber domain in the range of −K + 1 ≦ k ≦ K Correction amount dH ^(W) _f (k, k + dk, i) (where −δ ≦ dk ≦ δ) is calculated (s19), and P × (F + 1) × (2δ + 1) correction amounts dH ^(W) _f ( k, k + dk, i) is output to the filter coefficient unit 213.

なお、ρは分母が０になることを防止するための微小な正定数であり、右辺分母中のＢ^（Ｗ） _ｆ（ｋ，ｉ）は修正量ｄＨ^（Ｗ） _ｆ（ｋ，ｋ＋ｄｋ，ｉ）を補正しており、

Note that ρ is a minute positive constant for preventing the denominator from becoming 0, and B ^(W) _f (k, i) in the right-side denominator is the correction amount dH ^(W) _f (k, k + dk, i). )

により計算される。Ｂ^（Ｗ） _ｆ（ｋ，ｉ）は受話信号Ｘ^（Ｗ） _ｆ（ｋ−δ，ｉ）〜Ｘ^（Ｗ） _ｆ（ｋ＋δ，ｉ）のパワーの総和であり、βはパワー計算で短時間平均をとるための平滑化定数であり、０〜１の値をとる。

Is calculated by B ^(W) _f (k, i) is the sum of the powers of the received signals X ^(W) _f (k−δ, i) to X ^(W) _f (k + δ, i), and β is a short time in the power calculation. This is a smoothing constant for taking an average and takes a value of 0 to 1.

（フィルタ係数部２１３）
波数領域エコーレプリカ生成部２１内のフィルタ係数部２１３は、Ｐ×（Ｆ＋１）×（２δ＋１）個の修正量ｄＨ^（Ｗ） _ｆ（ｋ，ｋ＋ｄｋ，ｉ）を受け取り（ただしｆ≦Ｆ）、次式でフィルタ係数Ｈ^（Ｗ） _ｆ（ｋ，ｋ＋ｄｋ，ｉ）を更新し（ｓ２１）、Ｐ×（Ｆ＋１）×（２δ＋１）個の更新後の波数領域のフィルタ係数Ｈ^（Ｗ） _ｆ（ｋ，ｋ＋ｄｋ，ｉ＋１）を乗算部２１５に出力する。
H^(W) _f(k,k+dk,i+1)=H^(W) _f(k,k+dk,i)+μdH^(W) _f(k,k+dk,i) (16)
ただし、μは０〜１の値をとるステップサイズである。乗算部２１５における処理は前述の通りである。 (Filter coefficient part 213)
The filter coefficient unit 213 in the wave number domain echo replica generation unit 21 receives P × (F + 1) × (2δ + 1) correction amounts dH ^(W) _f (k, k + dk, i) (where f ≦ F), The filter coefficient H ^(W) _f (k, k + dk, i) is updated by the equation (s21), and P × (F + 1) × (2δ + 1) wave number domain filter coefficients H ^(W) _f (k, k + dk, i + 1) is output to the multiplier 215.
H ^(W) _f (k, k + dk, i + 1) = H ^(W) _f (k, k + dk, i) + μdH ^(W) _f (k, k + dk, i) (16)
However, μ is a step size taking a value of 0-1. The processing in the multiplication unit 215 is as described above.

＜残留エコー消去部１２０＞
残留エコー消去部１２０は、Ｐ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ）と、Ｐ個の時間領域の誤差信号ベクトルｅ（ｐ，ｉ）とを受け取り、波数領域の誤差信号に含まれる残留エコーを推定し、波数領域の誤差信号から推定した残留エコーを消去し（ｓ２３）、Ｐ個の時間領域の送話信号ｅ^（３）（ｐ，ｎ）を出力する。 <Residual echo canceller 120>
The residual echo canceling unit 120 receives the received signal X ^(W) _f (k, i) in the P × 2F wave number domain and the error signal vector e (p, i) in the P time domain, and receives the wave number domain. Is estimated, the residual echo estimated from the error signal in the wave number domain is deleted (s23), and P time transmission signals e ⁽³⁾ (p, n) are output. .

図６は残留エコー消去部１２０の機能ブロック図を、図７はその処理フローを示す。残留エコー消去部１２０は、周波数領域変換部１２１と、波数変換部１２２と、波数領域残留エコー推定消去部１２３１と、波数領域拡散残留エコー推定消去部１２３２と、逆波数変換部１２４と、時間領域変換部１２５と、フレーム合成部１２６とを含む。残留エコーには、反射等によらない直接波によるものと、直接波以外の反射波等によるもの（拡散残留エコー）とが含まれる。残留エコー消去部１２０では、直接波による残留エコーを波数領域残留エコー推定消去部１２３１で、拡散残留エコーを波数領域拡散残留エコー推定消去部１２３２でそれぞれ推定し、消去する。以下、処理の詳細を説明する。 FIG. 6 is a functional block diagram of the residual echo canceling unit 120, and FIG. 7 shows its processing flow. The residual echo cancellation unit 120 includes a frequency domain conversion unit 121, a wave number conversion unit 122, a wave number domain residual echo estimation cancellation unit 1231, a wave number domain diffuse residual echo estimation cancellation unit 1232, an inverse wave number conversion unit 124, a time domain A conversion unit 125 and a frame synthesis unit 126 are included. Residual echoes include those based on direct waves that do not depend on reflections, and those based on reflected waves other than direct waves (diffuse residual echoes). In the residual echo canceling unit 120, the residual echo due to the direct wave is estimated by the wave number domain residual echo estimation canceling unit 1231 and the diffuse residual echo is estimated by the wave number domain residual residual echo estimation canceling unit 1232 and erased. Details of the processing will be described below.

（周波数領域変換部１２１）
周波数領域変換部１２１は、Ｐ個の時間領域の誤差信号ベクトルｅ（ｐ，ｉ）を受け取り、次式のように、チャネルｐ毎にフレームｉにおける誤差信号ベクトルｅ（ｐ，ｉ）と一つ前のフレームｉ−１における誤差信号ベクトルｅ（ｐ，ｉ−１）とを用いて、周波数領域の誤差信号Ｅ^（１） _ｆ（ｐ，ｉ）に変換し（ｓ２３１）、Ｐ×２Ｆ個の周波数領域の誤差信号Ｅ^（１） _ｆ（ｐ，ｉ）を波数変換部１２２に出力する。例えば、周波数領域変換部１１と同様の方法により周波数領域に変換する。
E⁽¹⁾(p,i)=FFT([e^T(p,i-1),e^T(p,i)])=[E⁽¹⁾ ₀(p,i) … E⁽¹⁾ _f(p,i) … E⁽¹⁾ _2F-1(p,i)]
(17) (Frequency domain transform unit 121)
The frequency domain transform unit 121 receives P time domain error signal vectors e (p, i), and one error signal vector e (p, i) in the frame i for each channel p as shown in the following equation. Using the error signal vector e (p, i−1) in the previous frame i−1, the error signal is converted into a frequency domain error signal E ⁽¹⁾ _f (p, i) (s231), and P × 2F The frequency domain error signal E ⁽¹⁾ _f (p, i) is output to the wave number converter 122. For example, conversion into the frequency domain is performed by a method similar to that of the frequency domain conversion unit 11.
E ⁽¹⁾ (p, i) = FFT ([e ^T (p, i-1), e ^T (p, i)]) = [E ⁽¹⁾ ₀ (p, i)… E ⁽¹⁾ _f (p, i)… E ⁽¹⁾ _2F-1 (p, i)]
(17)

（波数変換部１２２）
波数変換部１２は、Ｐ×２Ｆ個の周波数領域の誤差信号Ｅ^（１） _ｆ（ｐ，ｉ）を受け取り、次式により、周波数ｆ毎に波数領域の誤差信号Ｅ^（Ｗ１） _ｆ（ｋ，ｉ）に変換し（ｓ２３２）、Ｐ×２Ｆ個の波数領域の誤差信号Ｅ^（Ｗ１） _ｆ（ｋ，ｉ）を波数領域残留エコー推定消去部１２３１に出力する。
E^(W1) _f(i)=FFT([E⁽¹⁾ _f(1,i) E⁽¹⁾ _f(2,i) … E⁽¹⁾ _f(P,i)])
=[E^(W1) _f(0,i) … E^(W1) _f(k,i) … E^(W1) _f(K,i) E^(W1) _f(-K+1,i) … E^(W1) _f(-1,i)]
(18) (Wave number converter 122)
The wave number converter 12 receives P × 2F frequency domain error signals E ⁽¹⁾ _f (p, i), and calculates the wave number domain error signal E ^(W1) _f (k, k ^{) for} each frequency f according to the following equation. i) (s232), and P × 2F wave number domain error signals E ^(W1) _f (k, i) are output to the wave number domain residual echo estimation elimination section 1231.
E ^(W1) _f (i) = FFT ([E ⁽¹⁾ _f (1, i) E ⁽¹⁾ _f (2, i)… E ⁽¹⁾ _f (P, i)])
= [E ^(W1) _f (0, i)… E ^(W1) _f (k, i)… E ^(W1) _f (K, i) E ^(W1) _f (-K + 1, i)… E ^{( W1)} _f (-1, i)]
(18)

（波数領域残留エコー推定消去部１２３１）
波数領域残留エコー推定消去部１２３１は、Ｐ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−１）と、Ｐ×２Ｆ個の波数領域の誤差信号Ｅ^（Ｗ１） _ｆ（ｋ，ｉ）とを受け取り、これらの値を用いて、ｆ≦Ｆにおいて、誤差信号Ｅ^（Ｗ１） _ｆ（ｋ，ｉ）に含まれる直接波による残留エコーを推定し、波数領域の収音信号から推定した直接波による残留エコーを消去し（ｓ２３３１）、直接波による残留エコーを消去した、Ｐ×（Ｆ＋１）個の波数領域の誤差信号Ｅ^（Ｗ２） _ｆ（ｐ，ｉ）を求める。なお周波数領域の受話信号として、Ｘ^（Ｗ） _ｆ（ｋ，ｉ）ではなく、１つ前のＸ^（Ｗ） _ｆ（ｋ，ｉ−１）を用いるのは、エコーレプリカ信号をフレーム合成する際に生じる遅延を考慮に入れているためである。 (Wave number domain residual echo estimation elimination unit 1231)
The wave number domain residual echo estimation elimination unit 1231 receives P × 2F wave number domain received signals X ^(W) _f (k, i−1) and P × 2F wave number domain error signals E ^(W1) _f ( k, i) and using these values, the residual echo due to the direct wave included in the error signal E ^(W1) _f (k, i) is estimated when f ≦ F, and the sound collected signal in the wave number domain is estimated. The residual echo due to the direct wave estimated from (1) is eliminated (s2331), and the error signal E ^(W2) _f (p, i) in the P × (F + 1) wave number domain is obtained by eliminating the residual echo due to the direct wave. Note that the immediately preceding X ^(W) _f (k, i−1) is used as the frequency domain received signal, not X ^(W) _f (k, i), when the echo replica signal is frame-synthesized. This is because the delay caused by the above is taken into consideration.

以下、処理の詳細を説明する。
図８は波数領域残留エコー推定消去部１２３１の機能ブロック図を、図９はその処理フローを示す。
波数領域残留エコー推定消去部１２３１は、入出力相関係数算出部１２３１１と、入出力伝達特性推定部１２３１２と、残留エコー推定部１２３１３と、残留エコー補正部１２３１４と減算部１２３１５とを含む。 Details of the processing will be described below.
FIG. 8 is a functional block diagram of the wave number domain residual echo estimation erasing unit 1231, and FIG. 9 shows a processing flow thereof.
Wave number domain residual echo estimation elimination section 1231 includes an input / output correlation coefficient calculation section 12311, an input / output transfer characteristic estimation section 12312, a residual echo estimation section 12313, a residual echo correction section 12314, and a subtraction section 12315.

((入出力相関係数算出部１２３１１))
入出力相関係数算出部１２３１１は、Ｐ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−１）とＰ×２Ｆ個の波数領域の誤差信号Ｅ^（Ｗ１） _ｆ（ｋ，ｉ）とを受け取り、ｆ≦Ｆにおいて、波数領域の残留エコー信号を出力とする系の伝達特性を推定するために、時刻ｎ＝ｉＦ／Ｄにおける波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−１）と波数領域の誤差信号Ｅ^（Ｗ１） _ｆ（ｋ，ｉ）とから
P_f(k,i)=E[X^(W)* _f(k,i-1)X^(W) _f(k,i-1)]
Q_f(k,i)=E[X^(W)* _f(k,i-1)E^(W1) _f(k,i)] (19)
により、受話信号のパワースペクトルＰ_ｆ（ｋ，ｉ）と、受話信号と誤差信号との間のクロススペクトルＱ_ｆ（ｋ，ｉ）とを算出し（ｓ２３３１ａ）、入出力伝達特性推定部１２３１２に出力する。ただし、ｉはフレーム番号であり、時刻ｎとはｎ＝ｉＦ／Ｄの関係があり、＊は複素共役を、Ｅ［・］は・の平均をとることを表す。平均処理の一例としては、
E[X^(W)* _f(k,i-1)X^(W) _f(k,i-1)]=βE[X^(W)* _f(k,i-2)X^(W) _f(k,i-2)]+(1-β)X^(W)* _f(k,i-1)X^(W) _f(k,i-1)
のように、１フレーム前の処理結果と０〜１の値をとる平滑化定数βを用いる方法や過去の数〜数十フレームの統計的平均値として求める方法等が考えられる。 ((Input / output correlation coefficient calculation unit 12311))
The input / output correlation coefficient calculation unit 12311 receives the received signal X ^(W) _f (k, i−1) in the P × 2F wave number domain and the error signal E ^(W1) _f (k in the P × 2F wave number domain ^). , I), and in order to estimate the transfer characteristics of the system that outputs the residual echo signal in the wave number domain at f ≦ F, the received signal X ^(W) _f (in the wave number domain at time n = iF / D k, i−1) and the error signal E ^(W1) _f (k, i) in the wave number domain.
P _f (k, i) = E [X ^{(W) *} _f (k, i-1) X ^(W) _f (k, i-1)]
Q _f (k, i) = E [X ^{(W) *} _f (k, i-1) E ^(W1) _f (k, i)] (19)
Thus, the power spectrum P _f (k, i) of the received signal and the cross spectrum Q _f (k, i) between the received signal and the error signal are calculated (s2331a), and the input / output transfer characteristic estimating unit 12312 Output. However, i is a frame number, and there is a relationship of n = iF / D with time n, * represents a complex conjugate, and E [•] represents an average of. As an example of the averaging process,
E [X ^{(W) *} _f (k, i-1) X ^(W) _f (k, i-1)] = βE [X ^{(W) *} _f (k, i-2) X ^(W) _f ( k, i-2)] + (1-β) X ^{(W) *} _f (k, i-1) X ^(W) _f (k, i-1)
As described above, a method using a processing result of one frame before and a smoothing constant β that takes a value of 0 to 1 or a method of obtaining a statistical average value of past several to several tens of frames can be considered.

((入出力伝達特性推定部１２３１２))
入出力伝達特性推定部１２３１２は、Ｐ×（Ｆ＋１）個のパワースペクトルＰ_ｆ（ｋ，ｉ）とＰ×（Ｆ＋１）個のクロススペクトルＱ_ｆ（ｋ，ｉ）とを受け取り、ｆ（ｆ≦Ｆ）において、パワースペクトルＰ_ｆ（ｋ，ｉ）及びクロススペクトルＱ_ｆ（ｋ，ｉ）から ((Input / output transfer characteristic estimation unit 12312))
The input / output transfer characteristic estimation unit 12312 receives P × (F + 1) power spectra P _f (k, i) and P × (F + 1) cross spectra Q _f (k, i), and receives f (f ≦ f F), from the power spectrum P _f (k, i) and the cross spectrum Q _f (k, i)

により、受話信号と誤差信号との入出力伝達特性を推定し（ｓ２３３１ｂ）、推定値Ｇ’_ｆ（ｋ，ｉ）を残留エコー推定部１２３１３に出力する。

Thus, the input / output transfer characteristics between the received signal and the error signal are estimated (s2331b), and the estimated value G ′ _f (k, i) is output to the residual echo estimator 12313.

また、次式により推定値Ｇ’_ｆ（ｋ，ｉ）を平滑化し、平滑化した推定値Ｇ_ｆ（ｋ，ｉ）を残留エコー推定部１２３１３に出力してもよい。 Further, the estimated value G ′ _f (k, i) may be smoothed by the following equation, and the smoothed estimated value G _f (k, i) may be output to the residual echo estimating unit 12313.

本実施形態では、平滑化した推定値Ｇ_ｆ（ｋ，ｉ）を出力するものとする。ここで、β_２は、入出力伝達特性の推定値を平滑化するための定数であり、０〜１の間の値をとる。

In the present embodiment, it is assumed that a smoothed estimated value G _f (k, i) is output. Here, beta ₂ are constants for smoothing the estimate of the input-output transfer characteristic, it takes a value between 0 and 1.

((残留エコー推定部１２３１３))
残留エコー推定部１２３１３は、Ｐ×（Ｆ＋１）個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−１）と、Ｐ×（Ｆ＋１）個の推定値Ｇ_ｆ（ｋ，ｉ）とを受け取り、ｆ（ｆ≦Ｆ）において、次式のように、受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−１）に推定値Ｇ_ｆ（ｋ，ｉ）を乗じて、残留エコーを推定し（ｓ２３３１ｃ）、推定値ΔＥ^（Ｗ１） _ｆ（ｋ，ｉ）を残留エコー補正部１２３１４に出力する。
ΔE^(W1) _f(k,i)=G_f(k,i)X^(W) _f(k,i-1) (21) ((Residual Echo Estimator 12313))
The residual echo estimator 12313 includes P × (F + 1) wave number domain received signals X ^(W) _f (k, i−1), P × (F + 1) estimated values G _f (k, i), and Then, at f (f ≦ F), the received signal X ^(W) _f (k, i−1) is multiplied by the estimated value G _f (k, i) as shown in the following equation to estimate the residual echo. (S2331c) and the estimated value ΔE ^(W1) _f (k, i) are output to the residual echo correcting unit 12314.
ΔE ^(W1) _f (k, i) = G _f (k, i) X ^(W) _f (k, i-1) (21)

((残留エコー補正部１２３１４))
残留エコー補正部１２３１４は、Ｐ×（Ｆ＋１）個の推定値ΔＥ^（Ｗ１） _ｆ（ｋ，ｉ）と、Ｐ×２Ｆ個の波数領域の誤差信号Ｅ^（Ｗ１） _ｆ（ｋ，ｉ）とを受け取り、ｆ（ｆ≦Ｆ）において、次式で補正し（ｓ２３３１ｄ）、補正後の残留エコーの推定値ΔＥ^{ＩＩ（Ｗ１）} _ｆ（ｋ，ｉ）を減算部１２３１５に出力する。 ((Residual echo correction unit 12314))
The residual echo correcting unit 12314 generates P × (F + 1) estimated values ΔE ^(W1) _f (k, i) and P × 2F error signals E ^(W1) _f (k, i). Then, at f (f ≦ F), it is corrected by the following equation (s2331d), and the corrected residual echo estimated value ΔE ^{II (W1)} _f (k, i) is output to the subtracting unit 12315.

ただし、式中のＳ^（Ｗ） _ｆ（ｋ，ｉ）は、送話信号の推定値であり、次式により算出される。
S^(W) _f(k,i)=E^(W1) _f(k,i)-ΔE^(W1) _f(k,i) (23)
また、Ｔは各スペクトルの推定の自由度の数であり、入出力相関係数算出部１２３１１においてパワースペクトルＰ_ｆ（ｋ，ｉ）及びクロススペクトルＱ_ｆ（ｋ，ｉ）を算出するときのフレーム数が、これにあたる。Ｍは入力変数の数であり、式（２０）の場合にはＭ＝１になる。またＦ_{２Ｍ，Ｔ−２Ｍ，ａｌｐｈａ}は、自由度ｎ_１＝２Ｍ、ｎ_２＝Ｔ−２ＭのＦ分布の１００×ａｌｐｈａ百分比点である。

However, S ^(W) _f (k, i) in a type _| formula is an estimated value of a transmission signal, and is calculated by following Formula.
S ^(W) _f (k, i) = E ^(W1) _f (k, i) -ΔE ^(W1) _f (k, i) (23)
T is the number of degrees of freedom of estimation of each spectrum, and the frame when the input / output correlation coefficient calculation unit 12311 calculates the power spectrum P _f (k, i) and the cross spectrum Q _f (k, i). This is the number. M is the number of input variables. In the case of equation (20), M = 1. F _{2M, T-2M, and alpha} are 100 × alpha percentage points of F distribution with n ₁ = 2M and n ₂ = T-2M degrees of freedom.

なお、Ｆ分布は、統計学で用いられる連続確率分布である。統計的仮説検定の一手法である分散分析において、観測データにおける変動を誤差変動と各要因の変動に分解し、各要因の効果・有意性を判定する際に使用される。 The F distribution is a continuous probability distribution used in statistics. In analysis of variance, which is a method of statistical hypothesis testing, it is used to determine the effect / significance of each factor by breaking the variation in the observed data into error variation and the variation of each factor.

参考文献１によれば、Ｍ＝１のとき入出力伝達特性推定部１２３１２において推定される入出力伝達特性の推定値Ｇ_ｆ（ｋ，ｉ）の信頼区間は、真値からの比率で According to Reference Document 1, the confidence interval of the input / output transfer characteristic estimation value G _f (k, i) estimated by the input / output transfer characteristic estimation unit 12312 when M = 1 is a ratio from the true value.

の幅を持つ。
（参考文献１）Ｊ．Ｓ．ベンダット、Ａ．Ｇ．ピアソル、「ランダムデータの統計的処理」、培風館、１９７６年、ｐ．１９４〜１９７

With a width of
(Reference 1) J. Org. S. Vendat, A.M. G. Pearsol, “Statistical Processing of Random Data”, Baifukan, 1976, p. 194-197

短時間スペクトルに基づく入出力伝達特性推定部１２３１１の推定では、本来よりも送話と残留エコーの相関性を高めに推定しやすく、伝達特性を高めに推定する傾向がある。このことに基づき、上記の補正は残留エコーの信頼区間の下端の値を残留エコーの補正値としている。 In the estimation of the input / output transfer characteristic estimation unit 12311 based on the short-time spectrum, it is easier to estimate the correlation between the transmission and the residual echo than originally, and there is a tendency to estimate the transfer characteristic higher. Based on this, the above correction uses the value of the lower end of the confidence interval of the residual echo as the residual echo correction value.

((減算部１２３１５))
減算部１２３１５は、Ｐ×２Ｆ個の波数領域の誤差信号Ｅ^（Ｗ１） _ｆ（ｋ，ｉ）と、Ｐ×（Ｆ＋１）個の波数領域の補正後の残留エコーの推定値ΔＥ^{ＩＩ（Ｗ１）} _ｆ（ｋ，ｉ）とを受け取り、ｆ（ｆ≦Ｆ）において、次式のように波数領域で誤差信号Ｅ^（Ｗ１） _ｆ（ｋ，ｉ）から残留エコーの推定値ΔＥ^{ＩＩ（Ｗ１）} _ｆ（ｋ，ｉ）を差し引いて（ｓ２３３１ｅ）、差分Ｅ^（Ｗ２） _ｆ（ｋ，ｉ）を求め、波数領域拡散残留エコー推定消去部１２３２に出力する。
E^(W2) _f(k,i)=E^(W1) _f(k,i)-ΔE^II(W1) _f(k,i) (25)
なお、差分Ｅ^（Ｗ２） _ｆ（ｋ，ｉ）は、誤差信号Ｅ^（Ｗ１） _ｆ（ｋ，ｉ）から直接波による残留エコーを消去した信号であり、誤差信号Ｅ^（Ｗ２） _ｆ（ｋ，ｉ）ともいう。 ((Subtraction unit 12315))
The subtracting unit 12315 calculates the error signal E ^(W1) _f (k, i) in the P × 2F wave number domain and the estimated value ΔE ^{II (W1)} of the residual echo after correction in the P × (F + 1) wave number domain. _f (k, i) is received, and at f (f ≦ F), an estimated value ΔE ^{II (W1)} _{f of the} residual echo from the error signal E ^(W1) _f (k, i) in the wave number domain as in the following equation: (K, i) is subtracted (s2331e) to obtain a difference E ^(W2) _f (k, i), which is output to the wave number domain diffuse residual echo estimation elimination unit 1232.
E ^(W2) _f (k, i) = E ^(W1) _f (k, i) -ΔE ^{II (W1)} _f (k, i) (25)
The difference E ^(W2) _f (k, i) is a signal obtained by eliminating the residual echo due to the direct wave from the error signal E ^(W1) _f (k, i), and the error signal E ^(W2) _f (k, i) It is also called i).

（波数領域拡散残留エコー推定消去部１２３２）
波数領域拡散残留エコー推定消去部１２３２は、Ｐ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−２）と、Ｐ×（Ｆ＋１）個の波数領域の誤差信号Ｅ^（Ｗ２） _ｆ（ｋ，ｉ）とを受け取り、これらの値を用いて、ｆ≦Ｆにおいて、誤差信号Ｅ^（Ｗ２） _ｆ（ｋ，ｉ）に含まれる拡散残留エコーを推定し、波数領域の誤差信号Ｅ^（Ｗ２） _ｆ（ｋ，ｉ）から推定した拡散残留エコーを消去し、Ｐ×（Ｆ＋１）個の波数領域の送話信号Ｅ^（Ｗ３） _ｆ（ｐ，ｉ）を求め（ｓ２３３２）、逆波数変換部１２４に出力する。 (Wave domain diffuse residual echo estimation elimination unit 1232)
Wave number domain diffuse residual echo estimation elimination section 1232 receives P × 2F received signal X ^(W) _f (k, i−2) in the wave number domain and P × (F + 1) wave number error signal E ^{(W2). )} _F (k, i) is received, and using these values, the diffuse residual echo included in the error signal E ^(W2) _f (k, i) is estimated when f ≦ F, and the error signal in the wavenumber domain is estimated. The diffuse residual echo estimated from E ^(W2) _f (k, i) is eliminated, and the transmission signal E ^(W3) _f (p, i) of P × (F + 1) wavenumber regions is obtained (s2332), and the inverse Output to wave number converter 124.

波数領域残留エコー推定消去部１２３２は、（１）波数領域残留エコー推定消去部１２３１よりも１フレーム前の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−２）を使うこと、（２）受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−２）をベクトル（以下、波数領域受話信号ベクトルともいい、
X^(W) _f(i-2)=[X^(W) _f(0,i-2) … X^(W) _f(k,i-2) … X^(W) _f(K,i-2) X^(W) _f(-K+1,i-2) … X^(W) _f(-1,i-2)]
とする）として扱うこと、で壁面等で反射して拡散した拡散残留エコーを推定対象としている。以下、処理の詳細を説明する。 The wave number domain residual echo estimation cancellation unit 1232 uses (1) the reception signal X ^(W) _f (k, i−2) one frame before the wave number domain residual echo estimation cancellation unit 1231, and (2) the reception signal. X ^(W) _f (k, i−2) is also referred to as a vector (hereinafter also referred to as a wavenumber domain received signal vector)
X ^(W) _f (i-2) = [X ^(W) _f (0, i-2)… X ^(W) _f (k, i-2)… X ^(W) _f (K, i-2) X ^(W) _f (-K + 1, i-2)… X ^(W) _f (-1, i-2)]
In this case, a diffuse residual echo reflected and diffused by a wall surface or the like is used as an estimation target. Details of the processing will be described below.

図１０は波数領域拡散残留エコー推定消去部１２３２の機能ブロック図を、図１１はその処理フローを示す。 FIG. 10 is a functional block diagram of the wave number domain diffuse residual echo estimation erasing unit 1232, and FIG.

波数領域拡散残留エコー推定消去部１２３２は、入力次元圧縮部１２３２０と、次元圧縮行列更新部１２３２６と、圧縮入出力相関係数算出部１２３２１と、圧縮入出力伝達特性推定部１２３２２と、拡散残留エコー推定部１２３２３と、拡散残留エコー補正部１２３２４と減算部１２３２５とを含む。 Wave number domain diffusion residual echo estimation elimination section 1232 includes input dimension compression section 12320, dimension compression matrix update section 12326, compression input / output correlation coefficient calculation section 12321, compression input / output transfer characteristic estimation section 12322, and diffusion residual echo. An estimation unit 12323, a diffuse residual echo correction unit 12324, and a subtraction unit 12325 are included.

((入力次元圧縮部１２３２０))
入力次元圧縮部１２３２０は、後述する次元圧縮行列更新部１２３２６で更新された、（Ｆ＋１）個のＰ’×Ｐの圧縮行列Ｗ_ｆ（ｉ−１）と、Ｐ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−２）とを受け取る。なお、Ｐ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−２）を２Ｆ個の波数領域受話信号ベクトルＸ^（Ｗ） _ｆ（ｉ−２）として扱う。入力次元圧縮部１２３２０は、圧縮行列Ｗ_ｆ（ｉ−１）をもちいて、ｆ≦Ｆにおいて、波数領域受話信号ベクトルＸ^（Ｗ） _ｆ（ｉ−２）を、Ｐ’次元の波数領域圧縮ベクトルＺ^（Ｗ） _ｆ（ｉ−２）に圧縮し（ｓ２３３２ａ）、圧縮入出力相関係数算出部１２３２１及び次元圧縮行列更新部１２３２６に出力する。 ((Input dimension compression unit 12320))
The input dimension compression unit 12320 receives (F + 1) P ′ × P compression matrices W _f (i−1) updated by a dimension compression matrix update unit 12326, which will be described later, and P × 2F wave number domain receptions. The signal X ^(W) _f (k, i−2) is received. The received signal X ^(W) _f (k, i−2) in the P × 2F wave number domain is treated as 2F wave number domain received signal vector X ^(W) _f (i−2). The input dimension compression unit 12320 uses the compression matrix W _f (i−1), and in f ≦ F, the wave number domain received signal vector X ^(W) _f (i−2) is converted into a P′-dimensional wave number domain compression vector. Compressed to Z ^(W) _f (i−2) (s2332a), and outputs the result to the compressed input / output correlation coefficient calculation unit 12321 and the dimension compression matrix update unit 12326.

Z^(W) _f(i-2)＝W_f(i-1) X^(W) _f(i-2)
なお、Ｐ’＜Ｐであり、Ｐ’の大きさは、Ｐの大きさは、環境（例えば部屋の広さや反響の程度）により適宜設定すればよく、例えば、Ｐの１／５〜１／１０程度に設定することができる。 Z ^(W) _f (i-2) = W _f (i-1) X ^(W) _f (i-2)
It should be noted that P ′ <P, and the size of P ′ may be appropriately set depending on the environment (for example, the size of the room and the degree of reverberation). It can be set to about 10.

((次元圧縮行列更新部１２３２６))
次元圧縮行列更新部１２３２６は、（Ｆ＋１）個の波数領域圧縮ベクトルＺ^（Ｗ） _ｆ（ｉ−２）とＰ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−２）とを受け取る。なお、Ｐ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−２）を２Ｆ個の波数領域受話信号ベクトルＸ^（Ｗ） _ｆ（ｉ−２）として扱う。次元圧縮行列更新部１２３２６は、ｆ≦Ｆにおいて、波数領域圧縮ベクトルＺ^（Ｗ） _ｆ（ｉ−２）を圧縮行列Ｗ_ｆ（ｉ−１）の複素共役転置行列Ｗ^Ｈ _ｆ（ｉ−１）で伸長し、波数領域受話信号ベクトルＸ^（Ｗ） _ｆ（ｉ−２）との差ｄＸ^（Ｗ） _ｆ（ｉ−２）を求める。・^Ｈは・の複素共役かつ転置を表わす
dX^(W) _f(i-2)= X^(W) _f(i-2) - W^H _f(i-1) Z^(W) _f(i-2)
= X^(W) _f(i-2) - W^H _f(i-1) W_f(i-1) X^(W) _f(i-2)
そして、差ｄＸ^（Ｗ） _ｆ（ｉ−２）の大きさが最小になるように圧縮行列Ｗ_ｆ（ｉ−１）を更新し（ｓ２３３２ｇ）、更新した圧縮行列Ｗ_ｆ（ｉ）を入力次元圧縮部１２３２０に出力する。 ((Dimension compression matrix update unit 12326))
The dimension compression matrix update unit 12326 includes (F + 1) wave number domain compression vectors Z ^(W) _f (i-2) and P × 2F wave number domain received signals X ^(W) _f (k, i−2). And receive. The received signal X ^(W) _f (k, i−2) in the P × 2F wave number domain is treated as 2F wave number domain received signal vector X ^(W) _f (i−2). The dimension compression matrix update unit 12326 converts the wave number domain compression vector Z ^(W) _f (i-2) to the complex conjugate transpose matrix W ^H _f (i-1) of the compression matrix W _f (i-1) when f ≦ F. To obtain a difference dX ^(W) _f (i-2) from the wave number domain received signal vector X ^(W) _f (i-2).・^H represents the complex conjugate and transpose of
dX ^(W) _f (i-2) = X ^(W) _f (i-2)-W ^H _f (i-1) Z ^(W) _f (i-2)
= X ^(W) _f (i-2)-W ^H _f (i-1) W _f (i-1) X ^(W) _f (i-2)
Then, the compression matrix W _f (i−1) is updated so that the magnitude of the difference dX ^(W) _f (i−2) is minimized (s2332g), and the updated compression matrix W _f (i) is input to the input dimension. The data is output to the compression unit 12320.

この更新には例えば、サブスペース追跡法をもちいることができる。一例として、参考文献２中のＯＰＳＡ１を使用する際の詳細を以下にしめす。 For this update, for example, a subspace tracking method can be used. As an example, the details when using OPSA1 in Reference 2 are as follows.

波数領域圧縮ベクトルＺ^（Ｗ） _ｆ（ｉ−２）の自己相関行列Ｒ_ＺＺ（ｉ−２）の逆行列Ｒ^−１ _ＺＺ（ｉ−２）を、初期値Ｒ^−１ _ＺＺ（０）＝δ_０ ^−１Ｉから繰り返し推定する。ただし、δ_０は非０の正の定数であり、繰り返し推定処理を初めて実行する際の０割を防止する。ＩはＰ’×Ｐ’単位行列である。またｋ（ｉ）はＰ’次元の、Ｖ（ｉ）はＰ次元の中間生成ベクトルである。λは０〜１の間の値をとる忘却定数であり、推定速度を決めるパラメータである。以下のように、圧縮行列Ｗ_ｆ（ｉ）を更新することができる。
k(i) = R^-1 _ZZ(i-3)Z^(W)(i-2)/{λ+Z^(W)H(i-3) R^-1 _ZZ(i-3) Z^(W) (i-2)}
R^-1 _ZZ(i-2) = (1/λ){ R^-1 _ZZ(i-3)-k(i) Z^(W)H(i-2) R^-1 _ZZ(i-3)}
V(i) = dX^(W) _f (i-2) - 0.5||dX^(W) _f (i-2)||² W^H _f(i-1)k(i)
W_f(i) = W_f(i-1) + k(i)V^H(i)/{1+0.25||dX^(W) _f (i-2)||²||k(i)||²}
（参考文献２）S.C. Douglas and X. Sun, "Designing orthonormal subspace tracking algorithms", the Thirty-Fourth Asilomar Conference on Signals, Systems and Computers 2000, 2000, vol. 2, pp. 1441--1445. An inverse matrix R ^-1 _ZZ (i-2) of the autocorrelation matrix R _ZZ (i-2) of the wave number domain compression vector Z ^(W) _f (i-2) is set to an initial value R ^-1 _ZZ (0) = δ. _It estimates repeatedly from ₀ ⁻¹ I. However, δ ₀ is a non-zero positive constant, and prevents 0% when the iterative estimation process is executed for the first time. I is a P ′ × P ′ identity matrix. K (i) is a P′-dimensional intermediate generation vector, and V (i) is a P-dimensional intermediate generation vector. λ is a forgetting constant that takes a value between 0 and 1, and is a parameter that determines the estimated speed. The compression matrix W _f (i) can be updated as follows.
k (i) = R ^-1 _ZZ (i-3) Z ^(W) (i-2) / {λ + Z ^{(W) H} (i-3) R ^-1 _ZZ (i-3) Z ^(W) (i-2)}
R ^-1 _ZZ (i-2) = (1 / λ) {R ^-1 _ZZ (i-3) -k (i) Z ^{(W) H} (i-2) R ^-1 _ZZ (i-3)}
V (i) = dX ^(W) _f (i-2)-0.5 || dX ^(W) _f (i-2) || ² W ^H _f (i-1) k (i)
W _f (i) = W _f (i-1) + k (i) V ^H (i) / {1 + 0.25 || dX ^(W) _f (i-2) || ² || k (i) | | ² }
(Reference 2) SC Douglas and X. Sun, "Designing orthonormal subspace tracking algorithms", the Thirty-Fourth Asilomar Conference on Signals, Systems and Computers 2000, 2000, vol. 2, pp. 1441--1445.

((圧縮入出力相関係数算出部１２３２１))
圧縮入出力相関係数算出部１２３２１は、（Ｆ＋１）個の波数領域圧縮ベクトルＺ^（Ｗ） _ｆ（ｉ−２）とＰ×（Ｆ＋１）個の波数領域の誤差信号Ｅ^（Ｗ２） _ｆ（ｋ，ｉ）とを受け取る。なお、Ｐ×（Ｆ＋１）個の波数領域の誤差信号Ｅ^（Ｗ２） _ｆ（ｋ，ｉ）を（Ｆ＋１）個の波数領域誤差信号ベクトルＥ^（Ｗ２） _ｆ（ｋ，ｉ）(ただし、E^(W2) _f(i)=[E^(W2) _f(0,i) … E^(W2) _f(k,i) … E^(W2) _f(K,i) E^(W2) _f(-K+1,i) … E^(W2) _f(-1,i)])として扱う(ただしｆ≦Ｆ)。圧縮入出力相関係数算出部１２３２１は、ｆ≦Ｆにおいて、（Ｆ＋１）個の波数領域圧縮ベクトルＺ^（Ｗ） _ｆ（ｉ−２）と（Ｆ＋１）個の波数領域誤差信号ベクトルＥ^（Ｗ２） _ｆ（ｉ）とから圧縮された受話信号のパワースペクトル行列Ｐ^（２） _ｆ（ｉ）と、圧縮された受話信号と誤差信号との間のクロススペクトル行列Ｑ^（２） _ｆ（ｉ）とを次式により算出し（ｓ２３３２ｂ）、圧縮入出力伝達特性推定部１２３２２に出力する。
P⁽²⁾ _f(i)=E[Z^(W) _f(i-2)Z^(W)H _f(i-2)]
Q⁽²⁾ _f(i)=E[E^(W2) _f(i) Z^(W)H _f(i-2)] ((Compressed input / output correlation coefficient calculation unit 12321))
The compression input / output correlation coefficient calculation unit 12321 includes (F + 1) wave number domain compression vectors Z ^(W) _f (i−2) and P × (F + 1) wave number domain error signals E ^(W2) _f (k , I). Note that the error signal E ^(W2) _f (k, i) in the P × (F + 1) wave number domain is changed to the (F + 1) wave number domain error signal vector E ^(W2) _f (k, i) (where E ^{( W2)} _f (i) = [E ^(W2) _f (0, i)… E ^(W2) _f (k, i)… E ^(W2) _f (K, i) E ^(W2) _f (-K + 1 , i)... E ^(W2) _f (-1, i)]) (where f ≦ F). The compression input / output correlation coefficient calculation unit 12321 has (F + 1) wavenumber domain compression vectors Z ^(W) _f (i-2) and (F + 1) wavenumber domain error signal vectors E ^(W2) when f ≦ F. and _{f (i)} receiving signals compressed from the power spectrum matrix ^{_{P (2) f (i)}} , cross-spectral matrix Q between the compressed received signal and the error signal and ^{₍₂₎ f _(i)} It is calculated by the following equation (s2332b) and output to the compression input / output transfer characteristic estimation unit 12322.
P ⁽²⁾ _f (i) = E [Z ^(W) _f (i-2) Z ^{(W) H} _f (i-2)]
Q ⁽²⁾ _f (i) = E [E ^(W2) _f (i) Z ^{(W) H} _f (i-2)]

((圧縮入出力伝達特性推定部１２３２２))
圧縮入出力伝達特性推定部１２３２２は、Ｐ’×Ｐ’行列であるパワースペクトル行列Ｐ^（２） _ｆ（ｉ）とＰ×Ｐ’行列であるクロススペクトル行列Ｑ^（２） _ｆ（ｉ）とを受け取る。なお、各行列は（Ｆ＋１）個である。圧縮入出力伝達特性推定部１２３２２は、ｆ（ｆ≦Ｆ）において、パワースペクトル行列Ｐ^（２） _ｆ（ｉ）及びクロススペクトル行列Ｑ^（２） _ｆ（ｉ）から、次式により、入出力伝達特性行列Ｇ’_ｆ（ｉ）を求め（ｓ２３３２ｃ）、拡散残留エコー推定部１２３２３に出力する。 ((Compression input / output transfer characteristic estimation unit 12322))
The compression input / output transfer characteristic estimation unit 12322 calculates a power spectrum matrix P ⁽²⁾ _f (i) which is a P ′ × P ′ matrix and a cross spectrum matrix Q ⁽²⁾ _f (i) which is a P × P ′ matrix. receive. Each matrix is (F + 1). The compression input / output transfer characteristic estimation unit 12322 calculates the input / output transfer from the power spectrum matrix P ⁽²⁾ _f (i) and the cross spectrum matrix Q ⁽²⁾ _f (i) according to the following equation at f (f ≦ F). A characteristic matrix G ′ _f (i) is obtained (s2332c) and output to the diffuse residual echo estimator 12323.

なお、入出力伝達特性行列Ｇ’_ｆ（ｉ）は、圧縮された受話信号と誤差信号との入出力伝達特性の推定値を要素とするＰ×Ｐ’行列である。受話信号の圧縮では、主成分分析に似た考え方で、波数領域受話信号ベクトル（その要素は各波数成分）を主要な成分(主要なパターン)に分解し、近似する。この各主要成分と、残留エコーの各波数成分との対応が、入出力伝達特性行列Ｇ’_ｆ（ｉ）で記述される。 The input / output transfer characteristic matrix G ′ _f (i) is a P × P ′ matrix whose elements are estimated values of the input / output transfer characteristics of the compressed reception signal and error signal. In compression of the received signal, a wave number domain received signal vector (its elements are each wave number component) is decomposed into main components (main patterns) and approximated in a manner similar to principal component analysis. The correspondence between each main component and each wave number component of the residual echo is described by an input / output transfer characteristic matrix G ′ _f (i).

また、次式により推定行列Ｇ’_ｆ（ｉ）を平滑化し、平滑化した入出力伝達特性行列Ｇ_ｆ（ｉ）を拡散残留エコー推定部１２３２３に出力してもよい。 Further, the estimation matrix G ′ _f (i) may be smoothed by the following equation, and the smoothed input / output transfer characteristic matrix G _f (i) may be output to the diffuse residual echo estimation unit 12323.

本実施形態では、平滑化した入出力伝達特性行列Ｇ_ｆ（ｉ）を出力するものとする。ここで、β_２は、入出力伝達特性の推定値を平滑化するための定数であり、０〜１の間の値をとる。 In this embodiment, a smoothed input / output transfer characteristic matrix G _f (i) is output. Here, beta ₂ are constants for smoothing the estimate of the input-output transfer characteristic, it takes a value between 0 and 1.

((拡散残留エコー推定部１２３２３))
拡散残留エコー推定部１２３２３は、（Ｆ＋１）個の波数領域圧縮ベクトルＺ^（Ｗ） _ｆ（ｉ−２）と、（Ｆ＋１）個の入出力伝達特性行列Ｇ_ｆ（ｉ）とを受け取り、ｆ（ｆ≦Ｆ）において、次式のように、圧縮ベクトルＺ^（Ｗ） _ｆ（ｋ，ｉ−２）に入出力伝達特性行列Ｇ_ｆ（ｉ）を乗じて、拡散残留エコーベクトルΔＥ^（Ｗ２） _ｆ（ｉ）を求め（ｓ２３３２ｄ）、拡散残留エコー補正部１２３２４に出力する。
ΔE^(W２) _f(i)=G_f(i)Z^(W) _f(i-2)
なお、拡散残留エコーベクトルΔＥ^（Ｗ２） _ｆ（ｉ）は、波数毎の拡散残留エコーの推定値を要素とするＰ次元のベクトルである。 ((Diffusion residual echo estimation unit 12323))
The diffuse residual echo estimator 12323 receives (F + 1) wave number domain compression vectors Z ^(W) _f (i−2) and (F + 1) input / output transfer characteristic matrices G _f (i), and receives f ( In f ≦ F), the diffusion residual echo vector ΔE ^(W2) _{f is} obtained by multiplying the compression vector Z ^(W) _f (k, i−2) by the input / output transfer characteristic matrix G _f (i) as in the following equation. (I) is obtained (s2332d) and output to the diffusion residual echo correction unit 12324.
ΔE ^(W2) _f (i) = G _f (i) Z ^(W) _f (i-2)
The diffuse residual echo vector ΔE ^(W2) _f (i) is a P-dimensional vector whose element is an estimated value of diffuse residual echo for each wave number.

((拡散残留エコー補正部１２３２４))
拡散残留エコー補正部１２３２４は、（Ｆ＋１）個の拡散残留エコーベクトルΔＥ^（Ｗ２） _ｆ（ｉ）と、Ｐ×（Ｆ＋１）個の波数領域の誤差信号Ｅ^（Ｗ２） _ｆ（ｋ，ｉ）とを受け取り、ｆ（ｆ≦Ｆ）において、拡散残留エコーベクトルΔＥ^（Ｗ２） _ｆ（ｉ）の各要素ΔＥ^（Ｗ２） _ｆ（ｋ，ｉ）を次式で補正し（ｓ２３３２ｅ）、補正後の拡散残留エコーの推定値ΔＥ^{ＩＩ（Ｗ２）} _ｆ（ｋ，ｉ）を減算部１２３２５に出力する。 ((Diffusion residual echo correcting unit 12324))
The diffusion residual echo correction unit 12324 includes (F + 1) diffusion residual echo vectors ΔE ^(W2) _f (i) and P × (F + 1) number of error signals E ^(W2) _f (k, i). In f (f ≦ F), each element ΔE ^(W2) _f (k, i) of the diffusion residual echo vector ΔE ^(W2) _f (i) is corrected by the following equation (s2332e), and the diffusion after the correction The estimated value ΔE ^{II (W2)} _f (k, i) of the residual echo is output to the subtracting unit 12325.

ただし、式中のＳ^（Ｗ２） _ｆ（ｋ，ｉ）は、送話信号の推定値であり、次式により算出される。
S^(W２) _f(k,i)=E^(W２) _f(k,i)-ΔE^(W２) _f(k,i)
また、Ｔは各スペクトルの推定の自由度の数であり、圧縮入出力相関係数算出部１２３２１においてパワースペクトル行列Ｐ^（２） _ｆ（ｉ）及びクロススペクトル行列Ｑ^（２） _ｆ（ｉ）を算出するときのフレーム数が、これにあたる。Ｍは入力変数の数であり、式（３０）の場合にはＭ＝１になる。またＦ_{２Ｍ，Ｔ−２Ｍ，ａｌｐｈａ}は、自由度ｎ_１＝２Ｍ、ｎ_２＝Ｔ−２ＭのＦ分布の１００×ａｌｐｈａ百分比点である。 However, S ^(W2) _f (k, i) in a formula is an estimated value of a transmission signal, and is calculated by the following formula.
S ^(W2) _f (k, i) = E ^(W2) _f (k, i) -ΔE ^(W2) _f (k, i)
T is the number of degrees of freedom of estimation of each spectrum, and the compressed input / output correlation coefficient calculation unit 12321 calculates the power spectrum matrix P ⁽²⁾ _f (i) and the cross spectrum matrix Q ⁽²⁾ _f (i). This is the number of frames when calculating. M is the number of input variables. In the case of equation (30), M = 1. F _{2M, T-2M, and alpha} are 100 × alpha percentage points of F distribution with n ₁ = 2M and n ₂ = T-2M degrees of freedom.

((減算部１２３２５))
減算部１２３２５は、Ｐ×（Ｆ＋１）個の波数領域の誤差信号Ｅ^（Ｗ２） _ｆ（ｋ，ｉ）と、Ｐ×（Ｆ＋１）個の波数領域の補正後の拡散残留エコーの推定値ΔＥ^{ＩＩ（Ｗ２）} _ｆ（ｋ，ｉ）とを受け取り、ｆ（ｆ≦Ｆ）において、次式のように波数領域で誤差信号Ｅ^（Ｗ２） _ｆ（ｋ，ｉ）から拡散残留エコーの推定値ΔＥ^{ＩＩ（Ｗ２）} _ｆ（ｋ，ｉ）を差し引いて（ｓ２３３２ｆ）、差分を波数領域の送話信号Ｅ^（Ｗ３） _ｆ（ｋ，ｉ）として求め、逆波数変換部１２４に出力する。
E^(W３) _f(k,i)=E^(W２) _f(k,i)-ΔE^II(W２) _f(k,i) ((Subtraction unit 12325))
The subtracting unit 12325 calculates the error signal E ^(W2) _f (k, i) in the P × (F + 1) wave number domain and the estimated value ΔE ^II of the diffuse residual echo after the correction in the P × (F + 1) wave number domain. ^(W2) _f (k, i) is received, and at f (f ≦ F), an estimated value ΔE ^{II of the} diffuse residual echo from the error signal E ^(W2) _f (k, i) in the wave number domain as shown in the following equation: ^(W2) _f (k, i) is subtracted (s2332f), the difference is obtained as a transmission signal E ^(W3) _f (k, i) in the wave number domain, and is output to the inverse wave number converter 124.
E ^(W3) _f (k, i) = E ^(W2) _f (k, i) -ΔE ^{II (W2)} _f (k, i)

（逆波数変換部１２４）
逆波数変換部１２４は、Ｐ×（Ｆ＋１）個の波数領域の送話信号Ｅ^（Ｗ３） _ｆ（ｋ，ｉ）を受け取り（図６参照）、ｆ（ｆ≦Ｆ）において、次式のように周波数ｆ毎に周波数領域の送話信号Ｅ^（３） _ｆ（ｐ，ｉ）に変換する（ｓ２３４）。
[E⁽³⁾ _f(1,i) E⁽³⁾ _f(2,i) … E⁽³⁾ _f(P,i)]
=IFFT([E^(W3) _f(0,i)…E^(W3) _f(k,i)…E^(W3) _f(K,i) E^(W3) _f(-K+1,i)…E^(W3) _f(-1,i)])
なお、周波数ｆ＞Ｆについては、実数信号のＦＦＴ結果に関する対称性から、次式で周波数領域の送話信号Ｅ^（３） _ｆ（ｐ，ｉ）を求める。
E⁽³⁾ _f(p,i)=conj(E⁽³⁾ _2F-f(p,i))
このようにして求めた合計Ｐ×２Ｆ個の周波数領域の送話信号Ｅ^（３） _ｆ（ｐ，ｉ）を時間領域変換部１２５に出力する。なお、逆波数変換方法としては、波数変換部１２２における波数領域変換方法に対応するものを用いればよい。 (Reverse wave number converter 124)
The inverse wave number converter 124 receives the transmission signal E ^(W3) _f (k, i) in the P × (F + 1) wave number domain (see FIG. 6), and at f (f ≦ F), For each frequency f, it is converted into a frequency domain transmission signal E ⁽³⁾ _f (p, i) (s234).
[E ⁽³⁾ _f (1, i) E ⁽³⁾ _f (2, i)… E ⁽³⁾ _f (P, i)]
= IFFT ([E ^(W3) _f (0, i)… E ^(W3) _f (k, i)… E ^(W3) _f (K, i) E ^(W3) _f (−K + 1, i)… E ^(W3) _f (-1, i)])
For the frequency f> F, the transmission signal E ⁽³⁾ _f (p, i) in the frequency domain is obtained by the following equation from the symmetry regarding the FFT result of the real signal.
E ⁽³⁾ _f (p, i) = conj (E ⁽³⁾ _2F-f (p, i))
The total P × 2F frequency domain transmission signals E ⁽³⁾ _f (p, i) thus obtained are output to the time domain conversion unit 125. As the inverse wave number conversion method, a method corresponding to the wave number domain conversion method in the wave number conversion unit 122 may be used.

（時間領域変換部１２５）
時間領域変換部１２５は、Ｐ×２Ｆ個の周波数領域の送話信号Ｅ^（３） _ｆ（ｐ，ｉ）を受け取り、次式のように、チャネルｐ毎に周波数領域の送話信号Ｅ^（３） _ｆ（ｐ，ｉ）を逆ＦＦＴし、時間領域の送話信号ベクトルｅ^（３）（ｐ，ｉ）（要素数は２Ｆ個）に変換し（ｓ２３５）、フレーム合成部１２６に出力する。
e⁽³⁾(p,i)=IFFT([E⁽³⁾ ₀(p,i)…E⁽³⁾ _f(p,i)…E⁽³⁾ _2F-1(p,i)])
時間領域変換方法としては、周波数領域変換部１２１における周波数領域変換方法に対応するものを用いればよい。 (Time domain conversion unit 125)
Time domain transform section 125 receives the transmission signal ^{_{E (3) f (p,}} i) of P × 2F frequency-domain, the following equation, transmission signal ^{E (3} in the frequency domain for each channel p ⁾ _F (p, i) is subjected to inverse FFT, converted into a time domain transmission signal vector e ⁽³⁾ (p, i) (number of elements is 2F) (s235), and output to the frame synthesis unit 126.
e ⁽³⁾ (p, i) = IFFT ([E ⁽³⁾ ₀ (p, i)… E ⁽³⁾ _f (p, i)… E ⁽³⁾ _2F-1 (p, i)])
As the time domain conversion method, a method corresponding to the frequency domain conversion method in the frequency domain conversion unit 121 may be used.

（フレーム合成部１２６）
フレーム合成部１２６は、Ｐ個の時間領域の送話信号ベクトルｅ^（３）（ｐ，ｉ）を受け取る。周波数領域変換部１２１において、受話信号ｘ（ｐ，ｎ）をＤ≧２でフレーム化した場合には、フレーム合成部１２６は、フレームｉで求めた送話信号ｅ^（３）（ｐ，ｉ）と一つ前のフレームｉ−１で求めた送話信号ｅ^（３）（ｐ，ｉ−１）とに対して窓かけ処理を行った上で、合成し（ｓ２３６）、合成後の送話信号ベクトルｅ^（３）’（ｐ，ｉ）（要素数はＦ／Ｄ個）の要素ｅ^（３）（ｐ，ｎ−Ｆ／Ｄ＋１），ｅ^（３）（ｐ，ｎ−Ｆ／Ｄ＋２），…，ｅ^（３）（ｐ，ｎ）を逐次、エコー消去装置１００の出力値として出力する。ただし、ｎ＝ｉＦ／Ｄの関係にある。なお、その処理内容は、フレーム合成部３４の処理と同等である。 (Frame synthesis unit 126)
The frame synthesizing unit 126 receives P time domain transmission signal vectors e ⁽³⁾ (p, i). When the received signal x (p, n) is framed with D ≧ 2 in the frequency domain transform unit 121, the frame synthesizing unit 126 transmits the transmitted signal e ⁽³⁾ (p, i) obtained in the frame i. And the transmission signal e ⁽³⁾ (p, i-1) obtained in the previous frame i-1 are subjected to windowing processing, synthesized (s236), and the synthesized transmission Element e ⁽³⁾ (p, n−F / D + 1), e ⁽³⁾ (p, n−F / D + 2) of signal vector e ⁽³⁾ ′ (p, i) (number of elements is F / D) ,..., E ⁽³⁾ (p, n) are sequentially output as the output value of the echo canceller 100. However, there is a relationship of n = iF / D. The processing content is the same as the processing of the frame synthesis unit 34.

＜変形例＞
残留エコー消去部１２０は、単体でもエコー消去装置として使用することができる。すなわち図２において周波数領域変換部１１、波数変換部１２、波数領域エコーレプリカ生成部２１、逆波数変換部３１、時間領域変換部３２、フレーム合成部３４、Ｐ個の減算部３３_ｐ、誤差周波数領域変換部４１、誤差波数変換部４２から構成される適応フィルタ部分（エコー消去部ともいう）をはずした構成でも使用することができる。その場合、残留エコー消去部１２０は、誤差信号ベクトルｅ（ｐ，ｉ）に代えて、収音信号ｙ（ｐ，ｎ）を受け取り、ベクトル化した上で同様の処理を行う。 <Modification>
The residual echo canceling unit 120 can be used alone or as an echo canceling device. That is, in FIG. 2, the frequency domain transform unit 11, the wave number transform unit 12, the wave number domain echo replica generation unit 21, the inverse wave number transform unit 31, the time domain transform unit 32, the frame synthesis unit 34, the P subtraction units 33 _p , the error frequency A configuration in which the adaptive filter portion (also referred to as echo canceling portion) composed of the region converting portion 41 and the error wave number converting portion 42 is removed can also be used. In that case, the residual echo canceling unit 120 receives the collected sound signal y (p, n) instead of the error signal vector e (p, i), converts it into a vector, and performs the same processing.

また波数領域残留エコー推定消去部１２３１において、残留エコー補正部１２３１４をはずしても使用することができる。同様に波数領域拡散残留エコー推定消去部１２３２において、拡散残留エコー補正部１２３２４をはずしても使用することができる。その場合、各減算部は、補正前の信号を受け取り、同様の処理を行う。 Further, the wave number domain residual echo estimation erasure unit 1231 can be used even if the residual echo correction unit 12314 is removed. Similarly, the wave number domain diffuse residual echo estimation erasure unit 1232 can be used even if the diffuse residual echo correction unit 12324 is removed. In that case, each subtraction unit receives a signal before correction and performs the same processing.

また残留エコー消去部１２０において、波数領域残留エコー推定消去部１２３１をはずし、波数領域拡散残留エコー推定消去部１２３２単独とする構成でも使用できる。この場合、図１２および１３のように、波数領域拡散残留エコー推定消去部１２３２の入力が変わる。図１２は波数領域拡散残留エコー推定消去部１２３２を単独で用いた場合の残留エコー消去部１２０の機能ブロック図を、図１３は波数領域拡散残留エコー推定消去部１２３２の機能ブロック図を示す。受話側信号が、Ｐ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−２）からＰ×２Ｆ個の波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−１）に変わる。また波数領域残留エコー推定消去部１２３１がないため、誤差信号がＰ×２Ｆ個の波数領域の誤差信号Ｅ^（Ｗ２） _ｆ（ｋ，ｉ）＝Ｅ^（Ｗ１） _ｆ（ｋ，ｉ）になる。この構成は、フレーム長を長くしたために、誤差信号Ｅ^（Ｗ１） _ｆ（ｋ，ｉ）に受話信号Ｘ^（Ｗ） _ｆ（ｋ，ｉ−１）の直接成分と反射成分が混在する場合に、有効である。 In the residual echo canceling unit 120, the wave number domain residual echo estimation canceling unit 1231 can be removed and the wave number domain diffuse residual echo estimating canceling unit 1232 can be used alone. In this case, as shown in FIGS. 12 and 13, the input of the wave number domain diffuse residual echo estimation erasure unit 1232 changes. FIG. 12 is a functional block diagram of the residual echo canceling unit 120 when the wave number domain diffuse residual echo estimation canceling unit 1232 is used alone, and FIG. 13 is a functional block diagram of the wave number domain residual residual echo estimating canceling unit 1232. The receiver side signal is received signal X ^(W) _f (k, i−1) from P × 2F wave number domain to received signal X ^(W) _f (k, i−1) of P × 2F wave number domain. Changes to. In addition, since there is no wave number domain residual echo estimation elimination section 1231, the error signal becomes error signal E ^(W2) _f (k, i) = E ^(W1) _f (k, i) of P × 2F wave number domain. In this configuration, when the frame length is increased, the error signal E ^(W1) _f (k, i) includes both the direct component and the reflection component of the received signal X ^(W) _f (k, i-1). It is valid.

さらに、エコー消去部及び波数領域残留エコー推定消去部１２３１をはずしてもよい。その場合、誤差信号がＰ×２Ｆ個の波数領域の誤差信号Ｅ^（Ｗ２） _ｆ（ｋ，ｉ）に代えて、収音信号ｙ（ｐ，ｎ）を受け取り、波数領域の収音信号Ｙ^（Ｗ） _ｆ（ｋ，ｉ）に変換し、同様の処理を行う。 Further, the echo canceller and the wavenumber domain residual echo estimate canceler 1231 may be removed. In that case, instead of the error signal E ^(W2) _f (k, i) of the error signal P × 2F wave number domain, the collected sound signal y (p, n) is received, and the collected signal Y ⁽ p) of the wave number domain. ^W) Convert to _f (k, i) and perform similar processing.

波数領域においてエコーレプリカを求める方法については、上述の方法以外の既存技術を用いてもよい。また、既存技術を用いて、周波数領域や時間領域においてエコーレプリカを求めてもよい。ただし、時間領域の収音信号から時間領域のエコーレプリカを差し引く構成のほうが、エコー消去の精度が高いことが知られているため、仮に周波数領域においてエコーレプリカを求めた場合も、時間領域に変換した上で、時間領域の収音信号から差し引く構成とすることが望ましい。 As a method for obtaining an echo replica in the wave number domain, an existing technique other than the above-described method may be used. In addition, an echo replica may be obtained in the frequency domain or the time domain using existing technology. However, it is known that subtracting the time-domain echo replica from the time-domain sound pickup signal has higher echo cancellation accuracy, so even if the echo replica is obtained in the frequency domain, it is converted to the time domain. In addition, it is desirable to subtract from the time domain sound pickup signal.

第一実施形態では、チャネル数Ｐが偶数の場合について説明したが、奇数（Ｐ＝２Ｋ＋１）であってもよい。 In the first embodiment, the case where the number of channels P is an even number has been described, but an odd number (P = 2K + 1) may be used.

なお、本実施形態では、入力次元圧縮部１２３２０において、波数領域受話信号ベクトルＸ^（Ｗ） _ｆ（ｉ−２）を、波数領域圧縮ベクトルＺ^（Ｗ） _ｆ（ｉ−２）に圧縮しているが、必ずしも圧縮する必要はない。その場合、入力次元圧縮部１２３２０以降の処理において、波数領域圧縮ベクトルＺ^（Ｗ） _ｆ（ｉ−２）に代えて、波数領域受話信号ベクトルＸ^（Ｗ） _ｆ（ｉ−２）を用いればよい。例えば、圧縮入出力相関係数算出部１２３２１では、パワースペクトル行列P⁽²⁾ _f(i)及びクロススペクトル行列Q⁽²⁾ _f(i)をそれぞれ次式及び次々式により求める。
P⁽²⁾ _f(i)=E[X^(W) _f(i-2)X^(W)H _f(i-2)]
Q⁽²⁾ _f(i)=E[E^(W2) _f(i) X^(W)H _f(i-2)]
この場合、入力次元圧縮部１２３２０及び次元圧縮行列更新部１２３２６をはずしてもよい。また、次元圧縮行列更新部１２３２６の処理をはずし、入力次元圧縮部１２３２０において、圧縮行列Ｗ_ｆ（ｉ−１）に代えて、Ｐ×Ｐ単位行列を用いる構成としてもよい。このような構成であっても壁面等の反射を考慮に入れて残留エコーを従来法以上に低減することができる。 In the present embodiment, the input dimension compression unit 12320 compresses the wave number domain received signal vector X ^(W) _f (i-2) into a wave number domain compressed vector Z ^(W) _f (i-2). However, it is not always necessary to compress. In that case, in the processing after the input dimension compression unit 12320, the wave number domain received signal vector X ^(W) _f (i-2) may be used instead of the wave number domain compressed vector Z ^(W) _f (i-2). . For example, the compression input / output correlation coefficient calculation unit 12321 obtains the power spectrum matrix P ⁽²⁾ _f (i) and the cross spectrum matrix Q ⁽²⁾ _f (i) by the following equations and the following equations, respectively.
P ⁽²⁾ _f (i) = E [X ^(W) _f (i-2) X ^{(W) H} _f (i-2)]
Q ⁽²⁾ _f (i) = E [E ^(W2) _f (i) X ^{(W) H} _f (i-2)]
In this case, the input dimension compression unit 12320 and the dimension compression matrix update unit 12326 may be removed. Alternatively, the processing of the dimension compression matrix update unit 12326 may be removed, and the input dimension compression unit 12320 may use a P × P unit matrix instead of the compression matrix W _f (i−1). Even in such a configuration, the residual echo can be reduced more than the conventional method in consideration of reflection of the wall surface or the like.

＜効果＞
従来法では波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｉ）から波数領域の誤差信号Ｅ^（Ｗ１） _ｆ（ｉ）への伝達特性を対角行列として推定して、残留エコー消去をはかる。これは波面の直接伝搬のみを考慮して残留エコーを推定することに対応する。 <Effect>
In the conventional method, the transfer characteristic from the received signal X ^(W) _f (i) in the wave number domain to the error signal E ^(W1) _f (i) in the wave number domain is estimated as a diagonal matrix, and residual echo cancellation is performed. This corresponds to estimating the residual echo considering only the direct propagation of the wavefront.

本構成では、波数領域の受話信号Ｘ^（Ｗ） _ｆ（ｉ）から波数領域の誤差信号Ｅ^（Ｗ２） _ｆ（ｉ）への伝達特性を行列として推定して、波数領域の拡散残留エコーベクトルを推定し、波数領域誤差信号ベクトルＥ^（Ｗ１） _ｆ（ｉ）から差し引く。これは天井や壁に反射した波面の到来を考慮して残留エコーを推定することに対応する。 In this configuration, the transfer characteristic from the received signal X ^(W) _f (i) in the wave number domain to the error signal E ^(W2) _f (i) in the wave number domain is estimated as a matrix, and the diffusion residual echo vector in the wave number domain is calculated. Estimate and subtract from wave number domain error signal vector E ^(W1) _f (i). This corresponds to estimating the residual echo in consideration of the arrival of the wavefront reflected on the ceiling or wall.

これにより波数領域の適応フィルタによるエコー経路推定及び消去が十分でない状態であっても会話状態によらず、壁面等の反射を考慮に入れて迅速に残留エコーを従来法以上に低減することができるという効果を奏する。 As a result, even if the echo path estimation and cancellation by the adaptive filter in the wave number domain is not sufficient, the residual echo can be reduced more quickly than the conventional method taking into account the reflection of the wall surface etc. regardless of the conversation state. There is an effect.

さらに受話信号の次元圧縮をおこなうことにより、上記残留エコー推定に必要なメモリ量と演算量を減らすことができる。受話信号の相関行列の格納に必要なメモリ量は次元の２乗に比例するため、入力次元をａ倍（０＜ａ＜１）に圧縮する場合、メモリ量をａ^２に圧縮できる。また残留エコー伝達特性推定における逆行列算出に次元の３乗に比例する演算量を必要とするため、入力次元をａ倍（０＜ａ＜１）に圧縮すれば、この演算量をａ^３に圧縮できる。 Furthermore, by performing dimensional compression of the received signal, it is possible to reduce the amount of memory and the amount of calculation required for the residual echo estimation. Amount of memory required to store the correlation matrix of the received signals is proportional to the square of the dimension, when compressing input dimension to a times (0 <a <1), can be compressed amount of memory a ^2. Also requires a calculation amount proportional to the cube of the dimensions in the inverse matrix calculation in the residual echo transfer characteristic estimate, the input dimension if compressed to a times (0 <a <1), the amount of computation in a ³ It can be compressed.

＜シミュレーション結果＞
残留エコー消去の効果を検証するために、変形例の構成についてシミュレーションを行った。
エコー消去装置１００の構成として、残留エコー消去部１２０のみを使用した。さらに、内部の波数領域残留エコー推定消去部１２３１をはずし、さらに波数領域拡散残留エコー推定消去部１２３２において、拡散残留エコー補正部１２３２４をはずした。また波数領域拡散残留エコー消去部１２３２では、受話信号を１／４に圧縮する設定とした。相関算出の平滑化定数としてβ＝０．９８を、圧縮ベクトルの相関行列の逆行列算出の忘却定数としてλ＝０．１を、推定した入出力伝達特性の推定にβ_２＝０．１をもちいた。 <Simulation results>
In order to verify the effect of residual echo cancellation, a simulation was performed on the configuration of the modified example.
As the configuration of the echo canceller 100, only the residual echo canceler 120 is used. Further, the internal wave number domain residual echo estimation elimination unit 1231 is removed, and the diffusion domain residual echo correction unit 12324 is removed in the wave number domain residual residual echo estimation elimination unit 1232. The wave number domain residual echo canceling unit 1232 is set to compress the received signal to ¼. Β = 0.98 as a smoothing constant for correlation calculation, λ = 0.1 as a forgetting constant for calculating an inverse matrix of a compression vector correlation matrix, and β ₂ = 0.1 for estimation of an estimated input / output transfer characteristic. I used it.

これと比較する従来法として、非特許文献３で提案されている方法をもちいた。その構成は、エコー消去装置１００の構成として残留エコー消去部１２０のみを使用し、その内部では波数領域残留エコー推定消去部１２３１のみを使用した。なお残留エコー補正部１２３１４をはずした。 As a conventional method compared with this, the method proposed in Non-Patent Document 3 was used. In the configuration, only the residual echo cancellation unit 120 is used as the configuration of the echo cancellation apparatus 100, and only the wave number domain residual echo estimation cancellation unit 1231 is used therein. The residual echo correction unit 12314 was removed.

シミュレーションで使用する信号を生成するため、残響時間１５０ｍｓの部屋で、直線状スピーカアレー（３２素子、間隔６ｃｍ）と直線状マイクロホンアレー（３２素子、間隔６ｃｍ）を５０ｃｍ離して平行に配置し（Ｐ＝３２）、スピーカ・マイクロホン間の全エコー経路インパルス応答を測定した。サンプリング周波数ｆｓを８ｋＨｚに設定し、フレーム長として２Ｆ＝１０２４を用いた。受話信号には、それぞれ異なる位置に配置した２音源が交互に白色雑音を再生する状況をシミュレートし、３２個のマイクロホンによる収音を模擬して生成した。 In order to generate a signal for use in the simulation, a linear speaker array (32 elements, spacing 6 cm) and a linear microphone array (32 elements, spacing 6 cm) are placed 50 cm apart in parallel in a room with a reverberation time of 150 ms (P = 32), the total echo path impulse response between the speaker and the microphone was measured. The sampling frequency fs was set to 8 kHz, and 2F = 1024 was used as the frame length. The received signal was generated by simulating the situation in which two sound sources arranged at different positions reproduce white noise alternately and simulated sound collection by 32 microphones.

図１４、１５にシミュレーション結果を示す。図１４は従来法の処理結果であり、図１５は本実施形態の変形例での処理結果である。いずれも３２チャネル中の奇数チャネルについて、残留エコー消去処理によるエコー消去量（ＥＲＬＥ）をプロットしている。 14 and 15 show the simulation results. FIG. 14 shows the processing result of the conventional method, and FIG. 15 shows the processing result of the modification of this embodiment. In both cases, the echo cancellation amount (ERLE) by the residual echo cancellation processing is plotted for odd-numbered channels out of 32 channels.

図１４より従来法のＥＲＬＥが平均で１０ｄＢ強にとどまるのに対し、図１５より提案法のＥＲＬＥは平均で２０ｄＢ強になっている。これより、提案法が効果的にエコーを消去していることが分かる。 As shown in FIG. 14, the ERLE of the conventional method stays on average only over 10 dB, whereas the ERLE of the proposed method averages over 20 dB on average from FIG. This shows that the proposed method effectively cancels the echo.

＜その他の変形例＞
本発明は上記の実施形態及び変形例に限定されるものではない。例えば、上述の各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。その他、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。 <Other variations>
The present invention is not limited to the above-described embodiments and modifications. For example, the various processes described above are not only executed in time series according to the description, but may also be executed in parallel or individually as required by the processing capability of the apparatus that executes the processes. In addition, it can change suitably in the range which does not deviate from the meaning of this invention.

＜プログラム及び記録媒体＞
また、上記の実施形態及び変形例で説明した各装置における各種の処理機能をコンピュータによって実現してもよい。その場合、各装置が有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、上記各装置における各種の処理機能がコンピュータ上で実現される。 <Program and recording medium>
In addition, various processing functions in each device described in the above embodiments and modifications may be realized by a computer. In that case, the processing contents of the functions that each device should have are described by a program. Then, by executing this program on a computer, various processing functions in each of the above devices are realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。 The program describing the processing contents can be recorded on a computer-readable recording medium. As the computer-readable recording medium, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used.

また、このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ−ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させてもよい。 The program is distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Further, the program may be distributed by storing the program in a storage device of the server computer and transferring the program from the server computer to another computer via a network.

このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶部に格納する。そして、処理の実行時、このコンピュータは、自己の記憶部に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実施形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよい。さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるＡＳＰ（Application Service Provider）型のサービスによって、上述の処理を実行する構成としてもよい。なお、プログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの（コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等）を含むものとする。 A computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its storage unit. When executing the process, this computer reads the program stored in its own storage unit and executes the process according to the read program. As another embodiment of this program, a computer may read a program directly from a portable recording medium and execute processing according to the program. Further, each time a program is transferred from the server computer to the computer, processing according to the received program may be executed sequentially. Also, the program is not transferred from the server computer to the computer, and the above-described processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition. It is good. Note that the program includes information provided for processing by the electronic computer and equivalent to the program (data that is not a direct command to the computer but has a property that defines the processing of the computer).

また、コンピュータ上で所定のプログラムを実行させることにより、各装置を構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 In addition, although each device is configured by executing a predetermined program on a computer, at least a part of these processing contents may be realized by hardware.

Claims

E is an echo that eliminates an echo that goes around the microphone via an echo path when P is an integer of 2 or more, P speakers and P microphones are arranged in a common sound field, and a received signal is reproduced from the speakers An erasing device,
Using the signal obtained by converting the collected sound signal collected by the microphone into the wave number domain and the received signal in the wave number domain, the diffusion residual echo included in the collected signal in the wave number domain is estimated, and the wave number domain Including a wave number domain diffuse residual echo estimation canceling unit that cancels the diffuse residual echo estimated from the collected sound signal,
The wave number domain diffuse residual echo estimation erasure unit is
A power spectrum matrix, which is a P × P matrix, is calculated using a received signal vector X, which is a P-dimensional vector having the received signal for each wave number as an element, and its complex conjugate and transpose, and the collected sound signal for each wave number A compressed input / output correlation coefficient calculation unit that calculates a cross spectrum matrix that is a P × P matrix using a sound pickup signal vector that is a P-dimensional vector having elements as elements and a complex conjugate and transpose of the received signal vector X; ,
Using the power spectrum matrix and the cross spectrum matrix, compression input to obtain an input / output transfer characteristic matrix which is a P × P matrix having an estimated value of the input / output transfer characteristics of the received signal and the collected sound signal as elements. An output transfer characteristic estimator;
A spread residual echo estimator that multiplies the received signal vector X by the input / output transfer characteristic matrix to obtain a diffuse residual echo vector that is a P-dimensional vector whose elements are estimated values of the diffuse residual echo for each wave number;
A subtractor that obtains a difference between the sound collection signal in the wave number domain and the estimated value of the diffuse residual echo in the wave number domain,
Echo canceler.

E is an echo that eliminates an echo that goes around the microphone via an echo path when P is an integer of 2 or more, P speakers and P microphones are arranged in a common sound field, and a received signal is reproduced from the speakers An erasing device,
Using the signal obtained by converting the collected sound signal collected by the microphone into the wave number domain and the received signal in the wave number domain, the diffusion residual echo included in the collected signal in the wave number domain is estimated, and the wave number domain Including a wave number domain diffuse residual echo estimation canceling unit that cancels the diffuse residual echo estimated from the collected sound signal,
The wave number domain diffuse residual echo estimation erasure unit is
P ′ <P, and using a compression matrix W that is a P ′ × P matrix, a received signal vector X that is a P-dimensional vector having the received signal for each wave number as an element is converted into a P′-dimensional compressed vector Z. An input dimension compression unit for compression;
A dimension compression matrix updating unit that updates the compression matrix W so that a difference between a P-dimensional vector obtained by expanding the compression vector Z with a complex conjugate transpose matrix of the compression matrix W and the received signal vector X is minimized. When,
A power spectrum matrix which is a P ′ × P ′ matrix is calculated using the compression vector Z and its complex conjugate and transpose, and a sound collection signal vector which is a P-dimensional vector having the sound collection signal for each wave number as an element. And a compressed input / output correlation coefficient calculating unit that calculates a cross spectrum matrix that is a P × P ′ matrix using the complex conjugate and transpose of the compressed vector Z;
Compression using the power spectrum matrix and the cross spectrum matrix to obtain an input / output transfer characteristic matrix that is a P × P ′ matrix whose elements are estimated values of input / output transfer characteristics of the received signal and the collected sound signal An input / output transfer characteristic estimation unit;
A diffusion residual echo estimator that multiplies the compression vector Z by the input / output transfer characteristic matrix to obtain a diffusion residual echo vector that is a P-dimensional vector whose elements are estimated values of the diffusion residual echo for each wave number;
A subtractor that obtains a difference between the sound collection signal in the wave number domain and the estimated value of the diffuse residual echo in the wave number domain,
Echo canceler.

The echo canceller of claim 2,
P ⁽²⁾ _f is a power spectrum matrix, Q ⁽²⁾ _f is a cross spectrum matrix, Z ^(W) _f is a compression vector Z, E ^(W2) _f is a collected signal vector, and ^H is a complex of Conjugate and transpose, E [·] represents the average of ·, the compressed input / output correlation coefficient calculation unit calculates the power spectrum matrix by the following equation,
P ⁽²⁾ _f = E [Z ^(W) _f Z ^{(W) H} _f ]
The cross spectrum matrix is calculated by the following equation:
Q ⁽²⁾ _f = E [E ^(W2) _f Z ^{(W) H} _f ]
β ₂ is a constant for smoothing the estimated value of the input / output transfer characteristic, and the compressed input / output transfer characteristic estimation unit obtains the input / output transfer characteristic matrix by the following equation or the following equation:

Echo canceler.

The echo canceller according to any one of claims 1 to 3,
Using the received signal in the wave number domain and the collected sound signal in the wave number domain, a residual echo due to a direct wave included in the collected sound signal in the wave number domain is estimated, and the direct wave estimated from the collected sound signal in the wave number domain A wave number domain residual echo estimation canceling unit that cancels residual echo due to
The wave number domain residual echo estimation elimination part
I / O correlation coefficient for calculating a power spectrum of the received signal and a cross spectrum between the received signal and the collected sound signal using the received signal in the wave number domain and the collected sound signal in the wave number domain A calculation unit;
Using the power spectrum and the cross spectrum, an input / output transfer characteristic estimation unit that estimates an input / output transfer characteristic of the received signal and the collected sound signal;
A residual echo estimator for multiplying the received signal in the wave number domain by the estimated value of the input / output transfer characteristic to estimate the residual echo in the wave number domain;
A second subtracting unit for obtaining a difference between the sound pickup signal in the wave number region and the estimated value of the residual echo in the wave number region;
The collected sound signal used in the wave number domain residual echo estimation erasure unit is subjected to the processing in the wave number domain residual echo estimation erasure unit,
The received signal in the wave number domain used in the wave number domain diffuse residual echo estimation erasing unit is one frame before the received signal in the wave number domain used in the wave number domain residual echo estimation erasing unit.
Echo canceler.

The echo canceller according to any one of claims 1 to 4,
Using the received signal in the time domain and the collected sound signal in the time domain, an echo canceling unit that estimates and cancels an echo component included in the collected sound signal in the time domain further includes:
The echo canceller is
A first frequency domain transform unit for transforming the received signal in the time domain into a frequency domain signal;
A first wave number domain converter for converting the received signal in the frequency domain into a signal in the wave number domain;
A multiplier that multiplies the received signal in the wavenumber domain by a filter coefficient in the wavenumber domain to generate an echo replica in the wavenumber domain;
An inverse wave number converter for converting the echo replica in the wave number domain into the echo replica in the frequency domain;
A time domain transforming unit for transforming the echo replica in the frequency domain into the echo replica in the time domain;
A third subtracting unit for subtracting the echo replica in the time domain from the collected sound signal in the time domain to obtain an error signal in the time domain;
A second frequency domain transform unit that transforms the time domain error signal into a frequency domain signal;
A second wavenumber domain converter for converting the error signal in the frequency domain into a signal in the wavenumber domain;
A correction amount calculation unit that calculates a correction amount of the filter coefficient in the wave number domain using the received signal in the wave number domain and the error signal in the wave number domain;
A filter coefficient unit that updates the filter coefficient using the correction amount, and
The collected sound signal used in the wave number domain residual echo estimation cancellation unit or the wave number domain residual echo estimation cancellation unit is processed in the echo cancellation unit, and corresponds to the error signal.
Echo canceler.

The echo canceller according to any one of claims 1 to 5,
The wave number domain diffuse residual echo estimation erasure unit is
A residual echo correction unit that corrects each element of the diffuse residual echo vector by multiplying each element of the diffuse residual echo vector by a value based on a value of a lower end of a confidence interval of the estimated value of the input / output transfer characteristic; In addition,
The estimated value of the diffuse residual echo used in the subtraction unit is subjected to processing in the residual echo correction unit.
Echo canceler.

E is an echo that eliminates an echo that goes around the microphone via an echo path when P is an integer of 2 or more, P speakers and P microphones are arranged in a common sound field, and a received signal is reproduced from the speakers An erasing method,
Using the signal obtained by converting the collected sound signal collected by the microphone into the wave number domain and the received signal in the wave number domain, the diffusion residual echo included in the collected signal in the wave number domain is estimated, and the wave number domain A wave number domain diffuse residual echo estimation cancellation step for canceling the diffuse residual echo estimated from the collected sound signal,
The wave number domain diffuse residual echo estimation elimination step comprises:
A power spectrum matrix, which is a P × P matrix, is calculated using a received signal vector X, which is a P-dimensional vector having the received signal for each wave number as an element, and its complex conjugate and transpose, and the collected sound signal for each wave number A compressed input / output correlation coefficient calculating step of calculating a cross spectrum matrix that is a P × P matrix using a sound pickup signal vector that is a P-dimensional vector having a component as a component and a complex conjugate and transpose of the received signal vector X; ,
Using the power spectrum matrix and the cross spectrum matrix, compression input to obtain an input / output transfer characteristic matrix which is a P × P matrix having an estimated value of the input / output transfer characteristics of the received signal and the collected sound signal as elements. An output transfer characteristic estimation step;
A diffusion residual echo estimation step of multiplying the reception signal vector X by the input / output transfer characteristic matrix to obtain a diffusion residual echo vector which is a P-dimensional vector having the estimation value of the diffusion residual echo for each wave number as an element;
Subtracting the difference between the collected sound signal in the wave number domain and the estimated value of the diffuse residual echo in the wave number domain,
Echo cancellation method.

E is an echo that eliminates an echo that goes around the microphone via an echo path when P is an integer of 2 or more, P speakers and P microphones are arranged in a common sound field, and a received signal is reproduced from the speakers An erasing method,
Using the signal obtained by converting the collected sound signal collected by the microphone into the wave number domain and the received signal in the wave number domain, the diffusion residual echo included in the collected signal in the wave number domain is estimated, and the wave number domain A wave number domain diffuse residual echo estimation cancellation step for canceling the diffuse residual echo estimated from the collected sound signal,
The wave number domain diffuse residual echo estimation elimination step comprises:
P ′ <P, and using a compression matrix W that is a P ′ × P matrix, a received signal vector X that is a P-dimensional vector having the received signal for each wave number as an element is converted into a P′-dimensional compressed vector Z. An input dimension compression step to compress;
A dimension compression matrix updating step for updating the compression matrix W so that a difference between a P-dimensional vector obtained by expanding the compression vector Z by a complex conjugate transpose matrix of the compression matrix W and the received signal vector X is minimized; When,
A power spectrum matrix which is a P ′ × P ′ matrix is calculated using the compression vector Z and its complex conjugate and transpose, and a sound collection signal vector which is a P-dimensional vector having the sound collection signal for each wave number as an element. And a compressed input / output correlation coefficient calculating step of calculating a cross spectrum matrix that is a P × P ′ matrix using the complex conjugate and transpose of the compressed vector Z;
Compression using the power spectrum matrix and the cross spectrum matrix to obtain an input / output transfer characteristic matrix that is a P × P ′ matrix whose elements are estimated values of input / output transfer characteristics of the received signal and the collected sound signal An input / output transfer characteristic estimation step;
A diffusion residual echo estimation step of multiplying the compression vector Z by the input / output transfer characteristic matrix to obtain a diffusion residual echo vector that is a P-dimensional vector whose element is an estimation value of the diffusion residual echo for each wave number;
Subtracting the difference between the collected sound signal in the wave number domain and the estimated value of the diffuse residual echo in the wave number domain,
Echo cancellation method.

A program for causing a computer to function as the echo canceling apparatus according to claim 1.