JP6356087B2

JP6356087B2 - Echo canceling apparatus, method and program

Info

Publication number: JP6356087B2
Application number: JP2015068888A
Authority: JP
Inventors: 江村　暁; 暁江村
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2015-03-30
Filing date: 2015-03-30
Publication date: 2018-07-11
Anticipated expiration: 2035-03-30
Also published as: JP2016189548A

Description

本発明は、M（但し、Mは1以上の整数）個のスピーカと1個以上のマイクロホンが共通の音場に配置され、スピーカから受話信号を再生した際に、エコー経路を介してマイクロホンに回り込む音響エコー（以下、単に「エコー」という）を消去する技術、特にテレビ会議システム等の拡声通話系におけるエコーを消去する技術に関する。 In the present invention, M (where M is an integer of 1 or more) speakers and one or more microphones are arranged in a common sound field, and when a received signal is reproduced from the speakers, the microphones are connected to the microphones via an echo path. The present invention relates to a technique for canceling an acoustic echo that wraps around (hereinafter simply referred to as “echo”), and more particularly to a technique for canceling an echo in a loudspeaker communication system such as a video conference system.

スピーカで受話信号が再生され、その音声がマイクロホンで収音されてエコーが生じる。そのまま送信されると通話の障害や不快感等の問題が生じる。さらに、スピーカやマイクロホンの音量が大きい場合にはハウリングが生じ、通話が不可能になる。特に拡声通話系では、このような問題が顕著となる。 The received signal is reproduced by the speaker, and the sound is picked up by the microphone to generate an echo. If it is transmitted as it is, problems such as trouble of telephone conversation and discomfort arise. Further, howling occurs when the volume of the speaker or microphone is high, making it impossible to make a call. In particular, such a problem becomes conspicuous in the voice call system.

この問題を解決するために、従来技術として、適応フィルタを用いてエコーを消去するエコー消去装置がある。非特許文献１が従来技術の多チャネルエコー消去方法として知られている。図１を用いて従来の多チャネルエコー消去装置８０を説明する。 In order to solve this problem, there is an echo canceller that cancels echoes using an adaptive filter as a prior art. Non-Patent Document 1 is known as a conventional multi-channel echo cancellation method. A conventional multi-channel echo canceling apparatus 80 will be described with reference to FIG.

スピーカ２_１，…，２_Ｍとマイクロホン３_１，…，３_Ｎが共通の音場に配置され、スピーカ２_１，…，２_Ｍからそれぞれ受話信号x₁(k),…,x_M(k)を再生した場合に、多チャネルエコー消去装置８０内のエコー消去部８_ｎは、マイクロホン３_ｎにM本のエコー経路h_mn(k)を介して回り込む再生音を消去する。但し、Mは１以上の整数であり、Nは１以上の整数であり、m=1,…,Mであり、n=1,…,Nである。多チャネルエコー消去装置８０は、受話端子１_１，…，１_Ｍと、送話端子４_１，…，４_Ｎと、マイクロホン３_１，…，３_Ｎとが接続されており、受話信号x₁(k),…,x_M(k)及び収音信号y₁(k),…,y_N(k)が入力され、送話信号u₁(k),…,u_N(k)をそれぞれ送話端子４_１，…，４_Ｎに出力する。多チャネルエコー消去装置８０は、N個のエコー消去部８_１，…，８_Ｎを含み、エコー消去部８_ｎは、エコー予測部８１と、減算部８２と、エコー経路推定部８３とを有する。図１において、y_n(k)をy(k)とし、u_n(k)をu(k)とし、h_1n(k),…,h_Mn(k)をそれぞれh₁(k),…,h_M(k)として表す。他のマイクロホンからの収音信号についても同様の処理を行うことができ、図１のエコー消去部８_ｎの構成を並列に並べるだけでよいため、以下では図１を用いて説明する。 Speaker ₂ 1, ..., _{2 M} and the microphone ₃ 1, ..., _{3 N} are arranged in a common sound field, speaker ₂ 1, ..., respectively, from _{2 M} received signals _{x 1 (k), ...,} x M (k ), The echo canceling unit 8 _n in the multi-channel echo canceling device 80 cancels the playback sound that wraps around the microphone 3 _n via the M echo paths h _mn (k). However, M is an integer greater than or equal to 1, N is an integer greater than or equal to 1, m = 1, ..., M and n = 1, ..., N. Multi-channel echo canceller 80, receiving terminal ₁ 1, ..., and _{1 M,} transmitter terminal ₄ 1, ..., ₄ and _N, the microphone ₃ 1, ..., _{3 N} and are connected, the received signal x ₁ (k), ..., x _M (k) and the collected sound signals y ₁ (k), ..., y _N (k) are inputted, and the transmission signals u ₁ (k), ..., u _N (k) are respectively inputted. Output to the transmitting terminals 4 ₁ ,..., 4 _N. The multi-channel echo cancellation apparatus 80 includes N echo cancellation units 8 ₁ ,..., 8 _N , and the echo cancellation unit 8 _n includes an echo prediction unit 81, a subtraction unit 82, and an echo path estimation unit 83. . In FIG. 1, y _n (k) is y (k), u _n (k) is u (k), and h _1n (k),..., H _Mn (k) are h ₁ (k),. , h _M (k). Can also perform the same processing for the sound signals picked up from the other microphones, it is only necessary arranged in parallel configuration of the echo canceling portion 8 _n of FIG. 1, the following will be described with reference to FIG.

エコー消去部８_ｎは、エコー予測部８１において、受話信号x₁(k),…,x_M(k)を適応フィルタでフィルタリングし、予測エコー信号y’(k)を生成する。減算部８２において、収音信号y(k)と予測エコー信号y’(k)との差分（以下「誤差信号」という）u(k)を求め、これを送話信号として出力する。また、エコー経路推定部８３において、誤差信号u(k)と受話信号x₁(k),…,x_M(k)とからエコー経路を逐次推定し、この推定結果（適応フィルタのフィルタ係数h’(k)）をエコー予測部８１にコピーする。エコー経路推定が精度よく行われた状態では、収音信号y(k)に含まれるエコー成分と予測エコー信号y’(k)がほぼ等しくなり、誤差信号u(k)中にエコーは殆ど含まれなくなる。 In the echo canceling unit 8 _n , the echo prediction unit 81 filters the received signals x ₁ (k),..., X _M (k) with an adaptive filter to generate a predicted echo signal y ′ (k). In the subtracting unit 82, a difference (hereinafter referred to as “error signal”) u (k) between the collected sound signal y (k) and the predicted echo signal y ′ (k) is obtained and output as a transmission signal. Further, the echo path estimation unit 83 sequentially estimates the echo path from the error signal u (k) and the received signals x ₁ (k),..., X _M (k), and this estimation result (filter coefficient h of the adaptive filter) '(k)) is copied to the echo prediction unit 81. When the echo path is estimated accurately, the echo component contained in the collected sound signal y (k) and the predicted echo signal y '(k) are almost equal, and the error signal u (k) contains almost no echo. It will not be.

しかし実際に多チャネルエコー消去装置が使用される状況では、いつも十分にエコー消去できるとは限らず、残留エコーが生じて通話品質が劣化しうる。それは、人の動き等によりエコー経路は絶えず変動しているからであり、適応フィルタによるエコー経路推定が瞬時には完了しないためである。またダブルトーク状態でエコー経路の推定が若干乱れうるからである。 However, in a situation where a multi-channel echo canceller is actually used, it is not always possible to sufficiently cancel the echo, and a residual echo may occur, resulting in a deterioration in the speech quality. This is because the echo path is constantly fluctuating due to human movement and the like, and the echo path estimation by the adaptive filter is not completed instantaneously. This is because the estimation of the echo path can be slightly disturbed in the double talk state.

さらに受話信号が多チャネルの場合には、受話信号間の相関が高いために、エコーが消去されている状態であっても推定されたエコー経路と真のエコー経路は必ずしも一致しない場合がある。そのため、話者が交代して受話信号間の相互相関が変化すると突然残留エコーが大きくなりうる（非特許文献１参照）。 Further, when the received signal is multi-channel, since the correlation between the received signals is high, the estimated echo path may not always match the true echo path even if the echo is canceled. Therefore, when the speaker changes and the cross-correlation between the received signals changes, the residual echo can suddenly increase (see Non-Patent Document 1).

快適な拡声通話を実現するには、適応フィルタによるエコー経路推定及び消去が十分でない状態において、受話信号のチャネル数や会話状態によらず、迅速に残留エコーを低減する必要がある。チャネル数や会話状態によらず残留エコーを低減させるために、受話信号から残留エコーへの伝達特性を高速に推定し、誤差信号から残留エコーを差し引く方法として非特許文献２が知られている。この方法において、伝達特性の推定では、周波数毎に受話信号と誤差信号の相関を利用することで、推定が高速化され、残留エコー以外の信号による推定揺らぎが抑えられる。伝達特性と残留エコーに関して振幅と位相を推定するため、チャネル数によらず適用可能である。また引き算により残留エコーの消去をはかるため、ダブルトーク時でも送話音質の歪みを小さくできる。 In order to realize a comfortable loud voice call, it is necessary to quickly reduce the residual echo regardless of the number of channels of the received signal and the conversation state in a state where the echo path estimation and cancellation by the adaptive filter is not sufficient. Non-Patent Document 2 is known as a method of estimating transfer characteristics from a received signal to a residual echo at high speed and subtracting the residual echo from an error signal in order to reduce the residual echo regardless of the number of channels and the conversation state. In this method, the transfer characteristic is estimated by using the correlation between the received signal and the error signal for each frequency, thereby speeding up the estimation and suppressing the estimated fluctuation caused by signals other than the residual echo. Since the amplitude and phase are estimated with respect to the transfer characteristic and residual echo, the present invention can be applied regardless of the number of channels. In addition, since residual echo is eliminated by subtraction, distortion of transmitted sound quality can be reduced even during double talk.

非特許文献２では、残留エコーが精度良く求められている必要がある。しかし残留エコーを限られた時間長（短時間区間）の受話信号と誤差信号とから推定するために、時間長を十分長くとる場合と比較すると推定のばらつきが大きくなり、残留エコーを大きめに推定してしまう場合がある。 In Non-Patent Document 2, the residual echo needs to be obtained with high accuracy. However, since the residual echo is estimated from the reception signal and error signal of a limited time length (short time interval), the estimation variation is larger than when the time length is sufficiently long, and the residual echo is estimated larger. May end up.

送話の品質を高くするには、上記のような状況でも残留エコーの推定精度を高める必要がある。そのために、残留エコー推定値を補正する方法が、特許文献１で提案されている。 In order to improve the quality of transmission, it is necessary to improve the estimation accuracy of residual echo even in the above situation. For this purpose, Patent Literature 1 proposes a method for correcting the residual echo estimation value.

特開２０１２−２２７５６６号公報JP 2012-227466 A

M.M.Sondhi, D.R.Morgan, and J.L.Hall, “Stereophonic Acoustic Echo Cancellation-An Overview of the Fundamental Problem”, IEEE Signal Processing Letters, AUGUST 1995, vol.2, no.8, pp.148-151M.M.Sondhi, D.R.Morgan, and J.L.Hall, “Stereophonic Acoustic Echo Cancellation-An Overview of the Fundamental Problem”, IEEE Signal Processing Letters, AUGUST 1995, vol.2, no.8, pp.148-151 江村暁、羽田陽一、「多段エコー推定による多チャネルエコー消去法」、日本音響学会研究発表会講演論文集、２０１０年、pp.717-719Satoshi Emura and Yoichi Haneda, "Multi-channel echo cancellation using multi-stage echo estimation", Proc. Of the Acoustical Society of Japan, 2010, pp.717-719

しかし推定された残留エコー（推定値）のパワーと適応フィルタが出力する誤差信号のパワーが同等であっても、残留エコー消去処理の効果が薄いことがある。それは、補正した残留エコー推定値の位相誤差が小さくないために、適応フィルタ出力信号から残留エコー推定値を引いても、差信号のパワー小さくならないときである。その一つの理由は、残留エコー消去処理のモデルが想定する残響時間が数十msであり、実際の部屋の残響時間（数百ms）よりもかなり短く設定されることである。また、適応フィルタの学習が進み、上記の数十msに対応する部分が適応フィルタにより良好に消去されると、上記の想定残響時間の相違の影響がより顕著になることが、もう一つの理由である。 However, even if the power of the estimated residual echo (estimated value) is equal to the power of the error signal output from the adaptive filter, the effect of the residual echo cancellation process may be weak. This is when the phase error of the corrected residual echo estimation value is not small, so that the difference signal power does not decrease even if the residual echo estimation value is subtracted from the adaptive filter output signal. One reason is that the reverberation time assumed by the model of the residual echo cancellation processing is several tens of ms, which is set to be considerably shorter than the reverberation time of the actual room (several hundred ms). Another reason is that when the adaptive filter learning progresses and the part corresponding to the above several tens of ms is well erased by the adaptive filter, the influence of the difference in the assumed reverberation time becomes more prominent. It is.

本発明の目的は、このような状態（残留エコー推定値の位相誤差が小さくないために残留エコー消去処理の効果が薄い状況）において、従来よりも残留エコーを減らすことができるエコー消去技術を提供することである。なお、本発明において、残留エコーとは、収音信号中に含まれるエコー成分全般を意味し、収音信号に対して適応フィルタによるエコー消去を行った後に誤差信号中に残るエコー成分を意味するだけではなく、適応フィルタによるエコー消去行わない場合の収音信号中に含まれるエコー成分全てをも意味する概念である。 The object of the present invention is to provide an echo cancellation technique that can reduce the residual echo in the state (the situation where the residual echo cancellation processing is less effective because the phase error of the residual echo estimation value is not small). It is to be. In the present invention, the residual echo means all echo components included in the collected sound signal, and means an echo component remaining in the error signal after performing echo cancellation by an adaptive filter on the collected sound signal. It is a concept that means not only the echo component but also all echo components included in the collected sound signal when the echo cancellation by the adaptive filter is not performed.

上記の課題を解決するために、本発明の一形態によれば、エコー消去装置は、１個以上のスピーカと１個以上のマイクロホンが共通の音場に配置され、スピーカから受話信号を再生した際に、エコー経路を介してマイクロホンに回り込むエコーを消去する。エコー消去装置は、マイクロホンで収音した第一収音信号から得られる信号である周波数領域収音信号と、受話信号から得られる周波数領域の信号である周波数領域受話信号とを用いて、周波数領域収音信号に含まれる残留エコーの位相と振幅とを考慮し、残留エコー推定値を求める残留エコー推定部と、周波数領域収音信号から残留エコー推定値を消去し、抑圧する残留エコー消去抑圧部とを含み、残留エコー消去抑圧部は、周波数領域収音信号から残留エコー推定値を引いて差を求め、その差が小さいほど、周波数領域収音信号から残留エコーを抑圧する割合を増やし、残留エコーを消去する割合を減らす。 In order to solve the above problems, according to one aspect of the present invention, an echo canceller reproduces a received signal from a speaker, in which one or more speakers and one or more microphones are arranged in a common sound field. In this case, the echo that goes around the microphone via the echo path is eliminated. The echo canceller uses a frequency domain sound pickup signal that is a signal obtained from the first sound pickup signal picked up by the microphone and a frequency domain reception signal that is a frequency domain signal obtained from the reception signal. A residual echo estimator that calculates the residual echo estimate in consideration of the phase and amplitude of the residual echo contained in the collected sound signal, and a residual echo cancellation suppressor that eliminates and suppresses the residual echo estimate from the frequency domain collected signal The residual echo cancellation suppression unit subtracts the residual echo estimate from the frequency domain collected signal to obtain a difference, and the smaller the difference, the higher the ratio of suppressing the residual echo from the frequency domain collected signal. Reduce the rate of echo cancellation.

上記の課題を解決するために、本発明の他の形態によれば、エコー消去方法は、１個以上のスピーカと１個以上のマイクロホンが共通の音場に配置され、スピーカから受話信号を再生した際に、エコー経路を介してマイクロホンに回り込むエコーを消去する。エコー消去方法は、マイクロホンで収音した第一収音信号から得られる信号である周波数領域収音信号と、受話信号から得られる周波数領域の信号である周波数領域受話信号とを用いて、周波数領域収音信号に含まれる残留エコーの位相と振幅とを考慮し、残留エコー推定値を求める残留エコー推定ステップと、周波数領域収音信号から残留エコー推定値を消去し、抑圧する残留エコー消去抑圧ステップとを含み、残留エコー消去抑圧ステップは、周波数領域収音信号から残留エコー推定値を引いて差を求め、その差が小さいほど、周波数領域収音信号から残留エコーを抑圧する割合を増やし、残留エコーを消去する割合を減らす。 In order to solve the above problems, according to another aspect of the present invention, an echo canceling method reproduces a received signal from a speaker by arranging one or more speakers and one or more microphones in a common sound field. When this occurs, the echo that goes around the microphone via the echo path is eliminated. The echo canceling method uses a frequency domain sound pickup signal that is a signal obtained from the first sound pickup signal picked up by the microphone and a frequency domain reception signal that is a frequency domain signal obtained from the reception signal. Considering the phase and amplitude of the residual echo contained in the collected sound signal, a residual echo estimation step for obtaining a residual echo estimate value, and a residual echo cancellation suppression step for eliminating and suppressing the residual echo estimate value from the frequency domain collected signal In the residual echo cancellation suppression step, the residual echo estimation value is subtracted from the frequency domain collected signal to obtain a difference, and the smaller the difference, the higher the ratio of suppressing the residual echo from the frequency domain collected signal. Reduce the rate of echo cancellation.

本発明に係るエコー消去技術では、残留エコー推定値の位相誤差が小さくないために残留エコー消去処理の効果が薄い状況において、従来よりも残留エコーを減らすことができるという効果を奏する。 The echo cancellation technique according to the present invention has an effect that the residual echo can be reduced as compared with the conventional technique in a situation where the effect of the residual echo cancellation process is weak because the phase error of the residual echo estimation value is not small.

従来の多チャネルエコー消去装置８０の構成例を示す図。The figure which shows the structural example of the conventional multichannel echo cancellation apparatus 80. FIG. 残留エコーの消去／抑圧の配分を示す図。The figure which shows distribution of cancellation / suppression of a residual echo. エコー消去装置１００の構成例を示す図。1 is a diagram illustrating a configuration example of an echo cancellation apparatus 100. FIG. エコー消去装置１００の処理フローの例を示す図。The figure which shows the example of the processing flow of the echo cancellation apparatus. 残留エコー推定部１６Ａの構成例を示す図。The figure which shows the structural example of 16A of residual echo estimation parts. 残留エコー消去抑圧部１６９の構成例を示す図。The figure which shows the structural example of the residual echo cancellation suppression part 169. FIG. 残留エコー推定部１６Ａの処理フローの例を示す図。The figure which shows the example of the processing flow of the residual echo estimation part 16A. 入出力相関係数算出部１６３の構成例を示す図。The figure which shows the structural example of the input-output correlation coefficient calculation part 163. 残留エコー消去抑圧部１６９の処理フローの例を示す図。The figure which shows the example of the processing flow of the residual echo cancellation suppression part 169. FIG. エコー消去装置２００の構成例を示す図。1 is a diagram illustrating a configuration example of an echo canceller 200. FIG. エコー消去装置２００の処理フローの例を示す図。The figure which shows the example of the processing flow of the echo cancellation apparatus. エコー消去部２８_ｎの構成例を示す図。Diagram illustrating an exemplary configuration of the echo cancellation unit 28 _n. エコー消去部２８_ｎの処理フローの例を示す図。It shows an example of a process flow of the echo canceling portion 28 _n.

以下、本発明の実施形態について、説明する。なお、以下の説明に用いる図面では、同じ機能を持つ構成部や同じ処理を行うステップには同一の符号を記し、重複説明を省略する。以下の説明において、テキスト中で使用する記号「^」等は、本来直前の文字の真上に記載されるべきものであるが、テキスト記法の制限により、当該文字の直後に記載する。式中においてはこれらの記号は本来の位置に記述している。また、ベクトルや行列の各要素単位で行われる処理は、特に断りが無い限り、そのベクトルやその行列の全ての要素に対して適用されるものとする。 Hereinafter, embodiments of the present invention will be described. In the drawings used for the following description, constituent parts having the same function and steps for performing the same process are denoted by the same reference numerals, and redundant description is omitted. In the following description, the symbol “^” or the like used in the text should be described immediately above the character immediately before, but it is described immediately after the character due to restrictions on text notation. In the formula, these symbols are written in their original positions. Further, the processing performed for each element of a vector or matrix is applied to all elements of the vector or matrix unless otherwise specified.

＜第一実施形態のポイント＞
特許文献１の残留エコー消去は、残留エコーの位相と振幅を推定し、引き算により残留エコーの消去をはかる。一方、残留エコーの振幅のみを推定し、その振幅相当分だけ各周波数において信号を掛け算で減衰させるエコー抑圧という手法がある（参考文献１）。
（参考文献１）特開平１１−３３１０４６号公報 <Points of first embodiment>
In the residual echo cancellation of Patent Document 1, the phase and amplitude of the residual echo are estimated, and the residual echo is canceled by subtraction. On the other hand, there is a technique called echo suppression in which only the amplitude of the residual echo is estimated and the signal is attenuated by multiplication at each frequency by an amount corresponding to the amplitude (Reference Document 1).
(Reference 1) Japanese Patent Laid-Open No. 11-331046

本実施形態では、この引き算による残留エコー消去と、掛け算によるエコー抑圧を組み合わせる。具体的には、収音信号のパワーと残留エコー推定値のパワーの差分が小さいときに、エコー消去及び抑圧の混合モードに入る。前述の通り、パワーの差分が小さい場合、位相誤差が小さくない可能性が高く、位相誤差が小さくないときに残留エコー消去により残留エコー推定値を引いても、残留エコーがほとんど減少しないことが多い。。一方、パワーの差分が小さい場合、ダブルトーク状態である可能性は低く、エコー抑圧による送話音質の歪みの可能性は低い。そこで、混合モードではパワーの差分が小さいほど、エコー消去の配分を引き下げ、エコー抑圧の配分を引き上げることで、従来よりも残留エコーを減らす。 In the present embodiment, residual echo cancellation by this subtraction is combined with echo suppression by multiplication. Specifically, when the difference between the power of the collected sound signal and the power of the residual echo estimation value is small, the mixed mode of echo cancellation and suppression is entered. As described above, when the power difference is small, there is a high possibility that the phase error is not small, and even if the residual echo estimation value is subtracted by canceling the residual echo when the phase error is not small, the residual echo is often hardly reduced. . . On the other hand, when the power difference is small, the possibility of being in the double talk state is low, and the possibility of distortion of the transmission sound quality due to echo suppression is low. Therefore, in the mixed mode, the smaller the power difference, the lower the echo cancellation distribution and the higher the echo suppression distribution, thereby reducing the residual echo than before.

この処理の一例を図２をもちいて説明する。図２の横軸は周波数領域収音信号Y(f,j)のパワーY(f,j)²と残留エコー推定値Y^₂(f,j)のパワーY^₂(f,j)²の差分を周波数領域収音信号Y(f,j)のパワーY(f,j)²で割った値を表し、縦軸は残留エコー消去の分担比R_mxc（ｆ，ｊ）を表す。周波数領域収音信号Y(f,j)のパワーY(f,j)²と残留エコー推定値Y^₂(f,j)のパワーY^₂(f,j)²の差分が小さい状況を An example of this process will be described with reference to FIG. Power Y (f, j) ² and the residual echo estimate Y ^ ₂ (f, j) power Y ^ ₂ (f, j) of the horizontal axis in FIG. 2 is a frequency domain sound pickup signal Y (f, j) ² Is divided by the power Y (f, j) ² of the frequency domain collected signal Y (f, j), and the vertical axis represents the share ratio R _mxc (f, j) of residual echo cancellation. Frequency domain sound collection signal Y (f, j) power Y (f, j) ² and residual echo estimate Y ^ ₂ (f, j) power Y ^ ₂ (f, j) of the ^second difference is smaller situation

で検出する。ただし Detect with. However,

である。ここでp_hyb_range_upperは検出用の閾値であり、−２０〜−１０ｄBの範囲で値を設定する。また、MAX(a,b)はaとbで大きい方の値を返す関数であり、残留エコーが大きめに推定されるとき（残留エコー推定値Y^₂(f,j)のパワーY^₂(f,j)²が周波数領域収音信号Y(f,j)のパワーY(f,j)²以上のとき）、関数Jは０を返す。 It is. Here, p_hyb_range_upper is a threshold value for detection, and a value is set in a range of −20 to −10 dB. MAX (a, b) is a function that returns the larger value of a and b. When the residual echo is estimated to be larger (the power Y ^ _{2 of the} residual echo estimate Y ^ ₂ (f, j) (f, j) power Y (f, j) of ² frequency domain sound pickup signal Y (f, j) ² or more time), function J returns 0.

差分が小さい状況では、残留エコー消去の分担比R_mxc(f,j)を In situations where the difference is small, the residual echo cancellation share ratio R _mxc (f, j) is

で計算する。ここで、パラメータp_cancel_allotted_minは、残留エコー消去の分担が最小になるときの比率であり、0から1の範囲の値を設定する。 Calculate with Here, the parameter p_cancel_allotted_min is a ratio when the share of residual echo cancellation is minimized, and a value in the range of 0 to 1 is set.

残りの(1-R_mxc(f,j))を、残留エコー抑圧が分担する。抑圧量は振幅換算で Residual echo suppression shares the remaining (1-R _mxc (f, j)). The amount of suppression is converted to amplitude

になり、エコー抑圧ゲインG_mxs(f,j)として、例えば、 Echo suppression gain G _mxs (f, j)

をもちいることができる。 Can be used.

以下、上述の処理を実現するための構成について説明する。 Hereinafter, a configuration for realizing the above-described processing will be described.

＜第一実施形態＞
＜エコー消去装置１００＞
図３は第一実施形態に係るエコー消去装置１００の機能ブロック図の例を、図４はその処理フローを示す。図３及び図４を用いて第一実施形態に係るエコー消去装置１００を説明する。M個のスピーカ２_１，…，２_ＭとN個のマイクロホン３_１，…，３_Ｎが共通の音場に配置され、スピーカ２_１，…，２_Ｍからそれぞれ受話信号x₁(k),…,x_M(k)を再生した場合に、エコー消去装置１００は、M×N本のエコー経路h_mn(k)を介してマイクロホンに回り込む再生音（エコー）を消去する。より詳しく説明すると、エコー消去装置１００内の残留エコー消去部１６_ｎは、マイクロホン３_ｎにM本のエコー経路h_mn(k)を介して回り込む再生音（エコー）を消去する。エコー消去装置１００は、受話側の全Mチャネルの受話端子１_１，…，１_Ｍと、送話側の全Nチャネルの送話端子４_１，…，４_Ｎと、マイクロホン３_１，…，３_Ｎとが接続されており、受話信号x₁(k),…,x_M(k)及び収音信号y₁(k),…,y_N(k)が入力され、送話信号v₁(k),…,v_N(k)をそれぞれ送話端子４_１，…，４_Ｎに出力する。 <First embodiment>
<Echo canceling apparatus 100>
FIG. 3 shows an example of a functional block diagram of the echo cancellation apparatus 100 according to the first embodiment, and FIG. 4 shows a processing flow thereof. The echo cancellation apparatus 100 according to the first embodiment will be described with reference to FIGS. 3 and 4. M speakers ₂ 1, ..., _{2 M} and N microphones ₃ 1, ..., _{3 N} are arranged in a common sound field, speaker ₂ 1, ..., respectively, from _{2 M} received signals x ₁ (k), .., X _M (k) is reproduced, the echo canceller 100 cancels the reproduced sound (echo) that wraps around the microphone via M × N echo paths h _mn (k). More specifically, the residual echo canceling unit 16 _n in the echo canceling apparatus 100 cancels the reproduced sound (echo) that wraps around the microphone 3 _n via the M echo paths h _mn (k). The echo canceling apparatus 100 includes receiving terminals 1 ₁ ,..., 1 _M for all M channels on the receiving side, transmitting terminals 4 ₁ ,..., 4 _N for all N channels on the transmitting side, and microphones 3 ₁ ,. 3 _N is connected, and the received signal x ₁ (k),..., X _M (k) and the collected sound signal y ₁ (k),..., Y _N (k) are input, and the transmitted signal v ₁ _{(k), ..., v N} (k) , respectively transmitter terminals 4 _1, ..., and outputs a 4 _N.

エコー消去装置１００は、N個の残留エコー消去部１６_１，…，１６_Ｎを含む。 The echo canceller 100 includes N residual echo cancelers 16 ₁ ,..., 16 _N.

＜残留エコー消去部１６_ｎ＞
残留エコー消去部１６_ｎは、受話側の全Mチャネルの受話端子１_１，…，１_Ｍと、送話側の1チャネルの送話端子４_ｎと、マイクロホン３_ｎとが接続されており、Mチャネルの受話信号x₁(k),…,x_M(k)及び1チャネルの収音信号y_n(k)が入力され、1チャネルの送話信号v_n(k)を送話端子４_ｎに出力する。なお、各図において、y_n(k)をy(k)とし、v_n(k)をv(k)とし、h_1n(k),…,h_Mn(k)をそれぞれh₁(k),…,h_M(k)として表す。また、各図において、第nチャネルの処理部についてのみ説明する。他のマイクロホンからの収音信号についても同様の処理を行うことができ、第nチャネルの処理部の構成を並列に並べるだけでよいため、説明を省略する。 <Residual echo canceller _16n >
The residual echo canceling unit 16 _n is connected to all M channel receiving terminals 1 ₁ ,..., 1 _{M on} the receiving side, one channel transmitting terminal 4 _n on the transmitting side, and a microphone 3 _n . M-channel received signals x ₁ (k),..., X _M (k) and 1-channel sound pickup signal y _n (k) are input, and 1-channel transmitted signal v _n (k) is transmitted to transmission terminal 4. output to _n . In each figure, y _n (k) is y (k), v _n (k) is v (k), and h _1n (k), ..., h _Mn (k) are h ₁ (k), respectively. , ..., h _M (k). In each figure, only the processing unit of the nth channel will be described. Similar processing can be performed on the collected sound signals from other microphones, and the configuration of the processing units of the n-th channel only needs to be arranged in parallel.

残留エコー消去部１６_ｎは、M個の周波数領域変換部１６１_１，…，１６１_Ｍと、周波数領域変換部１６２と、残留エコー推定部１６Ａと、残留エコー消去抑圧部１６９と、時間領域変換部１６８とを含む。 The residual echo canceling unit 16 _n includes M frequency domain transforming units 161 ₁ ,..., 161 _M , a frequency domain transforming unit 162, a residual echo estimating unit 16 A, a residual echo canceling suppressing unit 169, and a time domain transforming unit. 168.

残留エコー推定部１６Ａは、入出力相関係数算出部１６３と、入出力伝達特性推定部１６４と、残留エコー予測部１６５と、残留エコー補正部１６６とを含む（図５参照）。 The residual echo estimation unit 16A includes an input / output correlation coefficient calculation unit 163, an input / output transfer characteristic estimation unit 164, a residual echo prediction unit 165, and a residual echo correction unit 166 (see FIG. 5).

残留エコー消去抑圧部１６９は、消去抑圧配分制御部１６９１と、消去配分設定部１６９２と、減算部１６９３と、抑圧部１６９４とを含む（図６参照）。 The residual echo erasure suppression unit 169 includes an erasure suppression distribution control unit 1691, an erasure distribution setting unit 1692, a subtraction unit 1693, and a suppression unit 1694 (see FIG. 6).

＜周波数領域変換部１６１_１，…，１６１_Ｍと周波数領域変換部１６２＞
図３及び図４に示すように、周波数領域変換部１６１_１，…，１６１_Ｍは、それぞれ受話信号x₁(k),…,x_M(k)を入力とし、これを短時間区間毎に周波数領域の信号である周波数領域受話信号X₁(f,j),…,X_M(f,j)に変換し、出力する（ｓ１６１）。同様に、周波数領域変換部１６２は、マイクロホン３_ｎで収音した収音信号y(k)を入力とし、短時間区間毎に周波数領域の信号である周波数領域収音信号Y(f,j)に変換し出力する（ｓ１６２）。なお、以下において、収音信号y(k)のことを、後述する第二収音信号u(k)と区別するために第一収音信号y(k)ともいう。 <Frequency Domain Transformer 161 ₁ ,..., 161 _M and Frequency Domain Transformer 162>
As shown in FIGS. 3 and 4, the frequency domain converter 161 1, _..., 161 _M each received signal x ₁ (k), ..., as input x _M (k), which in a short time interval per frequency domain received signals X ₁ is a signal in the frequency domain (f, j), ..., it is converted to X _M (f, j), and outputs (s161). Similarly, the frequency domain conversion unit 162 receives the collected sound signal y (k) collected by the microphone 3 _n , and the frequency domain collected signal Y (f, j), which is a frequency domain signal for each short time interval. And output (s162). Hereinafter, the collected sound signal y (k) is also referred to as a first collected sound signal y (k) in order to distinguish it from a second collected sound signal u (k) described later.

各信号を1フレーム=2Lサンプルとし、L/Dサンプル毎にブロック化し、L/Dサンプルずつずらして、フレームを作成する場合について説明する。但し、Lは1以上の整数であり、DはLを割り切ることができる整数であり、jはフレーム番号を表し、時刻k=jL/Dである。fは周波数番号を表し、例えば、fはサンプリング周波数f_sの半分をL等分した離散点（周波数ビン）に対応し、f=0,1,…,L-1であり、f=0は周波数0に対応し、f=1は周波数(1/L)f_s/2に対応し、…、f=L-1は((L-1)/L)f_s/2に対応する。 A case will be described in which each signal is set to 1 frame = 2 L samples, the L / D samples are blocked, and the L / D samples are shifted to create a frame. However, L is an integer greater than or equal to 1, D is an integer which can divide L, j represents a frame number, and is time k = jL / D. f represents a frequency number. For example, f corresponds to a discrete point (frequency bin) obtained by dividing half of the sampling frequency f _s into L equal parts, and f = 0, 1,..., L−1, and f = 0 is Corresponding to frequency 0, f = 1 corresponds to frequency (1 / L) f _s / 2,..., F = L-1 corresponds to ((L-1) / L) f _s / 2.

周波数領域への変換は例えば、ＦＦＴ（Fast Fourier transform）やＤＦＴ（discrete Fourier transform）により行い、計算を簡略化・高速化するために、Lを2のべき乗にとることが好ましい。例えば、L=64〜1024、D=2〜8等とする。フレーム長（1フレームに含まれるサンプル数）を10ms〜20msに対応するように設定すればよい。 The conversion to the frequency domain is performed by, for example, FFT (Fast Fourier transform) or DFT (discrete Fourier transform), and it is preferable to set L to a power of 2 in order to simplify and speed up the calculation. For example, L = 64 to 1024, D = 2 to 8 and the like. The frame length (number of samples included in one frame) may be set so as to correspond to 10 ms to 20 ms.

＜残留エコー推定部１６Ａ＞
残留エコー推定部１６Ａは、周波数領域収音信号Y(f,j)と周波数領域受話信号X₁(f,j),…,X_M(f,j)とを受け取り、これらの値を用いて、周波数領域収音信号Y(f,j)に含まれる残留エコーの位相と振幅とを考慮し、残留エコーの推定値(以下、残留エコー推定値ともいう)Y^₂(f,j)を求め（ｓ１６Ａ）、出力する。図７は残留エコー推定部１６Ａの処理フローの例を示す。図５及び図７を用いて、残留エコー推定部１６Ａの処理を説明する。 <Residual echo estimation unit 16A>
The residual echo estimation unit 16A receives the frequency domain collected signal Y (f, j) and the frequency domain received signal X ₁ (f, j),..., X _M (f, j), and uses these values. Taking into account the phase and amplitude of the residual echo contained in the frequency domain collected signal Y (f, j), the estimated residual echo (hereinafter also referred to as the residual echo estimated value) Y ^ ₂ (f, j) Obtain (s16A) and output. FIG. 7 shows an example of the processing flow of the residual echo estimator 16A. The processing of the residual echo estimator 16A will be described with reference to FIGS.

＜入出力相関係数算出部１６３＞
入出力相関係数算出部１６３は、周波数領域受話信号X₁(f,j),…,X_M(f,j)と周波数領域収音信号Y(f,j)とを入力とし、これらの値を用いて、第mチャネルの周波数領域受話信号X_m(f,j)のパワースペクトルP_mm(f,j)と、第mチャネルの周波数領域受話信号X_m(f,j)と第m’（但し、m’=1,…,Mであり、m≠m’である）チャネルの周波数領域受話信号X_m’(f,j)とのクロススペクトルP_m’m(f,j)と、第m’チャネルの周波数領域受話信号X_m’(f,j)と周波数領域収音信号Y(f,j)とのクロススペクトルQ_m’(f,j)とを求め、出力する（ｓ１６３）。 <Input / output correlation coefficient calculation unit 163>
The input / output correlation coefficient calculation unit 163 receives the frequency domain received signal X ₁ (f, j),..., X _M (f, j) and the frequency domain collected signal Y (f, j) as inputs. using the value, the power spectrum P _mm (f, j) of the m channels of the frequency domain received signal X _m (f, j) and the frequency domain received signal X _m (f, j) of the m channels and the m '(Where m' = 1, ..., M, m ≠ m ') and the cross spectrum P _m'm (f, j) with the frequency domain received signal X _m' (f, j) of the channel The cross spectrum Q _{m ′} (f, j) between the frequency domain received signal X _{m ′} (f, j) of the m′-th channel and the frequency domain collected signal Y (f, j) is obtained and output (s163). ).

なお、各クロススペクトル及びパワースペクトルは、時刻k=jL/Dにおける値である。パワースペクトルP_mm(f,j)は入力信号(第mチャネルの周波数領域受話信号X_m(f,j))の自己相関係数を表し、クロススペクトルP_m’m(f,j)は入力信号（第mチャネルの周波数領域受話信号X_m(f,j)と第m’チャネルの周波数領域受話信号X_m’(f,j))間の相関係数を表す。上述のパワースペクトルP_mm(f,j)とクロススペクトルP_m’m(f,j)からなる行列を入力信号の相関係数P(f,j)として、以下のように表す。 Each cross spectrum and power spectrum are values at time k = jL / D. The power spectrum P _mm (f, j) represents the autocorrelation coefficient of the input signal (frequency domain received signal X _m (f, j) of the m-th channel), and the cross spectrum P _m'm (f, j) is the input The correlation coefficient between the signals (the frequency domain received signal X _m (f, j) of the m-th channel and the frequency domain received signal X _{m ′} (f, j) of the m-th channel) is represented. The matrix composed of the power spectrum P _mm (f, j) and the cross spectrum P _m′m (f, j) described above is represented as the correlation coefficient P (f, j) of the input signal as follows.

一方、クロススペクトルQ_m’(f,j)は、入力信号（第m’チャネルの周波数領域受話信号X_m’(f,j)）と出力信号（周波数領域収音信号Y(f,j)）との間の相関係数を表し、入出力間の相関係数Q(f,j)を On the other hand, the cross spectrum Q _{m ′} (f, j) includes an input signal (frequency domain received signal X _{m ′} (f, j) of the mth channel) and an output signal (frequency domain collected signal Y (f, j)). ) And the correlation coefficient between input and output Q (f, j)

と表す。図８を用いて入出力相関係数算出部１６３を説明する。例えば、入出力相関係数算出部１６３はパワースペクトル算出部１６３ａと、受話信号間クロススペクトル算出部１６３ｂと、入出力信号間クロススペクトル算出部１６３ｃを有する。 It expresses. The input / output correlation coefficient calculation unit 163 will be described with reference to FIG. For example, the input / output correlation coefficient calculation unit 163 includes a power spectrum calculation unit 163a, an inter-received signal cross spectrum calculation unit 163b, and an input / output signal cross spectrum calculation unit 163c.

パワースペクトル算出部１６３ａは、第mチャネルの周波数領域受話信号X_m(f,j)を用いて、パワースペクトルP_mm(f,j)を算出する。 The power spectrum calculation unit 163a calculates a power spectrum P _mm (f, j) using the frequency domain received signal X _m (f, j) of the m-th channel.

受話信号間クロススペクトル算出部１６３ｂは、M個の周波数領域受話信号X₁(f,j),…,X_M(f,j)を用いて、第mチャネルの周波数領域受話信号X_m(f,j)と第m’チャネルの周波数領域受話信号X_m’(f,j)間のクロススペクトルP_m’m(f,j)を算出する。 The inter-received signal cross spectrum calculation unit 163b uses the M frequency domain received signals X ₁ (f, j),..., X _M (f, j) to generate the m-th channel frequency domain received signal X _m (f , j) and the frequency domain received signal X _{m ′} (f, j) of the _m′-th channel, a cross spectrum P _m′m (f, j) is calculated.

入出力信号間クロススペクトル算出部１６３ｃは、M個の周波数領域受話信号X₁(f,j),…,X_M(f,j)と周波数領域収音信号Y(f,j)とを用いて、M個の周波数領域受話信号X₁(f,j),…,X_M(f,j)と周波数領域収音信号Y(f,j)間のクロススペクトルQ_m’(f,j)を算出する。 The input / output signal cross spectrum calculation unit 163c uses M frequency domain received signals X ₁ (f, j),..., X _M (f, j) and the frequency domain collected sound signal Y (f, j). And the cross spectrum Q _{m ′} (f, j) between the M frequency domain received signals X ₁ (f, j),..., X _M (f, j) and the frequency domain collected signal Y (f, j) Is calculated.

例えば、P_mm(f,j),P_m’m(f,j),Q_m’(f,j)は、時刻k=jL/Dにおける第mチャネルの周波数領域受話信号X_m(f,j)と周波数領域収音信号Y(f,j)からそれぞれ以下の式（３）、（４）、（５）により算出する。 For example, P _mm (f, j), P _m′m (f, j), Q _{m ′} (f, j) is the frequency domain received signal X _m (f, j) of the m-th channel at time k = jL / D. It is calculated from the following equations (3), (4), and (5) from j) and the frequency domain sound pickup signal Y (f, j), respectively.

X^*はXの複素共役を、E[ ]は平均をとることを意味する。平均処理の一例としては、 X ^* means the complex conjugate of X, and E [] means the average. As an example of the averaging process,

のように、1フレーム前の処理結果と0〜1の値をとる平滑化定数βを用いる方法や過去の数フレームに時定数を乗じて求める方法等が考えられる。P_mm(f,j)及びQ_m’(f,j)についても同様の方法により求めることができる。 As described above, a method using a smoothing constant β which takes a processing result of one frame before and a value of 0 to 1, a method of multiplying several past frames by a time constant, and the like can be considered. P _mm (f, j) and Q _{m ′} (f, j) can also be obtained by the same method.

＜入出力伝達特性推定部１６４＞
入出力伝達特性推定部１６４は、パワースペクトルP_mm(f,j)とクロススペクトルP_m’m(f,j)、Q_m’(f,j)とを入力とし、これらの値を用いて、M個の周波数領域受話信号X₁(f,j),…,X_M(f,j)と周波数領域収音信号Y(f,j)との入出力伝達特性の推定値G(f,j)=[G₁(f,j),…,G_M(f,j)]^Tを周波数毎に推定し、出力する（ｓ１６４）。 <Input / output transfer characteristic estimation unit 164>
The input / output transfer characteristic estimation unit 164 receives the power spectrum P _mm (f, j), the cross spectrum P _m′m (f, j), and Q _{m ′} (f, j) as input, and uses these values. , M frequency domain received signals X ₁ (f, j), ..., X _M (f, j) and frequency domain sound pickup signal Y (f, j) estimated input / output transfer characteristics G (f, j) = [G ₁ (f, j),..., G _M (f, j)] ^T is estimated for each frequency and output (s164).

例えば、入出力伝達特性推定部１６４は、入出力伝達特性の推定値G(f,j)を以下の式（７）により推定する。 For example, the input / output transfer characteristic estimation unit 164 estimates the input / output transfer characteristic estimated value G (f, j) by the following equation (7).

なお上記パワースペクトルとクロススペクトルからなる行列について、逆行列計算を安定化するために、対角成分に微小定数δを加えて、 For the matrix composed of the power spectrum and cross spectrum, in order to stabilize the inverse matrix calculation, a small constant δ is added to the diagonal component,

としてもよい。 It is good.

＜残留エコー予測部１６５＞
残留エコー予測部１６５は、M個の周波数領域受話信号X₁(f,j),…,X_M(f,j)と入出力伝達特性の推定値G(f,j)とを入力とし、これらの値から、周波数領域収音信号Y(f,j)に含まれる残留エコー成分を予測し、推定値Y^(f,j)を出力する（ｓ１６５）。 <Residual echo prediction unit 165>
The residual echo prediction unit 165 receives M frequency domain received signals X ₁ (f, j),..., X _M (f, j) and an estimated value G (f, j) of input / output transfer characteristics as inputs, From these values, a residual echo component included in the frequency domain sound pickup signal Y (f, j) is predicted, and an estimated value Y ^ (f, j) is output (s165).

例えば、残留エコー推定値Y^(f,j)を、 For example, the residual echo estimate Y ^ (f, j)

として予測する。 To predict.

なお、式（３）〜（５）により入出力相関係数P(f,j)、Q(f,j)を求める際に、残留エコー成分の位相が考慮されている。さらに、式（７）または（７’）により、入出力相関係数P(f,j)、Q(f,j)から入出力伝達特性の推定値G(f,j)を求める際に、残留エコー成分の位相及び振幅が考慮されており、残留エコー推定部１６Ａは、残留エコーの位相及び振幅を考慮し、残留エコー推定値Y^(f,j)を求めていると言える。 It should be noted that the phase of the residual echo component is taken into account when obtaining the input / output correlation coefficients P (f, j) and Q (f, j) by the equations (3) to (5). Furthermore, when obtaining the estimated value G (f, j) of the input / output transfer characteristic from the input / output correlation coefficient P (f, j), Q (f, j) by the equation (7) or (7 ′), The phase and amplitude of the residual echo component are taken into account, and it can be said that the residual echo estimation unit 16A obtains the residual echo estimated value Y ^ (f, j) in consideration of the phase and amplitude of the residual echo.

＜残留エコー補正部１６６＞
残留エコー補正部１６６は、周波数領域収音信号Y(f,j)と残留エコー推定値Y^(f,j)とを入力とし、これを用いて、残留エコー推定値Y^(f,j)を補正して補正後の残留エコー推定値Y₂^(f,j)を求め、出力する（ｓ１６６）。補正後の残留エコー推定値Y₂^(f,j)は例えば、以下の式により、求めることができる。 <Residual echo correction unit 166>
The residual echo correction unit 166 receives the frequency domain sound pickup signal Y (f, j) and the residual echo estimated value Y ^ (f, j) as input, and uses the residual echo estimated value Y ^ (f, j). ) To obtain a corrected residual echo estimated value Y ₂ ^ (f, j) and output it (s166). The corrected residual echo estimated value Y ₂ ^ (f, j) can be obtained by the following equation, for example.

但し、Tは各スペクトルの推定の自由度の数であり、入出力相関係数算出部１６３において、パワースペクトルP_mm(f,j)及びクロススペクトルP_m’m(f,j)、Q_m’(f,j)を算出するときのフレーム数が、これに該当する。T-2M>0になるように、利用に先立ち、または、受話信号のチャネル数Mを設定後に、Tに適切な値が設定される。なお、式（１１）の結果、比率η(f,j)<0となる場合には、式（１２）において、η(f)=0を代わりに用いる。 However, T is the number of degrees of freedom of estimation of each spectrum, and in the input / output correlation coefficient calculation unit 163, the power spectrum P _mm (f, j), the cross spectrum P _m′m (f, j), Q _{m '} This is the number of frames when calculating (f, j). An appropriate value is set for T before use or after setting the number M of received signal channels so that T-2M> 0. When the ratio η (f, j) <0 is obtained as a result of Expression (11), η (f) = 0 is used instead in Expression (12).

なお、図示しない記憶部にコヒーレンスの推定値γ^²(f)と式（１１）により定義される比率η(f)との対応付けを記憶しておいてもよい。このような構成により、式（１１）の計算時間を短縮できる。つまり、残留エコー補正部１６６は、周波数領域収音信号Y(f,j)と残留エコー推定値Y^(f,j)とを用いて、式（９）、（１０）を計算し、コヒーレンスの推定値γ^²(f)を求め、図示しない記憶部から求めた推定値γ^²(f)に対応する比率η(f)を取り出し、残留エコー推定値Y^(f,j)に乗じて（式（１２）参照）、補正後の残留エコー推定値Y₂^(f,j)を求め、出力すればよい。別の言い方をすると、MおよびTは事前に分かっている定数であり、比率η(f)は、0から1の間をとる推定値γ^²(f)の関数とみなせる。すなわち比率η(f)を推定値γ^²(f)の関数とみて、事前に計算して表を作成できる。実際の信号処理では、この表を引いて比率η(f)を求めることで、√を計算することなくη(f)を効率良く求められる。 A storage unit (not shown) may store a correspondence between the estimated coherence value γ ^ ² (f) and the ratio η (f) defined by the equation (11). With such a configuration, the calculation time of Expression (11) can be shortened. That is, the residual echo correction unit 166 calculates Equations (9) and (10) using the frequency domain sound pickup signal Y (f, j) and the residual echo estimation value Y ^ (f, j), and provides coherence. The estimated value γ ^ ² (f) is obtained, the ratio η (f) corresponding to the estimated value γ ^ ² (f) obtained from the storage unit (not shown) is extracted, and the residual echo estimated value Y ^ (f, j) is obtained. Multiplying (see equation (12)), a corrected residual echo estimated value Y ₂ ^ (f, j) may be obtained and output. In other words, M and T are constants known in advance, and the ratio η (f) can be regarded as a function of an estimated value γ ^ ² (f) that takes a value between 0 and 1. That is, the ratio η (f) is regarded as a function of the estimated value γ ^ ² (f), and a table can be created by calculating in advance. In actual signal processing, this table is subtracted to obtain the ratio η (f), so that η (f) can be obtained efficiently without calculating √.

＜残留エコー消去抑圧部１６９＞
残留エコー消去抑圧部１６９は、補正後の残留エコー推定値Y₂^(f,j)と周波数領域収音信号Y(f,j)とを受け取り、周波数領域収音信号Y(f,j)から補正後の残留エコー推定値Y₂^(f,j)を消去し、抑圧し（ｓ１６９）、周波数領域の送話信号V(f,j)を求め、出力する。なお、周波数領域収音信号Y(f,j）から補正後の残留エコー推定値Y₂^(f,j)を引いて差を求め、その差が小さいほど、周波数領域収音信号Y(f,j)から残留エコーを抑圧する割合を増やし、残留エコーを消去する割合を減らす。例えば、エコー抑圧ゲインをG_mxs(f,j)とし、残留エコーを消去する割合を分担比R_mxc(f,j)とし、送話信号V(f,j)を <Residual echo cancellation suppressor 169>
The residual echo cancellation suppression unit 169 receives the corrected residual echo estimation value Y ₂ ^ (f, j) and the frequency domain collected signal Y (f, j), and receives the frequency domain collected signal Y (f, j). The corrected residual echo estimated value Y ₂ ^ (f, j) is deleted from the input signal, suppressed (s169), and the transmission signal V (f, j) in the frequency domain is obtained and output. The difference is obtained by subtracting the corrected residual echo estimated value Y ₂ ^ (f, j) from the frequency domain collected signal Y (f, j), and the smaller the difference, the more the frequency domain collected signal Y (f , j), the ratio of suppressing the residual echo is increased and the ratio of canceling the residual echo is decreased. For example, the echo suppression gain is G _mxs (f, j), the ratio of canceling the residual echo is the sharing ratio R _mxc (f, j), and the transmission signal V (f, j) is

として求める。このような構成により、エコー消去とエコー抑圧との配分を、残留エコーを消去するために適切に設定することができる。なお、エコー抑圧ゲインG_mxs(f,j)、分担比R_mxc(f,j)の設定例については、後述する消去抑圧配分制御部１６９１において説明する。図９は残留エコー消去抑圧部１６９の処理フローの例を示す。図６及び図９を用いて、残留エコー消去抑圧部１６９の処理を説明する。 Asking. With such a configuration, the distribution of echo cancellation and echo suppression can be set appropriately in order to cancel the residual echo. An example of setting the echo suppression gain G _mxs (f, j) and the sharing ratio R _mxc (f, j) will be described in the erasure suppression distribution control unit 1691 described later. FIG. 9 shows an example of the processing flow of the residual echo cancellation suppressing unit 169. The processing of the residual echo cancellation suppressing unit 169 will be described with reference to FIGS. 6 and 9.

＜消去抑圧配分制御部１６９１＞
周波数領域収音信号Y(f,j)のパワーと推定した残留エコー推定値Y₂^(f,j)のパワーとの差分が小さいときに、エコー消去及び抑圧の混合モードに入る。混合モードでは差分が小さいほど、エコー消去の配分を引き下げ、エコー抑圧の配分を引き上げる。 <Erasure Suppression Distribution Control Unit 1691>
When the difference between the power of the frequency domain collected signal Y (f, j) and the estimated residual echo estimated value Y ₂ ^ (f, j) is small, the mixed mode of echo cancellation and suppression is entered. In the mixed mode, the smaller the difference, the lower the echo cancellation distribution and the higher the echo suppression distribution.

この処理の一例を図２をもちいて説明する。周波数領域収音信号Y(f,j)のパワーY(f,j)²と残留エコー推定値Y₂^(f,j)のパワーY₂^(f,j)²の差分が小さい状況を An example of this process will be described with reference to FIG. Frequency domain sound collection signal Y (f, j) power Y (f, j) ² and residual echo estimate Y ₂ ^ (f, j) power Y ₂ ^ (f, j) of the ^second difference is smaller situation

で検出する。ただしp_hyb_range_upperは検出用の閾値であり、−２０〜−１０ｄBの範囲で値を設定する。またMAX(a,b)はaとbで大きい方の値を返す関数であり、残留エコーが収音信号より大きく推定されるとき、関数Jは０を返す。 Detect with. However, p_hyb_range_upper is a threshold value for detection, and a value is set in the range of -20 to -10 dB. MAX (a, b) is a function that returns the larger value of a and b. When the residual echo is estimated larger than the collected sound signal, the function J returns 0.

残留エコー消去の分担比 R_mxc(f,j)は

The share ratio of residual echo cancellation R _mxc (f, j) is

で計算され、消去配分設定部１６９２に設定される。残りの(1-R_mxc(f,j))が残留エコー抑圧の適用分になり、抑圧量は振幅換算で|Y₂^(f,j)|(1-R_mxc(f,j))になる。エコー抑圧ゲインの一例として、 And is set in the erasure distribution setting unit 1692. The remaining (1-R _mxc (f, j)) is applied to residual echo suppression, and the amount of suppression is | Y ₂ ^ (f, j) | (1-R _mxc (f, j)) in terms of amplitude become. As an example of echo suppression gain,

をもちいることができ、これが抑圧部１６９４に設定される。 This is set in the suppression unit 1694.

なおパラメータp_cancel_allotted_minは、残留エコーを消去する最小の割合であり、0から1の範囲の値を設定する。 The parameter p_cancel_allotted_min is the minimum ratio for canceling the residual echo, and a value in the range of 0 to 1 is set.

また収音信号パワーと残留エコー信号パワーの差分が小さくないとき、すなわち、p_cancel_allotted_min≦J(Y(f,j),Y^₂(f,j))のとき、R_mxc(f,j)=1に設定する。このときG_mxs(f,j)=1となり、残留エコー消去のみが有効になる。 Also, when the difference between the collected signal power and the residual echo signal power is not small, that is, p_cancel_allotted_min ≦ J (Y (f, j), Y ^ ₂ (f, j)), R _mxc (f, j) = Set to 1. At this time, G _mxs (f, j) = 1, and only residual echo cancellation is valid.

つまり、消去抑圧配分制御部１６９１は、周波数領域収音信号Y(f,j)と、残留エコー推定値Y^₂(f,j)とを受け取り、それぞれのパワーを求め、周波数領域収音信号Y(f,j)のパワーY(f,j)²から残留エコー推定値Y^₂(f,j)のパワーY^₂(f,j)²を引いて差(Y(f,j)²-Y^₂(f,j)²)を求め、差(Y(f,j)²-Y^₂(f,j)²)が所定の閾値p_hyb_range_upperより小さいときは、分担比R_mxc(f,j)を That is, the erasure suppression distribution control unit 1691 receives the frequency domain sound pickup signal Y (f, j) and the residual echo estimated value Y ^ ₂ (f, j), obtains respective powers, and obtains the frequency domain sound pickup signal. Y (f, j) power Y (f, j) ² from the residual echo estimate Y ^ ₂ (f, j) power _{Y ^ 2 (f, j)} 2 the pulling difference of (Y (f, j) ² -Y ^ ₂ (f, j) ² ), and when the difference (Y (f, j) ² -Y ^ ₂ (f, j) ² ) is smaller than the predetermined threshold p_hyb_range_upper, the sharing ratio R _mxc ( f, j)

とし、差(Y(f,j)²-Y^₂(f,j)²)が所定の閾値p_hyb_range_upper以上のときは、分担比R_mxc(f,j)を１とし、分担比R_mxc(f,j)を消去配分設定部１６９２に出力する（ｓ１６９１）。さらに、消去抑圧配分制御部１６９１は、式（１３）により、周波数領域収音信号Y(f,j)と、残留エコー推定値Y^₂(f,j)と分担比R_mxc(f,j)とからエコー抑圧ゲインG_mxs(f,j)を求め、抑圧部１６９４に出力する。 When the difference (Y (f, j) ² -Y ^ ₂ (f, j) ² ) is greater than or equal to a predetermined threshold p_hyb_range_upper, the sharing ratio R _mxc (f, j) is set to 1, and the sharing ratio R _mxc ( f, j) is output to the erasure distribution setting unit 1692 (s1691). Further, the erasure suppression distribution control unit 1691 calculates the frequency domain collected signal Y (f, j), the residual echo estimated value Y ^ ₂ (f, j), and the sharing ratio R _mxc (f, j ) To obtain an echo suppression gain G _mxs (f, j) and output it to the suppression unit 1694.

＜消去配分設定部１６９２＞
消去配分設定部１６９２は、残留エコー推定値Y^₂(f,j)と分担比R_mxc(f,j)とを受け取り、これらの積Y^₂(f,j)R_mxc(f,j)を求め（ｓ１６９２）、出力する。なお、この処理が、残留エコー消去の割合を設定する処理に相当する。 <Erase distribution setting unit 1692>
The erasure distribution setting unit 1692 receives the residual echo estimation value Y ^ ₂ (f, j) and the sharing ratio R _mxc (f, j), and the product Y ^ ₂ (f, j) R _mxc (f, j ) Is obtained (s1692) and output. This process corresponds to a process for setting the residual echo cancellation ratio.

＜減算部１６９３＞
減算部１６９３は、周波数領域収音信号Y(f,j)と積Y^₂(f,j)R_mxc(f,j)とを受け取り、周波数領域収音信号Y(f,j)から積Y^₂(f,j)R_mxc(f,j)を引き、差{Y(f,j)-Y^₂(f,j)R_mxc(f,j)}を求め（ｓ１６９３）、出力する。なお、この処理が、残留エコー消去処理に相当する。 <Subtraction unit 1693>
The subtracting unit 1693 receives the frequency domain sound _pickup signal Y (f, j) and the product Y ^ ₂ (f, j) R _mxc (f, j), and _{produces the} product from the frequency domain sound _pickup signal Y (f, j). _Subtract Y ^ ₂ (f, j) _Rmxc (f, j) to _find the difference {Y (f, j) -Y ^ ₂ (f, j) _Rmxc (f, j)} (s1693) and output To do. This process corresponds to a residual echo cancellation process.

＜抑圧部１６９４＞
抑圧部１６９４は、差{Y(f,j)-Y^₂(f,j)R_mxc(f,j)}とエコー抑圧ゲインG_mxs(f,j)とを受け取り、積G_mxs(f,j){Y(f,j)-Y^₂(f,j)R_mxc(f,j)}を求め（ｓ１６９４）、この積を周波数量器の送話信号V(f,j)として、出力する。なお、この処理が、残留エコー抑圧処理に相当する。 <Suppression unit 1694>
The suppressor 1694 receives the difference {Y (f, j) −Y ^ ₂ (f, j) R _mxc (f, j)} and the echo suppression gain G _mxs (f, j), and the product G _mxs (f , j) {Y (f, j) -Y ^ ₂ (f, j) R _mxc (f, j)} is obtained (s1694), and this product is used as the transmission signal V (f, j) of the frequency meter. ,Output. This process corresponds to the residual echo suppression process.

よって、残留エコー消去抑圧部１６９は、例えば、以下の式（１４）により、送話信号V(f,j)を求める。 Therefore, the residual echo cancellation suppression unit 169 obtains the transmission signal V (f, j) by the following equation (14), for example.

＜時間領域変換部１６８＞
図３及び図４に示すように、時間領域変換部１６８は、周波数領域の送話信号V(f,j)を入力とし、この信号を時間領域の信号v(k)に変換し、これをエコー消去装置１００の出力値として出力する（ｓ１６８）。なお、時間領域変換部１６８では、周波数領域変換部１６１_ｍ及び１６２において用いた周波数領域変換方法に対応する時間領域変換方法を用いればよい。 <Time domain conversion unit 168>
As shown in FIG. 3 and FIG. 4, the time domain conversion unit 168 receives the frequency domain transmission signal V (f, j), converts this signal into a time domain signal v (k), and converts this signal v (k). It is output as an output value of the echo canceller 100 (s168). Note that the time domain conversion unit 168 may use a time domain conversion method corresponding to the frequency domain conversion method used in the frequency domain conversion units 161 _m and 162.

＜効果＞
このような構成によって、残留エコーパワーと収音信号パワーが同等だが、補正した残留エコー推定値の位相誤差が小さくないために残留エコー消去処理の効果が薄い状況でも、従来よりも残留エコーを抑えることができる。 <Effect>
With such a configuration, the residual echo power is equal to the collected signal power, but the residual echo is suppressed more than before even in the situation where the residual echo cancellation processing is less effective because the phase error of the corrected residual echo estimate is not small. be able to.

＜変形例＞
第一実施形態では、主にM>1のときについて説明しているが、M=1であってもよい。この場合、入出力相関係数算出部１６３では、第mチャネルの周波数領域受話信号X_m(f,j)と第m’チャネルの周波数領域受話信号X_m’(f,j)とのクロススペクトルP_m’m(f,j)を求める必要はなくなる。入出力伝達特性推定部１６４では、パワースペクトルP₁₁(f,j)とクロススペクトルQ₁(f,j)とを用いて、周波数領域受話信号X₁(f,j)と周波数領域収音信号Y(f,j)との入出力伝達特性の推定値G(f,j)を周波数毎に推定し、出力する。 <Modification>
In the first embodiment, the case where M> 1 is mainly described. However, M = 1 may be used. In this case, the input / output correlation coefficient calculation unit 163 cross-spectrums the frequency domain received signal X _m (f, j) of the m-th channel and the frequency domain received signal X _{m ′} (f, j) of the m′-th channel. There is no need to find P _m'm (f, j). The input / output transfer characteristic estimation unit 164 uses the power spectrum P ₁₁ (f, j) and the cross spectrum Q ₁ (f, j), and the frequency domain received signal X ₁ (f, j) and the frequency domain sound pickup signal. An estimated value G (f, j) of input / output transfer characteristics with Y (f, j) is estimated for each frequency and output.

残留エコー推定部１６Ａは、周波数領域収音信号Y(f,j)に含まれる残留エコーの位相と振幅とを考慮し、残留エコー推定値を求めるものであれば他の構成であってもよい。例えば、残留エコー推定部１６Ａは、残留エコー補正部１６６を含まず、残留エコー予測部１６５の出力値（推定値Y^(f,j)）を残留エコー推定部１６Ａの出力値として用いてもよい。 The residual echo estimator 16A may have other configurations as long as the residual echo estimation value is obtained in consideration of the phase and amplitude of the residual echo included in the frequency domain sound pickup signal Y (f, j). . For example, the residual echo estimation unit 16A does not include the residual echo correction unit 166, and may use the output value (estimated value Y ^ (f, j)) of the residual echo prediction unit 165 as the output value of the residual echo estimation unit 16A. Good.

また、本実施形態のポイントは、残留エコー推定値の位相誤差が小さくないために残留エコー消去処理の効果が薄い状況においても、残留エコー消去と、エコー抑圧を組み合わせることで残留エコーを抑える点である。そのため、残留エコー消去部１６ｎは、少なくとも、残留エコー推定部１６Ａと残留エコー消去抑圧部１６９とを含めばよく、他の構成（例えば、Ｍ個の周波数領域変換部１６１_１，…，１６１_Ｍ、周波数領域変換部１６２及び時間領域変換部１６８）は必ずしも含まなくともよい。 The point of this embodiment is that the residual echo can be suppressed by combining residual echo cancellation and echo suppression even in a situation where the effect of residual echo cancellation is weak because the phase error of the residual echo estimation value is not small. is there. Therefore, the residual echo cancellation unit 16n may include at least the residual echo estimation unit 16A and the residual echo cancellation suppression unit 169, and other configurations (for example, M frequency domain conversion units 161 ₁ ,..., 161 _M , The frequency domain transform unit 162 and the time domain transform unit 168) are not necessarily included.

＜第二実施形態＞
第一実施形態と異なる部分を中心に説明する。 <Second embodiment>
A description will be given centering on differences from the first embodiment.

＜エコー消去装置２００＞
図１０及び図１１を用いて第二実施形態に係るエコー消去装置２００を説明する。エコー消去装置２００は、Ｎ個のエコー消去部２８_１，…，２８_ＮとＮ個の残留エコー消去部２６_１，…，２６_Ｎを含み、残留エコー消去部２６_ｎの前段にエコー消去部２８_ｎを設ける。 <Echo canceling apparatus 200>
An echo canceling apparatus 200 according to the second embodiment will be described with reference to FIGS. 10 and 11. The echo cancellation apparatus 200 includes N echo cancellation units 28 ₁ ,..., 28 _N and N residual echo cancellation units 26 ₁ ,..., 26 _N , and the echo cancellation unit 28 precedes the residual echo cancellation unit 26 _n. _n is provided.

＜エコー消去部２８_ｎ＞
エコー消去部２８_ｎには、受話端子１_１，…，１_Ｍと、残留エコー消去部２６_ｎと、マイクロホン３_ｎとが接続されており、受話信号x₁(k),…,x_M(k)及び第一収音信号y_n(k)が入力され、１チャネルの第二収音信号u_n(k)を残留エコー消去部２６_ｎに出力する。なお、第一収音信号からエコー成分を消去した誤差信号を便宜的に第二収音信号と呼ぶ。 <Echo canceling unit 28 _n >
The echo cancellation unit 28 _n, receiving terminal ₁ 1, ..., _{1 M} and the residual echo cancellation unit 26 _n, which is connected to the microphone _{3 n} is the received signal _{x 1 (k), ...,} x M ( k) and the first sound pickup signal y _n (k) are input, and the second sound pickup signal u _n (k) of one channel is output to the residual echo canceling unit 26 _n . Note that an error signal obtained by eliminating the echo component from the first sound collection signal is referred to as a second sound collection signal for convenience.

エコー消去部２８_ｎは、受話信号x₁(k),…,x_M(k)を適応フィルタでフィルタリングし、予測エコー信号y’(k)を生成し、さらに、マイクロホン３_ｎで収音した第一収音信号y(k)と予測エコー信号y’(k)との差分を第二収音信号u(k)として求め、第二収音信号u(k)と受話信号x₁(k),…,x_M(k)とに基づき、適応フィルタのフィルタ係数h’(k)を更新する（ｓ２８）。 The echo canceller 28 _n filters the received signals x ₁ (k),..., X _M (k) with an adaptive filter, generates a predicted echo signal y ′ (k), and further picks up the sound with the microphone 3 _n The difference between the first collected signal y (k) and the predicted echo signal y ′ (k) is obtained as the second collected signal u (k), and the second collected signal u (k) and the received signal x ₁ (k ),..., X _M (k) and the filter coefficient h ′ (k) of the adaptive filter is updated (s28).

以下、図１２及び図１３を用いて、詳細を説明する。エコー消去部２８_ｎは、エコー予測部２８１と減算部２８２とエコー経路推定部２８３とを有する。 Details will be described below with reference to FIGS. 12 and 13. The echo erasure unit 28 _n includes an echo prediction unit 281, a subtraction unit 282, and an echo path estimation unit 283.

エコー消去部２８_ｎの処理内容を説明するために、まず、受話信号と第一収音信号との関係を説明する。スピーカ２_１，…，２_Ｍからマイクロホン３_ｎまでのエコー経路のインパルス応答をh₁,…,h_M(k)とし、その長さをL₁とすると、受話信号x₁(k),…,x_M(k)と第一収音信号y(k)の間には次の関係がある。 To illustrate the processing of the echo canceling portion 28 _n, first described the relationship between the received signal and the first voice collecting signal. If the impulse response of the echo path from the speakers 2 ₁ ,..., 2 _M to the microphone 3 _n is h ₁ ,..., H _M (k) and the length is L ₁ , the received signal x ₁ (k),. , x _M (k) and the first collected sound signal y (k) have the following relationship:

第mチャネルのインパルス応答h_mと受話信号x_mを
h_m=[h_m(0)…h_m(L₁-1)]^T (22)
x_m=[x_m(0)…x_m(L₁-1)]^T (23)
として、ベクトル化すると、受話信号x₁(k),…,x_M(k)と第一収音信号y(k)の関係は次のように記述される。 M-th channel impulse response h _m and received signal x _m
h _m = [h _m (0)… h _m (L ₁ -1)] ^T (22)
x _m = [x _m (0)… x _m (L ₁ -1)] ^T (23)
As a vector, the relationship between the received signal x ₁ (k),..., X _M (k) and the first collected signal y (k) is described as follows.

y(k)=h₁ ^Tx₁(k)+…+h_M ^Tx_M(k) (24)
但し、Tは転置を表す。 y (k) = h ₁ ^T x ₁ (k) +… + h _M ^T x _M (k) (24)
However, T represents transposition.

＜エコー予測部２８１＞
エコー予測部２８１は、適応フィルタによる予測エコー経路に受話信号x₁(k),…,x_M(k)を入力して予測エコー信号y’(k)を生成し、出力する（ｓ２８１）。エコー予測部２８１は適応フィルタによって構成され、受話状態における減算部２８２の誤差信号が最小となるように後述するエコー経路推定部２８３で適応フィルタの特性が制御される。 <Echo Prediction Unit 281>
The echo prediction unit 281 inputs the received signals x ₁ (k),..., X _M (k) to the predicted echo path by the adaptive filter, generates a predicted echo signal y ′ (k), and outputs it (s281). The echo prediction unit 281 includes an adaptive filter, and the characteristic of the adaptive filter is controlled by an echo path estimation unit 283 described later so that the error signal of the subtraction unit 282 in the reception state is minimized.

例えば、第mチャネルの適応フィルタのフィルタ係数を
h'_m=[h'_m(0)…h'_m(L_E-1)]^T (25)
とし、予測エコー信号
y'(k)=h'₁ ^Tx₁(k)+…+h'_M ^Tx_M(k) (26)
を生成する。但し、L_Eは適応フィルタのタップ長を表す。エコー予測部２８１は、生成した予測エコー信号y’(k)を減算部２８２に出力する。なお、例えば、適応フィルタのタップ長は100〜300ms程度に設定されることが多い。 For example, the filter coefficient of the adaptive filter of the mth channel is
h ' _m = [h' _m (0)… h ' _m (L _E -1)] ^T (25)
And the predicted echo signal
y '(k) = h' ₁ ^T x ₁ (k) +… + h ' _M ^T x _M (k) (26)
Is generated. However, L _E represents a tap length of the adaptive filter. The echo prediction unit 281 outputs the generated predicted echo signal y ′ (k) to the subtraction unit 282. For example, the tap length of the adaptive filter is often set to about 100 to 300 ms.

＜減算部２８２＞
減算部２８２は、第一収音信号y(k)と予測エコー信号y’(k)を入力とし、第一収音信号y(k)から予測エコー信号y’(k)を差し引き、第二収音信号u(k)を求める（ｓ２８２）。 <Subtraction unit 282>
The subtraction unit 282 receives the first collected sound signal y (k) and the predicted echo signal y ′ (k), subtracts the predicted echo signal y ′ (k) from the first collected sound signal y (k), and The collected sound signal u (k) is obtained (s282).

u(k)=y(k)-y'(k) (27)
求めた第二収音信号u(k)をエコー経路推定部２８３と残留エコー消去部２６_ｎ内の周波数領域変換部２６２に出力する。 u (k) = y (k) -y '(k) (27)
And it outputs the obtained second sound pickup signal u (k) of the frequency domain transform section 262 of the echo path estimation unit 283 residual echo canceling portion 26 _n.

＜エコー経路推定部２８３＞
エコー経路推定部２８３は、第二収音信号u(k)と受話信号x₁(k),…,x_M(k)を入力とし、これらを用いて、適応フィルタのフィルタ係数h’(k)を更新し、出力する（ｓ２８３）。適応フィルタの係数修正法としてNormalized Least Mean Squareアルゴリズム（NLMSアルゴリズム）を用いた場合を、以下の式（２８）により、フィルタ係数を更新する。 <Echo path estimation unit 283>
The echo path estimation unit 283 receives the second collected sound signal u (k) and the received signals x ₁ (k),..., X _M (k) as input, and uses them to use the filter coefficient h ′ (k ) Is updated and output (s283). When the Normalized Least Mean Square algorithm (NLMS algorithm) is used as the coefficient correction method for the adaptive filter, the filter coefficient is updated by the following equation (28).

h'_m(k+1)=h'_m(k)+μu(k)x_m(k) (28)
但し、μはステップサイズであり、 h ' _m (k + 1) = h' _m (k) + μu (k) x _m (k) (28)
Where μ is the step size,

により決定される。なお、μ₀は入力信号のパワーに基づいて制御され、安定した推定を行うために、予め0〜1の値に設定されるパラメータである。エコー経路推定部２８３は、更新したフィルタ係数h’(k+1)をコピーして、エコー予測部２８１に出力する。なお、フィルタ係数の更新方法は上述の方法に限定されるものではなく、他の更新方法を用いてもよい。 Determined by. Note that μ ₀ is a parameter that is controlled based on the power of the input signal and is preset to a value of 0 to 1 in order to perform stable estimation. The echo path estimation unit 283 copies the updated filter coefficient h ′ (k + 1) and outputs it to the echo prediction unit 281. The filter coefficient updating method is not limited to the above-described method, and other updating methods may be used.

＜残留エコー消去部２６_ｎ＞
第一実施形態の残留エコー消去部１６_ｎにおいて第一収音信号y_n(k)を用いて行っていた処理を、残留エコー消去部２６_ｎにおいて上述の第二収音信号u_n(k)を用いて行う。例えば、周波数領域変換部２６２において、第二収音信号u(k)を周波数領域の信号U(f,j)に変換し、この信号を用いて残留エコー推定部２６Ａと残留エコー消去抑圧部２６９において各処理を行う。また、残留エコー推定部２６Ａで行われる処理は、第一実施形態と同様であるが、推定する残留エコー推定値U^₂(f,j)は、第一収音信号y_n(k)に含まれる残留エコー推定値ではなく、第二収音信号u_n(k)に含まれる残留エコー推定値である。残留エコー消去部２６_ｎは、第一収音信号y_n(k)に含まれる残留エコー成分ではなく、第二収音信号u_n(k)に含まれる残留エコー成分を消去する。 <Residual echo canceller 26 _n >
The processing performed using the first collected sound signal y _n (k) in the residual echo canceling unit 16 _n of the first embodiment is performed in the above-described second collected sound signal u _n (k) in the residual echo canceling unit 26 _n . To do. For example, the frequency domain conversion unit 262 converts the second collected sound signal u (k) into a frequency domain signal U (f, j), and using this signal, the residual echo estimation unit 26A and the residual echo cancellation suppression unit 269 In FIG. The processing performed by the residual echo estimation unit 26A is the same as that in the first embodiment, but the estimated residual echo value U ^ ₂ (f, j) is the first collected sound signal y _n (k). It is not the residual echo estimation value included, but the residual echo estimation value included in the second collected sound signal u _n (k). The residual echo canceling unit 26 _n erases not the residual echo component included in the first sound pickup signal y _n (k) but the residual echo component included in the second sound pickup signal u _n (k).

＜効果＞
このような構成により、第一実施形態と同様の効果を得ることができる。エコー経路に大きな変動がない場合には、前段のエコー消去部２８_ｎにおいて、精度の高いエコー経路の推定が可能となるため、送話品質が向上する。また、エコー経路が大きく変動した場合には、エコー消去部２８_ｎにおいて行われるエコー経路の推定が安定するまで、後段の残留エコー消去部２６_ｎにおいて、残留エコー成分を消去することができる。よって、適応フィルタのみを用いてエコー消去を行う装置（例えば、図１の多チャネルエコー消去装置８０）に比べ、エコー経路安定時及び変動時を通じて、高い送話品質を維持することができる。 <Effect>
With such a configuration, the same effect as that of the first embodiment can be obtained. When there is no large fluctuation in the echo path, the echo canceling unit 28 _n in the previous stage can estimate the echo path with high accuracy, so that the transmission quality is improved. In addition, when the echo path greatly fluctuates, the residual echo component can be canceled by the subsequent residual echo canceller 26 _n until the estimation of the echo path performed in the echo canceler 28 _n is stabilized. Therefore, compared to a device that performs echo cancellation using only an adaptive filter (for example, the multi-channel echo cancellation device 80 in FIG. 1), high transmission quality can be maintained throughout the echo path stabilization and fluctuation.

＜変形例＞
本実施形態では、時間領域の信号（受話信号x₁(k),…,x_M(k)及び第二収音信号u_n(k)）を用いて適応フィルタを更新しているが、周波数領域または波数領域の信号を用いて適応フィルタを更新してもよい（参考文献２参照）。
（参考文献２）特開２０１３−２５５１５５号公報 <Modification>
In this embodiment, the adaptive filter is updated using signals in the time domain (received signals x ₁ (k),..., X _M (k) and the second collected sound signal u _n (k)). The adaptive filter may be updated using a signal in the domain or wave number domain (see Reference 2).
(Reference document 2) JP2013-255155A

その場合、エコー消去部２８_ｎの計算過程で得られる周波数領域の信号（X₁(f,j),…,X_M(f,j)及びU_n(f,j)を時間領域の信号に変換せずにそのまま残留エコー消去部２６_ｎに出力する構成としてもよい。その場合、残留エコー消去部２６_ｎは、周波数領域変換部１６１_１，…，１６１_Ｍ及び周波数領域変換部２６２を含まなくともよい。また、適応フィルタの計算コストは大きいので、エコー消去部２８_ｎのエコー消去処理については、一部の周波数（例えば、聴覚的な影響の強い周波数300Hz〜3.4ｋHzや100Hz〜7kHz）においてのみ行い、残留エコー消去部２６_ｎの残留エコー消去処理については、全ての周波数で行う構成としてもよい。このような構成とすることで、効率よく、送話品質を向上させることができる。 In this case, the frequency domain signals (X ₁ (f, j),..., X _M (f, j) and U _n (f, j)) obtained in the calculation process of the echo canceling unit 28 _n are converted into time domain signals. The residual echo canceling unit 26 _n may be directly output to the residual echo canceling unit 26 _n without being converted, in which case the residual echo canceling unit 26 _n does not include the frequency domain converting units 161 ₁ , ..., 161 _M and the frequency domain converting unit 262. In addition, since the calculation cost of the adaptive filter is high, the echo canceling process of the echo canceling unit 28 _n is performed at some frequencies (for example, frequencies 300 Hz to 3.4 kHz or 100 Hz to 7 kHz having a strong auditory influence). only performed, for residual echo cancellation processing of the residual echo canceling portion 26 _n may be configured to perform at all frequencies. with such a configuration, efficiently, thereby improving the transmission quality.

＜プログラム及び記録媒体＞
また、上記の実施形態及び変形例で説明した各装置における各種の処理機能をコンピュータによって実現してもよい。その場合、各装置が有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、上記各装置における各種の処理機能がコンピュータ上で実現される。 <Program and recording medium>
In addition, various processing functions in each device described in the above embodiments and modifications may be realized by a computer. In that case, the processing contents of the functions that each device should have are described by a program. Then, by executing this program on a computer, various processing functions in each of the above devices are realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。 The program describing the processing contents can be recorded on a computer-readable recording medium. As the computer-readable recording medium, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used.

また、このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ−ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させてもよい。 The program is distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Further, the program may be distributed by storing the program in a storage device of the server computer and transferring the program from the server computer to another computer via a network.

このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶部に格納する。そして、処理の実行時、このコンピュータは、自己の記憶部に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実施形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよい。さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるＡＳＰ（Application Service Provider）型のサービスによって、上述の処理を実行する構成としてもよい。なお、プログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの（コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等）を含むものとする。 A computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its storage unit. When executing the process, this computer reads the program stored in its own storage unit and executes the process according to the read program. As another embodiment of this program, a computer may read a program directly from a portable recording medium and execute processing according to the program. Further, each time a program is transferred from the server computer to the computer, processing according to the received program may be executed sequentially. Also, the program is not transferred from the server computer to the computer, and the above-described processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition. It is good. Note that the program includes information provided for processing by the electronic computer and equivalent to the program (data that is not a direct command to the computer but has a property that defines the processing of the computer).

また、コンピュータ上で所定のプログラムを実行させることにより、各装置を構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 In addition, although each device is configured by executing a predetermined program on a computer, at least a part of these processing contents may be realized by hardware.

＜その他の変形例＞
本発明は上記の実施形態及び変形例に限定されるものではない。例えば、上述の各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。その他、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。 <Other variations>
The present invention is not limited to the above-described embodiments and modifications. For example, the various processes described above are not only executed in time series according to the description, but may also be executed in parallel or individually as required by the processing capability of the apparatus that executes the processes. In addition, it can change suitably in the range which does not deviate from the meaning of this invention.

なお、請求項における周波数領域収音信号とは、マイクロホンで収音した第一収音信号から得られる周波数領域の信号であり、マイクロホンで収音した第一収音信号自体や、第一収音信号と予測エコー信号との差分として求められる第二収音信号を含む概念である。ただし、第一収音信号自体や第二収音信号が時間領域の信号の場合には、その信号を周波数領域の信号に変換したものである。さらに、第一または第二収音信号に対し多チャネルの受話信号の相互相関が変化するような工夫を施された信号（例えば、ノイズが負荷された信号、半波整流、遅延変動、レベル変動等の処理を施された信号）であってもよいし、第一収音信号に対し上述の工夫が施された信号と予測エコー信号との差分として求められる第二収音信号であってもよい。 The frequency domain collected signal in the claims is a signal in the frequency domain obtained from the first collected signal collected by the microphone, and the first collected signal itself collected by the microphone or the first collected sound signal. This is a concept including a second sound collection signal obtained as a difference between the signal and the predicted echo signal. However, when the first sound pickup signal itself or the second sound pickup signal is a signal in the time domain, the signal is converted into a signal in the frequency domain. Furthermore, a signal that has been devised so that the cross-correlation of the multi-channel received signal changes with respect to the first or second collected sound signal (for example, a signal loaded with noise, half-wave rectification, delay variation, level variation) Or a second sound pickup signal obtained as a difference between the signal obtained by performing the above-described contrivance on the first sound pickup signal and the predicted echo signal. Good.

Claims

One or more speakers and one or more microphones are arranged in a common sound field, and when an incoming signal is reproduced from the speakers, an echo canceller that cancels echoes that enter the microphone via an echo path. ,
Using the frequency domain collected signal that is a signal obtained from the first collected signal collected by the microphone and the frequency domain received signal that is a frequency domain signal obtained from the received signal, the frequency domain collected signal is used. Considering the phase and amplitude of the residual echo contained in the signal, a residual echo estimation unit for obtaining a residual echo estimate,
Canceling the residual echo estimation value from the frequency domain collected signal, including a residual echo cancellation suppression unit to suppress,
The residual echo cancellation suppression unit obtains a difference by subtracting the residual echo estimation value from the frequency domain collected signal, and increases the ratio of suppressing the residual echo from the frequency domain collected signal as the difference is smaller. Reduce the rate of echo cancellation,
Echo canceler.

The echo canceller of claim 1 ,
The residual echo cancellation suppressor is
The frequency index is f, the frame index is j, the frequency domain collected signal is Y (f, j), the residual echo estimate is Y ^ ₂ (f, j), and the smallest residual echo is eliminated. The ratio is p_cancel_allotted_min, and the difference Y is obtained by subtracting the power Y ^ ₂ (f, j) ² of the residual echo estimated value from the power Y (f, j) ^{2 of the} frequency domain collected signal. If it is smaller than p_hyb_range_upper, set the sharing ratio R _mxc (f, j)

And when the difference is equal to or greater than a predetermined threshold p_hyb_range_upper, an erasure suppression distribution control unit that _sets the sharing ratio R _mxc (f, j) to 1 is included
Echo canceler.

The echo canceller of claim 1 or claim 2 ,
The residual echo cancellation suppressor is
The frequency index is f, the frame index is j, the echo suppression gain is G _mxs (f, j), the frequency domain collected signal is Y (f, j), and the residual echo estimate is Y ^ ₂ ( f, j),
The ratio of canceling the residual echo is the sharing ratio R _mxc (f, j), and the transmission signal V (f, j) is

Asking,
Echo canceler.

A echo canceller of 請 Motomeko 3,
The echo suppression gain G _mxs (f, j) is

Is,
Echo canceler.

The echo canceller according to any one of claims 1 to 4,
The received signal is filtered by an adaptive filter to generate a predicted echo signal, and a difference between the first collected sound signal collected by the microphone and the predicted echo signal is obtained as a second collected sound signal. An echo canceler that updates a filter coefficient of the adaptive filter based on the sound signal and the received signal, and
Using the second sound collection signal in the frequency domain as the frequency domain sound collection signal,
Echo canceler.

An echo canceling method in which one or more speakers and one or more microphones are arranged in a common sound field, and when an incoming signal is reproduced from the speakers, an echo that goes around the microphone via an echo path is canceled. ,
Using the frequency domain collected signal that is a signal obtained from the first collected signal collected by the microphone and the frequency domain received signal that is a frequency domain signal obtained from the received signal, the frequency domain collected signal is used. A residual echo estimation step for obtaining a residual echo estimate in consideration of the phase and amplitude of the residual echo contained in the signal;
Canceling the residual echo estimate from the frequency domain collected signal, and a residual echo cancellation suppressing step of suppressing,
The residual echo cancellation suppression step obtains a difference by subtracting the residual echo estimation value from the frequency domain collected signal, and the smaller the difference, the higher the ratio of suppressing the residual echo from the frequency domain collected signal. Reduce the rate of echo cancellation,
Echo cancellation method.

The echo cancellation method of claim 6,
The received signal is filtered by an adaptive filter to generate a predicted echo signal, and a difference between the first collected sound signal collected by the microphone and the predicted echo signal is obtained as a second collected sound signal. An echo cancellation step of updating a filter coefficient of an adaptive filter based on the sound signal and the received signal, and
Using the second sound collection signal in the frequency domain as the frequency domain sound collection signal,
Echo cancellation method.

A program for causing a computer to function as the echo canceling apparatus according to any one of claims 1 to 5.