JP2005318571A

JP2005318571A - Echo canceler

Info

Publication number: JP2005318571A
Application number: JP2005103882A
Authority: JP
Inventors: Kensaku Fujii; 健作藤井; Jakugu Cho; 若愚張
Original assignee: Toa Corp
Current assignee: Toa Corp
Priority date: 2004-03-31
Filing date: 2005-03-31
Publication date: 2005-11-10

Abstract

<P>PROBLEM TO BE SOLVED: To solve a problem that it is necessary to identify a double talk and an echo path variation using a predetermined identification parameter in order to apply an echo canceler to a system wherein the double talk and the echo path variation occur, but an overlap that occurs in distribution given by the double talk and the echo path variation is too great in a conventional identification parameter, so that identification is delayed, the securing of an echo erasure quantity is delayed and operation of the echo canceler becomes unstable. <P>SOLUTION: In an adaptation algorithm of an echo canceler, a product of an error signal and an input signal to each of coefficient taps within a predetermined search range of an FIR filter 212 is calculated, and a representative tap is selected out of the coefficient taps within the search range based on a first value determined based on the product. It is then determined whether or not a double-talk state is generated based on a second value determined based on a product of a difference signal and an input signal to the representative tap, and a third value determined based on a power value of the difference signal. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、エコーキャンセラの性能（エコー消去量）を低下させる要因の一つとして知られるダブルトーク（近端話者と遠端話者が同時に発話する状態）の発生時においてもエコー消去量を所定値以上に維持するとともに、エコー経路が変動した場合にはエコー消去量を速やかに回復させるエコーキャンセラに関する。 The present invention reduces the amount of echo cancellation even when double talk (a state where a near-end speaker and a far-end speaker speak simultaneously), which is known as one of the factors that degrade the performance (echo cancellation amount) of the echo canceller. The present invention relates to an echo canceller that maintains a predetermined value or more and quickly recovers an echo cancellation amount when the echo path changes.

ダブルトークとエコー経路変動の識別技術が適用される具体的な応用分野の一つとしてハンズフリー通話装置がある。これはマイクロホンとスピーカを用いて通話する装置で、ハンドセット（受話器）をもたないで通話できるという利点を有する。図２は、そのハンズフリーによる通話を音響エコーキャンセラ210の導入によって実現する回路の例である。この図２において、適応フィルタ212は遠端話者音声ｘ_jを使ってスピーカ201からマイクロホン200に回り込むエコーｄ_jに近似される擬似エコーｕ_jを合成する。また、そのフィルタの係数 One of the specific application fields to which the identification technique of double talk and echo path variation is applied is a hands-free communication device. This is an apparatus for making a call using a microphone and a speaker, and has an advantage that a call can be made without a handset (handset). FIG. 2 shows an example of a circuit that realizes the hands-free call by introducing the acoustic echo canceller 210. In FIG. 2, the adaptive filter 212 synthesizes a pseudo echo u _j approximated to an echo d _j that circulates from the speaker 201 to the microphone 200 using the far-end speaker voice x _j . Also, the filter coefficient

は減算器211の出力 Is the output of the subtractor 211

が最小となるように更新される。ただし、上式においてＩは適応フィルタのタップ数、ｎ_jは近端話者音声である。すなわち、エコーｄ_jは、この係数更新の結果として相殺され、遠端話者の側に送出される量が抑えられて円滑な会話が可能になる。 Is updated to be minimal. In the above equation, I is the number of taps of the adaptive filter, and n _j is the near-end speaker voice. That is, the echo _dj is canceled as a result of the coefficient update, and the amount transmitted to the far-end speaker side is suppressed, thereby enabling a smooth conversation.

問題は、近端話者の音声ｎ_jがマイクロホンの出力ｙ_jに The problem is that the near-end speaker's voice n _j becomes the microphone output y _j

として混入することである。適応フィルタはエコーｄ_jを目標として、それに近似される擬似エコーｕ_jを合成する。その際、近端話者の音声ｎ_jは、その合成に対して擾乱として働く。その影響は既に、適応フィルタの係数更新アルゴリズムを最も一般的な学習同定法 It is to mix as. The adaptive filter uses the echo d _j as a target and synthesizes a pseudo echo u _j approximated thereto. At that time, the near-end speaker's voice n _j acts as a disturbance to the synthesis. The effect is already the adaptive filter coefficient update algorithm, the most common learning identification method

とした場合において明らかにされている。ここで、‖ｘ_j‖²は２乗ノルムで It has been revealed in the case. Where ‖x _j ‖ ² is the square norm

と計算され、μはステップサイズと呼ばれる定数で And μ is a constant called step size.

の範囲に選ばれる。この学習同定法においてエコー消去量は推定誤差、すなわち、スピーカ201からマイクロホン200に至る音響系のインパルス応答 The range is selected. In this learning identification method, the amount of echo cancellation is an estimation error, that is, the impulse response of the acoustic system from the speaker 201 to the microphone 200

と適応フィルタの係数との差 Between the coefficient and the adaptive filter coefficient

に対応し、その推定誤差Δ_jの大きさは、その２乗平均σ_d ²として And the magnitude of the estimation error Δ _j is expressed as its root mean square σ _d ²

と定式化されている。このことからわかるように、近端話者の発声はσ_n ²を増加させ、その結果として増加した推定誤差によってエコー消去性能が低下する。ここで、σ_x ²は遠端話者音声の平均パワー、σ_n ²は近端話者音声の平均パワーである。 It is formulated as As can be seen from this, the speech of the near-end speaker increases σ _n ^2, and as a result, the echo cancellation performance decreases due to the increased estimation error. Here, σ _x ² is the average power of the far-end speaker speech, and σ _n ² is the average power of the near-end speaker speech.

この低下に対して式(９)は、ダブルトークの有無の観測と、それを観測したときには適応フィルタの係数の更新を停止させるか、ステップサイズを小さく与える制御が有効であることを示している。その制御に対して問題は、エコー経路、すなわち、ｈが変化したときには反対に係数更新の継続を必要とすることである。このようにエコーキャンセラではエコー経路変動とダブルトークの発生に対して逆の制御が必要になる。したがって、ハンズフリー通話装置の安定した動作のためには、その両者を誤り無く識別して、それぞれに適した操作を施すことが不可欠となる。問題は、その識別法である。 In response to this decrease, equation (9) indicates that observation of the presence / absence of double talk and control to stop updating the coefficient of the adaptive filter or to reduce the step size when it is observed are effective. . The problem with that control is that the echo path, i.e., when the h changes, on the contrary, requires continued coefficient updating. As described above, the echo canceller requires reverse control with respect to the echo path fluctuation and the occurrence of double talk. Therefore, for stable operation of the hands-free communication device, it is indispensable to identify both without error and perform an operation suitable for each. The problem is how to identify it.

このダブルトークの問題の存在が知られるようになって最初に検討されたのは、上記識別法ではなく、ダブルトークの検出法である（例えば、非特許文献１参照）。その理由は、係数更新を常態としてダブルトークのときだけ係数更新を停止すればよいと単純に考えられたためである。しかし、エコー消去量を例えば３０dB確保することを目標とした場合、推定誤差としてσ_d ²＝０.００１の確保が必要になる。この確保の難しさは例えば、式(９)から明らかなようにステップサイズμを１とした場合において検出すべき近端話者の音声パワーが遠端話者の音声パワーの約１／１０００となるということからもわかる。この事実は、エコーに埋もれた近端話者の発声を検出すべきことを意味する。しかし、この埋もれた音声の検出が現実には困難であることは明らかである。 The existence of this double talk problem became known, and the first method examined was not the identification method but the double talk detection method (see, for example, Non-Patent Document 1). The reason is that it was simply considered that the coefficient update should be stopped only in the case of double talk with the coefficient update in a normal state. However, when the goal is to secure an echo cancellation amount of, for example, 30 dB, it is necessary to secure σ _d ² = 0.001 as an estimation error. This difficulty in securing is, for example, that the voice power of the near-end speaker to be detected is about 1/1000 of the voice power of the far-end talker when the step size μ is set to 1 as is apparent from Equation (9). It can also be seen from the fact that This fact means that the utterance of the near-end speaker buried in the echo should be detected. However, it is clear that the detection of this buried voice is actually difficult.

そこで、つぎに減算器出力ｅ_jの増加をもってダブルトークの発生とする方法が提案された（例えば、非特許文献２参照）。この場合、減算器出力ｅ_jにおいてエコーが減算されており、したがって、エコーに埋もれた近端話者の音声が検出可能となる。しかし、減算器出力ｅ_jはエコー経路が変化してもダブルトークと同様に増加する。したがって、この減算器出力ｅ_jを用いる方法においては、エコー経路変動とダブルトーク発生の識別を必要とする。すなわち、ダブルトークにおいてエコー消去量を高く維持するためにはエコーに埋もれた近端話者音声を検出する必要から減算器出力ｅ_jの変化を参照せざるを得ず、さらに、その変化をダブルトーク検出に利用するためにはエコー経路変動とダブルトーク発生の正確な識別法が必要となる。 Therefore, a method for the generation of double talk then with an increase in the subtractor output e _j has been proposed (e.g., see Non-Patent Document 2). In this case, it has been subtracted echoes in a subtractor output e _j, therefore, the near-end speaker's voice buried in the echo can be detected. However, the subtractor output e _j increases like the double talk even echo path changes. Accordingly, in the method using the subtractor output e _j, which requires identification of the echo path change and double-talk occurs. That is, in order to maintain a high amount of echo cancellation in the double-talk is not forced to see the change in the subtractor output e _j from the need to detect the near-end speaker's speech buried in echo, further double the change In order to use for talk detection, an accurate discrimination method of echo path fluctuation and double talk occurrence is required.

この識別は、マイクロホンの出力ｙ_j、あるいは減算器出力ｅ_jを構成する成分の割合を知ることに等しい。例えば、式(２)において（ｄ_j−ｕ_j）が増加したときはエコー経路変動、ｎ_jが増加したときはダブルトークと判定することができる。問題は、その両者のどちらが増加したか、それを正しく判別する方法である。 This identification is equivalent to knowing the proportion of components constituting the microphone output y _j or the subtractor output e _j . For example, when (d _j −u _j ) increases in equation (2), it can be determined that the echo path changes, and when n _j increases, it is determined as double talk. The problem is how to correctly determine which of the two has increased.

その増加を区別して知る方法として相関の利用が考えられる。さらに、その相関の計算に利用できる信号として遠端話者音声ｘ_jと擬似エコーｕ_jがある。例えば、後者の擬似エコーｕ_jを用いる場合では、その擬似エコーｕ_jと近端話者音声ｎ_jは相関が低いと仮定することができ、したがって、エコー経路に変動がなく、エコーが十分に相殺されて Use of correlation can be considered as a method of distinguishing and knowing the increase. Furthermore, there are a far-end speaker voice x _j and a pseudo echo u _j as signals that can be used for the calculation of the correlation. For example, in the case of using the latter pseudo echo u _j , it can be assumed that the pseudo echo u _j and the near-end speaker voice n _j have a low correlation, and therefore, there is no fluctuation in the echo path, and the echo is sufficiently Offset

の関係が成り立つ場合ではダブルトークの発生に対して、期待値 If the above relationship holds, the expected value for the occurrence of double talk

は零となる。すなわち、ダブルトークにおいて相関ｅ_jｕ_jは零を中心とした分布を与えることになる。ここで、 Becomes zero. That is, in double talk, the correlation e _j u _j gives a distribution centered on zero. here,

は期待値演算を表す。反対に、エコー経路変動の場合はエコーｄ_jに対して消し残りが生じることから、期待値 Represents an expected value operation. Conversely, since the unerased occurs for echo d _j when the echo path changes, the expected value

は零ではなくなる。したがって、例えば、擬似エコーｕ_jとエコーｄ_jが相関を全くもたない完全なエコー経路変動が発生した場合を考えると、この場合において相関ｅ_jｕ_jは擬似エコーの平均パワーの負値 Is no longer zero. Thus, for example, considering a case where complete echo path change the pseudo echo u _j and echo d _j has no correlation at all occurs, a negative value of the average power of the correlation e _j u _j is pseudo echo in this case

を中心とした分布を与える。すなわち、この原理を用いれば、例えば、 Gives a distribution centered on. That is, if this principle is used, for example,

を閾値として、これよりも Than this,

が大きいときはダブルトーク、小さいときはエコー経路変動と判定することができる。 When the value is large, it can be determined as double talk, and when it is small, it can be determined as echo path fluctuation.

この擬似エコーｕ_jとの相関を利用する考え方は The idea of using the correlation with this pseudo echo u _j is

に対しても同様に適用することができる。すなわち、このマイクロホン出力ｙ_jと適応フィルタの出力ｕ_jの相関 The same applies to the above. That is, the correlation between the microphone output y _j and the output u _j of the adaptive filter

に対して、擬似エコーｕ_jがエコーｄ_jをよく模擬しているときは On the other hand, when the pseudo echo u _j closely simulates the echo d _j

は擬似エコーの平均パワー Is the average power of the pseudo echo

を中心として分布し、エコー経路変動のときは擬似エコーｕ_jとエコーｄ_jの相関が低いことから、 Since the correlation between the pseudo echo u _j and the echo d _j is low when the echo path changes,

は零を中心として分布する。したがって、 Is distributed around zero. Therefore,

が閾値 Is the threshold

を超えるときはエコー経路変動、下回るときはダブルトークと判定することができる。さらに、その判定に用いるパラメータとして相関ｙ_jｕ_jやｅ_jｕ_jはｕ_jやｙ_jのパワーで正規化した値としてもよい。この識別法に関しての詳しい説明は従来の文献にもある（例えば、非特許文献３参照）。 It can be determined that the echo path is changed when exceeding the value, and double talk when the value is below. Further, the correlations y _j u _j and e _j u _j may be values normalized by the power of u _j and y _j as parameters used for the determination. A detailed description of this identification method can also be found in conventional literature (see, for example, Non-Patent Document 3).

この相関が識別に有効であることを示す例として図３に、上記非特許文献３で比較的高い分離性能が得られるとされた減算器出力ｅ_jと擬似エコーｕ_jの相関をマイクロホン出力ｙ_jのパワーで正規化する As an example showing that this correlation is effective for identification, FIG. 3 shows the correlation between the subtractor output e _j and the pseudo echo u _j , which is said to have a relatively high separation performance in Non-Patent Document 3, as the microphone output y. Normalize with _j power

をエコー経路変動とダブルトークについて計算した分布を示す。ただし、この計算にあたって以下の条件を設定している。
（１）Ｍ＝１２８
（２）標本化周波数：８ｋＨｚ
（３）図４に示す音声を遠端話者音声と近端話者音声として用いる。
（４）適応フィルタの係数を推定誤差が−３０dBとなったところで固定し、ダブルトークを発生させる。
（５）未知系のインパルス応答を推定誤差が０dBとなる値に変化させてエコー経路変動とする。
（６）適応フィルタのタップ長と未知系のインパルス応答長を等しく５１２。
（７）未知系のインパルス応答には指数減衰する正規乱数で与える。
（８）遠端話者音声対周囲騒音および近端話者音声対周囲騒音の平均パワー比：３０dB
この図３に示す結果から、ダブルトークとエコー経路変動が与える分布に大きな重なりが生じていることがわかる。その重なりは識別誤りが発生することを意味する。したがって、この誤りを少なくするためには各分布の分散を小さくする必要があり、そのためには標本数（Ｍ）を大きくとる必要がある。しかし、これは識別遅延を大きし、その遅延はエコー消去量の確保を遅らせてエコーキャンセラの動作を不安定にする。 Shows the distribution calculated for echo path variation and double talk. However, the following conditions are set for this calculation.
(1) M = 128
(2) Sampling frequency: 8 kHz
(3) The voice shown in FIG. 4 is used as the far-end speaker voice and the near-end talker voice.
(4) The coefficient of the adaptive filter is fixed when the estimation error becomes -30 dB to generate double talk.
(5) The impulse response of the unknown system is changed to a value at which the estimation error becomes 0 dB, and the echo path variation is set.
(6) The tap length of the adaptive filter and the impulse response length of the unknown system are equal 512.
(7) The impulse response of an unknown system is given as a normal random number that exponentially decays.
(8) Average power ratio of far-end speaker voice to ambient noise and near-end talker voice to ambient noise: 30 dB
From the results shown in FIG. 3, it can be seen that there is a large overlap in the distribution given by the double talk and the echo path variation. The overlap means that an identification error occurs. Therefore, in order to reduce this error, it is necessary to reduce the variance of each distribution. For this purpose, it is necessary to increase the number of samples (M). However, this increases the identification delay, which delays securing the amount of echo cancellation and destabilizes the operation of the echo canceller.

この減算器出力ｅ_jと擬似エコーｕ_jの相関を用いる方法におけるもう１つの問題は、エコー経路の変動が小さい場合にｙ_jとｕ_jの相関が強くなり、ダブルトークが与える分布とエコー経路変動が与える分布が接近して識別が一層難しくなることである。この小さなエコー経路変動を正確に判定するためには、さらに多くの標本数が必要である。それはエコー経路変動への対応がさらに遅れることを意味する。加えて、擬似エコーｕ_jと近端話者音声ｎ_jの相関は零と仮定した。しかし、異なる話者が発声しても音韻が一致する場合もあって実際には相関をもつ。この場合、Ｍを大きくしても分布の重なりは小さくなりにくい。 Another problem in the method using the correlation between the subtracter output e _j and the pseudo echo u _{j is that} the correlation between y _j and u _j becomes strong when the fluctuation of the echo path is small, and the distribution and echo path given by double talk. This is because the distribution given by the fluctuations approaches, making identification more difficult. In order to accurately determine this small echo path variation, a larger number of samples is required. That means that the response to echo path variations is further delayed. In addition, the correlation between the pseudo echo u _j and the near-end speaker's voice n _j is assumed to be zero. However, even if different speakers utter, the phonemes may coincide, so there is actually a correlation. In this case, even if M is increased, the distribution overlap is not easily reduced.

この減算器出力ｅ_jとの相関を利用するもう１つの方法として遠端話者音声ｘ_jとの相関を用いる方法があり（例えば、非特許文献４参照）、その有効性は白色雑音を参照信号とする場合に確認されている。しかし、先の減算器出力ｅ_jと擬似エコーｕ_jとの相関も結局は減算器出力ｅ_jと遠端話者音声ｘ_jとの相関を利用している。したがって、この手法においても、減算器出力ｅ_jと擬似エコーｕ_jの相関を用いる方法と同様の問題がある。 As another method of using the correlation with the subtracter output e _j , there is a method using the correlation with the far-end speaker voice x _j (see, for example, Non-Patent Document 4), and the effectiveness is referred to white noise. It has been confirmed that it is a signal. However, after all the correlation between the previous subtractor output e _j and the pseudo echo u _j utilizes the correlation between the subtractor output e _j and the far-end speaker's speech x _j. Therefore, also in this method, there is a similar problem to the method of using the correlation of the subtractor output e _j and the pseudo echo u _j.

そこで次に、エコー経路変動とダブルトークを区別することなく対処できる方法が検討された（例えば、非特許文献５参照）。そこでは、適応フィルタの係数更新アルゴリズムをブロック実行型の学習同定法 Then, next, a method that can deal with echo path fluctuation and double talk without distinguishing them was examined (for example, see Non-Patent Document 5). In this method, an adaptive filter coefficient update algorithm is used as a block-execution learning identification method.

とし、遠端話者音声のパワー変動を吸収して係数更新を常時実行するとともに And constantly updating the coefficient by absorbing the power fluctuation of the far-end speaker voice

を計算してステップサイズρ_nを To calculate the step size ρ _n

のようにブロック単位で可変とする方法が提案されている。ここで、ブロック実行型の適応アルゴリズムについて説明する。ブロック実行型の適応アルゴリズムとは、更新ベクトルの分子ベクトルと正規化分母とを時間ブロックで累積加算する更新アルゴリズムのことである。つまり、式(4)では、右辺第２項が更新ベクトルの部分であるが、その第２項の分子（分子ベクトル）と分母（正規化分母）とを時間ブロックで累積加算するようにしてブロック実行型に修正したのが式(14)である。ただし、上式においてＪはブロック長で、 As described above, a method of making it variable in units of blocks has been proposed. Here, a block execution type adaptive algorithm will be described. The block execution type adaptive algorithm is an update algorithm that cumulatively adds a numerator vector of an update vector and a normalized denominator in a time block. In other words, in equation (4), the second term on the right-hand side is the part of the update vector, but the numerator (numerator vector) and denominator (normalized denominator) of the second term are cumulatively added in the time block. Equation (14) is modified to the execution type. Where J is the block length,

が予め次式を使って計算した定数 Is a constant previously calculated using

以上となるまで延長され、適応フィルタの係数はＰ_n≧Ｐ₀となった時点で更新される。さらに、ρ₀は本制御で設定されるステップサイズの最大値、Ｃ₀は確保すべきエコー消去量に対応する推定誤差、Ｑ₀は定常な周囲騒音の平均パワーである。すなわち、ダブルトークが発生したときはＱ_nが増加し、式(１６)にしたがってステップサイズが小さくなって推定誤差は所定の大きさに維持される。 It is extended until it becomes above, and the coefficient of the adaptive filter is updated when P _n ≧ P ₀ . Furthermore, ρ ₀ is the maximum step size set in this control, C ₀ is an estimation error corresponding to the amount of echo cancellation to be secured, and Q ₀ is the average power of steady ambient noise. That is, when double talk occurs, Q _n increases, the step size decreases according to equation (16), and the estimation error is maintained at a predetermined size.

ここで、式(１６)は遠端話者音声を白色雑音とした場合の制御式である。自己相関の強い音声を参照信号に用いる場合は Here, Expression (16) is a control expression when the far-end speaker voice is white noise. When using speech with strong autocorrelation as a reference signal

とし、ａ＝１０としてステップサイズを小さく与えた方が動作が安定する（例えば、非特許文献６参照）。 Then, the operation becomes more stable when a = 10 and a small step size is given (for example, see Non-Patent Document 6).

このステップサイズ制御によれば、推定誤差が−２０dB程度まで増加する小さなエコー経路変動に対してダブルトークとの識別を要しないことが確認されている（非特許文献５参照）。すなわち、この制御を採用すれば、推定誤差が−２０dBを超えて増加するエコー経路変動に対して同識別が高速化される方法を見出せばよいということになる。
来山征士，田村潤三，山本誠一，石上彦一，”共通の適応制御部をもつ多重エコーキャンセラ”，電子情報通信学会技術報告，ＣＳ７８−２３（１９７８−０５）古屋宣二，福士雄三，伊藤栄紀，田辺淳二，萩原幸雄，”適応型エコーキャンセラにおける重畳通話検出の一方式と実験結果”，１９８４電子情報通信学会総合大会，２３４３（１９８４−０３）藤井健作，大賀寿郎，”ダブルトークとエコー経路変動の識別”，電子情報通信学会技術報告，ＥＡ９４−１５（１９９４−０５）藤井健作，大賀寿郎，”加算正規化ＬＭＳ法におけるブロック長制御による収束時間の短縮”，電子情報通信学会和文論文誌（Ａ），vol．Ｊ８０−Ａ，no．１，pp．２７−３５（１９９７−０１）藤井健作，大賀寿郎，”音響エコーキャンセラのための推定誤差を所要値に保つ方法”，電子情報通信学会和文論文誌（Ａ），vol．Ｊ８３−Ａ，no．０２，pp．１４１−１５１（２０００−０２）張若愚，藤井健作，棟安実治，”ダブルトークとエコー経路変動検出に関する一検討”，第１８回ディジタル信号処理シンポジウム，Ｂ３−６（２００３−１１） According to this step size control, it has been confirmed that it is not necessary to distinguish from double talk for small echo path fluctuations in which the estimation error increases to about −20 dB (see Non-Patent Document 5). That is, if this control is adopted, it is only necessary to find a method for speeding up the identification with respect to echo path fluctuations in which the estimation error increases beyond -20 dB.
Seiji Kuruyama, Junzo Tamura, Seiichi Yamamoto, Hikoichi Ishigami, “Multiple Echo Canceller with Common Adaptive Control Unit”, IEICE Technical Report, CS78-23 (1978-05) Furuya Seiji, Fukushi Yuzo, Ito Eiki, Tanabe Junji, Sugawara Yukio, "A Method and Experimental Results of Superimposed Call Detection in Adaptive Echo Canceller," 1984 IEICE General Conference, 2343 (1984-03) Kensaku Fujii, Toshiro Oga, “Identification of Double Talk and Echo Path Fluctuation”, IEICE Technical Report, EA94-15 (1994-05) Kensaku Fujii, Toshiro Oga, “Reduction of convergence time by block length control in additive normalization LMS method”, Japanese Institute of Electronics, Information and Communication Engineers (A), vol. J80-A, no. 1, pp. 27-35 (1997-01) Kensaku Fujii, Toshiro Oga, “Method of keeping the estimation error for acoustic echo canceller to the required value”, IEICE Japanese Journal (A), vol. J83-A, no. 02, pp. 141-151 (2000-02) Zhang Wakao, Kensaku Fujii, Meiji Muneyasu, “A Study on Double Talk and Echo Path Change Detection”, 18th Digital Signal Processing Symposium, B3-6 (2003-11)

本発明の目的は、この推定誤差が−２０dBを超えて増加するエコー経路変動とダブルトークを素早く、正確に識別して、エコー経路変動に対しては係数の収束を早め、ダブルトークに対しては安定したエコー消去量の維持を可能とする装置の実現にある。 It is an object of the present invention to quickly and accurately discriminate between echo path variations and double talk where this estimation error increases beyond -20 dB, speed up the convergence of the coefficients for echo path variations, Is to realize a device capable of maintaining a stable echo cancellation amount.

上記課題を解決するために、本願発明に係るエコーキャンセラは、スピーカに供給される受信信号が該スピーカからマイクロホンへ回り込むことによって生成されるエコー信号を、該マイクロホンの出力信号から除去するエコーキャンセラであって、該受信信号に基づいて疑似エコー信号を生成する係数更新可能なＦＩＲフィルタと、該マイクロホンの出力信号から該ＦＩＲフィルタが生成した疑似エコー信号を差し引くことによって誤差信号を生成する減算器と、該受信信号と該誤差信号とに基づいて該誤差信号が最小となるように適応アルゴリズムによって該ＦＩＲフィルタの係数を更新する係数更新部とを備え、該適応アルゴリズムは、該誤差信号と、該ＦＩＲフィルタの所定の探索範囲内の各係数タップへの入力信号との積を算出し、該積に基づいて定められる第１値に基づいて、該探索範囲内の係数タップのうちから代表タップを選出し、該誤差信号と該代表タップへの入力信号との積に基づいて定められる第２値と、該誤差信号のパワー値に基づいて定められる第３値とに基づいて、ダブルトーク状態が発生しているか否かを判断する。 In order to solve the above problems, an echo canceller according to the present invention is an echo canceller that removes an echo signal generated when a reception signal supplied to a speaker wraps around from the speaker to the microphone from the output signal of the microphone. A coefficient updateable FIR filter that generates a pseudo echo signal based on the received signal, and a subtractor that generates an error signal by subtracting the pseudo echo signal generated by the FIR filter from the output signal of the microphone; A coefficient updating unit that updates a coefficient of the FIR filter by an adaptive algorithm so that the error signal is minimized based on the received signal and the error signal, and the adaptive algorithm includes the error signal and the error signal Calculate the product of the input signal to each coefficient tap within the predetermined search range of the FIR filter A representative tap is selected from coefficient taps within the search range based on a first value determined based on the product, and a first value determined based on a product of the error signal and an input signal to the representative tap. Based on the binary value and a third value determined based on the power value of the error signal, it is determined whether or not a double talk state has occurred.

また、上記エコーキャンセラにおいて、該第１値が、該誤差信号と各係数タップへの入力信号との積であり、該第２値が、誤差信号と該代表タップへの入力信号との積であり、該第３値が、該誤差信号のパワー値であってもよい。 In the echo canceller, the first value is a product of the error signal and the input signal to each coefficient tap, and the second value is a product of the error signal and the input signal to the representative tap. Yes, the third value may be the power value of the error signal.

また、上記エコーキャンセラにおいて、該適応アルゴリズムが、ブロック実行型の適応アルゴリズムであり、該第１値が、該誤差信号と各係数タップへの入力信号との積の時間ブロックにおける累積和であり、該第２値が、該誤差信号と該代表タップへの入力信号との積の時間ブロックにおける累積和であり、該第３値が、該誤差信号のパワー値の時間ブロックにおける累積和であってもよい。 In the echo canceller, the adaptive algorithm is a block execution type adaptive algorithm, and the first value is a cumulative sum in a time block of a product of the error signal and an input signal to each coefficient tap, The second value is a cumulative sum in the time block of the product of the error signal and the input signal to the representative tap, and the third value is a cumulative sum in the time block of the power value of the error signal. Also good.

また、上記エコーキャンセラにおいて、該第２値の該第３値に対する比率の絶対値が所定閾値未満であるときに、ダブルトーク状態が発生していると判断するようにしてもよい。 In the echo canceller, it may be determined that a double talk state has occurred when the absolute value of the ratio of the second value to the third value is less than a predetermined threshold.

また、上記エコーキャンセラにおいて、該第２値の該第３値に対する比率の絶対値が前記所定閾値以上であり、かつ、今回またはそれ以前に選出された代表タップについての連続した２回分の第２値の符号が異なるときに、ダブルトーク状態が発生していると判断するようにしてもよい。 Further, in the echo canceller, the absolute value of the ratio of the second value to the third value is equal to or greater than the predetermined threshold value, and the second consecutive second for the representative tap selected this time or before. When the signs of the values are different, it may be determined that a double talk state has occurred.

また、上記エコーキャンセラにおいて、該第２値の該第３値に対する比率の絶対値が前記所定閾値以上であり、かつ、今回選出された代表タップへの今回の入力信号と今回の誤差信号との積に基づいて定められる第２値の符号と、今回選出された代表タップへの前回の入力信号と前回の誤差信号との積に基づいて定められる第２値の符号とが異なるときに、ダブルトーク状態が発生していると判断するようにしてもよい。 In the echo canceller, the absolute value of the ratio of the second value to the third value is equal to or greater than the predetermined threshold value, and the current input signal to the representative tap selected this time and the current error signal When the sign of the second value determined based on the product and the sign of the second value determined based on the product of the previous input signal to the representative tap selected this time and the previous error signal are different, It may be determined that a talk state has occurred.

また、上記エコーキャンセラにおいて、該第２値の該第３値に対する比率の絶対値が前記所定閾値以上であり、かつ、前回選出された代表タップへの今回の入力信号と今回の誤差信号との積に基づいて定められる第２値の符号と、前回選出された代表タップへの前回の入力信号と前回の誤差信号との積に基づいて定められる第２値の符号とが異なるときに、ダブルトーク状態が発生していると判断するようにしてもよい。 In the echo canceller, the absolute value of the ratio of the second value to the third value is equal to or greater than the predetermined threshold value, and the current input signal to the representative tap selected last time and the current error signal When the sign of the second value determined based on the product is different from the sign of the second value determined based on the product of the previous input signal to the representative tap selected last time and the previous error signal, It may be determined that a talk state has occurred.

また、上記エコーキャンセラにおいて、該第３値が所定値以下であるとき、前記所定閾値がより大きな値に修正されるようにしてもよい。 In the echo canceller, when the third value is equal to or less than a predetermined value, the predetermined threshold value may be corrected to a larger value.

また、上記課題を解決するために、本願発明に係るもう一つのエコーキャンセラは、スピーカに供給される受信信号が該スピーカからマイクロホンへ回り込むことによって生成されるエコー信号を、該マイクロホンの出力信号から除去するエコーキャンセラであって、該受信信号に基づいて疑似エコー信号を生成する係数更新可能なＦＩＲフィルタと、該マイクロホンの出力信号から該ＦＩＲフィルタが生成した疑似エコー信号を差し引くことによって誤差信号を生成する減算器と、該受信信号と該誤差信号とに基づいて該誤差信号が最小となるように適応アルゴリズムによって該ＦＩＲフィルタの係数を更新する係数更新部とを備え、該適応アルゴリズムは、該誤差信号と、該ＦＩＲフィルタの所定の探索範囲内の各係数タップへの入力信号との積を算出し、該積に基づいて定められる第１値に基づいて、該探索範囲内の係数タップのうちから代表タップを選出し、今回またはそれ以前に選出された代表タップについての連続した２回分の第２値の符号が異なるときに、ダブルトーク状態が発生していると判断し、該第２値が、該誤差信号と該代表タップへの入力信号との積に基づいて定められる値であってもよい。 In order to solve the above problem, another echo canceller according to the present invention provides an echo signal generated when a reception signal supplied to a speaker circulates from the speaker to a microphone from an output signal of the microphone. An echo canceller that eliminates the error signal by subtracting the FIR filter capable of updating the coefficient based on the received signal and the coefficient-updatable FIR filter that generates the pseudo echo signal from the output signal of the microphone. A subtractor for generating, and a coefficient updating unit for updating a coefficient of the FIR filter by an adaptive algorithm so that the error signal is minimized based on the received signal and the error signal, the adaptive algorithm comprising: The error signal and the input signal to each coefficient tap within the predetermined search range of the FIR filter And a representative tap is selected from among the coefficient taps within the search range based on a first value determined based on the product, and continuous for the representative taps selected this time or before When the two values of the second value have different signs, it is determined that a double talk state has occurred, and the second value is determined based on the product of the error signal and the input signal to the representative tap. It may be a value.

また、上記エコーキャンセラにおいて、今回選出された代表タップへの今回の入力信号と今回の誤差信号との積に基づいて定められる第２値の符号と、今回選出された代表タップへの前回の入力信号と前回の誤差信号との積に基づいて定められる第２値の符号とが異なるときに、ダブルトーク状態が発生していると判断するようにしてもよい。 In the echo canceller, the sign of the second value determined based on the product of the current input signal and the current error signal to the representative tap selected this time and the previous input to the representative tap selected this time When the sign of the second value determined based on the product of the signal and the previous error signal is different, it may be determined that the double talk state has occurred.

また、上記エコーキャンセラにおいて、前回選出された代表タップへの今回の入力信号と今回の誤差信号との積に基づいて定められる第２値の符号と、前回選出された代表タップへの前回の入力信号と前回の誤差信号との積に基づいて定められる第２値の符号とが異なるときに、ダブルトーク状態が発生していると判断するようにしてもよい。 In the echo canceller, the sign of the second value determined based on the product of the current input signal and the current error signal to the representative tap selected last time, and the previous input to the representative tap selected last time When the sign of the second value determined based on the product of the signal and the previous error signal is different, it may be determined that the double talk state has occurred.

また、上記エコーキャンセラにおいて、該第１値が、該誤差信号と各係数タップへの入力信号との積であり、該第２値が、誤差信号と該代表タップへの入力信号との積であってもよい。 In the echo canceller, the first value is a product of the error signal and the input signal to each coefficient tap, and the second value is a product of the error signal and the input signal to the representative tap. There may be.

また、上記エコーキャンセラにおいて、該適応アルゴリズムが、ブロック実行型の適応アルゴリズムであり、該第１値が、該誤差信号と各係数タップへの入力信号との積の時間ブロックにおける累積和であり、該第２値が、該誤差信号と該代表タップへの入力信号との積の時間ブロックにおける累積和であってもよい。 In the echo canceller, the adaptive algorithm is a block execution type adaptive algorithm, and the first value is a cumulative sum in a time block of a product of the error signal and an input signal to each coefficient tap, The second value may be a cumulative sum in a time block of a product of the error signal and an input signal to the representative tap.

また、上記エコーキャンセラにおいて、該探索範囲内の係数タップのうちで、該第１値の絶対値が最も大きい係数タップを、代表タップとして選出するようにしてもよい。 In the echo canceller, a coefficient tap having the largest absolute value of the first value may be selected as a representative tap among coefficient taps within the search range.

また、上記エコーキャンセラにおいて、該探索範囲が、該ＦＩＲフィルタの全係数タップであってもよい。 In the echo canceller, the search range may be all coefficient taps of the FIR filter.

また、上記エコーキャンセラにおいて、該探索範囲が、該ＦＩＲフィルタの一部の係数タップであってもよい。 In the echo canceller, the search range may be a part of coefficient taps of the FIR filter.

また、上記エコーキャンセラにおいて、該探索範囲が、該ＦＩＲフィルタの初段から中間段までの連続した係数タップであってもよい。 In the echo canceller, the search range may be a continuous coefficient tap from the first stage to the intermediate stage of the FIR filter.

また、上記エコーキャンセラにおいて、該適応アルゴリズムは、ダブルトーク状態が発生していると判断したときに、該ＦＩＲフィルタの係数の更新量を少なく、又は、該係数の更新を停止するようにしてもよい。 In the echo canceller, when the adaptive algorithm determines that the double talk state has occurred, the update amount of the coefficient of the FIR filter is reduced or the update of the coefficient is stopped. Good.

また、上記エコーキャンセラにおいて、該適応アルゴリズムは、所定回数連続して、ダブルトーク状態が発生しているとの判断がされなかったとき、エコー経路変動状態が発生していると判断するようにしてもよい。 In the echo canceller, the adaptive algorithm may determine that an echo path variation state has occurred when it is not determined that a double talk state has occurred for a predetermined number of times. Also good.

また、上記エコーキャンセラにおいて、該適応アルゴリズムは、エコー経路変動状態が発生していると判断したときに、該ＦＩＲフィルタの係数の更新量を多くするようにしてもよい。 In the echo canceller, the adaptive algorithm may increase the coefficient update amount of the FIR filter when it is determined that an echo path fluctuation state has occurred.

本願のエコーキャンセラによれば、ダブルトークとエコー経路変動が少ない遅延で識別され、安定した動作がシステムに保証される。 According to the echo canceller of the present application, double talk and echo path fluctuation are identified with a small delay, and a stable operation is guaranteed for the system.

以下、本発明の一実施形態を図面を参照しつつ説明する。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings.

図１Ａは、通話装置に適用された、この実施形態に係る本発明（エコーキャンセラ）の概略構成を示す図であり、図１Ｂは、この実施形態に係る本発明（エコーキャンセラ）の各機能を示すブロック図である。 FIG. 1A is a diagram showing a schematic configuration of the present invention (echo canceller) according to this embodiment applied to a communication device, and FIG. 1B shows each function of the present invention (echo canceller) according to this embodiment. FIG.

図１Ａを参照すると、通話装置はマイクロホン200とスピーカ201とを有している。エコーキャンセラは、主に、係数更新可能なＦＩＲフィルタ212aと、係数更新部212bと、減算器211とによって構成される。なお、ＦＩＲフィルタ212aと係数更新部212bとによって、適応フィルタ212が構成される。 Referring to FIG. 1A, the call device has a microphone 200 and a speaker 201. The echo canceller mainly includes an FIR filter 212a capable of updating coefficients, a coefficient updating unit 212b, and a subtractor 211. The adaptive filter 212 is configured by the FIR filter 212a and the coefficient updating unit 212b.

通話装置が受信した信号ｘ_j(受信信号)は、スピーカ201に供給される。スピーカ201から放射された音は、マイクロホン200へ回り込む。マイクロホン200の出力信号ｙ_jは、この回り込みによって生成されるエコー信号を含んでいる。 A signal x _j (received signal) received by the call device is supplied to the speaker 201. Sound radiated from the speaker 201 goes around the microphone 200. The output signal y _j of the microphone 200 includes an echo signal generated by this wraparound.

信号ｘ_jは、ＦＩＲフィルタ212aにも入力される。ＦＩＲフィルタ212aは、Ｉ個の係数タップを備えており、各係数タップには係数として、Ｈ(１)，Ｈ(２)，Ｈ(３)，・・・，Ｈ(Ｉ)が設定されている。ＦＩＲフィルタ212aからは、出力信号として信号ｕ_jが出力される。 The signal x _j is also input to the FIR filter 212a. The FIR filter 212a includes I coefficient taps, and H (1), H (2), H (3),..., H (I) are set as coefficients for each coefficient tap. Yes. From the FIR filter 212a, a signal u _j is output as an output signal.

減算器211には、信号ｙ_jと信号ｕ_jとが入力される。減算器211は、信号ｙ_jから信号ｕ_jを差し引く。減算結果としての信号ｅ_jは送信信号として通信装置から送出される。 The subtractor 211 receives the signal y _j and the signal u _j . The subtractor 211 subtracts the signal u _j from the signal y _j . Signal e _j as the subtraction result is sent from the communication device as a transmission signal.

係数更新部212bには、信号ｅ_jが誤差信号として入力される。係数更新部212bには、信号ｘ_jも入力される。係数更新部212bにおいて、適応アルゴリズム（係数更新アルゴリズム）が作動している。係数更新部212bの適応アルゴリズム（係数更新アルゴリズム）は、信号ｘ_jと信号ｅ_jとに基づいて、信号ｅ_jが最小となるようにＦＩＲフィルタの係数タップの係数Ｈ(１)，Ｈ(２)，Ｈ(３)，・・・，Ｈ(Ｉ)を更新する。 The coefficient updating unit 212b, the signal e _j is inputted as an error signal. A signal x _j is also input to the coefficient updating unit 212b. In the coefficient update unit 212b, an adaptive algorithm (coefficient update algorithm) is operating. Adaptive algorithm of the coefficient update unit 212b (the coefficient update algorithm), based on the signal x _j and the signal e _j, coefficients of the coefficient taps of the FIR filter so that the signal e _j is minimized H (1), H (2 ), H (3),..., H (I) are updated.

ここでの実施形態においても係数更新アルゴリズムとしてブロック実行型の学習同定法を用い、推定誤差の増加が−２０dB以下となるエコー経路変動には上記ステップサイズ制御を適用して対応する。本実施形態では、推定誤差の増加が−２０dBを越えるエコー経路変動に対してダブルトークの発生との識別を行い、ステップサイズを大きく与えてエコー消去量の迅速な確保を実現する。 In this embodiment as well, the block execution type learning identification method is used as the coefficient update algorithm, and the above step size control is applied to the echo path variation in which the increase in the estimation error is -20 dB or less. In the present embodiment, an echo path variation in which an increase in estimation error exceeds −20 dB is identified as the occurrence of double talk, and a large step size is provided to quickly secure an echo cancellation amount.

その説明を簡単にするため、式(１４)を適応フィルタの係数Ｈ_nのｍ＝１,・・・,Ｉの各要素について計算する式 In order to simplify the description, the equation (14) is calculated for each element of m = 1,..., I of the coefficient H _n of the adaptive filter.

に改める。上の式(20)の右辺の第２項の分子は、誤差信号ｅ_jと、ＦＩＲフィルタ212aのｍ段目の係数タップへの入力信号ｘ_j(ｍ)との積の時間ブロックにおける累積和である。本実施形態では、この第２項の分子 Change to The numerator of the second term on the right side of the above equation (20) is the cumulative sum in the time block of the product of the error signal e _j and the input signal x _j (m) to the m-th coefficient tap of the FIR filter 212a. It is. In this embodiment, the molecule of the second term

に着目する。そして、第ｎブロックにおける推定誤差を Pay attention to. And the estimation error in the nth block is

とおいて、式(２１)に与えるＡ_n(ｍ)の最大値がｍ＝ｋにおいて得られたとすると、ｋ段目の係数タップを代表タップとして選出する。その最大値Ａ_n(ｋ)は If the maximum value of A _n (m) given in equation (21) is obtained at m = k, the k-th coefficient tap is selected as the representative tap. Its maximum value A _n (k) is

と表すことができ、エコー経路変動のときは When the echo path changes,

を平均とする分布が得られる。さらに、ダブルトークのときは Is obtained as a mean distribution. Furthermore, during double talk

と近似できることから、この場合、Ａ_n(ｋ)は零を中心とする分布を与える。したがって、このＡ_n(ｋ)をパラメータとすれば、Δ_n(ｋ)Ｊσ_x ²／２を閾値としてダブルトークとエコー経路変動を識別することができると期待される。ただし、後で述べる理由からＡ_n(ｋ)の探索範囲をｋ＝１〜Ｉ／１６とする。この探索範囲は、ＦＩＲフィルタのＩ個の係数タップの全部であってもよいし、その一部であってもよいが、ここでは一部を検索範囲としている。ここでは、ＦＩＲフィルタの初段（１段）から中間段（（Ｉ／16）段）までの連続した係数タップを検索範囲としている。 In this case, A _n (k) gives a distribution centered on zero. Accordingly, the if A _n (k) of a parameter is expected to be able to identify double-talk and echo path change the _{_{^{Δ n (k) Jσ x 2}}} /2 as a threshold value. However, the search range of A _n (k) is k = 1 to I / 16 for the reason described later. This search range may be all or a part of the I coefficient taps of the FIR filter, but a part of the search range is used here. Here, the search range is a continuous coefficient tap from the first stage (1 stage) to the intermediate stage ((I / 16) stage) of the FIR filter.

図５は、そのＡ_n(ｋ)が遠端話者音声のパワーの関数となることから、 FIG. 5 shows that A _n (k) is a function of the power of the far-end speaker voice.

で除した Divided by

の分布として表した結果である（ただし、図に示すｍａｘＡ_nは本文のＡ_n(ｋ)である。明らかに、ａ_n(ｋ)の分布はエコー経路変動のときにΔ_n(ｋ)を平均値、ダブルトークのときは零を平均とする分布を与えている。しかし、この図５の分布も図３と同様、分布に重なりが生じている。このことから、このａ_n(ｋ)によっても単純な分布の２分による識別は難しいとわかる。ここで、図５に示す分布は、ブロック長制御を適用し、 (However, maxA _n shown in the figure is A _n (k) in the text. Obviously, the distribution of a _n (k) is expressed as Δ _n (k) when the echo path changes. average value, when the double-talk has given distribution to average zero. However, similar to the distribution also Figure 3 of FIG. 5, the overlap occurs in the distribution. Therefore, the a _n (k) It can be seen that it is difficult to identify a simple distribution by 2 minutes, where the distribution shown in FIG.

となるまでブロック長Ｊを延長してａ_n(ｋ)を計算した結果である。さらに、この図５のシミュレーションでは図３の条件に加えて
（１１）定常周囲騒音の平均パワー：Ｑ₀＝０.００１
（１２）最小ブロック長：Ｊ₀＝１２８
（１３）目標推定誤差：−３０dB，すなわち、Ｃ₀＝０.００１
（１４）ａ＝１０.０
（１５）ρ₀＝１２.８
（１６）Ｐ₀＝６４Ｉ
を設定している。 This is a result of calculating a _n (k) by extending the block length J until. Further, in the simulation of FIG. 5, in addition to the conditions of FIG. 3, (11) average power of stationary ambient noise: Q ₀ = 0.001
(12) Minimum block length: J ₀ = 128
(13) Target estimation error: −30 dB, that is, C ₀ = 0.001
(14) a = 10.0
(15) ρ ₀ = 12.8
(16) P ₀ = 64I
Is set.

このａ_n(ｋ)を識別パラメータとする方法に対して、その分散を小さくする最も単純な方法はブロック長Ｊを長くすることである。しかし、その場合、識別遅延が増加するという問題が生じる。また、その延長は参照信号を自己相関の強い音声とするブロック実行型学習同定法において収束を遅らせる原因ともなる。すなわち、このＡ_n(ｋ)を用いる方法でも、ブロック長Ｊを長くすることなく、重なりを少なくする対策が求められる。 In contrast to the method using a _n (k) as an identification parameter, the simplest method for reducing the variance is to increase the block length J. However, in that case, there arises a problem that the identification delay increases. Moreover, the extension also causes a delay in convergence in the block execution type learning identification method in which the reference signal is a speech having a strong autocorrelation. That is, even in the method using A _n (k), a countermeasure for reducing the overlap without increasing the block length J is required.

そこで、ダブルトークにおいてａ_n(ｋ)が大きくなってエコー経路変動の分布と重なるのは、Ａ_n(ｋ)の分散が大きくなるときであること、さらに、その分散が大きくなるのは近端話者音声のパワーが大きくなるときであることに注目する。この場合、式(１５)から Therefore, in double talk, a _n (k) becomes large and overlaps with the distribution of echo path fluctuation when the variance of A _n (k) becomes large, and further, the variance becomes large in the near end. Note that this is when the power of the speaker voice increases. In this case, from equation (15)

となることからわかるように、Ｑ_nも同時に大きくなる。一方、エコー経路変動でＱ_nが大きくなるのは大きなエコー経路変動が起こったときで、そのときはＡ_n(ｋ)も同時に大きくなる。すなわち、ダブルトークのときの分布の分散が大きくなって図５の分布が右方向に広がるときはＱ_nが大きくなるときで、そのときはエコー経路変動の分布の平均値も大きくなって、その分布は右方向に移動する。したがって、このＱ_nに合わせてエコー経路変動の分布を右方向に移動させれば、大きなダブルトークが発生して分布が広がったときの重なりが小さくなると期待できる。これは、Ｑ_nをパラメータに加えてＡ_n(ｋ)の分布を２次元の分布として表現することに等しい。 As can be seen, Q _n also increases simultaneously. On the other hand, Q _n increases due to echo path variation when a large echo path variation occurs, and at that time, A _n (k) also increases simultaneously. That is, when the distribution of the distribution at the time of double talk becomes large and the distribution of FIG. 5 spreads in the right direction, Q _n becomes large. At that time, the average value of the distribution of echo path fluctuations also becomes large. The distribution moves to the right. Therefore, if the distribution of echo path fluctuations is moved to the right in accordance with this Q _n , it can be expected that the overlap when the double distribution occurs and the distribution spreads will be reduced. This is equivalent to expressing the distribution of A _n (k) as a two-dimensional distribution by adding Q _n to the parameter.

図６は、横軸をＪＱ_n、縦軸をＡ_n(ｋ)として表現した、エコー経路変動のときに得られる分布である。ただし、ここでエコー経路変動は推定誤差が０dBとなる変動として与えている。この場合、横軸となるＪＱ_nの期待値は式(１５)から FIG. 6 is a distribution obtained when the echo path changes, with the horizontal axis representing JQ _n and the vertical axis representing A _n (k). However, the echo path variation is given here as a variation in which the estimation error is 0 dB. In this case, the expected value of JQ _{n on} the horizontal axis is

となる。このことから、ＪＱ_nは推定誤差の２乗平均σ_d ²と遠端話者音声の平均パワーσ_x ²の積、近端話者音声の平均パワーσ_n ²との和に比例する大きさとわかる。したがって、これを横軸とした場合、ｅ_jの増加がエコー経路変動だけで起こっているときは、 It becomes. From this, JQ _n has a magnitude proportional to the sum of the product of the mean square σ _d ² of the estimation error and the average power σ _x ² of the far-end speaker voice and the average power σ _n ² of the near-end talker voice. Understand. Therefore, if this was the horizontal axis, when the increase in e _j is happening only echo path change is

と近似できるから、 Can be approximated with

は Is

を傾きとする直線を平均として分布することになる。このことは図６に示す分布からも確認できる。ただし、この図６ではＡ_n(ｋ)の正負に意味がないのでＡ_n(ｋ)は絶対値で表示している。また、この分布を計算した条件は図５と同じで、表示方法に違いがあるだけである。 The straight line with the slope is distributed as an average. This can also be confirmed from the distribution shown in FIG. However, in FIG. 6, since A _n (k) has no meaning in the positive and negative, A _n (k) is displayed as an absolute value. Further, the conditions for calculating this distribution are the same as those in FIG. 5, and only the display method is different.

上の式(29)の右辺の分子は、誤差信号ｅ_jと代表タップへの入力信号ｘ_jとの積の時間ブロックにおける累積和に相当する。また、式(29)の右辺の分母は、誤差信号ｅ_jの二乗値、すなわち、誤差信号ｅ_jのパワー値の時間ブロックにおける累積和に相当する（式(15)参照）。 Molecule on the right side of the above equation (29) corresponds to the cumulative sum at time block of the product of the input signal x _j to representative taps and the error signal e _j. Further, the denominator of the right side of the equation (29), the square value of the error signal e _j, i.e., corresponding to the accumulated sum at the time the block of the power value of the error signal e _j (see equation (15)).

ここで、ＪＱ_nが図６の分布に与える影響を詳しくみておく。まず、参照信号のパワーが最大となるときにブロック長が最短のＪ₀となり、所定の推定誤差が確保されるようにＰ₀と最大ステップサイズρ₀を設定すると、Ｐ_nがＰ₀を超えることはなくなり、毎回、Ｐ_n＝Ｐ₀で係数が更新される。すなわち、推定誤差が０dBとなるエコー経路変動の場合にはσ_d ²＝１となる（本シミュレーションでは未知系のパワー利得を１としている）ことから、ＪＱ_nは Here, the effect of JQ _n on the distribution of FIG. 6 will be examined in detail. First, when P ₀ and the maximum step size ρ ₀ are set so that the block length is the shortest J ₀ when the power of the reference signal is maximum and a predetermined estimation error is ensured, P _n exceeds P ₀ . Every time, the coefficient is updated with P _n = P ₀ . That is, in the case of an echo path variation where the estimation error is 0 dB, σ _d ² = 1 (in this simulation, the power gain of the unknown system is set to 1), so JQ _n is

付近に集中して分布することになる。しかし、このように最短ブロック長を選んだ場合、ブロック長がＪ₀以上となる頻度が上がり、自己相関の強い音声を参照信号とした場合において係数の収束が遅れることになる。参照信号に強い自己相関がある場合には、ステップサイズの最大値ρ₀を小さく設定して小さなＰ₀でも、すなわち、短いブロックでも所定の推定誤差が確保されるようにして、更新頻度を高くした方がよい。図６のシミュレーションでは、この点を考慮して係数更新頻度が高くなるようにステップサイズの最大値ρ₀を小さく選んで、先にシミュレーション条件（１６）として示したようにＰ₀＝６４Ｉとしている。この場合、参照信号のパワーが最小のときに It will be concentrated in the vicinity. However, when the shortest block length is selected in this way, the frequency with which the block length becomes J ₀ or more increases, and the convergence of the coefficients is delayed when speech with strong autocorrelation is used as a reference signal. If the reference signal has a strong autocorrelation, the maximum value of the step size ρ ₀ is set small so that a predetermined estimation error is secured even with a small P ₀ , that is, a short block, and the update frequency is increased. You should do it. In the simulation of FIG. 6, taking this point into consideration, the maximum step size ρ ₀ is selected to be small so that the coefficient update frequency is high, and P ₀ = 64I is set as previously shown as the simulation condition (16). . In this case, when the power of the reference signal is minimum

は最小の６４となり、これよりも参照信号のパワーが大きくなるときはＪＱ_n＞６４となる。図７は図６の分布の計算に生じたＰ_nの変化である。明らかに、図７に示すＰ_nの最小値はＰ₀であり、そのＰ₀となる頻度は低い。図６の分布は、この図７の結果を反映した分布となっている。 Becomes the minimum 64, and when the power of the reference signal is larger than this, JQ _n > 64. FIG. 7 shows changes in P _n that occur in the distribution calculation of FIG. Apparently, the minimum value of P _n shown in FIG. 7 is P ₀ , and the frequency of P ₀ is low. The distribution of FIG. 6 is a distribution reflecting the result of FIG.

一方、ダブルトークが単独で起こった場合、式(２１)において On the other hand, when double talk occurs alone, in equation (21)

となるので、Ａ_n(ｋ)は零を中心とした分布となる。また、その分散は近端話者音声のパワーに比例するので、ＪＱ_nの増加とともに広がる図８のような分布となる。この図８と図６に与える分布から、近端話者音声のパワーが大きくなる強いダブルトークの発生においてよく分離できていることがわかる。さらに、図５の分布と比較して、図５では強いダブルトークにおいて分布が大きく重なるという欠点が２次元の分布とすることで逆に分離され、解決されていることがわかる。また、図９は、この図８と図６を一緒にして表した分布である。 Therefore, A _n (k) has a distribution centered on zero. Further, because the dispersion is proportional to the power of the near-end talker speech, a distribution as shown in FIG. 8 spread with increasing JQ _n. From the distributions shown in FIG. 8 and FIG. 6, it can be seen that the separation is well performed in the occurrence of strong double talk in which the power of the near-end speaker voice increases. Further, as compared with the distribution of FIG. 5, it can be seen that in FIG. 5, the disadvantage that the distribution greatly overlaps in strong double talk is separated and solved by using a two-dimensional distribution. FIG. 9 shows the distribution of FIG. 8 and FIG. 6 together.

図６から、エコー経路変動ではｒ_n＜０.６となる事象が発生していないこと、したがって、ｒ_n＜０.６となる場合はダブルトークと判定できるとわかる。もちろん、ｒ_n＜０.６となるエコー経路変動が発生する可能性が皆無とは言えない。しかし、そこでダブルトークと判定されても次のブロックでエコー経路変動と判定される確率が高いので問題はない。問題は、図８からわかるようにダブルトークでもｒ_n≧０.６となる場合があり、ｒ_n＝０.６を閾値として判定したときに、ダブルトークをエコー経路変動と誤る場合があることである。この誤りが発生したとき、ステップサイズが大きく設定され、それによって推定誤差が急増することになる。 From FIG. 6, it can be seen that the event of r _n <0.6 does not occur in the echo path fluctuation, and therefore it can be determined that double talk is possible when r _n <0.6. Of course, it cannot be said that there is no possibility of echo path fluctuations where r _n <0.6. However, even if it is determined that there is a double talk, there is no problem because there is a high probability that it is determined that the echo path changes in the next block. The problem is that, as can be seen from FIG. 8, even in the case of double talk, there are cases where r _n ≧ 0.6, and when r _n = 0.6 is determined as a threshold, double talk may be mistaken for echo path fluctuation. It is. When this error occurs, the step size is set large, and the estimation error increases rapidly.

図１０は、図９の分布においてダブルトークがｒ_n≧０.６となってエコー経路変動と判定される確率とＪＱ_nとの関係を計算した結果である。明らかに、その判定誤りが発生する確率はＪＱ_nが小さいほど高く、ＪＱ_nが大きくなるほど低い。このことは図８の分布からも予想できる。ただし、エコー経路変動をダブルトークとする判定誤りは生じない。すなわち、本識別法によれば図５の１次元の分布ではエコー経路変動との分離が難しい大きなダブルトークで識別誤りが減少するのである。 FIG. 10 shows the result of calculating the relationship between JQ _n and the probability that the double talk is r _n ≧ 0.6 in the distribution of FIG. Obviously, the probability that the determination error occurs is higher as JQ _n is smaller, and lower as JQ _n is larger. This can be predicted from the distribution of FIG. However, a determination error in which the echo path variation is double talk does not occur. That is, according to this identification method, the identification error is reduced by a large double talk that is difficult to separate from the echo path fluctuation in the one-dimensional distribution of FIG.

さらに、この結果においてＪＱ_n＝０〜５０では８０％以上のダブルトークがエコー経路変動と判定される。このように判定誤りが発生する率が高くなる理由は、本シミュレーション条件である推定誤差が０dBとなるエコー経路変動においてＪＱ_nの最小値の平均が６４となることにある。この場合、エコー経路変動においてＪＱ_n＜５０となる可能性が非常に低い。このことは図６からもわかる。すなわち、ＪＱ_n＜５０ではほとんどがダブルトークであるからである。いずれにしても、このｒ_n＝０.６を判定閾値とする単純な方法ではダブルトークをエコー経路変動と誤る可能性が高い。 Further, in this result, when JQ _n = 0 to 50, double talk of 80% or more is determined as an echo path fluctuation. The reason why the rate of occurrence of determination errors in this way is high is that the average of the minimum values of JQ _n is 64 in the echo path variation where the estimation error, which is the simulation condition, is 0 dB. In this case, it is very unlikely that JQ _n <50 in the echo path variation. This can also be seen from FIG. That is, most of them are double talk when JQ _n <50. In any case, with a simple method using r _n = 0.6 as a determination threshold, there is a high possibility that double talk will be mistaken for echo path fluctuation.

この誤りを減少させる対策として本実施形態では、まず、部屋の残響が初期に集中していること（牧野昭二，小泉宣夫，”エコーキャンセラの室内音場における適応特性の改善について”，電子情報通信学会和文論文誌（Ａ），vol．Ｊ７１−Ａ，no．１２，pp．２２１２−２２１４（１９８８−０１２））を考慮して、Ａ_n(ｋ)の探索範囲をｋ＝１〜Ｉ／１６とする。これによって、探索処理量の削減も同時に得られる。すなわち、エコー経路変動においては、この範囲内に最大となる確率が高く、ダブルトークにおいては全範囲に等しく出現する。したがって、この範囲に限定して探索すればダブルトークにおいてＡ_n(ｋ)が小さくなる可能性が高まり、ｒ_n＞０.６となる確率が減少する。図１１は、このように探索範囲を狭めたときにダブルトークをエコー経路変動と判定される確率とＪＱ_nの関係である。図１０と比較して減少していることが分かる。 As a measure to reduce this error, in this embodiment, the reverberation of the room is first concentrated (Shinji Makino, Nobuo Koizumi, “Improvement of adaptive characteristics of echo canceller in room sound field”, Electronic Information Communication) The search range of A _n (k) is set to k = 1 to I / 16 in consideration of the Japanese academic journal (A), vol. J71-A, no. 12, pp. 2212-2214 (1988-012)). And As a result, a reduction in the search processing amount can be obtained at the same time. That is, in the echo path fluctuation, the probability that the maximum is within this range is high, and in double talk, it appears equally in the entire range. Therefore, if the search is limited to this range, the possibility that A _n (k) becomes small in double talk increases, and the probability of r _n > 0.6 decreases. 11, the double-talk is the relationship probability and JQ _n is determined that echo path change when the narrowed thus search range. It turns out that it has decreased compared with FIG.

次に、Ａ_n(ｋ)の符号変化に着目する。すなわち、Ａ_n(ｋ)はエコー経路変動の場合に Next, attention is paid to the sign change of A _n (k). In other words, A _n (k) is

を平均として分布することからわかるように、エコー経路変動では隣接ブロックでＡ_n(ｋ)は同符号となる可能性が高い。逆に、ダブルトークではＡ_n(ｋ)は式(２３)に As can be seen from the distribution as an average, it is highly possible that A _n (k) has the same sign in adjacent blocks in the echo path variation. Conversely, in double talk, A _n (k) is expressed by equation (23).

を適用した結果から分かるように、隣接ブロック間で相関は低く、連続するブロックで符号は異なる可能性が高い。従って、異符号ならばダブルトークと判断することができる。 As can be seen from the result of applying, the correlation between adjacent blocks is low, and there is a high possibility that codes are different in consecutive blocks. Therefore, if the code is different, it can be determined as double talk.

以上では、今回分の時間ブロックにおいて選出された代表タップのＡ_n(ｋ)の符号が今回分と前回分とで異なるか否かにより、ダブルトークを判断する例を示した。しかし、前回分の時間ブロックにおいて選出された代表タップのＡ_n(ｋ)の符号が今回分と前回分とで異なるか否かによって判断してもよい。また、それ以前に選出された代表タップのＡ_n(ｋ)の符号が今回分と前回分とで異なるか否かにより判断してもよい。さらに、今回またはそれ以前に選出された代表タップについての、過去の連続した２回分のＡ_n(ｋ)の符号が異なるか否かにより、ダブルトークを判断してもよい。 In the above, the sign of A _n of elected representatives taps in the time block of the current minute (k) depending on whether different or not between this date and the previous minute, an example of determining double talk. However, the determination may be made based on whether or not the sign of the representative tap An _n (k) selected in the previous time block is different between the current time and the previous time. Alternatively, the determination may be made based on whether or not the sign of the representative tap An _n (k) selected before that differs between the current time and the previous time. Furthermore, double talk may be determined based on whether or not the past two consecutive times of A _n (k) have different signs for representative taps selected this time or before.

また以上では、ｒ_nが0.6以上であるときに、連続した２回分のＡ_n(ｋ)の符号が異なるか否かによりダブルトークを判断する例を示した。しかし、ｒ_nがいかなる値をとるかにかかわらず、連続した２回分のＡ_n(ｋ)の符号が異なるか否かのみによりダブルトークを判断してもよい。 In the above, an example in which double talk is determined based on whether or not the two consecutive A _n (k) codes are different when r _n is 0.6 or more is shown. However, regardless of what value r _n takes, double talk may be determined only by whether the signs of two consecutive A _n (k) are different.

上記において、Ａ_n(ｋ)の符号が連続するブロックで異符号ならば、ダブルトークと判断することができることを説明した。ただし、エコー経路変動であっても式(２３)第２項の影響から異符号となる可能性は否定できない。しかし、その判定によってエコー経路変動をダブルトークと誤っても次のブロックでエコー経路変動と判定される可能性が高い。実際、図１２は１つ前のブロックで観測した最大値Ａ_n-1(ｋ_n-1)を与えたタップｋ_n-1が現ブロックにおいてもＡ_n(ｋ_n-1)に対して同じ符号を与えるか、異なる符号を与えるか、をエコー経路変動とダブルトークの場合について調べた結果である。この図１２の上段は推定誤差が０dBとなるエコー経路変動、中段が同−１０dBとなるエコー経路変動、下段がダブルトークの場合で、１が符号の一致を、−１が符号の不一致を表している。この図１２において推定誤差が０dBとなるエコー経路変動で符号が一致しない確率は１.５％、推定誤差が−１０dBとなるエコー経路変動で５％の確率である。さらに、この結果から分かるように、エコー経路変動時に２ブロック続けて異符号となる可能性は低く、多くの場合において次のブロックで同符号となり、エコー経路変動と判定される。また、このようにエコー経路変動をダブルトークと誤ってもダブルトークとしての係数更新が実行され、次のブロックおいて正しくエコー経路変動と判定されるならば、エコー経路変動としての処理が１ブロック分遅れるだけで、とくに大きな問題は生じない。 In the above description, it has been described that if the code of A _n (k) is a continuous block and different code, it can be determined as double talk. However, it cannot be denied that there is a possibility of different signs due to the influence of the second term of equation (23) even if the echo path changes. However, even if the echo path variation is mistaken as double talk by the determination, there is a high possibility that the next block is determined as the echo path variation. In fact, FIG. 12 shows that the tap k _n-1 given the maximum value A _n-1 (k _n-1 ) observed in the previous block is the same as A _n (k _n-1 ) in the current block. This is a result of examining whether the code is given or a different code is given in the case of echo path fluctuation and double talk. The upper part of FIG. 12 shows an echo path fluctuation where the estimation error is 0 dB, the middle stage is an echo path fluctuation where the error is −10 dB, and the lower part is a double talk, where 1 indicates a code match and −1 indicates a code mismatch. ing. In FIG. 12, the probability that the codes do not match due to the echo path variation where the estimation error is 0 dB is 1.5%, and the probability that the estimation error is −10 dB is 5%. Further, as can be seen from this result, it is unlikely that two blocks continue to have different codes when the echo path changes, and in many cases the same code is used in the next block, and it is determined that the echo path changes. In addition, even if the echo path variation is mistaken as double talk, coefficient update as double talk is executed, and if it is determined that the echo path variation is correct in the next block, the processing as echo path variation is one block. Just a minute delay will not cause any major problems.

一方、この図１２は推定誤差の増加が少なくなるほど異符号となる確率が高くなることも示している。そのため、推定誤差の増加が−１０dBより少ないエコー経路変動については符号変化を利用しないことも１つの選択である。その推定誤差の増加が−１０dBより少ないかどうかの判定はステップサイズの大きさから知ることができる。すなわち、式(１９)に On the other hand, FIG. 12 also shows that the probability of different signs increases as the increase in estimation error decreases. Therefore, it is also an option not to use a sign change for an echo path variation in which the increase in estimation error is less than −10 dB. The determination of whether or not the increase in the estimation error is less than −10 dB can be known from the step size. That is, the equation (19)

を適用し、Ｃ₀＝０.００１，ａ＝１０.０を代入すると And substituting C ₀ = 0.001 and a = 10.0

が得られる。ただし、推定誤差が−１０dBとなるエコー経路変動の場合を考えているので、σ_d ²＝０.１で、かつ、Ｑ_nは残留エコーが支配的となってσ_n ²≪σ_x ²σ_d ²、また、遠端話者音声のパワーσ_x ²はブロック長を最短のＪ₀＝１２８に換算して与えるものとしている。この式(３２)から推定誤差が−１０dB未満となる場合はステップサイズρは０.２５６よりも大きくなることが分かる。ここで、推定誤差が−１０dB以上となるエコー経路変動について符号の変化を判定に使うとすれば、ρ≦０.２５６のときに符号判定を行うことになる。すなわち、
（１）Ａ_n(ｋ)の探索をｋ＝１〜Ｉ／１６に限定して行い、ｒ_n＜０.６ならばダブルトークと判定する。
（２）ρ≦０.２５６ならば符号判定を行い、前ブロックにおいて最大となったタップｋ_n-1に対して現ブロックのＡ_n(ｋ_n-1)が異符号となったときはダブルトークと判定する。
（３）ρ＞０.２５６ならば符号判定を行わない。 Is obtained. However, since the case of an echo path fluctuation in which the estimation error is −10 dB is considered, σ _d ² = 0.1, and Q _n is σ _n ² << σ _x ² σ because the residual echo is dominant. _d ^2, the power sigma _x ² of the far-end speaker's speech is assumed to give in terms of the block length in the shortest J ₀ = 128. From this equation (32), it can be seen that when the estimation error is less than −10 dB, the step size ρ is larger than 0.256. Here, if the change in the sign is used for the determination with respect to the echo path variation in which the estimation error is −10 dB or more, the sign determination is performed when ρ ≦ 0.256. That is,
(1) The search for A _n (k) is limited to k = 1 to I / 16, and if r _n <0.6, it is determined as double talk.
(2) If ρ ≦ 0.256, the code is determined, and when A _n (k _n−1 ) of the current block is different from the maximum tap k _n−1 in the previous block, double Judge as talk.
(3) If ρ> 0.256, no sign determination is performed.

この上記２つの方法を併用したときに誤り（ダブルトークをエコー経路変動と判定する誤り）が発生する確率を図１３に示す。その確率はさらに減少していることが分かる。 FIG. 13 shows the probability that an error (an error that determines double talk as an echo path variation) occurs when these two methods are used together. It can be seen that the probability further decreases.

次に、このｒ_nを連続Ｋ個のブロックについて観測することを考える。すなわち、
（４）上記手順をＫ回繰り返し、全てダブルトークと判定されないときはエコー経路変動と判定する。 Next, consider that to observe this r _n for consecutive K blocks. That is,
(4) The above procedure is repeated K times, and when all are not determined as double talk, it is determined as echo path fluctuation.

このブロック単独で判定したときにダブルトークをエコー経路変動と判定する誤りが発生する確率は図１３に示されている通りである。いま、その発生確率をｐとすると、その誤りがＫ回連続して発生する確率はｐ^kとなる。従って、その発生確率をｐ₀としたい場合、連続して観測すべきブロック数Ｋは The probability of occurrence of an error in determining double talk as an echo path variation when this block alone is determined is as shown in FIG. Now, if the occurrence probability is p, the probability that the error occurs K times consecutively is p ^k . Therefore, if the probability of occurrence is p ₀ , the number of blocks K to be continuously observed is

より Than

と計算される。 Is calculated.

この関係を用いて計算した、誤り発生確率をｐ₀＝１０^-4とするときに必要となるブロック数Ｋを図１４に示す。この結果から、ＪＱ_n≧５０については連続５回までにエコー経路変動が誤り率１０^-4で検出されることが分かる。その検出遅れは、時間にして１２８×５／８＝８０ｍｓである。一方、この図１４はＪＱ_n＜５０については倍の連続１０回の観測が必要であることを示している。しかし、このＪＱ_n＜５０は、既に説明したように参照信号のパワーが最小のときに生じ、その最小のパワーが発生する頻度は図７からわかるように非常に低い、従って、ＪＱ_n＜５０が多数回連続して発生することは少なく、実際には続いてＪＱ_n＞５０となって少ない連続回数で検出される可能性が高い。また、参照信号のパワーが最小のときにエコー経路変動であることを検出できたとしても推定誤差の減少は少なく、また、遠端側に伝送されるエコーもパワーも小さいものとなることから、素早く検出する必要性も薄い。この場合は判定遅延が大きく、ダブルトークとして処理されても問題は少ない。 FIG. 14 shows the number of blocks K calculated using this relationship when the error occurrence probability is p ₀ = 10 ⁻⁴ . From this result, it can be seen that for JQ _n ≧ 50, the echo path variation is detected with an error rate of 10 ⁻⁴ until 5 consecutive times. The detection delay is 128 × 5/8 = 80 ms in time. On the other hand, FIG. 14 shows that JQ _n <50 requires 10 times continuous observation. However, this JQ _n <50 occurs when the power of the reference signal is minimum as described above, and the frequency at which the minimum power is generated is very low as can be seen from FIG. 7, and therefore JQ _n <50. Is unlikely to occur continuously many times, and in fact, JQ _n > 50 is likely to be detected with a small number of consecutive times. In addition, even if it can be detected that there is an echo path fluctuation when the power of the reference signal is minimum, the estimation error is reduced little, and the echo transmitted to the far end side also has a small power. There is little need for quick detection. In this case, the determination delay is large, and there are few problems even if it is processed as double talk.

つぎに、推定誤差が−１０dBと−２０dBとなるエコー経路変動のときのＡ_n(ｋ)の分布を図１５と図１６に示す。この大きさのエコー経路変動では、式(２４)と式(２９)から分かるように、傾きｒ_nの分子は推定誤差の１乗、分母は２乗で与えられることから、推定誤差の小さいエコー経路変動ほどｒ_nの傾きは大きくなる。例えば、図１５の推定誤差が−１０dBの増加となるエコー経路変動はｒ_n＞１.８にあり、図１６の−２０dBではｒ_n＞５.３３にある。すなわち、全ての場合においてエコー経路変動はｒ_n＜０.６とならない。したがって、ｒ_n＜０.６ならばエコー経路変動の大小に関係なくダブルトークと判定することができる。 Next, FIG. 15 and FIG. 16 show the distribution of _An (k) when the echo path variation has an estimation error of −10 dB and −20 dB. This magnitude of the echo path changes, as can be seen in the formula (24) from equation (29), the molecular tilt r _n 1 square of the estimation error, since the denominator is given by the square, a small echo of estimation error inclination of about path changes r _n increases. For example, the echo path fluctuation in which the estimation error in FIG. 15 is increased by −10 dB is r _n > 1.8, and at −20 dB in FIG. 16, r _n > 5.33. That is, in all cases, the echo path variation does not satisfy r _n <0.6. Therefore, if r _n <0.6, it is possible to determine double talk regardless of the magnitude of the echo path fluctuation.

問題はダブルトークの場合の分布がエコー経路変動の大きさに関係がないことである。さらに、推定誤差が−１０dBと−２０dBとなるエコー経路変動のときのＪＱ_nは５０以下に集中しており、従って、その判定閾値は０.６とすれば判定に必要なブロック数は１０となる。実システムでは、このように推定誤差が−１０dBと−２０dBとなるエコー経路変動の発生頻度は高いと予想されることから、この場合における判定遅延を少なくする工夫が必要である。 The problem is that the distribution in the case of double talk has nothing to do with the magnitude of the echo path variation. Further, JQ _n is concentrated to 50 or less when the echo path variation is such that the estimation error is −10 dB and −20 dB. Therefore, if the determination threshold is 0.6, the number of blocks necessary for determination is 10. Become. In an actual system, it is expected that the frequency of occurrence of echo path fluctuations with estimation errors of −10 dB and −20 dB is expected to be high in this way. Therefore, a device for reducing the determination delay in this case is necessary.

そこで、推定誤差が−１０dBのエコー経路変動ではσ_d ²が１／１０になることから、ＪＱ_n＜２００以下がほとんどであるに着目する。それは図１５からも確認される。さらに、ＪＱ_n≧２００以上では閾値を０.６としても必要連続ブロック数は３で済む。従って、ここでは推定誤差が−１０dB以下となるエコー経路変動が発生したときに必要となるブロック数を減らす対策については、ＪＱ_n＜２００以下について考えればよい。もちろん、この場合も推定誤差が０dBのエコー経路変動が発生している可能性がある。しかし、その場合は上記の対策で十分であり、別段の対策は不要である。 Therefore, since σ _d ² becomes 1/10 when the echo path variation has an estimation error of −10 dB, attention is paid to JQ _n <200 or less. This is also confirmed from FIG. Further, if JQ _n ≧ 200 or more, the number of necessary continuous blocks is three even if the threshold value is 0.6. Therefore, here, JQ _n <200 or less may be considered as a countermeasure for reducing the number of blocks required when an echo path variation with an estimation error of −10 dB or less occurs. Of course, in this case as well, there is a possibility that an echo path variation with an estimation error of 0 dB has occurred. However, in that case, the above measures are sufficient, and no other measures are required.

そこで、その対策を見出すために、推定誤差が−１０dBのエコー経路変動に対して閾値をｒ_n＝１.８とした場合に必要となる連続ブロック数を計算する。その結果を図１７に示す。この結果から、１００≦ＪＱ_nで必要となる連続ブロック数は０である。これは１００≦ＪＱ_n＜２００でｒ_n＞１.８となるダブルトークの発生確率は１０^-4以下であることを表している（ダブルトークはエコー経路変動の大小に関係がないことに注意）。これは図８からも確認できる。すなわち、
（５）ｒ_n＞１.８と計算されたときにＪＱ_n≧１００ならば、直ちにダブルトークと判定することができる。 Therefore, in order to find the countermeasure, the number of continuous blocks required when the threshold value is set to r _n = 1.8 for an echo path variation with an estimation error of −10 dB is calculated. The result is shown in FIG. From this result, the number of continuous blocks required for 100 ≦ JQ _n is zero. This indicates that the probability of occurrence of double talk _where 100 ≦ JQ _n <200 and r _n > 1.8 is 10 ⁻⁴ or less (note that double talk is not related to the magnitude of echo path fluctuations). ). This can also be confirmed from FIG. That is,
(5) If JQ _n ≧ 100 when r _n > 1.8 is calculated, it can be immediately determined as double talk.

同様に、図１７から
（６）ｒ_n＞１.８で５０≦ＪＱ_n＜１００のときに必要となる連続ブロック数は３。
（７）ｒ_n＞１.８で１０≦ＪＱ_n＜５０のときに必要となる連続ブロック数は５。
（８）ｒ_n＞１.８でＪＱ_n＜１０のときに必要となる連続ブロック数は８。
をもって識別が可能となる。すなわち、ＪＱ_n＜２００で現ブロックにおいてｒ_nが１.８を越えるときは閾値を１.８に引き上げることで判定に必要な連続ブロック数を減らすことができる。 Similarly, from FIG. 17, the number of consecutive blocks required when (6) r _n > 1.8 and 50 ≦ JQ _n <100 is three.
(7) The number of consecutive blocks required when r _n > 1.8 and 10 ≦ JQ _n <50 is 5.
(8) The number of continuous blocks required when r _n > 1.8 and JQ _n <10 is 8.
Can be identified. That is, it is possible to reduce the number of consecutive blocks required for determination by raising the threshold to 1.8 when exceeding r _n of 1.8 in the current block by JQ _n <200.

同様に、ＪＱ_n＜２０の場合は図１６から閾値を５.３３に引き上げることが可能である。しかし、その推定誤差が−２０dBとなるエコー経路変動においてはダブルトークとエコー経路変動の識別はステップサイズ制御によって不要であり、実質的に閾値の変更は不要である。 Similarly, in the case of JQ _n <20, the threshold value can be increased to 5.33 from FIG. However, in the echo path variation where the estimation error is −20 dB, it is not necessary to distinguish between double talk and echo path variation by the step size control, and it is not necessary to change the threshold value substantially.

図２２Ａ，図２２Ｂは、本発明を実用装置に適用する場合の制御フローである。ここで、ｋ₁からｋ₆は連続ブロック数を計数するパラメータでｒ_n＜０.６のときと、制御初期において０に設定される。また、Ａ_n(ｍ)は本文中のＡ_kと同じであり、この制御フローにおけるｍは本文中のｋと同じである。この制御フローで注意すべきことは、連続ブロック数の計数に際して、例えば、ｒ_n≦１.８で２００≦ＪＱ_n＜７００はｒ_n≦１.８でＪＱ_n＜２００の場合を含んでいる点である。したがって、最初、ｒ_n≦１.８で２００≦ＪＱ_n＜７００となって、次にｒ_n≦１.８でＪＱ_n＜２００となった場合は、この場合の連続ブロック数ｋ₂は２と数えられる。また、ＪＱ_n＜１０でｒ_n＞５.３３となったときは、推定誤差が−２０dB以下となったことを表し、以後はエコー経路変動とダブルトークを区別して制御する必要が無くなったことを表している。そこで、このフローでは計数パラメータを０とする制御が組み入れられている。 22A and 22B are control flows when the present invention is applied to a practical apparatus. Here, k ₁ to k ₆ are parameters for counting the number of continuous blocks, and are set to 0 when r _n <0.6 and at the initial stage of control. A _n (m) is the same as A _k in the text, and m in this control flow is the same as k in the text. What should be noted in this control flow is that, for example, when counting the number of consecutive blocks, r _n ≦ 1.8 and 200 ≦ JQ _n <700 include r _n ≦ 1.8 and JQ _n <200. Is a point. Therefore, when r _n ≦ 1.8 and 200 ≦ JQ _n <700, and then r _n ≦ 1.8 and JQ _n <200, the number of consecutive blocks k _{2 in} this case is 2 It is counted. In addition, when JQ _n <10 and r _n > 5.33, it indicates that the estimation error is −20 dB or less, and thereafter it is no longer necessary to control the echo path variation and double talk separately. Represents. Therefore, in this flow, control for setting the counting parameter to 0 is incorporated.

図１８は、推定誤差が０dBとなるエコー経路変動の発生を本発明によって検出して得られたＥＲＬＥの減少特性である。この結果から本発明によってエコー経路変動の発生後、エコーは急速に減少していることが分かる。また、図１９は、エコー経路変動が遅延０で検出される理想状態の結果である。この比較から本発明の有効性が確認できる。さらに、図２０は推定誤差の減少特性である。ここで、破線は検出遅延が零の理想的な検出が行われた場合の結果で、本発明による検出遅延７５ｍｓの影響が残るものの、理想的な場合と同様の速度で推定誤差が減少していることが確認される。一方、図２１はダブルトークの場合に本発明が与えるＥＲＬＥである。このようにダブルトークにおいてはＥＲＬＥが安定して維持されていることが分かる。 FIG. 18 shows a decrease characteristic of ERLE obtained by detecting the occurrence of echo path fluctuation with an estimation error of 0 dB according to the present invention. From this result, it can be seen that the echo rapidly decreases after the occurrence of the echo path fluctuation according to the present invention. FIG. 19 shows the result of an ideal state in which the echo path variation is detected with a delay of zero. From this comparison, the effectiveness of the present invention can be confirmed. Furthermore, FIG. 20 shows a reduction characteristic of the estimation error. Here, the broken line is the result of an ideal detection with a detection delay of zero. Although the influence of the detection delay of 75 ms according to the present invention remains, the estimation error decreases at the same speed as in the ideal case. It is confirmed that On the other hand, FIG. 21 shows ERLE given by the present invention in the case of double talk. Thus, it can be seen that ERLE is stably maintained in double talk.

以上、本発明によればダブルトークとエコー経路変動が少ない遅延で識別され、安定した動作がシステムに保証される。ここで、本発明の説明を適応アルゴリズムをブロック実行型の学習同定法として行った。しかし、本発明はブロック実行型の学習同定法以外の適応アルゴリズムにも適用可能であることに注意が必要である。 As described above, according to the present invention, double talk and echo path fluctuation are identified with a small delay, and a stable operation is guaranteed for the system. Here, the description of the present invention has been made using the adaptive algorithm as a block execution type learning identification method. However, it should be noted that the present invention can be applied to adaptive algorithms other than the block execution type learning identification method.

例えば、ブロック実行型ではない学習同定法を用いることもできるし、学習同定法以外の適応アルゴリズムを用いることもできる。さらに、ブロック実行型である、学習同定法以外の適応アルゴリズムを用いることもできる。ブロック実行型の適応アルゴリズムの時間ブロックを短縮してゆくと、最終的には、ブロック実行型ではない適応アルゴリズムと同一となる。よって、上記した本発明の効果と同様の効果が、ブロック実行型ではない適応アルゴリズムにおいても生ずるであろうことが期待できる。 For example, a learning identification method that is not a block execution type can be used, or an adaptive algorithm other than the learning identification method can be used. Furthermore, an adaptive algorithm other than the learning identification method, which is a block execution type, can also be used. If the time block of the block execution type adaptive algorithm is shortened, the result is finally the same as that of the non-block execution type adaptive algorithm. Therefore, it can be expected that the same effect as the effect of the present invention described above will also occur in an adaptive algorithm that is not a block execution type.

本願のエコーキャンセラによれば、ダブルトークとエコー経路変動を少ない遅延で識別することができるので、例えば、ハンズフリー通話装置のような電気音響の分野においても利用できる。 According to the echo canceller of the present application, double talk and echo path fluctuation can be identified with a small delay, and can be used, for example, in the field of electroacoustics such as a hands-free call device.

通話装置に適用された、本発明（エコーキャンセラ）の概略構成を示す図である。It is a figure which shows schematic structure of this invention (echo canceller) applied to the telephone apparatus. 本発明（エコーキャンセラ）の各機能を示すブロック図である。It is a block diagram which shows each function of this invention (echo canceller). 音響エコーキャンセラによるハンズフリー通話装置のブロック図である。It is a block diagram of a hands-free communication device using an acoustic echo canceller. 識別パラメータＲ_Ey(ｎ)の度数分布図である。It is a frequency distribution diagram of the identification parameter R _Ey (n). シミュレーションに用いた音声を示す図である。It is a figure which shows the audio | voice used for simulation. ａ_n(ｋ)の度数分布図である。It is a histogram of a _n (k). 推定誤差が０dBまで増加するエコー経路変動で得られるＡ_kの分布図である。It is a distribution map of _Ak obtained by the echo path fluctuation | variation in which an estimation error increases to 0 dB. Ｐ_nの変化を示す図である。It is a figure which shows the change of _Pn . ダブルトークのときのＡ_kの分布図である。It is a distribution diagram of the A _k at the time of double-talk. 推定誤差が０dBまで増加するエコー経路変動とダブルトークのときのＡ_kの分布図である。Estimation error is a distribution diagram of A _k when the echo path change and double-talk increases to 0 dB. ｒ_n＝０.６を識別閾値としたときにダブルトークをエコー経路変動と誤る確率を示す図である。is a diagram illustrating the probability of mistaking the echo path change doubletalk when the r _n = 0.6 and the decision threshold. 探索範囲を１〜Ｉ／１６としたときの誤り発生確率を示す図である。It is a figure which shows an error generation probability when a search range is set to 1-16. 隣接するブロックのＡ_n(ｋ)の符号が一致あるいは不一致となる頻度を示す図である。It is a figure which shows the frequency with which the code | symbol of A _n (k) of an adjacent block corresponds or does not correspond. 符号判定の併用法においてダブルトークをエコー経路変動と誤る確率を示す図である。It is a figure which shows the probability that a double talk is mistaken for an echo path | route fluctuation | variation in the combined method of code | symbol determination. ダブルトークをエコー経路変動と誤る確率が１０^-4以下となるために必要な連続ブロック数を示す図である。It is a figure which shows the number of continuous blocks required in order that the probability that a double talk will be mistaken for echo path | route fluctuation | variation will be 10 <^-4> or less. 推定誤差が−１０dBとなるエコー経路変動とダブルトークのときのＡ_n(ｋ)の分布図である。It is a distribution map of _An (k) at the time of the echo path fluctuation | variation in which an estimation error is -10dB, and double talk. 推定誤差が−２０dBとなるエコー経路変動とダブルトークのときのＡ_n(ｋ)の分布図である。It is a distribution map of _An (k) at the time of the echo path fluctuation | variation in which an estimation error is -20dB, and double talk. 推定誤差が−１０dBとなるエコー経路変動を検出するのに必要なブロック数を示す図である。It is a figure which shows the number of blocks required in order to detect the echo path | pass fluctuation | variation in which an estimation error is -10dB. 本発明によってエコー経路変動を検出して得られたＥＲＬＥ特性図である。FIG. 5 is an ERLE characteristic diagram obtained by detecting an echo path variation according to the present invention. エコー経路変動を遅延０で検出できた場合のＥＲＬＥ特性図である。FIG. 5 is an ERLE characteristic diagram when echo path variation can be detected with a delay of zero. 推定誤差の減少特性の比較図である。It is a comparison figure of the reduction characteristic of an estimation error. ダブルトークのときに本発明が与えるＥＲＬＥ特性図である。It is an ERLE characteristic diagram given by the present invention at the time of double talk. 本願発明を実用装置に適用する場合の制御フローチャートである。It is a control flowchart in the case of applying this invention to a practical apparatus. 本願発明を実用装置に適用する場合の制御フローチャートである。It is a control flowchart in the case of applying this invention to a practical apparatus.

Explanation of symbols

200 マイクロホン
201 スピーカ
212 適応フィルタ

200 microphone
201 Speaker
212 Adaptive filter

Claims

An echo canceller that removes an echo signal generated by a reception signal supplied to a speaker from the speaker to the microphone, from the output signal of the microphone,
A coefficient updatable FIR filter that generates a pseudo echo signal based on the received signal;
A subtractor that generates an error signal by subtracting a pseudo echo signal generated by the FIR filter from an output signal of the microphone;
A coefficient updating unit that updates the coefficient of the FIR filter by an adaptive algorithm so that the error signal is minimized based on the received signal and the error signal;
The adaptive algorithm calculates a product of the error signal and an input signal to each coefficient tap within a predetermined search range of the FIR filter, and based on a first value determined based on the product, the search A representative tap is selected from the coefficient taps within the range,
Based on the second value determined based on the product of the error signal and the input signal to the representative tap and the third value determined based on the power value of the error signal, a double talk state occurs. Echo canceller to determine whether or not.

The first value is the product of the error signal and the input signal to each coefficient tap;
The second value is the product of the error signal and the input signal to the representative tap;
The echo canceller according to claim 1, wherein the third value is a power value of the error signal.

The adaptive algorithm is a block execution type adaptive algorithm;
The first value is a cumulative sum in a time block of a product of the error signal and the input signal to each coefficient tap;
The second value is a cumulative sum in a time block of a product of the error signal and an input signal to the representative tap;
The echo canceller according to claim 1, wherein the third value is a cumulative sum in a time block of the power value of the error signal.

The echo according to any one of claims 1 to 3, wherein when the absolute value of the ratio of the second value to the third value is less than a predetermined threshold, it is determined that a double talk state has occurred. Canceller.

The absolute value of the ratio of the second value to the third value is greater than or equal to the predetermined threshold, and
The echo canceller according to claim 4, wherein the double-talk state is determined to occur when the signs of the second values for two consecutive times of representative taps selected this time or before are different.

The absolute value of the ratio of the second value to the third value is greater than or equal to the predetermined threshold, and
The sign of the second value determined based on the product of the current input signal to the representative tap selected this time and the current error signal, the previous input signal to the representative tap selected this time, and the previous error signal, The echo canceller according to claim 4, wherein the double talk state is determined to occur when the sign of the second value determined based on the product of is different.

The absolute value of the ratio of the second value to the third value is greater than or equal to the predetermined threshold, and
The sign of the second value determined based on the product of the current input signal to the representative tap selected last time and the current error signal, the previous input signal to the representative tap selected last time, and the previous error signal, The echo canceller according to claim 4, wherein a double talk state is determined to occur when a sign of the second value determined based on the product of is different.

The echo canceller according to any one of claims 4 to 7, wherein when the third value is equal to or smaller than a predetermined value, the predetermined threshold is corrected to a larger value.

An echo canceller that removes an echo signal generated by a reception signal supplied to a speaker from the speaker to the microphone, from the output signal of the microphone,
A coefficient updatable FIR filter that generates a pseudo echo signal based on the received signal;
A subtractor that generates an error signal by subtracting a pseudo echo signal generated by the FIR filter from an output signal of the microphone;
A coefficient updating unit that updates the coefficient of the FIR filter by an adaptive algorithm so that the error signal is minimized based on the received signal and the error signal;
The adaptive algorithm calculates a product of the error signal and an input signal to each coefficient tap within a predetermined search range of the FIR filter, and based on a first value determined based on the product, the search A representative tap is selected from the coefficient taps within the range,
When the sign of the second value for two consecutive times for the representative tap selected this time or before is different, it is determined that a double talk state has occurred,
An echo canceller, wherein the second value is a value determined based on a product of the error signal and an input signal to the representative tap.

The sign of the second value determined based on the product of the current input signal to the representative tap selected this time and the current error signal, the previous input signal to the representative tap selected this time, and the previous error signal, The echo canceller according to claim 9, wherein a double talk state is determined to occur when a sign of the second value determined based on the product of is different.

The sign of the second value determined based on the product of the current input signal to the representative tap selected last time and the current error signal, the previous input signal to the representative tap selected last time, and the previous error signal The echo canceller according to claim 9, wherein a double talk state is determined to occur when a sign of the second value determined based on the product of is different.

The first value is the product of the error signal and the input signal to each coefficient tap;
The echo canceller according to any one of claims 9 to 11, wherein the second value is a product of an error signal and an input signal to the representative tap.

The adaptive algorithm is a block execution type adaptive algorithm;
The first value is a cumulative sum in a time block of a product of the error signal and the input signal to each coefficient tap;
The echo canceller according to any one of claims 9 to 11, wherein the second value is a cumulative sum in a time block of a product of the error signal and an input signal to the representative tap.

The echo canceller according to any one of claims 1 to 13, wherein, among coefficient taps within the search range, a coefficient tap having the largest absolute value of the first value is selected as a representative tap.

The echo canceller according to any one of claims 1 to 14, wherein the search range is a full coefficient tap of the FIR filter.

The echo canceller according to any one of claims 1 to 14, wherein the search range is a partial coefficient tap of the FIR filter.

The echo canceller according to any one of claims 1 to 14, wherein the search range is a continuous coefficient tap from an initial stage to an intermediate stage of the FIR filter.

The adaptive algorithm reduces the coefficient update amount of the FIR filter or stops updating the coefficient when it is determined that a double talk state has occurred. The echo canceller described in the item.

The adaptive algorithm determines that an echo path variation state has occurred when it is not determined that a double talk state has occurred continuously for a predetermined number of times. The echo canceller described in the section.

The echo canceller according to claim 19, wherein the adaptive algorithm increases the coefficient update amount of the FIR filter when it is determined that an echo path fluctuation state has occurred.