JP2000252884A

JP2000252884A - Adaptive filter learning system

Info

Publication number: JP2000252884A
Application number: JP5147099A
Authority: JP
Inventors: Yuriko Tsukahara; 由利子塚原
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1999-02-26
Filing date: 1999-02-26
Publication date: 2000-09-14

Abstract

PROBLEM TO BE SOLVED: To prevent deterioration in echo cancellation in units of frequency by independently discriminating propriety of learning for each frequency, in a frequency region type echo canceller provided with an adaptive filter for each frequency. SOLUTION: The echo canceller of a frequency region type is provided with a double talk detection section 17. This double talk detection section 17 applies learning processing to a frame of an adaptive filter 12, on the basis of a reception signal, a transmission signal. a pseudo echo signal and a residual error signal with respect to the transmission signal. independently decides propriety of learning especially for each frequency and uses a group average for part of reference values at the decision. Thus, opportunities for learning by frequencies are increased and errors in learning is reduced to enhance performance of double talk detection by frequencies, thereby preventing deterioration in an echo cancellation amount in units of frequency.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、例えばディジタル
自動車電話等の音声通信装置において、送話信号に含ま
れるエコー成分を除去するためのエコーキャンセラに係
り、特に各周波数毎に適応フィルタを備えた周波数領域
型のエコーキャンセラに用いられる適応フィルタ学習方
式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an echo canceller for removing an echo component contained in a transmission signal in a voice communication device such as a digital car telephone, and more particularly to an echo canceller provided for each frequency. The present invention relates to an adaptive filter learning method used for a frequency-domain echo canceller.

【０００２】[0002]

【従来の技術】近年、コンピュータや通信の分野では、
ディジタル信号処理（ＤＳＰ）が注目され、多種類の分
野に応用されている。このディジタル信号処理は、アナ
ログ処理では困難であった特性定数の任意変更や適応処
理等の複雑な処理を容易に実現することができるため、
特に、音声処理や画像処理の分野において汎用的な技術
として用いられている。2. Description of the Related Art In recent years, in the field of computers and communications,
Digital signal processing (DSP) has attracted attention and has been applied to various fields. This digital signal processing can easily realize complicated processing such as arbitrary change of characteristic constants and adaptive processing which were difficult in analog processing.
In particular, it is used as a general-purpose technique in the field of audio processing and image processing.

【０００３】例えば、自動車電話等で使用されるハンズ
フリータイプの電話機では、ハンズフリー通話中には、
スピーカからの受話音声がマイクロホンに回り込み、こ
れが相手方に送られて音響エコーが発生することがあ
る。このような受話音声の回り込みによる音響エコーを
打ち消して通話品質を保つために、一般にエコーキャン
セラと呼ばれるエコー消去装置が使用される。For example, in a hands-free type telephone used in a car telephone or the like, during a hands-free call,
In some cases, the voice received from the speaker wraps around the microphone and is sent to the other party to generate an acoustic echo. An echo canceller generally called an echo canceller is used to cancel the acoustic echo caused by the wraparound of the received voice and maintain the communication quality.

【０００４】ここで、音声通信において問題とされる、
スピーカからマイクへの帰還信号であるエコーは、受話
信号（スピーカへ供給される音）と送話信号（マイクか
ら入力される音）との相関をつかって打ち消すことが可
能である。具体的には、スピーカとマイクの間の回帰路
を１つのＦＩＲフィルタ（ＦＩＲ：finite impulse res
ponse，有限インパルス応答）と見なし、このフィルタ
の係数を推定する方式が実用化されている。Here, there is a problem in voice communication.
The echo, which is a feedback signal from the speaker to the microphone, can be canceled by using the correlation between the reception signal (sound supplied to the speaker) and the transmission signal (sound input from the microphone). Specifically, the regression path between the speaker and the microphone is connected to one FIR filter (FIR: finite impulse res
ponse, finite impulse response), and a method of estimating the coefficient of this filter has been put to practical use.

【０００５】実用のフィルタ係数推定方式は、時間逐次
型の適応フィルタを用いる（「ディジタル信号処理の基
礎」辻井重男、電子情報通信学会発行）。しかし、この
時間領域型のエコーキャンセラでは、適応フィルタの学
習のための演算量が多いことから、近年、信号をフレー
ム単位で区切り、各フレーム毎に適応フィルタの学習を
行う周波数領域型のエコーキャンセラが検討された
（“ＵｎｃｏｎｓｔｒａｉｎｅｄＦｒｅｑｕｅｎｃｙ
−ＤｏｍａｉｎＡｄａｐｔｉｖｅＦｉｌｔｅｒ”，
ＩＥＥＥｔｒａｎｓ．Ｖｏｌ．ＡＳＳＰ−３０，Ｎ
ｏ．５，Ｏｃｔ．１９８２）。A practical filter coefficient estimating method uses a time-sequential adaptive filter ("Basics of Digital Signal Processing", published by Shigeo Tsujii, IEICE). However, in this time-domain echo canceller, since the amount of computation for learning the adaptive filter is large, in recent years, the frequency-domain echo canceller separates the signal in frame units and learns the adaptive filter for each frame. (“Unconstrained Frequency”
-Domain Adaptive Filter ",
IEEE trans. Vol. ASSP-30, N
o. 5, Oct. 1982).

【０００６】この周波数領域型のエコーキャンセラの方
式の概要を述べると、まず、受話信号と送話信号をそれ
ぞれフレームに区切ってＦＦＴ（高速フーリエ変換）を
行い、周波数成分毎に個別の複素一次適応フィルタを用
意して１フレームに一度学習させ、各周波数成分毎にエ
コーをキャンセルする。残留成分を逆変換すれば、エコ
ー除去後の信号が得られる。The outline of the frequency-domain echo canceller system is as follows. First, an FFT (Fast Fourier Transform) is performed by dividing a reception signal and a transmission signal into frames, and an individual complex first-order adaptation is performed for each frequency component. A filter is prepared and learned once per frame, and echo is canceled for each frequency component. If the residual component is inversely transformed, a signal after echo removal can be obtained.

【０００７】適応フィルタの学習は、送話信号にエコー
以外の信号、即ち送話側の話者の声が混入している間は
行えない。学習ができるかどうかを判定する機能をダブ
ルトーク検出という。ダブルトークの判定は、受話信
号、送話信号、エコー残留成分のパワーを閾値と比較し
て行う。ダブルトーク検出は時間領域型のエコーキャン
セラの場合にはサンプル毎に、周波数領域型のエコーキ
ャンセラの場合にはフレーム毎に行われるのが通常であ
る。Learning of the adaptive filter cannot be performed while a signal other than the echo, that is, the voice of the speaker on the transmitting side is mixed in the transmitting signal. The function of determining whether learning is possible is called double talk detection. The determination of double talk is performed by comparing the power of the received signal, the transmitted signal, and the power of the echo residual component with a threshold. In general, double talk detection is performed for each sample in the case of a time domain type echo canceller, and is performed for each frame in the case of a frequency domain type echo canceller.

【０００８】[0008]

【発明が解決しようとする課題】周波数領域型のエコー
キャンセラにおいて、従来は、フレーム単位で学習の可
否を判定していた。しかしながら、周波数領域型で用い
られる適応フィルタは、低域、中域、高域といったよう
に各周波数毎に個別であるため、必ずしも全ての周波数
で学習が可能とは限らない。例えば、受話信号の周波数
成分のパワーが小さい場合に学習を行うと、その周波数
のみエコー打消し量が劣化することもある。このような
ことから、学習の可否は各周波数毎に個別に判定するこ
とが望ましい。Conventionally, in a frequency-domain echo canceller, whether or not learning is possible is determined on a frame-by-frame basis. However, an adaptive filter used in the frequency domain type is individual for each frequency such as a low band, a middle band, and a high band, so that learning is not always possible at all frequencies. For example, if learning is performed when the power of the frequency component of the received signal is small, the echo canceling amount may deteriorate only at that frequency. For this reason, it is desirable to determine whether or not learning is possible for each frequency.

【０００９】本発明は上記のような点に鑑みなされたも
ので、各周波数毎に適応フィルタを備えた周波数領域型
のエコーキャンセラにおいて、学習可否の判定を各周波
数毎に独立して行うことで、周波数単位でのエコー打消
し量の劣化を防ぐようにした適応フィルタ学習方式を提
供することを目的とする。The present invention has been made in view of the above points, and in a frequency-domain echo canceller provided with an adaptive filter for each frequency, the determination as to whether or not learning is possible is made independently for each frequency. It is another object of the present invention to provide an adaptive filter learning method for preventing deterioration of an echo canceling amount in frequency units.

【００１０】[0010]

【課題を解決するための手段】本発明の適応フィルタ学
習方式は、受話信号をフレーム単位で周波数変換し、各
周波数毎に異なる適応フィルタを用いて上記受話信号か
ら疑似エコー信号を生成し、この疑似エコー信号を周波
数逆変換して送話信号から減算することで上記送話信号
に含まれるエコー成分を除去する周波数領域型のエコー
キャンセラに用いられる適応フィルタ学習方式であっ
て、上記受話信号、上記送話信号、上記疑似エコー信号
と上記送話信号との残留誤差信号に基づいて上記適応フ
ィルタの当該フレームに対する学習処理を行い、その学
習の可否の判定を各周波数毎に独立に行うようにしたも
のである。According to the adaptive filter learning method of the present invention, a received speech signal is frequency-converted in frame units, and a pseudo echo signal is generated from the received speech signal using a different adaptive filter for each frequency. An adaptive filter learning method used in a frequency-domain echo canceller that removes an echo component included in the transmission signal by subjecting the pseudo echo signal to inverse frequency conversion and subtracting from the transmission signal, wherein the reception signal includes: The adaptive filter performs learning processing on the frame based on the residual error signal between the transmission signal and the pseudo echo signal and the transmission signal, and determines whether or not to perform the learning independently for each frequency. It was done.

【００１１】この場合、まず、上記受話信号、上記送話
信号、上記残留誤差信号のフレーム全体のパワーに基づ
いて当該フレームが学習に適するか否かを判定し、その
後で当該フレームが学習に適する場合には、さらに各周
波数毎に上記受話信号、上記送話信号、上記残留誤差信
号の周波数成分パワーから当該周波数の学習の可否を判
定することでも良い。In this case, first, it is determined whether or not the frame is suitable for learning based on the power of the entire frame of the reception signal, the transmission signal, and the residual error signal, and then the frame is suitable for learning. In this case, it is also possible to determine whether or not the frequency can be learned from the frequency component power of the received signal, the transmitted signal, and the residual error signal for each frequency.

【００１２】また、当該フレームの周波数成分をいくつ
かのグループに分け、上記残留誤差信号の周波数成分の
パワーをグループ毎に平均化した値と、グループ毎に設
定された閾値および上記受信信号の周波数成分のパワー
を用いて上記適応フィルタの当該フレームに対する学習
の可否を判定することでも良い。Further, the frequency component of the frame is divided into several groups, the power of the power of the frequency component of the residual error signal is averaged for each group, a threshold value set for each group, and the frequency of the reception signal are set. Whether the adaptive filter can learn the frame or not may be determined using the power of the component.

【００１３】また、学習可と判定された周波数の個数が
一定値より多い場合には当該フレームの全ての周波数に
ついて学習を行い、学習可と判定された周波数の個数が
一定値より少ない場合には当該フレームの全ての周波数
について学習を行わないようにすることでも良い。When the number of frequencies determined to be learnable is larger than a certain value, learning is performed for all the frequencies of the frame, and when the number of frequencies determined to be learnable is smaller than a certain value, the learning is performed. Learning may not be performed for all frequencies of the frame.

【００１４】さらに、当該フレームの周波数成分をいく
つかのグループに分け、各グループ内での学習の可否の
判定結果を同一にするようにしても良い。Further, the frequency components of the frame may be divided into several groups, and the determination result of the possibility of learning in each group may be the same.

【００１５】このように、各周波数毎に学習の可否を独
立に判定することで、各周波数に応じた最適な学習処理
を行うことができ、周波数単位のエコー打消し量劣化を
防ぐことができる。As described above, by independently determining whether or not learning is possible for each frequency, it is possible to perform optimal learning processing according to each frequency, and to prevent deterioration of the echo cancellation amount in frequency units. .

【００１６】また、大多数の周波数で学習可と判定され
た場合には、学習全体としてのスピードを上げるために
フレーム毎に学習し、逆に、学習可と判定された周波数
の数が少ない場合には誤学習の可能性が高いのでフレー
ム毎に学習不可とすることで、各周波数の相関関係を考
慮したフレーム全体としての学習処理を行うことができ
る。When it is determined that learning is possible at most of the frequencies, learning is performed for each frame in order to increase the speed of learning as a whole. Conversely, when the number of frequencies determined to be learning is small, Since the possibility of erroneous learning is high, learning cannot be performed for each frame, so that the learning processing of the entire frame can be performed in consideration of the correlation between the frequencies.

【００１７】[0017]

【発明の実施の形態】以下、図面を参照して本発明の一
実施形態を説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings.

【００１８】図１は本発明の一実施形態に係る周波数領
域型のエコーキャンセラの構成を示すブロック図であ
る。FIG. 1 is a block diagram showing the configuration of a frequency-domain echo canceller according to an embodiment of the present invention.

【００１９】このエコーキャンセラは、受話信号用のフ
レーム作成部１０、フーリエ変換部（ＦＦＴ）１１、適
応フィルタ部（ＡＦ）１２、逆フーリエ変換部（ＩＦＦ
Ｔ）１３、送話信号用のフレーム作成部１４、フレーム
合成部１５、フーリエ変換部１６およびダブルトーク検
出部（ＤＴＤ）１７から構成される。The echo canceller includes a frame generator 10 for a received signal, a Fourier transformer (FFT) 11, an adaptive filter (AF) 12, and an inverse Fourier transformer (IFF).
T) 13, a transmission signal frame creation unit 14, a frame synthesis unit 15, a Fourier transform unit 16, and a double talk detection unit (DTD) 17.

【００２０】なお、本発明の特徴部分は、各周波数毎に
設けられる適応フィルタ１２（フィルタ係数）の学習の
可否を判定するダブルトーク検出部１７であり、その他
のエコーキャンセラの構成部分（１０〜１６）について
は既存の方式で良いので、詳細な説明は避ける。本実施
形態では、従来方式として挙げた文献のＦＦＴ−ＥＣを
例にして説明する。A feature of the present invention is a double talk detecting unit 17 for determining whether or not learning of an adaptive filter 12 (filter coefficient) provided for each frequency is performed, and other components of the echo canceller (10 to 10). For 16), an existing method may be used, and a detailed description will be omitted. In the present embodiment, an example will be described using FFT-EC of a document cited as a conventional method.

【００２１】通信相手側の音声信号である受話信号は、
まず、フレーム作成部１０において長さ２Ｎのフレーム
ｘｋ（０），…，ｘｋ（２Ｎ−１）に分割される。ｋは
フレームの番号、２ＮはＦＦＴの変換次数であって、受
話信号においてはＮサンプルずつオーバーラップさせて
フレームを作成する。The reception signal, which is the voice signal of the communication partner, is
First, the frame creation unit 10 divides the frame into 2N-length frames xk (0),..., Xk (2N-1). k is the frame number, 2N is the transform order of the FFT, and the received signal is overlapped by N samples to create a frame.

【００２２】次に、このフレームをフーリエ変換部１１
において周波数変換し、受話信号の周波数成分Ｘｋ
（０），…，Ｘｋ（Ｎ−１）を得る。ここで、Ｘｋ
（ｉ）は複素数であるが、Ｘｋ（０）の実部は直流成
分、虚部は周期２サンプルの高周波成分とする。受話信
号の周波数成分Ｘｋ（ｉ）は適応フィルタ部１２におい
てフィルタ係数がかけらる。このＸｋ（ｉ）とフィルタ
係数との乗算結果として、疑似エコー成分Ｘ′ｋ（ｉ）
が生成される。この疑似エコー成分Ｘ′ｋ（ｉ）は逆フ
ーリエ変換部１３において逆変換され、疑似エコーｘ′
ｋ（０），…，ｘ′ｋ（２Ｎ−１）が生成される。Next, this frame is converted to a Fourier transform unit 11.
, And frequency component Xk of the received signal.
(0),..., Xk (N-1) are obtained. Where Xk
(I) is a complex number, but the real part of Xk (0) is a DC component, and the imaginary part is a high-frequency component having a period of two samples. The filter component is applied to the frequency component Xk (i) of the received signal in the adaptive filter unit 12. As a result of multiplication of the Xk (i) and the filter coefficient, a pseudo echo component X'k (i)
Is generated. This pseudo echo component X'k (i) is inversely transformed in the inverse Fourier transform unit 13, and the pseudo echo x '
k (0),..., x′k (2N−1) are generated.

【００２３】一方、通信者側の音声信号である送話信号
は、フレーム作成部１４において長さＮのフレームｙｋ
（０），…，ｙｋ（Ｎ−１）に分割される。そして、フ
レーム合成部１５において、前半Ｎサンプルは０、後半
Ｎサンプルは送話信号から疑似エコーの後半ｘ′ｋ
（Ｎ），…，ｘ′ｋ（２Ｎ−１）を引き去った値を入れ
たフレームを作り、これをＦＦＴ部１６において変換
し、残留誤差の周波数成分Ｅｋ（０），…，Ｅｋ（Ｎ−
１）を得る。On the other hand, a transmission signal, which is a voice signal on the communication side, is sent to a frame forming unit 14 by a frame yk of length N.
, Yk (N-1). Then, in the frame synthesizing unit 15, the first N samples are 0, and the second N samples are the second half x'k of the pseudo echo from the transmission signal.
(N),..., X′k (2N−1), and a frame in which the value is subtracted. The frame is transformed by the FFT unit 16, and the residual error frequency components Ek (0),. −
Obtain 1).

【００２４】ここで、Ｅｋ（ｉ）は複素数、Ｅｋ（０）
の実部は直流成分、虚部は周期２サンプルの高周波成分
とする。Where Ek (i) is a complex number and Ek (0)
Is a DC component, and the imaginary part is a high-frequency component having two cycles.

【００２５】最後に、各周波数成分ｉに対して、残留誤
差の周波数成分Ｅｋ（ｉ）と受話信号の周波数成分Ｘｋ
（ｉ）を使って適応フィルタ１２のｉ番目のフィルタ係
数を更新する。この場合、フィルタ係数の更新は周波数
ｉに対して、学習可否のフラグｍｏｄｅ（ｉ）が可（＝
１）の場合のみ行われる。フィルタ係数の更新アルゴリ
ズムについては、文献等において周知であるため、ここ
ではその説明を省略するものとする。Finally, for each frequency component i, the frequency component Ek (i) of the residual error and the frequency component Xk
The i-th filter coefficient of the adaptive filter 12 is updated using (i). In this case, the update of the filter coefficient is enabled by the flag mode (i) indicating whether or not learning is possible for the frequency i (=
This is performed only in the case of 1). The algorithm for updating the filter coefficients is well known in literatures and the like, and therefore, the description thereof is omitted here.

【００２６】フラグｍｏｄｅ（ｉ）（ｉ＝０，…，Ｎ）
はダブルトーク検出部１７において決定される。このダ
ブルトーク検出部１７は、受話信号から疑似エコーを生
成して送話信号から引き去るエコーキャンセルにおい
て、上述したように、受話信号および送話信号を周波数
変換（ＦＦＴ）し、各周波数成分毎に異なる適応フィル
タ（フィルタ係数）を用いて疑似エコー成分を生成する
場合に、各周波数の適応フィルタにおける当該フレーム
に対する学習の可否を各周波数毎に独立に決定すること
を特徴とするものである。Flag mode (i) (i = 0,..., N)
Is determined by the double talk detecting unit 17. As described above, the double talk detecting unit 17 performs frequency conversion (FFT) on the received signal and the transmitted signal in the echo cancellation to generate a pseudo echo from the received signal and subtract the pseudo echo from the transmitted signal. When a pseudo echo component is generated using a different adaptive filter (filter coefficient), it is characterized in that whether or not the adaptive filter of each frequency can learn the frame is independently determined for each frequency.

【００２７】以下に、このダブルトーク検出部１７の処
理動作について説明する。Hereinafter, the processing operation of the double talk detecting unit 17 will be described.

【００２８】図２乃至図４はダブルトーク検出部１７の
アルゴリズムを示したフローチャートである。FIGS. 2 to 4 are flowcharts showing the algorithm of the double talk detecting unit 17.

【００２９】ここでは、便宜上、直流成分（周波数０の
実部）に対するフラグはｍｏｄｅ（０）、周期２サンプ
ルの周波数成分（周波数０の虚部）に対するフラグはｍ
ｏｄｅ（Ｎ）を用いることとする。その他の周波数は周
波数Ｎｏ．に対応する。Here, for convenience, the flag for the DC component (real part of frequency 0) is mode (0), and the flag for the frequency component of two samples in the period (imaginary part of frequency 0) is m.
mode (N) is used. Other frequencies are frequency Nos. Corresponding to

【００３０】また、図２乃至図４において、各記号の定
義は以下の通りである。In FIGS. 2 to 4, the definition of each symbol is as follows.

【００３１】ｋ；フレーム番号Ｎ；送話信号のフレーム長かつＦＦＴ次数の１／２（例
えば１２８）Ｘｋ（ｉ）；受話信号のフレームｋの周波数成分ｉ（複
素数）Ｅｋ（ｉ）；残留誤差信号のフレームｋの周波数成分ｉ
（複素数）なお、ｉ＝０，…，Ｎ−１、ｉ＝０の時に周波数成分の
実部は直流成分、虚部は最高周波数である。ｙｋ（ｉ）；送信信号のフレームｋのシンプル値（実
数）ｉ＝０，…，Ｎ−１ｐｅ；残留誤差信号のフレーム全体のパワーｐｘ；受話信号のフレーム全体のパワーｐｙ；送話信号のフレーム全体のパワーｔｍｐ；Ｓ／Ｎ比ｔ；フレームの可変閾値ｆｌａｇ；フレームの状態フラグｃ；フレームの数Ｖ，ａ，ｂ；定数（例えばＶ＝１０２４＊Ｎ，ａ＝１，
ｂ＝１） α，β；定数（例えばα＝０．９５，β＝０．２５）Ｌ；更新時間（例えば１００）Ｐｅ（ｉ）；受話信号の周波数成分のグループ別平均パ
ワーＰｘ（ｉ）；残留誤差信号の周波数成分のグループ別平
均パワーＢ，ＢＮ；周波数のグループ数，各グループに含まれる
周波数の数例えば、Ｂ＝４，ＢＮ＝Ｎ／４であり、グループ数が４
で、周波数の数が１２８の場合には、各グループに含ま
れる周波数の数は１２８／４＝３２Ｃａ；学習可の周波数の数Ｍ１，Ｍ０；定数（Ｍ１＞Ｍ０）Ｔ（ｉ）；周波数グループ別の可変閾値，ｉ＝０，…，
Ｂ−１ｍｏｄｅ（ｉ）；周波数グループ別の学習可否フラグ，
ｉ＝０，…，ＮＡ０，Ａ１，Ａ２，ＶＢ；定数（例えばＡ０＝２，Ａ１
＝１，Ａ２＝１）なお、Ｒ（）とＩ（）はそれぞれ実部と虚部を表
す。K; frame number N; frame length of transmission signal and 1/2 of FFT order (for example, 128) Xk (i); frequency component i of reception signal frame k (complex number) Ek (i); residual error Frequency component i of frame k of the signal
(Complex number) Note that when i = 0,..., N−1, i = 0, the real part of the frequency component is a DC component, and the imaginary part is the highest frequency. yk (i); simple value (real number) of frame k of transmission signal i = 0,..., N-1 pe; power of entire frame of residual error signal px; power of entire frame of reception signal py; Power of the entire frame tmp; S / N ratio t; variable threshold value of the frame flag; status flag of the frame c; number of frames V, a, b; constant (for example, V = 1024 * N, a = 1,
b = 1) α, β; constant (for example, α = 0.95, β = 0.25) L; update time (for example, 100) Pe (i); average power of received signal frequency components by group Px (i) Average power B, BN for each group of frequency components of the residual error signal; number of frequency groups, number of frequencies included in each group For example, B = 4, BN = N / 4, and the number of groups is 4
When the number of frequencies is 128, the number of frequencies included in each group is 128/4 = 32 Ca; the number of learnable frequencies M1, M0; a constant (M1> M0) T (i); Variable threshold value for each group, i = 0,.
B-1 mode (i); learning enable / disable flag for each frequency group,
i = 0,..., N A0, A1, A2, VB; constant (for example, A0 = 2, A1
= 1, A2 = 1) Note that R () and I () represent a real part and an imaginary part, respectively.

【００３２】また、図中のｓｅｃｔｉｏｎ１はフレーム
に対する学習可否の判定処理、ｓｅｃｔｉｏｎ２は適応
フィルタが異常の場合でのフレーム学習不可処理、ｓｅ
ｃｔｉｏｎ３は周波数別の学習可否の判定結果に基づい
て当該フレームの全周波数に対する学習を可または不可
と決定する処理、ｓｅｃｔｉｏｎ４は周波数のグループ
単位での可変閾値の更新処理、ｓｅｃｔｉｏｎ５は周波
数別の学習可否の判定処理を示している。In the figure, section 1 is a process for determining whether or not learning is possible for a frame, section 2 is a frame learning disabling process when the adaptive filter is abnormal,
section3 is a process of determining whether or not learning is possible for all frequencies of the frame based on the determination result of the learning possibility of each frequency, section4 is a process of updating a variable threshold value for each frequency group, and section5 is a possibility of learning of each frequency. Is shown.

【００３３】図２乃至図４のフローチャートに示すよう
に、ダブルトーク検出部１７では、まず、ステップＳ１
で、処理に必要な各パラメータを０クリアし、ステップ
Ｓ２〜Ｓ１６で、フレームに対する学習可否の判定処理
（ｓｅｃｔｉｏｎ１）を以下のように実行する。As shown in the flow charts of FIGS. 2 to 4, the double talk detecting unit 17 first executes step S1.
Then, each parameter required for the processing is cleared to 0, and in steps S2 to S16, a learning possibility determination process (section 1) for the frame is executed as follows.

【００３４】すなわち、ステップＳ２で、残留誤差信号
のフレーム全体のパワーｐｅ、受話信号のフレーム全体
のパワーｐｘ、送話信号のフレーム全体のパワーｐｙを
所定の計算式に従って計算すると共に、残留誤差信号と
受話信号とのＳ／Ｎ比ｔｍｐを次のように計算する。That is, in step S2, the power pe of the entire frame of the residual error signal, the power px of the entire frame of the reception signal, and the power py of the entire frame of the transmission signal are calculated in accordance with a predetermined formula. The S / N ratio tmp between the received signal and the received signal is calculated as follows.

【００３５】ｔｍｐ＝ｌｏｇ１０（ｐｅ／ｐｘ）ここで、学習不可の状態とは、ダブルトークの時と、送
話者が通話している時である。学習可の状態とは、受話
者（相手）が通話している時である。Tmp = log10 (pe / px) Here, the state in which learning is not possible is at the time of double talk and at the time of the talker talking. The state in which learning is possible is when the receiver (the other party) is talking.

【００３６】したがって、上記ステップＳ２において、
残留誤差信号のフレーム全体のパワーｐｅ、受話信号の
フレーム全体のパワーｐｘ、送話信号のフレーム全体の
パワーｐｙ、そして、残留誤差信号と受話信号とのＳ／
Ｎ比ｔｍｐを求めた後、次のステップＳ３にて、受話信
号のフレーム全体のパワーｐｘと定数Ｖとを比較するこ
とにより、受話者（相手）が通話状態にあるか否かを判
断する。その結果、受話信号のフレーム全体のパワーｐ
ｘの値が定数Ｖの値以下である場合には（ｐｘ≦Ｖ）、
受話者（相手）が通話状態にないと判断して、ステップ
Ｓ４にて当該フレームの学習用ｆｌａｇに「０」をセッ
トしておく。なお、ｆｌａｇ＝０はフレームの学習不可
を示している。Therefore, in step S2,
The power pe of the entire frame of the residual error signal, the power px of the entire frame of the received signal, the power py of the entire frame of the transmitted signal, and the S / S of the residual error signal and the received signal.
After obtaining the N ratio tmp, in the next step S3, it is determined whether or not the receiver (the other party) is in a talking state by comparing the power px of the entire reception signal frame with the constant V. As a result, the power p of the entire received signal frame
If the value of x is equal to or less than the value of the constant V (px ≦ V),
It is determined that the receiver (the other party) is not in a talking state, and "0" is set in the learning flag of the frame in step S4. Note that flag = 0 indicates that learning of the frame is not possible.

【００３７】一方、パワーｐｘの値が定数Ｖの値よりも
高い場合（ｐｘ＞Ｖ）、つまり、受話者（相手）が通話
状態にあると判断した場合には、次のステップＳ５にお
いて、送話信号のフレーム全体のパワーｐｙと定数Ｖ／
２とを比較することにより、そのときに送話者が通話状
態にあるか否かを判断する。その結果、送話信号のフレ
ーム全体のパワーｐｙの値が定数Ｖ／２の値よりも高い
場合には（ｐｙ＞Ｖ／２）、送話者が通話状態にあると
判断する。なお、このときの定数をＶ／２としたのは、
エコー減衰量が６ｄＢと想定したのことであり、例えば
減衰量が１２ｄＢならば３Ｖ／４とする等の変更ができ
る。On the other hand, when the value of the power px is higher than the value of the constant V (px> V), that is, when it is determined that the receiver (the other party) is in a talking state, the transmission is performed in the next step S5. The power py of the whole frame of the speech signal and the constant V /
By comparing this with 2, it is determined whether or not the talker is in a talking state at that time. As a result, if the value of the power py of the entire frame of the transmission signal is higher than the value of the constant V / 2 (py> V / 2), it is determined that the transmitter is in a talking state. Note that the constant at this time was set to V / 2,
This is based on the assumption that the echo attenuation is 6 dB. For example, if the attenuation is 12 dB, it can be changed to 3 V / 4.

【００３８】ここで、送話者も通話状態であれば、ダブ
ルトークであり学習不可となるが、その際にステップＳ
６にて、残留誤差信号と受話信号とのＳ／Ｎ比ｔｍｐ
と、残留誤差信号と送話信号とのＳ／Ｎ比＝ｌｏｇ１０
（ｐｅ／ｐｙ）とをチェックし、両者がそれぞれ定数ｂ
の値を超えているような場合、つまり、ｔｍｐ＞ｂ＆ｌ
ｏｇ１０（ｐｅ／ｐｙ）＞ｂであるような場合には、適
応フィルタ１２が正常に動作しておらず、異常状態（誤
動作）にあると判断する。適応フィルタ１２が異常状態
にある場合には、エコーをキャンセルできないだけでな
く、逆にエコーを増幅してしまう可能性がある。そこ
で、適応フィルタ１２が異常状態にあると判断した場合
には、ステップＳ７にて当該フレームの学習用ｆｌａｇ
に「−１」をセットしておく。なお、ｆｌａｇ＝−１
は、フィルタ異常により学習不可であることを示してい
る。Here, if the talker is also in a talking state, it is a double talk and learning is not possible.
At 6, the S / N ratio tmp between the residual error signal and the reception signal
And the S / N ratio between the residual error signal and the transmission signal = log10
(Pe / py), and both are constants b
Is exceeded, that is, tmp> b & l
If og10 (pe / py)> b, it is determined that the adaptive filter 12 is not operating normally and is in an abnormal state (malfunction). When the adaptive filter 12 is in an abnormal state, not only the echo cannot be canceled but also the echo may be amplified. Therefore, when it is determined that the adaptive filter 12 is in the abnormal state, the learning flag of the frame is determined in step S7.
Is set to "-1". In addition, flag = -1
Indicates that learning is not possible due to a filter abnormality.

【００３９】また、上記ステップＳ６において、上述し
た残留誤差信号と受話信号とのＳ／Ｎ比ｔｍｐと、残留
誤差信号と送話信号とのＳ／Ｎ比＝ｌｏｇ１０（ｐｅ／
ｐｙ）Ｓ／Ｎ比に問題がなければ、ダブルトーク（送話
側と受話側の両方が同時に通話している状態）であると
判断する。ダブルトークの場合には、ステップＳ４にて
当該フレームの学習用ｆｌａｇに「０」をセットして、
学習不可であることを示しておく。In step S6, the S / N ratio tmp between the residual error signal and the received signal and the S / N ratio between the residual error signal and the transmitted signal = log10 (pe / pe
py) If there is no problem in the S / N ratio, it is determined that double talk (a state in which both the transmitting side and the receiving side are talking at the same time). In the case of double talk, in step S4, "0" is set to the learning flag of the frame, and
Show that learning is not possible.

【００４０】一方、上記ステップＳ３において、受話信
号のフレーム全体のパワーｐｘの値が定数Ｖの値よりも
高く（ｐｘ＞Ｖ）、かつ、上記ステップＳ５において、
送話信号のフレーム全体のパワーｐｙの値が定数Ｖ／２
の値以下である場合には（ｐｙ≦Ｖ／２）、受話者（相
手）のみが通話状態にある判断して、当該フレームの学
習を可とする。On the other hand, in step S3, the value of the power px of the entire frame of the received signal is higher than the value of the constant V (px> V), and in step S5,
The value of the power py of the entire frame of the transmission signal is a constant V / 2
Is smaller than or equal to (py ≦ V / 2), only the receiver (the other party) is determined to be in a talking state, and learning of the frame is enabled.

【００４１】その際、上記ステップＳ２で求めた残留誤
差信号と受話信号とのＳ／Ｎ比ｔｍｐをチェックし、そ
のＳ／Ｎ比ｔｍｐがフレームの可変閾値ｔに定数ａを加
えた値以上であれば（ｔｍｐ≧ｔ＋ａ）、ステップＳ９
にて当該フレームの学習用ｆｌａｇに「２」をセットす
る。また、そのＳ／Ｎ比ｔｍｐがフレームの可変閾値ｔ
に定数ａを加えた値よりも低ければ（ｔｍｐ＜ｔ＋
ａ）、ステップＳ１０にて当該フレームの学習用ｆｌａ
ｇに「１」をセットする。ｆｌａｇ＝１またはｆｌａｇ
＝２は、学習可の状態を示している。なお、ｆｌａｇ＝
２はフレーム全部の学習は危険だが、一部の周波数だけ
ならば学習可としても良い状態と考えられる。At this time, the S / N ratio tmp between the residual error signal and the received signal obtained in step S2 is checked, and if the S / N ratio tmp is equal to or larger than the value obtained by adding the constant a to the variable threshold value t of the frame. If there is (tmp ≧ t + a), step S9
Sets "2" to the learning flag of the frame. Also, the S / N ratio tmp is equal to the variable threshold value t of the frame.
Is smaller than the value obtained by adding the constant a to (tmp <t +
a) In step S10, the learning fla of the frame
Set “1” to g. flag = 1 or flag
= 2 indicates a state in which learning is possible. In addition, flag =
In the case of 2, the learning of the entire frame is dangerous, but it can be considered that learning is possible if only a part of the frequencies is possible.

【００４２】続いて、ステップＳ１１〜Ｓ１６にて、フ
レームの可変閾値ｔの更新処理を行う。Subsequently, in steps S11 to S16, a process of updating the variable threshold value t of the frame is performed.

【００４３】この可変閾値ｔの更新処理では、現在設定
されている可変閾値ｔと上記ステップＳ２で求めた残留
誤差信号と受話信号とのＳ／Ｎ比ｔｍｐの値とを比較す
ることにより、可変閾値ｔがＳ／Ｎ比ｔｍｐより低けれ
ば（ｔｍｐ＜ｔ）、ステップＳ１２で所定の計算式に従
って当該可変閾値ｔを更新する。また、Ｓ／Ｎ比ｔｍｐ
の値が０より低い場合には（ｔｍｐ＜０）、ステップＳ
１４〜Ｓ１６に示すように、更新時間Ｌ経過したとき
に、所定の計算式に従って当該可変閾値ｔを更新する。In the process of updating the variable threshold value t, the variable threshold value t which is currently set is compared with the value of the S / N ratio tmp of the received error signal and the residual error signal obtained in step S2 to obtain a variable value. If the threshold value t is lower than the S / N ratio tmp (tmp <t), the variable threshold value t is updated in step S12 according to a predetermined calculation formula. Also, the S / N ratio tmp
Is smaller than 0 (tmp <0), step S
As shown in 14 to S16, when the update time L has elapsed, the variable threshold value t is updated according to a predetermined calculation formula.

【００４４】このようにして、受話信号のパワーｐｘと
残留誤差のパワーｐｅ（＝周波数成分のパワー総和）、
送話信号のパワーｐｙからフレームｋが学習に適するか
否かを判定すると、次に周波数別の学習可否の判定処理
を行う。Thus, the power px of the received signal and the power pe of the residual error (= power sum of frequency components),
When it is determined whether or not the frame k is suitable for learning from the power py of the transmission signal, a process of determining whether or not learning is possible for each frequency is performed.

【００４５】すなわち、まず、ステップＳ１７で、現在
の学習対象となる周波数を示すｎの値を０クリアした状
態で、ｎ＝１〜Ｂ（Ｂ；周波数のグループ数）まで順次
インクリメントしながら、ステップＳ１０において、受
話信号の周波数成分のグループ別平均パワーＰｅ（ｎ）
と、残留誤差信号の周波数成分のグループ別平均パワー
Ｐｘ（ｎ）を求める。That is, first, in step S17, while the value of n indicating the current frequency to be learned is cleared to 0, n is sequentially incremented from 1 to B (B; the number of frequency groups). In S10, the average power Pe (n) of the frequency components of the received signal for each group
And the average power Px (n) for each group of the frequency components of the residual error signal.

【００４６】受話信号と留誤差信号の周波数成分のグル
ープ別平均パワーＰｅ（ｎ）、Ｐｘ（ｎ）が得られる
と、次のステップＳ２０にて上記フレーム単位でのｆｌ
ａｇのチェックを行う。その結果、ｆｌａｇが「１」よ
り低い値である場合（ｆｌａｇ＜１）、つまり、ｆｌａ
ｇ＝０またはｆｌａｇ＝−１である場合には、その周波
数グループｎでの学習はできないので、ステップＳ２１
で、周波数別の学習可否フラグｍｏｄｅ（ｎ）を「０」
にする。When the average powers Pe (n) and Px (n) of the frequency components of the received signal and the error signal are obtained for each group, fl is set in the frame unit in the next step S20.
Check ag. As a result, when flag is a value lower than “1” (flag <1), that is,
If g = 0 or flag = -1, learning is not possible in the frequency group n, and therefore, step S21 is performed.
Sets the learning enable / disable flag mode (n) for each frequency to "0".
To

【００４７】さらに、ステップＳ２２にてｆｌａｇ＝−
１であることを確認すると、適応フィルタ１２が異常状
態であるとして、ステップＳ２３でフレームの可変閾値
ｔおよび周波数グループ別の可変閾値Ｔ（ｎ）をそれぞ
れ０クリアすると共に、適応フィルタ１２の係数を０ク
リアして、適応フィルタ１２の異常状態に対処する（ｓ
ｅｃｔｉｏｎ２）。Further, at step S22, flag =-
When it is confirmed that the adaptive filter 12 is in an abnormal state, it is determined that the adaptive filter 12 is in an abnormal state. Clear to 0 to deal with the abnormal state of the adaptive filter 12 (s
section2).

【００４８】一方、上記ステップＳ２０において、上記
フレーム単位でのｆｌａｇが「１」以上である場合（ｆ
ｌａｇ≧１）、つまり、ｆｌａｇ＝１またはｆｌａｇ＝
２である場合には、次のステップＳ２４〜Ｓ３６で周波
数別学習可否の判定処理を行う（ｓｅｃｔｉｏｎ５）。
なお、学習の安全を期すならば、ｆｌａｇ＝１である場
合のみ、ｓｅｃｔｉｏｎ５の処理を行うようにしても良
い。On the other hand, in step S20, when the flag in the frame unit is "1" or more (f
lag ≧ 1), that is, flag = 1 or flag =
If it is 2, a determination process of whether or not learning is possible for each frequency is performed in the next steps S24 to S36 (section 5).
Note that if safety of learning is ensured, the processing of section 5 may be performed only when flag = 1.

【００４９】この場合、例えば２５６次のＦＦＴの演算
を行うと、２５６個の値が出現するが、そのうちの最初
の値は直流成分で、これは複素数ではなく、１つで単独
な値となる。そこで、本実施形態では、上述したよう
に、便宜上、直流成分と周期２サンプルの周波数成分と
に分け、ステップＳ２４〜Ｓ２７で直流成分に対する処
理を行い、ステップＳ２８〜Ｓ３２で通常の周波数成分
に対する処理、そして、ステップＳ３３〜３６で周期２
サンプルの周波数成分に対する学習可否の判定処理を行
っている。なお、ＦＦＴの場合には、周期２サンプルの
周波数成分についても複素数ではなく、実数になるた
め、ステップＳ３３〜３６に示すような処理が必要とな
る。In this case, for example, when a 256-order FFT operation is performed, 256 values appear, the first of which is a DC component, which is not a complex number but a single value. . Therefore, in the present embodiment, as described above, for convenience, the DC component and the frequency component having a period of two samples are divided, the DC component is processed in steps S24 to S27, and the normal frequency component is processed in steps S28 to S32. , And cycle 2 in steps S33 to S36.
The process of determining whether or not learning is possible for the frequency component of the sample is performed. In the case of FFT, since the frequency components of two samples in the period are not complex numbers but real numbers, the processing shown in steps S33 to S36 is required.

【００５０】ここで行われる周波数別学習可否の判定処
理は、基本的には上記フレーム単位での学習可否の判定
処理と同様である。The process of determining whether or not learning is possible for each frequency is basically the same as the process of determining whether or not learning is possible on a frame basis.

【００５１】すなわち、受話信号の周波数成分のグルー
プ別平均パワーＰｅ（ｎ）と、残留誤差信号の周波数成
分のグループ別平均パワーＰｘ（ｎ）とに基づいてＳ／
Ｎ比ｔｍｐの値を所定の計算式に従って求め（ステップ
Ｓ２４，Ｓ２８，Ｓ３３）、そのＳ／Ｎ比ｔｍｐが所定
の閾値より高ければ、学習に適さないものと判断して、
周波数別の学習可否フラグｍｏｄｅ（ｎ）を「０」にす
る（ステップＳ２６，Ｓ３０，Ｓ３５）。That is, based on the average power Pe (n) of the frequency components of the received signal per group and the average power Px (n) of the frequency components of the residual error signal per group.
The value of the N ratio tmp is obtained according to a predetermined formula (steps S24, S28, S33). If the S / N ratio tmp is higher than a predetermined threshold, it is determined that the learning is not suitable, and
The learning enable / disable flag mode (n) for each frequency is set to "0" (steps S26, S30, S35).

【００５２】また、上記Ｓ／Ｎ比ｔｍｐが所定の閾値よ
り低ければ、学習に適するものと判断して、周波数別の
学習可否フラグｍｏｄｅ（ｎ）を「１」にする（ステッ
プＳ２７，Ｓ３６）。その際、ステップＳ３１におい
て、周波数別の学習可否フラグｍｏｄｅ（ｎ）を「１」
にしたときに、学習可とされた周波数の数Ｃａをカウン
トしておく。If the S / N ratio tmp is lower than a predetermined threshold value, it is determined that the learning is suitable for learning, and the learning enable / disable flag mode (n) for each frequency is set to "1" (steps S27 and S36). . At this time, in step S31, the learning enable / disable flag mode (n) for each frequency is set to "1".
, The number Ca of frequencies at which learning is permitted is counted.

【００５３】このようにして、周波数別に学習可否の判
定を行うと、ステップＳ３７〜Ｓ３９にて、学習可とさ
れた周波数の数Ｃａに基づいて当該フレームの全周波数
に対する学習可否を決定する（ｓｅｃｔｉｏｎ３）。When the determination as to whether or not learning is possible for each frequency is made in this way, it is determined in steps S37 to S39 whether or not learning is possible for all frequencies of the frame based on the number Ca of frequencies for which learning is possible (section 3). ).

【００５４】すなわち、ステップＳ３７にて、学習可と
された周波数の数Ｃａと定数Ｍ１とを比較する。その結
果、学習可の周波数の数Ｃａが定数Ｍ１よりも多ければ
（Ｃａ＞Ｍ１）、学習全体としてのスピードを上げるた
めに、ステップＳ３８で当該フレームの全周波数に対す
る学習を行うものと決定し、周波数別学習可否ｍｏｄｅ
（ｉ）＝１とする。That is, in step S37, the number Ca of frequencies for which learning is possible is compared with a constant M1. As a result, if the number Ca of learnable frequencies is larger than the constant M1 (Ca> M1), it is determined in step S38 that learning is performed on all frequencies of the frame in order to increase the speed of the entire learning. Mode of learning by frequency mode
(I) = 1.

【００５５】一方、学習可とされた周波数の数Ｃａが定
数Ｍ１以下である場合には（Ｃａ≦Ｍ１）、さらにステ
ップＳ３９にて、学習可の周波数の数Ｃａと定数Ｍ０と
を比較することにより、学習可の周波数の数Ｃａが定数
Ｍ０よりも少なければ（Ｃａ＜Ｍ０）、誤学習の可能性
が高いので、ステップＳ４０で当該フレームの全周波数
に対する学習を行わないものと決定し、周波数別学習可
否フラグｍｏｄｅ（ｉ）＝０とする。On the other hand, if the number Ca of frequencies that can be learned is less than or equal to the constant M1 (Ca ≦ M1), the number Ca of frequencies that can be learned is compared with the constant M0 in step S39. Therefore, if the number Ca of the learnable frequencies is smaller than the constant M0 (Ca <M0), the possibility of erroneous learning is high, so that it is determined in step S40 that learning is not performed for all the frequencies of the frame. It is assumed that another learning availability flag mode (i) = 0.

【００５６】ここで、Ｍ１＞Ｍ０の関係にあり、例えば
周波数が１２８個ある場合には、Ｍ１＝１００、Ｍ０＝
１０といった値に設定され、学習可の周波数が１００個
よりも多ければ、当該フレームの全周波数に対する学習
をＯＮにし、逆に学習可の周波数が１０個よりも少なけ
れば、当該フレームの全周波数に対する学習をＯＦＦに
する。また、学習可の周波数が１００〜１０個の間であ
れば、前回の学習結果のままとする。Here, there is a relationship of M1> M0. For example, when there are 128 frequencies, M1 = 100 and M0 =
If the number of learnable frequencies is set to 10 or more and the number of learnable frequencies is more than 100, the learning for all the frequencies of the frame is turned ON. Turn off learning. If the frequency at which learning is possible is between 100 and 10, the previous learning result is left as it is.

【００５７】このように、当該フレームに対して、学習
を行うと決定した周波数の個数が一定値より多い場合は
全ての周波数について学習を行い、学習を行うと決定し
た周波数の個数が一定値より少ない場合は全ての周波数
について学習を行わないものとすることで、各周波数の
相関関係を考慮したフレーム全体としての学習処理を行
うことができる。なお、ここでのステップＳ３７〜Ｓ３
９の処理（ｓｅｃｔｉｏｎ３）は、特定の周波数に対す
る学習判定結果がその前後の周波数に対する学習判定結
果と異なるような例外的な場合のために設けられたもの
であり、必ずしも必要ではない。As described above, when the number of frequencies determined to be learned for the frame is larger than a certain value, learning is performed for all the frequencies, and the number of frequencies determined to be learned is smaller than the certain value. If the number is small, the learning is not performed for all the frequencies, so that the learning processing of the entire frame in consideration of the correlation between the frequencies can be performed. Steps S37 to S3 here
The process 9 (section 3) is provided for an exceptional case in which the learning determination result for a specific frequency is different from the learning determination results for the preceding and following frequencies, and is not always necessary.

【００５８】最後に、周波数グループ別の閾値Ｔ（ｎ）
の更新処理を行う。Finally, the threshold value T (n) for each frequency group
Update processing.

【００５９】すなわち、ステップＳ４１で、学習対象と
なる周波数を示すｎの値を０クリアし、ステップＳ４２
で、受話信号の周波数成分のグループ別平均パワーＰｅ
（ｎ）と、残留誤差信号の周波数成分のグループ別平均
パワーＰｘ（ｎ）とに基づいて、そのＳ／Ｎ比ｔｍｐの
値を次のような計算式に従って求める。That is, in step S41, the value of n indicating the frequency to be learned is cleared to 0, and in step S42
, And the average power Pe of the frequency components of the received signal for each group
Based on (n) and the average power Px (n) of the frequency components of the residual error signal for each group, the value of the S / N ratio tmp is obtained according to the following formula.

【００６０】ｔｍｐ＝ｌｏｇ１０（Ｐｅ（ｎ）／Ｐｘ（ｎ））そして、上記のようにして求めた受話信号の周波数成分
のグループ別平均パワーＰｅ（ｎ）と、残留誤差信号の
周波数成分のグループ別平均パワーＰｘ（ｎ）とのＳ／
Ｎ比ｔｍｐの値が現在設定されている閾値Ｔ（ｎ）より
も低く、かつ、残留誤差信号の周波数成分のグループ別
平均パワーＰｘ（ｎ）の値が定数ＶＢよりも高い場合に
は（ｔｍｐ＜Ｔ（ｎ）＆Ｐｘ（ｎ）＞ＶＢ）、ステップ
Ｓ４４で所定の計算式に従って当該可変閾値（ｎ）を更
新する。Tmp = log10 (Pe (n) / Px (n)) Then, the average power Pe (n) for each group of the frequency components of the received signal obtained as described above, and the group of the frequency components of the residual error signal S / with other average power Px (n)
If the value of the N ratio tmp is lower than the currently set threshold value T (n) and the value of the group-wise average power Px (n) of the frequency component of the residual error signal is higher than the constant VB, (tmp <T (n) & Px (n)> VB), the variable threshold value (n) is updated in step S44 according to a predetermined calculation formula.

【００６１】続いて、ステップＳ４５〜Ｓ４８におい
て、周波数グループ単位での閾値Ｔ（ｎ）の更新処理
（ｓｅｃｔｉｏｎ４）を行う。Subsequently, in steps S45 to S48, an update process (section 4) of the threshold value T (n) is performed for each frequency group.

【００６２】まず、ステップＳ４５において、上記受話
信号の周波数成分のグループ別平均パワーＰｅ（ｎ）
と、残留誤差信号の周波数成分のグループ別平均パワー
Ｐｘ（ｎ）とのＳ／Ｎ比ｔｍｐと、当該フレームの学習
用ｆｌａｇの状態をチェックする。その結果、Ｓ／Ｎ比
ｔｍｐが０より低く、かつ、ｆｌａｇが「１」である場
合には（ｔｍｐ＜０＆ｆｌａｇ＝１）、ステップＳ４６
〜Ｓ４８に示すように、更新時間Ｌ経過したときに、所
定の計算式に従って当該可変閾値Ｔ（ｎ）を更新する。First, in step S45, the average power Pe (n) for each group of the frequency components of the received signal is described.
Then, the S / N ratio tmp of the average power Px (n) of the frequency component of the residual error signal with respect to each group and the state of the learning flag of the frame are checked. As a result, if the S / N ratio tmp is lower than 0 and the flag is “1” (tmp <0 & flag = 1), step S46
As shown in S48 to S48, when the update time L has elapsed, the variable threshold T (n) is updated according to a predetermined calculation formula.

【００６３】以後、ステップＳ４９で、処理対象となる
周波数の数ｎを更新しながら、上記更新処理を所定のグ
ループ数Ｂ分繰り返す。そして、ステップＳ５０にて、
ｎ＝Ｂになったことを確認すると、上記ステップＳ２に
戻り、上記同様の処理を再び行う。Thereafter, in step S49, the updating process is repeated for a predetermined number B of groups while updating the number n of frequencies to be processed. Then, in step S50,
When it is confirmed that n = B, the process returns to step S2, and the same processing as above is performed again.

【００６４】以上の処理を簡単に求めると、ダブルトー
ク検出部１７では、まず、受話信号のフレーム全体のパ
ワーｐｘと残留誤差のフレーム全体のパワーｐｅ、送話
信号のフレーム全体のパワーｐｙに基づいてフレームｋ
が学習に適するか否かを判定する（ステップＳ１〜Ｓ１
６）。When the above processing is simply obtained, the double talk detecting unit 17 first determines the power px of the entire frame of the received signal, the power pe of the entire frame of the residual error, and the power py of the entire frame of the transmitted signal. Frame k
Is determined to be suitable for learning (steps S1 to S1).
6).

【００６５】次に、フレームの周波数成分を複数にグル
ープ分けし、残留誤差信号の周波数成分のパワーをグル
ープ毎に平均化した値を計算しておき（ステップＳ１７
〜Ｓ１９）、周波数成分ｎに対し、グループ平均残留誤
差信号のパワーと受話信号のｎ成分のパワーから周波数
ｎの学習の可否を判定する（ステップＳ２０〜Ｓ３
６）。Next, the frequency components of the frame are divided into a plurality of groups, and the average value of the power of the frequency components of the residual error signal is calculated for each group (step S17).
To S19), for the frequency component n, it is determined whether the learning of the frequency n is possible or not from the power of the group average residual error signal and the power of the n component of the received signal (steps S20 to S3).
6).

【００６６】そして、学習可と判定された周波数の数に
応じて、一定値以上なら全周波数の学習を可とし、一定
値より少ない場合には全周波数の学習を不可とする（ス
テップＳ３７〜Ｓ４０）。最後に可変閾値の更新を行う
（ステップＳ４１〜Ｓ５０）。Then, according to the number of frequencies determined to be learnable, learning of all frequencies is allowed if the value is equal to or more than a certain value, and learning of all frequencies is disabled if the value is less than the certain value (steps S37 to S40). ). Finally, the variable threshold is updated (steps S41 to S50).

【００６７】このように、各周波数毎に学習の可否を独
立に決定し、決定時の参考値の一部にグループ平均を用
いることで、周波数別の学習機会を増やし誤学習を減ら
すことができ、周波数別ダブルト−ク検出の性能を高め
ることができる。As described above, by independently determining whether or not learning is possible for each frequency and using the group average as a part of the reference value at the time of determination, it is possible to increase learning opportunities for each frequency and reduce erroneous learning. In addition, the performance of frequency-dependent double-talk detection can be improved.

【００６８】なお、フレームとしての可否判断部分（ｓ
ｅｃｔｉｏｎ１）はなくても良い。その場合は、フレー
ム学習用のｆｌａｇの値に関わりなく、それ以降の処理
を行うことになる。It should be noted that a portion (s
The option 1) may be omitted. In that case, subsequent processing is performed irrespective of the value of the flag for frame learning.

【００６９】また、ｆｌａｇ＝−１の場合、つまり、フ
ィルタ異常の場合に可変閾値と適応フィルタ係数の０ク
リアを行うが、この処理（ｓｅｃｔｉｏｎ２）は必ずし
も必要ではない。When flag = −1, that is, when the filter is abnormal, the variable threshold value and the adaptive filter coefficient are cleared to 0, but this processing (section 2) is not always necessary.

【００７０】また、学習可である周波数の数によりｍｏ
ｄｅを変更する処理（ｓｅｃｔｉｏｎ３）も削除しても
良い。最後の受話信号と残留誤差のパワー比の最小値に
よる可変閾値の更新処理（ｓｅｃｔｉｏｎ４）も必ず必
要なものではない。Further, mo is determined by the number of frequencies that can be learned.
The process of changing de (section 3) may also be deleted. The process of updating the variable threshold based on the minimum value of the power ratio between the last received signal and the residual error (section 4) is not always necessary.

【００７１】また、上記ｓｅｃｔｉｏｎ５において、ｔ
ｍｐの計算式を全て、ｔｍｐ＝ｌｏｇ１０（Ｐｅ（ｎ／ＢＮ）／Ｐｘ（ｎ／Ｂ
Ｎ））とすることもできる。この場合、同一グループ内の可否
フラグ（ｍｏｄｅ（ｎ））は同じ値になる。つまり、フ
レームの周波数成分をいくつかのグループに分けた場合
に、各周波数の適応フィルタ学習の可否をグループ内で
は同一であるとする。グループの数によってはこの方が
誤学習しにくい場合がある。In the above section 5, t
mp = log10 (Pe (n / BN) / Px (n / B
N)). In this case, the availability flag (mode (n)) in the same group has the same value. That is, when the frequency components of the frame are divided into several groups, it is assumed that the adaptive filter learning of each frequency is the same in the group. Depending on the number of groups, this may be less likely to cause erroneous learning.

【００７２】以上は文献に従いＦＦＴ次数が送話フレー
ム長の２倍である場合を説明したが、この他の長さでも
同様に実施できる。例えばＦＦＴ次数２５６、送話フレ
ーム長８０サンプル等でもよい。このときは受話フレー
ムのオーバーラップ長が１７６とすれば同様である。Although the case where the FFT order is twice the length of the transmission frame has been described in accordance with the literature, the present invention can be similarly applied to other lengths. For example, the FFT degree may be 256 and the transmission frame length may be 80 samples. At this time, the same applies if the overlap length of the reception frame is 176.

【００７３】[0073]

【発明の効果】以上のように本発明によれば、周波数領
域型のエコーキャンセラにおいて、受話信号、送話信
号、疑似エコー信号と送話信号との残留誤差信号に基づ
いて適応フィルタの当該フレームに対する学習処理を行
う場合に、その学習の可否を各周波数毎に独立に決定
し、また、決定時の参考値の一部にグループ平均を用い
ることで、周波数別の学習機会を増やし誤学習を減らす
ことができ、周波数別ダブルト−ク検出の性能を高め
て、周波数単位でのエコー打消し量の劣化を防ぐことが
できる。As described above, according to the present invention, in a frequency-domain echo canceller, a frame of an adaptive filter is determined based on a received error signal, a transmission signal, a residual error signal between a pseudo echo signal and a transmission signal. In the case of performing the learning process for, the possibility of the learning is determined independently for each frequency, and the group average is used as a part of the reference value at the time of determination, so that the learning opportunity for each frequency is increased and the erroneous learning is performed. Thus, the performance of double-talk detection for each frequency can be improved, and deterioration of the echo cancellation amount in frequency units can be prevented.

[Brief description of the drawings]

【図１】本発明の一実施形態に係る周波数領域型のエコ
ーキャンセラの構成を示すブロック図。FIG. 1 is a block diagram showing a configuration of a frequency-domain echo canceller according to an embodiment of the present invention.

【図２】上記周波数領域型のエコーキャンセラに用いら
れるダブルトーク検出部のアルゴリズムを示したフロー
チャート（その１）。FIG. 2 is a flowchart (part 1) illustrating an algorithm of a double talk detecting unit used in the frequency-domain echo canceller.

【図３】上記周波数領域型のエコーキャンセラに用いら
れるダブルトーク検出部のアルゴリズムを示したフロー
チャート（その２）。FIG. 3 is a flowchart (part 2) illustrating an algorithm of a double talk detecting unit used in the frequency domain echo canceller.

【図４】上記周波数領域型のエコーキャンセラに用いら
れるダブルトーク検出部のアルゴリズムを示したフロー
チャート（その３）。FIG. 4 is a flowchart (part 3) illustrating an algorithm of a double talk detection unit used in the frequency-domain echo canceller.

【符号の説明】１０…フレーム作成部１１…フーリエ変換部（ＦＦＴ）１２…適応フィルタ（ＡＦ）１３…逆フーリエ変換部（ＩＦＦＴ）１４…フレーム作成部１５…フレーム合成部１６…フーリエ変換部（ＦＦＴ）１７…ダブルトーク検出部（ＤＴＤ）ｘ…受話信号ｙ…送話信号ｚ…残留誤差信号ｐｘ…受話信号のフレーム全体のパワーｐｙ…送話信号のフレーム全体のパワーｐｚ…残留誤差信号のフレーム全体のパワー[Description of Signs] 10: Frame creation unit 11: Fourier transform unit (FFT) 12: Adaptive filter (AF) 13: Inverse Fourier transform unit (IFFT) 14: Frame creation unit 15: Frame synthesis unit 16: Fourier transform unit ( FFT) 17: Double talk detector (DTD) x: Received signal y: Transmitted signal z: Residual error signal px: Power of the entire frame of the received signal py: Power of the entire frame of the transmitted signal pz: Residual error signal Power of the whole frame

Claims

[Claims]

1. A reception signal is frequency-converted in frame units, a pseudo echo signal is generated from the reception signal by using a different adaptive filter for each frequency, and the pseudo echo signal is frequency-inverted and converted from a transmission signal. An adaptive filter learning method used in a frequency domain echo canceller for removing an echo component included in the transmission signal by subtraction, wherein the reception signal, the transmission signal, the pseudo echo signal, and the transmission A learning process for the frame of the adaptive filter based on a residual error signal with respect to the signal, and determining whether or not the learning is possible is performed independently for each frequency.

2. After determining whether or not the frame is suitable for learning based on the power of the entire frame of the reception signal, the transmission signal, and the residual error signal, if the frame is suitable for learning, 2. The adaptive filter learning method according to claim 1, further comprising determining whether or not the frequency can be learned from the frequency component power of the received signal, the transmitted signal, and the residual error signal for each frequency.

3. The frequency component of the frame is divided into several groups, a value obtained by averaging the power of the frequency component of the residual error signal for each group, a threshold value set for each group, and a frequency of the received signal. The adaptive filter learning method according to claim 1, wherein whether the adaptive filter can learn the frame is determined using the power of the component.

4. When the number of frequencies determined to be learnable is larger than a certain value, learning is performed for all frequencies of the frame. When the number of frequencies determined to be learnable is smaller than a certain value, The adaptive filter learning method according to claim 1, wherein learning is not performed for all frequencies of the frame.

5. The adaptive filter learning method according to claim 1, wherein the frequency components of the frame are divided into several groups, and the determination result as to whether or not learning is possible is the same in each group.