JP3228940B2

JP3228940B2 - Method and apparatus for reducing residual far-end echo in voice communication networks

Info

Publication number: JP3228940B2
Application number: JP52683696A
Authority: JP
Inventors: マイケル，ジュニヤヴェラード，パトリック; デールワイン，ウッドソン
Original assignee: AT&T Corp
Current assignee: AT&T Corp
Priority date: 1995-03-03
Filing date: 1996-01-11
Publication date: 2001-11-12
Anticipated expiration: 2016-01-11
Also published as: CA2189489C; JPH09512980A; WO1996027951A1; CA2189489A1; US5587998A; CN1149945A; EP0759235A1; PL317068A1; BR9605921A; PL179971B1

Description

【発明の詳細な説明】発明の分野本発明は、通信ネットワークにおける音声信号を処理
するための技術に関し、特に、遠端のエコーを抑制する
ための処理技術に関するものである。Description: FIELD OF THE INVENTION The present invention relates to a technology for processing an audio signal in a communication network, and more particularly to a processing technology for suppressing a far-end echo.

発明の背景音声通信ネットワークにおいては多くの場合、遠端に
おいては近端の話者にその音声伝達の複製物が遅延して
戻るという迷惑な性向があることが長い間認識されてい
る。このような遠端のエコーは、約40msあるいはそれよ
り長い遅延で生じた場合に特に迷惑なものとなり、これ
は、このようなエコーは近端の話者が気忙しい騒音と顕
著に感じる傾向があるためである。よって、遠端エコー
は特に、その動作がこのように比較的大きな遅れを伴う
タイプのネットワークに対しては大きな問題となる。こ
れには、通信衛星のネットワーク、並びに少なくとも音
声の符号化および圧縮を行うネットワークが含まれる。BACKGROUND OF THE INVENTION It has long been recognized in voice communication networks that at the far end the annoying propensity of the near-end speaker to return a delayed copy of the voice transmission is delayed. Such far-end echoes can be particularly annoying if they occur with a delay of about 40ms or longer, since such echoes tend to make the near-end speaker noticeable as busy noise That's why. Thus, far-end echoes are a significant problem, especially for those types of networks whose operation involves such a relatively large delay. This includes networks of communication satellites, as well as networks that at least encode and compress speech.

実際には、話者が、近端に意図しないで戻るその準通
話要素（near speech component）を抑制ないし無効に
することを可能とする装置も利用可能である。しかしな
がら、遠端の話者がこのような装置を使用しない場合が
ある。さらに、このようなエコー抑制ないしエコー相殺
装置が遠端において使用された場合でも、エコーを取り
除くことについては完全に効果があるとは言えない。よ
って、多くの場合には、近端に少なくとも残留のエコー
が戻ってしまう。In practice, devices are also available that allow the speaker to suppress or disable its near speech component that returns unintentionally to the near end. However, the far end speaker may not use such a device. Further, even when such echo suppression or echo cancellation devices are used at the far end, the removal of echo is not completely effective. Therefore, in many cases, at least the residual echo returns to the near end.

この結果、近端の話者はしばしば、遠隔の通信ネット
ワークを通る往復後に、近端の話者に戻るこれら準通話
要素を減じることができるように動作することが望まれ
る。As a result, it is often desirable for the near-end speaker to operate in such a way as to reduce these quasi-talk elements returning to the near-end speaker after a round trip through the remote communication network.

エコーを減じるための初期の非線形のプロセッサが、
O.M.Mracek MitchellおよびD.A.Berkleyによる「A Ful
l−Duplex Echo Supressor Using Center−Clipping」
と題されたBell System Technical Journal 50（1971）
の第1619−1630頁に説明されている。この文献が出版さ
れた当時は、エコーキャンセラはまた使用されていなか
った。この文献においては、遠端（つまり、受信端）に
おいて従来の（出版の際における）エコー抑制器に代替
するためにスタンドアローンの装置として使用されるサ
ブバンドのセンタークリッパが説明されている。このセ
ンタークリッパーは大きなエコー遅延がある場所には適
用できないものである。Early non-linear processors to reduce echo
"A Ful" by OMMracek Mitchell and DABerkley
l-Duplex Echo Supressor Using Center-Clipping ''
Bell System Technical Journal 50 (1971)
Pp. 1619-1630. At the time this document was published, echo cancellers were not used again. This document describes a sub-band center clipper that is used as a stand-alone device at the far end (ie, at the receiving end) to replace the conventional (at the time of publication) echo suppressor. This center clipper is not applicable where there is a large echo delay.

Younceなどに付与された米国特許第５、274、705号に
は、遠端（受信端）において装置を使用して残留エコー
を抑制するための最近の提案が記載されている。これに
よれば、従来のエコーキャセラによって完全に除去でき
なかったエコーを、非線形プロセッサにより、さらに取
り除くことができる。この非線形プロセッサは、バック
グランドノイズの推定値がフルバンドの、ノイズ透明度
のしきい値を設定するために使用される。このしきい値
より下にある送信は残留エコーを取り除くために透過さ
れて、バックグランドノイズの不自然な音響のとぎれを
防止している。この技術はまた、エコー通路に対する推
定された利得に基づいて、フルバンドのセンタークリプ
に対する時間変化するしきい値を設定するために、エコ
ーの複製中のエネルギーを使用している。U.S. Pat. No. 5,274,705 to Younce et al. Describes a recent proposal for using a device at the far end (receiving end) to suppress residual echo. According to this, the echo that cannot be completely removed by the conventional echo canceller can be further removed by the nonlinear processor. This non-linear processor is used to set a threshold value for noise clarity where the background noise estimate is full band. Transmissions below this threshold are passed through to remove residual echo, preventing unnatural acoustic breaks in background noise. This technique also uses the energy in replicating the echo to set a time-varying threshold for full band center clip based on the estimated gain for the echo path.

Younceの技術は、いくつかの場合には、エコー制御を
満足のいく程度に達成することができない。例えば、セ
ンタークリップのプロセスにおいて残存する残留エコー
は全周波数レンジにわたって拡がっており、従って信号
対ノイズ比が非常に低い場合であっても音声として認識
されてしまう（したがって抽出することができない）。
さらに、電力線のハムのような狭い帯域のノイズが、全
周波数帯域でのノイズ透過しきい値を上昇させる傾向に
あるため、フルバンドのノイズ透過は欠点がある。これ
により、限られた周波数レンジにおいてだけしかノイズ
によりマスクされてエコーが透過してしまう。Younce's technology cannot achieve satisfactory echo control in some cases. For example, the residual echo remaining in the process of center clip is spread over the entire frequency range and is therefore perceived as speech even if the signal-to-noise ratio is very low (and therefore cannot be extracted).
Furthermore, full-band noise transmission is disadvantageous because narrow-band noise, such as power line hum, tends to increase the noise transmission threshold in all frequency bands. As a result, only in a limited frequency range, the echo is masked by the noise and the echo is transmitted.

当分野における実務者には、近端（送信端）に設置さ
れた装置を、ローカルネットワークとリモートネットワ
ークの間を往復するエコーの伝送により引き起こされる
遅延を補償する場合であっても、遠端のエコーを減じる
ために使用できることが理解できる。例えば、J.Portel
iによる国際特許出願PCT/AU93/00626号（国際公開番号W
O94/14248号）には、従来のエコーキャンセラを近端
（送信端）に使用することが記載されている。準音声の
伝送と無効にすべきエコーの到達との間には大きな遅延
があるので、このエコーキャンセラは、設置前に、固定
の補償遅延を与えるようにプログラミングされた、遅延
装置とともに動作される。このエコーキャンセラにおい
ては、フルバンドの適合型走査フィルタ（adaptive tra
nsversal filter）はエコーの減法複製（subtractive r
eplica）を発生する。しかしながら、特定の要因により
このシステムが完全に満足のいく補償を行うことができ
ない。例えば、エコーの複製の正確さは回線ノイズによ
り制限されてしまう。これによりエコーキャンセラの効
率が減じられる。さらに、ローカルネットワークとリモ
ートネットワークの間の回路の多重性ないし圧縮がエコ
ー信号の一部を破壊してしまい、完全な抑制ができなく
なる。このシステムはまた、位相ロール（例えば、アナ
ログ伝送能力からの）により、あるいはデジタル伝送シ
ステム内の音声符号器によりもたらされる量子化ノイズ
および非線形性により性能が低下してしまう。Practitioners in the art will recognize that the equipment located at the near end (transmitting end) may require the equipment located at the far end to compensate for the delay caused by the transmission of echoes back and forth between the local and remote networks. It can be seen that it can be used to reduce echo. For example, J.Portel
International Patent Application No. PCT / AU93 / 00626 (International Publication No. W
O94 / 14248) describes using a conventional echo canceller at the near end (transmitting end). Because there is a large delay between the transmission of the quasi-voice and the arrival of the echo to be nullified, this echo canceller is operated before installation with a delay device programmed to provide a fixed compensation delay. . In this echo canceller, a full-band adaptive scanning filter (adaptive tra
The nsversal filter is a subtractive r
eplica). However, certain factors do not allow this system to provide fully satisfactory compensation. For example, the accuracy of echo replication is limited by line noise. This reduces the efficiency of the echo canceller. Furthermore, the multiplicity or compression of the circuit between the local network and the remote network destroys a part of the echo signal and cannot be completely suppressed. The system also suffers from phase rolls (eg, from analog transmission capabilities) or from quantization noise and non-linearities introduced by speech encoders in digital transmission systems.

よって、エコー制御の分野における実務者は、残留す
る遠端エコーを減じるためにローカルネットワーク内で
採用することができる、完全に満足のいく方法を提供す
ることができなかった。Thus, practitioners in the field of echo control have not been able to provide a completely satisfactory method that can be employed in local networks to reduce residual far-end echo.

発明の要約本発明は、ローカルネットワーク内で達成することが
できる、改良された非線形処理装置と方法を提供するも
のである。本発明の方法は、エコーが非常に大きな伝達
遅延で戻るときでも、遠隔通信ネットワークから残留エ
コーを取り除く点において非常に効率的である。本発明
の方法は、ラインノイズ、並びに遠隔の、非線形の処理
により遠隔のネットワーク内に伝達される歪みに対して
強い。本発明の方法はまた、位相ロール、並びに従来の
エコーキャンセラの収束を劣化させるの種々の問題に対
して比較的に感応し難い。SUMMARY OF THE INVENTION The present invention provides an improved nonlinear processing apparatus and method that can be achieved in a local network. The method of the present invention is very efficient in removing residual echo from telecommunications networks, even when the echo returns with a very large propagation delay. The method of the present invention is robust against line noise as well as distortion transmitted into the remote network by remote, non-linear processing. The method of the present invention is also relatively insensitive to the various problems of degrading the convergence of phase rolls, as well as conventional echo cancellers.

広い意味では、本発明は、遠い場所からネットワーク
内に伝送されるとともに、近い場所においてネットワー
クから受信される音声通信におけるエコーを減じるもの
である（「遠い」および「近い」の用語は制限的なもの
ではなく、二方向通信のための経路の反対の端を示した
ものである）。In a broad sense, the present invention reduces echo in voice communications transmitted from a remote location into a network and received from a network at a near location (the terms "far" and "close" are restrictive). But not the opposite end of the path for two-way communication).

本発明によれば、広い意味において、近い位置におい
てネットワーク内に伝送された信号は、適当な信号処理
装置により、「近接地からの入来信号」として受信され
る。遠い位置からネットワーク内に伝送された信号は、
同じ処理装置により「遠隔地からの入来信号」として受
信される。近接地からの入来信号と遠隔地からの入来信
号は比較されて、「エコー経路遅延」と称される量のた
めの値EPDが作られる。このEPDは、同様な情報を含むこ
れら近接地からの入来信号と遠隔地からの入来信号の部
分の間における相対的な時間遅延の尺度となる。According to the present invention, in a broad sense, signals transmitted into the network at a close location are received by a suitable signal processing device as "incoming signals from nearby areas". Signals transmitted into the network from distant locations
It is received by the same processor as an "incoming signal from a remote location". The incoming signal from the near area and the incoming signal from the remote area are compared to produce a value EPD for an amount called "echo path delay". This EPD is a measure of the relative time delay between the incoming signal from these nearby locations and the portion of the incoming signal from a remote location containing similar information.

近接地からの入来信号はEPDに等しい遅延を受け、こ
れにより、近接地からの入来信号と遠隔地からの入来信
号信号が一時的に位置合わせされる。次いで、近接地か
らの入来信号と遠隔地からの入来信号は複数のサブバン
ド要素とそれぞれ部分的に別々に分解される。The incoming signal from the nearby area experiences a delay equal to the EPD, thereby temporarily aligning the incoming signal from the nearby area and the incoming signal signal from the remote area. The incoming signal from the nearby area and the incoming signal from the remote area are then partially and separately decomposed into a plurality of subband elements, respectively.

係数信号（modulus signal）が次いで、近接地からの
入来信号のそれぞれのサブバンド要素から導出される。
つまり、これらサブバンド信号のそれぞれの絶対値が平
滑化され、サブバンド信号のRMSエネルギー包絡（energ
y envelope）に比例する波形が得られる。これらの波形
のそれぞれは、次いで、エコー損失推定値にしたがって
減衰される。得られた波形は、本明細書において、「テ
ンプレート」と称されるもので、予測されたエコー波形
の包絡を表すものである。A modulus signal is then derived from each subband component of the incoming signal from the nearby area.
That is, the absolute value of each of these sub-band signals is smoothed, and the RMS energy envelope (energ
y envelope). Each of these waveforms is then attenuated according to the echo loss estimate. The resulting waveform is referred to herein as a "template" and represents the envelope of the predicted echo waveform.

遠隔地からの入来信号の各サブバンド要素は次いで、
エコーであると推定される弱い信号を取り除く目的で、
センタークリップ動作を受ける。テンプレートはこれら
の弱い信号を区別するためのしきい値（本明細書におい
ては、後述する理由で、「上側の」しきい値と称する）
である。つまり、遠隔地からの入来信号信号のそれぞれ
は、そのそれぞれのテンプレートの並行値を越えた場合
において、少なくとも部分的に伝送される。Each subband element of the incoming signal from the remote location is then
To remove weak signals that are presumed to be echoes,
Receives center clip operation. The template provides a threshold for distinguishing these weak signals (referred to herein as the "upper" threshold for reasons described below).
It is. That is, each incoming signal signal from a remote location is at least partially transmitted if it exceeds the parallel value of its respective template.

センタークリップの後、遠隔地からの入来信号のサブ
バンド要素は結合されて、合成された、フルバンドの、
出力信号が生成される。After the center clip, the sub-band components of the incoming signal from the remote location are combined and combined, full-band,
An output signal is generated.

本発明の好ましい実施例においては、「より低い」し
きい値と称される、第２のしきい値が含まれている。よ
り低いしきい値は、しばしば「ノイズポンピング」と称
される厄介なバックグラウンド効果を抑制するために有
用である。これは、遠端からのラインノイズあるいは他
のバックグラウンドノイズが近端の音声により変調され
たときに発生し、往復ポンプに似た中間の音を生成す
る。クリッピング操作の後において、この効果をノイズ
エネルギーの量を制御することによりマスクすることは
良く知られている。しかしながら、放出されたノイズは
一般的には実際のバックグラウンドノイズの周波数歪み
とは合致することに乏しく、よって、完全に有効なマス
クをすることは困難である。In the preferred embodiment of the present invention, a second threshold, called the "lower" threshold, is included. Lower thresholds are useful to suppress annoying background effects, often referred to as "noise pumping". This occurs when line noise or other background noise from the far end is modulated by the near end voice and produces an intermediate sound similar to a reciprocating pump. It is well known to mask this effect by controlling the amount of noise energy after the clipping operation. However, the noise emitted is generally poorly matched to the frequency distortion of the actual background noise, making it difficult to provide a completely effective mask.

対照的に、本発明の好ましい実施例では、ノイズフロ
アを表す、より低いしきい値より下にある、センターク
リッパを、サブバンド要素を伝送するために配列した。
より低いしきい値は各サブバンド要素に対して別々に決
定されるので、実際のノイズスペクトルに対する良好な
合致は狭帯域のラインノイズが存在する場合でも達成す
ることができる。In contrast, in the preferred embodiment of the present invention, a center clipper, below the lower threshold, representing the noise floor, was arranged for transmitting the subband elements.
Since the lower threshold is determined separately for each subband element, a good match to the actual noise spectrum can be achieved even in the presence of narrowband line noise.

より低いしきい値はそれぞれ、遠端のそれぞれのサブ
バンド要素から導出される。遠隔地からの入来信号の信
号の絶対値は緩やかに上り、速やかに減衰するスムーザ
を使用して平滑化される。この工程によりサブバンドの
ノイズフロアの推定値が生成され、またより低いしきい
値に等しく設定される。このより低いしきい値より下に
ある、これらの対応するより遠隔地からの入来信号のサ
ブバンド信号はセンタークリッパにより伝送され、また
フルバンドの出力信号に結合される。Each lower threshold is derived from a respective sub-band element at the far end. The absolute value of the signal of the incoming signal from a remote location rises slowly and is smoothed using a smoother that attenuates quickly. This step produces an estimate of the sub-band noise floor and sets it equal to a lower threshold. Sub-band signals of these corresponding more remote incoming signals below this lower threshold are transmitted by the center clipper and combined with the full-band output signal.

図面の簡単な説明図１は、エコー制御のための従来の装置の使用を含
む、通信ネットワークの一般的なアーキテクチャの特徴
を示した説明図である。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is an illustration depicting general architectural features of a communication network, including the use of conventional devices for echo control.

図２は、広い意味において、残留する、遠端のエコー
制御（RFEC）のためのシステムを通信ネットワーク内で
使用した説明図である。FIG. 2 illustrates, in a broad sense, the use of a residual, far-end echo control (RFEC) system in a communication network.

図３は、本発明の１つの実施例における、エコー制御
のためのシステムを示した図式的な説明図である。FIG. 3 is a schematic explanatory diagram showing a system for echo control according to one embodiment of the present invention.

図４は、図３のサブバンド信号処理のためのブロック
により達成される機能の図式的な説明図である。FIG. 4 is a diagrammatic illustration of the functions achieved by the sub-band signal processing blocks of FIG.

図５は、本発明の１つの実施例における、センターク
リッパのための伝達関数の説明図である。FIG. 5 is an explanatory diagram of a transfer function for a center clipper according to one embodiment of the present invention.

図６は、本発明の１つの実施例における、エコー経路
の遅延を測定するための工程を示した図式的な説明図で
ある。FIG. 6 is a schematic explanatory view showing a process for measuring a delay of an echo path in one embodiment of the present invention.

好ましい実施例の詳細な説明図１の通信ネットワークは、ローカルネットワーク1
0、遠隔ネットワーク20、およびインターネットトラン
クス30を含んでいる。各ネットワーク10、20は、通常
は、電話ハイブリッド32、および１つまたはそれより多
くのスイッチあるいは交換機34を含んでいる。インテー
ネットトランクスは、国内ネットワークおよび国際ネッ
トワークの間の通信リンクを含み、および通信衛星への
並びにこれからのリンクを含んでいる。長距離通信のた
めの通信ネットワークはまた、通常は、音声符号化ある
いは音声圧縮のための他のプロセスによる通信帯域幅を
減じるための回路多重化システム40を含んでいる。ロー
カルネットワークおよび遠隔ネットワークはまた、従来
のエコー制御システム50、55を含んでいる。例えば、遠
隔ネットワークにおいては、システム55は、それ自身の
音声のエコーのような遠隔ネットワークを通ってリサイ
クルされるとともに近端の話者に戻る、近端の音声（ロ
ーカルシステム内に向けられた）を減じるために使用さ
れる。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The communication network of FIG.
0, remote network 20, and Internet trunks 30. Each network 10, 20 typically includes a telephone hybrid 32 and one or more switches or switches 34. Internet Trunks includes communication links between national and international networks, and includes links to and from communication satellites. Communication networks for long-distance communication also typically include a circuit multiplexing system 40 to reduce communication bandwidth due to voice coding or other processes for voice compression. Local and remote networks also include conventional echo control systems 50,55. For example, in a remote network, system 55 may have near-end voice (directed into the local system) that is recycled through the remote network and returned to the near-end speaker, such as an echo of its own voice. Used to reduce

少なくともいくつかのケースにおいては、しかしなが
ら、このようなシステム55は存在しない、あるいはエコ
ー低減を十分に行っていない。これらの場合において
は、近端の話者は、残留する、遠端のエコー制御（RFE
C）のための、ローカルネットワーク内に設置されるシ
ステムを採用することが好ましい。図２に示したよう
な、このようなRFECシステム60は、遠端から近端の話者
に戻るエコーをさらに減じるために有用である。In at least some cases, however, such a system 55 does not exist or does not provide sufficient echo reduction. In these cases, the near-end talker is left with the remaining far-end echo control (RFE).
It is preferable to adopt a system installed in a local network for C). Such an RFEC system 60, as shown in FIG. 2, is useful for further reducing the echo returning from the far end to the near end speaker.

図３に示したのは、フルバンドの、近端の音声信号ｙ
［ｎ］およびフルバンドの、遠端の音声信号ｘ［ｎ］上
で動作する、REFCシステムである（変数「ｎ」は時間の
離散尺度を意味する）。このシステムはデジタル信号プ
ロセッサ上に好適に実施される。FIG. 3 shows a full-band, near-end audio signal y.
A REFC system operating on [n] and full-band, far-end audio signal x [n] (variable "n" means a discrete measure of time). The system is preferably implemented on a digital signal processor.

図１のブロック100において、システムは、送信され
た近端の信号と戻された近端の信号の間のエコー経路の
遅延の推定値である、尺度EPD［ｎ］を推定する。後述
するように、EPD［ｎ］の導出における中間のステップ
は、近端の信号および遠端の信号のフルバンドの、平均
の、スペクトルエネルギーを計算することを含んでい
る。送信された信号と戻された信号の間の損失の任意的
な測定は、近端のスペクトルエネルギーに対する遠端の
スペクトルエネルギーの比から容易に導出される。この
比において、近端のスペクトルエネルギーは推定された
エコー経路の遅延により遅延されている。In block 100 of FIG. 1, the system estimates a measure EPD [n], which is an estimate of the delay of the echo path between the transmitted near-end signal and the returned near-end signal. As described below, an intermediate step in the derivation of EPD [n] involves calculating the full-band, average, spectral energy of the near-end and far-end signals. An optional measure of the loss between the transmitted and returned signals is easily derived from the ratio of the far-end spectral energy to the near-end spectral energy. At this ratio, the near-end spectral energy is delayed by the estimated echo path delay.

この任意的な損失の測定は、図６のブロック425に最
も良く例示した。損失の測定は、テンプレートに適用さ
れる減衰の量を調節するために有用であり（下記参
照）、また、図３のブロック130においてサブバンドの
信号処理を可能とする時を決定するための制御信号とし
ても使用される。This optional loss measurement is best illustrated in block 425 of FIG. The loss measurement is useful for adjusting the amount of attenuation applied to the template (see below), and controls for determining when to enable sub-band signal processing in block 130 of FIG. Also used as a signal.

ブロック110において、出発した近くの音声のタップ
オフされた部分は、遅延された、フルバンドの、準音声
信号ｙ［ｎ−EPD］を生じるために、EPD［ｎ］の遅延を
受ける。この遅延された信号は、上記のように、減衰後
の予測されたエコー包絡を表す、テンプレートを生成す
るために使用される。At block 110, the tapped off portion of the starting nearby audio is delayed by EPD [n] to produce a delayed, full-band, quasi-audio signal y [n-EPD]. This delayed signal is used to generate a template representing the predicted echo envelope after attenuation, as described above.

ブロック120において、遅延された準音声信号は、１
からＭまで番号付けされた、複数の周波数のサブバンド
に分解される。各サブバンド信号は、例えばｋ番目のサ
ブバンド信号ya_k［ｎ］は、別々にサブバンド信号処理
を受ける。図に示したように、各サブバンド信号はそれ
ぞれの処理ブロック130内で処理される。この好ましい
実施例では、周波数分析ブロック120により示されたプ
ロセッサは、分割されたサブバンド信号を生成する、サ
ンプル速度減少付きの多相分析フィルタバンクである。At block 120, the delayed quasi-sound signal is 1
Into M subbands, numbered from M to M. Each sub-band signal, for example, the k-th sub-band signal ya _k [n] is separately subjected to sub-band signal processing. As shown, each subband signal is processed in a respective processing block 130. In the preferred embodiment, the processor represented by the frequency analysis block 120 is a polyphase analysis filterbank with sample rate reduction that produces a split subband signal.

多相フィルタバンクは、比較的高い計算効率を提供す
るので、特に魅力的である。これらのフィルタバンク
は、当業分野においては良く知られているので、その詳
細についての説明は省略する。この点についての有用な
参照文献としては、P.P.VaidyanathanによるPrentice H
allの「Multirate System and Filterbanks」の第８章
が挙げられる。Polyphase filter banks are particularly attractive because they provide relatively high computational efficiency. These filter banks are well known in the art and need not be described in detail. A useful reference in this regard is Prentice H by PPVaidyanathan.
Chapter 8 of "Multirate System and Filterbanks" of all.

本発明者による研究では、計算効率が良い、多相構造
で実施される、コサイン変調されたフィルタバンクを採
用している。この研究によれば、簡単なデザイン、比較
的低い計算要求、並びにフルバンドの信号の再構築によ
る歪みとできるために優れた周波数応答特性が得られ
た。この点についての有用な参考文献としては、K.Naye
biなどによる、IEEE Trans.Signal Processing 42（199
4年４月）の「On the Design of FIR Analysis−Synthe
sis Filterbanks with High Computational Efficienc
y」がある。The work by the inventor employs a cosine-modulated filter bank implemented in a polyphase structure that is computationally efficient. This study provided excellent frequency response due to the simplicity of the design, relatively low computational requirements, and distortion due to full-band signal reconstruction. A useful reference in this regard is K. Naye
IEEE Trans. Signal Processing 42 (199
April 4) "On the Design of FIR Analysis-Synthe
sis Filterbanks with High Computational Efficienc
y ”.

一般的な事項として、個々の周波数のサブバンドを選
択的に調整することで、従来の、エコーを減じるための
フルバンドの非線形のプロセッサを使用する場合に比べ
て、動作安定性が向上し、また音声特性が改善されるも
のと思われる。さらに、サブバンドを考察することで、
遠端の話者に対する最も能動的な周波数帯域がローカル
な話者のエコーのものとは異なることから、全二重接続
に対してより大きな影響を与えることができる。また、
ノイズのポンピングは、上記した、副しきい値ノイズの
透過性の特徴がない場合でも、フルバンドの処理よりも
サブバンドの変動性が少ない傾向にある。As a general matter, the selective tuning of individual frequency sub-bands improves operational stability compared to traditional, full-band, non-linear, echo reducing processors. Also, it is thought that the voice characteristics are improved. Furthermore, by considering the subbands,
Since the most active frequency band for the far-end talker is different from that of the local talker echo, it can have a greater impact on full-duplex connections. Also,
Noise pumping tends to have less sub-band variability than full-band processing, even without the sub-threshold noise transparency feature described above.

個々のブロック130において同様に処理されるのは、
ブロック140において遠くの入力信号ｘ［ｎ］を分解し
て得られたＭ個のサブバンド信号である。好ましい実施
例では、ブロック140により示されたプロセッサは同様
に、分割されたサブバンド信号を生成する、サンプル速
度が減じられた多相分析フィルタバンクである。ｋのそ
れぞれの値（ｋは１からＭの整数と仮定する）に対し
て、ｋ番目のサブバンドの遠端の信号xa_k［ｎ］は、ブ
ロック130において、サブバンドの遠端の信号とテンプ
レートの同時発生の値との間の比較に依存して、センタ
ークリップ動作を受ける。What is similarly processed in the individual blocks 130 is
Block 140 shows M subband signals obtained by decomposing the distant input signal x [n]. In the preferred embodiment, the processor represented by block 140 is also a reduced sample rate polyphase analysis filterbank that produces the divided subband signals. for each value of k (k is assumed from 1 an integer of M), the signal xa _k the far end of the k-th subband [n], at block 130, a signal far-end sub-band Depending on the comparison between the template co-occurrence value and the center clip operation.

各サブバンド処理ブロック130の出力はそれぞれ処理
されたサブバンド信号xe_k［ｎ］である。Ｍ個の処理さ
れたサブバンド信号は、フルバンド出力信号x_po［ｎ］
を生成するために、周波数合成ブロック150において再
結合される。好ましい実施例では、ブロック150のプロ
セッサは多相合成フィルタバンクである。この種のフィ
ルタバンクは、例えば、上記のVaidyanathan、あるいは
上記のNayebiなどに説明されている。Each output of the subband processing block 130 is a sub-band signal xe _k [n] that has been processed, respectively. The M processed sub-band signals are the full-band output signal x _po [n]
Are recombined in the frequency synthesis block 150 to generate In the preferred embodiment, the processor in block 150 is a polyphase synthesis filter bank. Such a filter bank is described, for example, in Vaidyanathan above or Nayebi above.

ブロック135において、フルバンドの音声検出器は、
遠くの音声が検出された時にブロック130のサブバンド
処理を実行可能とするため、並びに他の時においてサブ
バンド処理を実行可能とするために任意的に使用され
る。これらの実行可能性および実行不能の機能は例えば
許可状態と拒否状態を有するフラグを適当に設定するこ
とで達成される。エコー損失のフルバンドの推定値は、
この点において、入力ｘ［ｎ］におけるエネルギーが近
くの音声のエコーよりはむしろ実際の遠くの音声である
かどうかを決定するために有用である。つまり、ｘ
［ｎ］は、そのエネルギー包絡がエコー損失単体に基づ
いて予測されるものよりもｙ［ｎ］の遅延されたエネル
ギー包絡のより大きな部分を表わす場合において、エコ
ーではなくて遠くの会話として分類される。図におい
て、ブロック135はこのようなエコー推定値を表わす信
号に対する入力を有するものとして示されている。適当
なこのような推定値は図６のブロック425によって供給
される。At block 135, the full band speech detector includes:
Optionally used to enable sub-band processing of block 130 when distant speech is detected, as well as at other times. These executable and non-executable functions are achieved, for example, by appropriately setting a flag having a permission state and a denial state. The full-band estimate of echo loss is
In this regard, it is useful to determine whether the energy at the input x [n] is real distant speech, rather than echo of nearby speech. That is, x
[N] is classified as distant speech rather than echo, where its energy envelope represents a larger portion of the delayed energy envelope of y [n] than would be expected based on echo loss alone. You. In the figure, block 135 is shown as having an input for a signal representing such an echo estimate. Suitable such estimates are provided by block 425 in FIG.

この目的のための好ましい音声検出器は、D.K.Freema
nなどによる、IEEE Conf.ICASSP、1989年、§S7.6、題
369−372頁の「The Voice Activity Detector for the
PAN−EUROPEAN Digital Celular Mobile Telephone Ser
vice」GSM06.32VAD Standardから得ることができる。
この音声検出器は、ノイズの存在下で信頼性高く動作す
ることが知られているので、好適である。しかしなが
ら、当業分野において良く知られている、他の音声検出
器を同様にこの目的で使用することができる。A preferred audio detector for this purpose is DKFreema
n, IEEE Conf. ICASSP, 1989, §S7.6, title
`` The Voice Activity Detector for the
PAN-EUROPEAN Digital Celular Mobile Telephone Ser
vice ”can be obtained from GSM06.32 VAD Standard.
This speech detector is preferred because it is known to operate reliably in the presence of noise. However, other sound detectors, well known in the art, can be used for this purpose as well.

本発明の実施例によれば、分割されたｋ番目のサブバ
ンド信号ya_k［ｎ］xa_k［ｎ］のブロック130の詳細が図
４を参照して説明されている。According to an embodiment of the present invention, details of the block 130 of the divided k-th subband signal _{_{ya k [n] xa k [}} n] is described with reference to FIG.

ブロック200において、近端の信号波形ya_k［ｎ］は決
定されてブロック210に通過される。同様に、ブロック2
20において、遠端の信号波形xa_k［ｎ］は決定されてブ
ロック230に通過される。ブロック210と230はそれぞ
れ、比較的上昇時間が早くて緩やかな減衰を有し、ピー
クを保持する、スムーズな動作を表わす。ブロック210
においては少なくとも、減衰が予測されたエコー残響テ
ールに近いことが望ましい。At block 200, the near-end signal waveform ya _k [n] is determined and passed to block 210. Similarly, block 2
In 20, the signal waveform xa _k far end [n] is passed is determined in block 230. Blocks 210 and 230 each have a relatively fast rise time, a slow decay, and a peak-retaining, smooth operation. Block 210
It is desirable that at least the attenuation be close to the predicted echo reverberation tail.

例えば、ブロック210の平滑化された出力yb_k［ｎ］
は、次の回帰的な平均で表わされる。For example, the smoothed output yb _k [n] of block 210
Is represented by the following recursive mean:

ここで、A2は早い上昇時間を確保するために近い単一
性（unity）に選択され、またA3は40−50msのオーダー
の崩壊を有するように選択される。 Here, A2 is selected to be close to unity to ensure a fast rise time, and A3 is selected to have a decay on the order of 40-50 ms.

本発明者は、本システムにおいて、yb_k［ｎ］に対し
てあらかじめ決定された繰り越し（hold over）期間に
おいてピークを繰り越すことで、エコー経路遅延を推定
する際に誤差が少なくできることが分かった。この繰り
越し期間は好ましくは遠隔ネットワークを介して予測さ
れた遅延に設定され、典型的には20−40msである。好ま
しい実施例において、繰り越しは次の指示に従って適用
される。（ｉ）上昇の条件が合致した場合、yb_k［ｎ］
を更新し、繰り越し期間を開始する、（ii）降下の条件
が合致した場合、最後の繰り越し期間が満了した場合に
限りyb_k［ｎ］を更新する。The inventor has found that, in the present system, the error can be reduced when estimating the echo path delay by carrying over the peak during a holdover period predetermined for yb _k [n]. This carry-over period is preferably set to the expected delay over the remote network, and is typically 20-40 ms. In a preferred embodiment, the carryover is applied according to the following instructions. (I) yb _k [n] if the conditions for the rise are met
Is started, and the carry-over period is started. (Ii) If the condition of descent is met, yb _k [n] is updated only when the last carry-over period has expired.

予測されたエコー経路損失EPL_k［ｎ］に対する任意的
な調節はブロック240において行われる。ここで、従来
のセンタークリッパでは、最小限の予測された損失の固
定値だけが予め決定されている。この値は、通信ネット
ワーク内部の残留エコーの制御のために、典型的には約
18dBである。しかしながら、例えば、テンプレートのエ
ネルギーレベルが受信されたエコー信号の実際のエネル
ギーレベルを越える傾向を示した場合には、この予測さ
れた損失の数字を調節することが有利である。Optional adjustments to the predicted echo path loss EPL _k [n] are made at block 240. Here, in the conventional center clipper, only the fixed value of the minimum predicted loss is predetermined. This value is typically about
18 dB. However, if, for example, the energy level of the template shows a tendency to exceed the actual energy level of the received echo signal, it is advantageous to adjust this figure of expected loss.

本実施例ではすべてのサブバンドにおいて固定され
た、最小限の予測された損失は典型的には10−12dBの範
囲内であり、EPLはこの値に設定される。この損失の値
は、例えば、インターネットワークトランクスを適当な
長さの時間の間だけ監視することでネットワークの測定
から容易に決定される。In this embodiment, the fixed minimum expected loss in all subbands is typically in the range of 10-12 dB, and the EPL is set to this value. The value of this loss is easily determined from network measurements, for example, by monitoring internetwork trunks only for an appropriate amount of time.

しかしながら、少なくともいくつかの場合には、各周
波数バンドｋに対して異なる、固定された値EPL_kを使用
することが望ましいことがある。これにより、、例えば
知覚的な基準あるいはネットワーク測定の結果に従っ
て、損失値を成形することができる。However, in at least some cases it may be desirable to use a different, fixed value EPL _k for each frequency band k. This allows the loss value to be shaped, for example, according to perceptual criteria or the result of a network measurement.

他の代わりの方法は、全てのサブバンドを通るか、あ
るいは各サブバンド内で個々にEPL［ｎ］を適応的に決
定することである。この代わりの方法によれば、予め決
定された最小の予測された損失は、損失計算の結果によ
り導かれたEPLを調整することで、EPLに対してより低い
境界として機能する。適当なフルバンドの計算は上記し
た通りである。Another alternative is to adaptively determine EPL [n] either through all subbands or individually within each subband. According to this alternative method, the predetermined minimum predicted loss acts as a lower bound on the EPL by adjusting the EPL derived from the result of the loss calculation. The calculation of the appropriate full band is as described above.

さらに別の代わりの方法においては、損失は既知の信
号で遠隔ネットワークを能動的に探査し、また戻ったエ
コーを分析することにより決定される。In yet another alternative, the loss is determined by actively probing the remote network with a known signal and analyzing the returned echo.

ブロック250においては、ブロック210からの近端の包
絡が損失推定値により乗算されて、しきい値CL1_k［ｎ］
に従って波形が生成される。At block 250, the near-end envelope from block 210 is multiplied by the loss estimate to form a threshold CL1 _k [n].
A waveform is generated according to

CL1_k［ｎ］＝EPL［ｎ］×yb_k［ｎ］ブロック230において、遠くの入力はブロック210にお
ける近くの入力の平滑化を同様な方法で平滑化される。
この平滑化された遠くの入力信号はブロック240におい
て任意の損失調整を実行するため、並びに後述するよう
に、ブロック260および265においてノイズフロア推定を
行うために有用である。CL1 _k [n] = EPL [n] × yb _k [n] In block 230, the distant input is smoothed in a similar manner to the smoothing of the nearby input in block 210.
This smoothed distant input signal is useful for performing any loss adjustments at block 240 and for performing noise floor estimation at blocks 260 and 265, as described below.

ブロック230の平滑化された出力xb_k［ｎ］は回帰的平
均により例えば表現される。The smoothed output xb _k [n] of block 230 is represented, for example, by a recursive average.

ここで、A4は早い上昇時間を確保するために近い単一
性に選択され、またA5は40−50msのオーダーの減衰を有
するように選択される。 Here, A4 is chosen to be close to unity to ensure a fast rise time, and A5 is chosen to have an attenuation on the order of 40-50 ms.

平滑化された遠端の包絡を表わす、ブロック230の出
力は、遠隔ネットワークからのノイズレベルの推定値xc
_k［ｎ］を生成するために、ブロック260において処理さ
れる。一例として、ブロック230の出力xb_k［ｎ］は以下
に規定される回帰的平均を受ける。The output of block 230, which represents the smoothed far-end envelope, is an estimate of the noise level xc from the remote network.
Processed at block 260 to generate _k [n]. As an example, the output xb _k [n] of block 230 undergoes a recursive average defined below.

ここで、A6は緩やかな上昇時間を確保するために比較
的小さいオーダーが選択され、またA7は、１−5msの短
い減衰を有するように選択される。 Here, A6 is selected to be of a relatively small order to ensure a slow rise time, and A7 is selected to have a short decay of 1-5 ms.

遠端のノイズ推定値xc_k［ｎ］から、図４のブロック2
65に示したように、波形に従ったより低いしきい値（つ
まり、ノイズフロア）CL2_k［ｎ］が導出される。一例と
して、このしきい値はノイズ推定値を、典型的には0.5
と1.5の間の値である、任意の大きさ因子NFAC_k［ｎ］に
より乗算することにより導出される。さらに、しきい値
CL₂k［ｎ］は好適には予測されたエコーレベルを決して
越えないように制限される。よって、より低いしきい値
は例えば次式のように規定される。From the far end noise estimate xc _k [n], block 2 in FIG.
As shown at 65, a lower threshold (ie, noise floor) CL2 _k [n] according to the waveform is derived. As an example, this threshold can be used to reduce the noise estimate, typically 0.5
And 1.5 by multiplying by an arbitrary magnitude factor NFAC _k [n]. In addition, the threshold
CL ₂ k [n] is preferably limited to never exceed the predicted echo level. Therefore, the lower threshold is defined, for example, by the following equation.

CL2_k［ｎ］＝min(NFAC_k［ｎ］×xc_k［ｎ］,CL1_k［ｎ］) 本発明者は、ノイズフロアの推定値は、xa_k［ｎ］とx
b_k［ｎ］の平滑化がノイズだけを含み、音声を含まない
時に行った場合において、さらに改善できることを見出
だした。図３のブロック135の遠端の音声検出器は、音
声（あるいはエコー）が存在する状況と、ノイズだけが
存在する状況の間を容易に識別するために使用される。
従って、ノイズフロア推定値は第１段階で不能とされ、
また第２段階で可能とされる。CL2 _k [n] = min (NFAC _k [n] × xc _k [n], CL1 _k [n]) The present inventor has estimated that the noise floor is xa _k [n] and x
It has been found that when the smoothing of b _k [n] is performed when only noise is included and no voice is included, it can be further improved. The far end speech detector in block 135 of FIG. 3 is used to easily distinguish between situations where speech (or echo) is present and situations where only noise is present.
Therefore, the noise floor estimate is disabled in the first stage,
It is made possible in the second stage.

ブロック270において、遠端の、サブバンドの、入力
信号xa_k［ｎ］はセンタークリップを受ける。本発明の
好ましい実施例によれば、入力信号は、その絶対値がし
きい値CL2_k［ｎ］とCL1_k［ｎ］＋CL2_k［ｎ］の間にある
場合は常に減衰されるが、（１）CL1_k［ｎ］＋CL2
_k［ｎ］を越える場合、あるいは（２）CL2_k［ｎ］より
下の場合には減衰なしに通過される。In block 270, the far end of the sub-band input signal xa _k [n] is subjected to center clip. According to a preferred embodiment of the present invention, the input signal is attenuated whenever its absolute value is between the thresholds CL2 _k [n] and CL1 _k [n] + CL2 _k [n], 1) CL1 _k [n] + CL2
If it exceeds _k [n], or (2) if below CL2 _k [n] is passed without attenuation.

好ましいセンタークリッパの転送関数を図５に例示し
た。図から明らかなように、このクリッパは、信号の絶
対値がより低いしきい値CL2より小さいか、あるいは上
側のしきい値CL1＋CL2より大きい場合において、入力信
号を実質的に減衰なしに通過させる（図において、量子
化された時間ｎの下付きｋおよび明示の依存関係は簡略
化のために省略した）。しかしながら、これらのしきい
値の中間の領域において、入力信号はCL2の平坦な出力
レベルにクリップされる。A preferred center clipper transfer function is illustrated in FIG. As can be seen, the clipper passes the input signal substantially without attenuation when the absolute value of the signal is less than the lower threshold CL2 or greater than the upper threshold CL1 + CL2 ( In the figure, the subscript k and explicit dependencies of the quantized time n have been omitted for simplicity). However, in the region between these thresholds, the input signal is clipped to the flat output level of CL2.

本発明者は、与えられたサブバンドｋ内でノイズが比
較的大きい時には、そのサブバンド内でセンタークリッ
パによって幾分減じられまた歪んだエコーが伝送される
ことを観測した。このようなエコー成分をマスクするた
め、送信されたサブバンド信号にホワイトノイズ成分
（つまり、与えられたサブバンドｋ内で平らなスペクト
ルを有するノイズ成分）を混合することが有用であるこ
とを見出だした。好ましい工程においては、サブバンド
信号レベル（１−FFAC）×xa_kはホワイトノイズレベルF
FAC×CL2［ｎ］と混合される。FFACの値は典型的には25
％−50％に選択される。追加されたホワイトノイズは格
差部バンド内でのみ平坦であるので、得られた合成され
たフルバンドの出力はフルバンドのノイズスペクトルに
近似している。The inventor has observed that when the noise is relatively large within a given subband k, a somewhat clipped and distorted echo is transmitted within that subband by the center clipper. To mask such echo components, it has been found useful to mix the transmitted sub-band signal with a white noise component (ie, a noise component having a flat spectrum within a given sub-band k). I started. In a preferred process, the sub-band signal level (1-FFAC) × xa _k white noise level F
It is mixed with FAC × CL2 [n]. The value of FFAC is typically 25
%-50%. Since the added white noise is flat only within the difference band, the resulting combined full-band output approximates the full-band noise spectrum.

ブロック275において、任意の事後平滑化（post−smo
oting）の関数がクリッパ270の出力から偽のスパイクを
取り除いている。メディアンフィルタに類似した、１つ
の事後平滑化の工程によれば、信号xdk［ｎ］の現在の
サンプルが遠端の音声の間に発生したかどうかの決定が
なされる。この決定は、上記したように、音声検出器32
0の出力に基づいて、損失測定を組み合わせて行われ
る。遠端の音声が存在せず、また現在の信号ブロックが
信号のクリップされたサンプルにより境界付けされた隔
離されたピークを含む場合には、全部のブロックがクリ
ップされる。他方、遠端の音声が検出された場合、クリ
ップされた値はブロック全体にリストアされる。この目
的のために、ブロックの大きさは約10−20msが好まし
い。At block 275, an optional post-smo
oting) function removes spurious spikes from the output of clipper 270. According to one post-smoothing step, similar to a median filter, a determination is made whether the current sample of the signal xdk [n] occurred during the far-end speech. This decision is made by the speech detector 32, as described above.
Based on the zero output, the loss measurement is performed in combination. If there is no far end speech and the current signal block contains isolated peaks bounded by clipped samples of the signal, the entire block will be clipped. On the other hand, if far-end speech is detected, the clipped value is restored to the entire block. For this purpose, the block size is preferably about 10-20 ms.

さらに、ブロック275は、ノイズだけを含むクリップ
された遠端の信号のこれらの部分をさらに減衰する。In addition, block 275 further attenuates those portions of the clipped far-end signal that contains only noise.

上記したように、エコー経路遅延のフルバンドの評価
値EPD［ｎ］は図３のブロック100において計算される。
この遅延の好ましい方法を図６を参照して説明する。こ
の方法は、周波数領域のコヒーレンス距離（coherence
metric）の計算に基づくものである。この距離は、それ
ぞれ、近端の信号と遠い信号の自スペクトル（autospec
tra）のペリオドグラム推定値、並びにそれらの相互ス
ペクトルのペリオドグラム推定値から推定される。この
種の方法は、一般的に、G.Clifforid Carter、edの、I
EEE Press 1993年のCoherence and Time Delay E
stimationに説明されている。しかしながら、従来の方
法とは異なり、本発明の方法では、コヒーレンス距離を
評価し、また、周波数領域から時間領域に戻る変換のた
めの逆FFTを行う前に正規化されたエネルギー距離で終
了させている。この変更はCarterに記載された全推定法
よりも時間推定が正確でなくなるが、計算の必要性およ
びメモリ利用が減じられ、また、本発明の目的には十分
なものである。As described above, the full band estimate EPD [n] of the echo path delay is calculated in block 100 of FIG.
A preferred method of this delay will be described with reference to FIG. This method uses a coherence distance in the frequency domain.
metric). This distance is the autospectrum (autospec) of the near-end and far signals, respectively.
tra), as well as their cross-spectral periodogram estimates. This type of method is generally described in G. Clifforid Carter, ed, I.
EEE Press 1993 Coherence and Time Delay E
Described in stimation. However, unlike conventional methods, the method of the present invention evaluates the coherence distance and terminates at a normalized energy distance before performing an inverse FFT for the transform from the frequency domain back to the time domain. I have. This change makes the time estimation less accurate than the full estimation method described in Carter, but reduces computational requirements and memory usage, and is sufficient for the purposes of the present invention.

近端の入力ｙ［ｎ］と遠端の入力ｘ［ｎ］はそれぞれ
実時間で受信され、また図のブロック300と310におい
て、それぞれ、これらの入力信号はオーバラップされた
ブロック内に分割される。好ましいブロックサイズは、
33％の、つまり80サンプルのオーバラップを備えた240
サンプルのものである。The near-end input y [n] and the far-end input x [n] are each received in real time, and in blocks 300 and 310, respectively, these input signals are divided into overlapped blocks. You. The preferred block size is
240 with 33% or 80 sample overlap
It is a sample thing.

遅延の計算は近端の音声上だけで動作することを意図
したものであり、戻った遠端の信号のこの部分が近端の
音声のエコーを含むものと推定される。よって、遅延計
算は近端の音声信号が検出されたときのみ開始される。
この目的のため、音声検出器320は、近端の相手が通話
していることが決定されたときに「前進」信号を与え
る。本発明者は、近端からの音声活動を識別するために
単一のエネルギー計測を採用する音声検出器を使用して
いる。この種の音声検出器は当業分野において良く知ら
れており、よって詳細な説明は省略する。The delay calculation is intended to operate on near-end speech only, and it is assumed that this portion of the returned far-end signal contains echoes of near-end speech. Therefore, the delay calculation is started only when the near-end voice signal is detected.
To this end, the voice detector 320 provides a "forward" signal when it is determined that the near end party is talking. The inventor has used a voice detector that employs a single energy measurement to identify voice activity from the near end. This type of speech detector is well known in the art and will not be described in detail.

エコーが期待できない時の間隔の間における不要な計
算を避けることが望ましい。近くの音声の発生開始に続
く全てのエコーはある時間期間内において発生する。こ
の時間期間を表わすために、典型的には約1000msであ
る、期間T₂を選択した。さらに、最初のエコーは幾分の
最小限の伝達遅延の後の発生する。この遅延を期間T₁と
選択した。T₁は０まで任意に設定できるが、０でない
（有限）の値、典型的には約150msを使用することが好
ましい。It is desirable to avoid unnecessary calculations during intervals when echoes cannot be expected. All echoes following the onset of nearby sounding occur within a certain time period. To represent this time period, typically about 1000 ms, and select a time period T _2. In addition, the first echo occurs after some minimal propagation delay. This delay was selected as the period T _1. T ₁ can be set arbitrarily to 0, but it is preferable to use a non-zero (finite) value, typically about 150 ms.

期間T₁とT₂はタイマ330内に記憶される。このタイマ
は遠端の信号の処理を、現在処理中の近端のブロックに
関して、T₁とT₂の間の遅延で到達するこれらの遠端のブ
ロックに制限するものである。Period T ₁ and T ₂ are stored in the timer 330. The timer processing the far-end signal, with respect to block the near-end of the currently processed, is limited to the block of the far end to reach a delay between T ₁ and T _2.

音声検出器320がｋ番目の近端の信号のブロックの音
声エネルギーが予め設定されたしきい値を越えていると
決定した時には、音声検出器は前進信号を発生する。こ
れに応答して、近くの信号のブロックには零が埋め込ま
れ、図でブロック340において示したように、高速フー
リェ変換（FFT）を使用して周波数領域信号Ｙ（ｆ）に
変換される。一例として、256ポイントの長さを有するF
FTを使用し、また16の零を埋め込みが必要となる。近端
の信号の自スペクトルは、図のブロック350において示
したように、Ｙ（ｆ）の平方係数、つまり|Y（ｆ）|²を
形成することにより得られる。When the sound detector 320 determines that the sound energy of the kth block of the near-end signal is above a preset threshold, the sound detector generates a forward signal. In response, blocks of nearby signals are padded with zeros and converted to a frequency domain signal Y (f) using a fast Fourier transform (FFT), as shown at block 340 in the figure. As an example, F with a length of 256 points
Using FT and embedding 16 zeros is required. The own spectrum of the near-end signal is obtained by forming the square coefficient of Y (f), ie, | Y (f) | ² , as shown in block 350 of the figure.

同様に、近端の音声の検出後のT₁とT₂ミリ秒の間に受
信されたこれらの遠端の信号ブロックは、零が埋め込ま
れ、またFFT340と同じサイズである、FFT360に受けられ
る。しかしながら、この遠端の、周波数領域の信号は、
T₁とT₂の間の間隔内にある、変化する時間遅延τの複数
の離散した値のそれぞれにおいて計算される。連続した
τの値は、例えば160サンプル（ブロックの長さの2/3）
に分割される。得られた周波数領域信号はＸ（τ、ｆ）
で示される。遠端の自スペクトル（離散した遅延τのそ
れぞれに対して）は、図のブロック370において示した
ように、平方係数|X（τ、ｆ）｜を取ることで形成され
る。Similarly, the signal block of the far-end received between the proximal end of the T ₁ and T ₂ ms after detection of speech, zero is embedded, also the same size as FFT340, be received in FFT360 . However, this far-end, frequency-domain signal is
Is within the interval between T ₁ and T _2, are computed in each of a plurality of discrete values of time delay τ to vary. A continuous value of τ is, for example, 160 samples (2/3 of the block length)
Is divided into The obtained frequency domain signal is X (τ, f)
Indicated by The far-end own spectrum (for each of the discrete delays τ) is formed by taking the square factor | X (τ, f) |, as shown in block 370 of the figure.

相互スペクトルは、図のブロック380において示した
ように、T₁とT₂の間のぞれぞれの遅延されたブロックの
ために形成される。この相互スペクトルは、近端の、周
波数領域信号に、遠端の複素共役の、周波数領域信号を
掛けた積である。遠端の自スペクトルと同様に、この相
互スペクトルYX（τ、ｆ）は遅延τに依存している。Cross spectrum, as shown in block 380 of FIG, formed for the delay blocks respectively Each of between T ₁ and T _2. The cross spectrum is the product of the near-end, frequency-domain signal multiplied by the far-end, complex conjugate, frequency-domain signal. Like the far-end own spectrum, this cross spectrum YX (τ, f) depends on the delay τ.

全てのスペクトルＹ（ｆ）、Ｘ（τ、ｆ）、およびYX
^＊（τ、ｆ）のセットを連続して更新した。この好まし
い工程によれば、平滑化された、ピリオドグラムの推定
値が、近端の音声の各Ｊの検出されたブロックに対して
１度推定され、Ｊセットは25に等しい。得られた周期的
なピリオドグラムのそれぞれは、Ｊの検出されたブロッ
ク上の自スペクトルおよび相互スペクトルの、平均的
な、典型的な単純平均である。得られた平均スペクトル
は、以下にそれぞれ、SY（ｆ、）SX（τ、ｆ）、SYX
（τ、ｆ）と表わした。All spectra Y (f), X (τ, f), and YX
^* The set of (τ, f) was updated continuously. According to this preferred step, a smoothed, periodogram estimate is estimated once for each J detected block of near-end speech, the J set being equal to 25. Each of the resulting periodic periodograms is an average, typical simple average of the own and cross spectra on the J detected blocks. The obtained average spectra are given below as SY (f,) SX (τ, f) and SYX, respectively.
(Τ, f).

近端の自スペクトルの平均化は図においてブロック39
0の場所で示し、遠端の自スペクトルの平均化は図にお
いてブロック400の場所で示し、相互スペクトルの平均
化は図においてブロック410の場所で示した。Averaging of the near-end self-spectrum is shown in block 39
The averaging of its own spectrum at the far end is shown at the location of block 400 in the figure and the averaging of the cross spectrum is shown at the location of block 410 in the figure.

この工程における速度を増大するとともにメモリの必
要性を減じるために、自スペクトルおよび相互スペクト
ルの周波数ピケットを分割することが好ましい。分割の
程度は、近端の音声の予測された平滑度に依存して許容
される。本発明者は２のファクターのスペクトル分割が
用いられ、また音声帯域間隔を187−3187Hzとしたが、1
87−2000Hzの音声帯域で十分であると考えられる。In order to increase the speed in this step and reduce the need for memory, it is preferable to divide the frequency pickets of the own spectrum and the cross spectrum. The degree of segmentation is acceptable depending on the predicted smoothness of the near-end speech. The present inventor has used a factor division of 2 and used a speech band spacing of 187-3187 Hz,
An audio band of 87-2000 Hz is considered sufficient.

Ｊの近端の音声ブロックの各シーケンスの端には、図
においてブロック420で示したように、平方されたコヒ
ーレンス距離が遅延τの値でそれぞれ形成されている。
この距離は次の式で表現される。At the end of each sequence of sound blocks near the end of J, a squared coherence distance is formed with a value of the delay τ, as indicated by block 420 in the figure.
This distance is expressed by the following equation.

この正規化され平方されたコヒーレンス距離は、離散
した時間遅延τに依存するコヒーレンスエネルギー関数
Ｃ（τ）を生じるために、電話の音声に関連した応用の
場合には187−3187Hzである、分割されたスペクトル帯
域上で合計される。周波数合計工程は図のブロック430
で示した。 This normalized squared coherence distance is divided by 187-3187 Hz in the case of telephone voice related applications to yield a coherence energy function C (τ) that depends on a discrete time delay τ. Are summed over the spectral bands. The frequency summing process is shown in block 430
Indicated by

図のブロック440で示したように、Ｃ（τ）は関数の
ピーク値を見出だすための工程を受ける。この工程はエ
コー経路遅延、EPDで示し、この離散したτの値におい
てＣ（τ）は局部的なピーク値を有する。別の信号ブロ
ックを受信した際には、平方化されたコヒーレンス距離
が再計算される。これにより、推定されたエコー経路遅
延が通話時間間隔上で追跡される。１つより多くのEPD
が存在する場合には、それぞれがが検出され、また上記
した検出のしきい値より上にあるＣ（τ）の局部的なし
きい値からトラックされる。As indicated by block 440 in the figure, C (τ) undergoes a step to find the peak value of the function. This step is referred to as the echo path delay, EPD, where C (τ) has a local peak value at this discrete value of τ. Upon receiving another signal block, the squared coherence distance is recalculated. Thereby, the estimated echo path delay is tracked over the talk time interval. More than one EPD
Are present, each is detected and tracked from a local threshold of C (τ) that is above the detection threshold described above.

遅延推定値EPDにおいてより正確さが要求される場合
には、関数Ｃ（τ）が逆フーリェ変換され、また得られ
た自己相関推定値が、各離散したτの副間隔内で最大時
間位置だけ検索される。EPD内で十分な遅延の正確さを
得るために、上記で使用したブロックおよびオーバラッ
プの大きさに対する、この最後の変換ステップでの遅延
計算を実行する必要はない。Ｃ（τ）を合計することは
EPDを検出するためのテストの十分な距離である。If more accuracy is required in the delay estimate EPD, the function C (τ) is inverse Fourier transformed, and the resulting autocorrelation estimate is shifted by the maximum time position within each discrete τ subinterval. Searched. In order to obtain sufficient delay accuracy within the EPD, it is not necessary to perform a delay calculation in this last conversion step for the blocks and overlap sizes used above. Summing C (τ) is
It is a sufficient distance for the test to detect EPD.

少なくとも１つのＣ（τ）の局部的なピーク値が存在
することの決定は、それ自体がエコーが存在することを
示している。よって、本エコー遅延測定技術はそれ自体
で通信システムにおけるエコー検出器のための基礎とな
るものである。A determination that there is at least one local peak value of C (τ) indicates that an echo is present. Thus, the present echo delay measurement technique is itself the basis for an echo detector in a communication system.

本発明は、エコーが幾分遅れて到達する種々の通信シ
ステムにおいて有用である。この遅延は通常はエコー経
路上の伝播時間による成分を含んでいる。しかしなが
ら、特定の用途では、信号処理による、別の、また優性
の、成分がある。この種の遅延はセルラー通信システム
および電話会議システムにおける符号化遅延を含んでい
る。本発明はこれらの用途にも有用である。The invention is useful in various communication systems where the echo arrives somewhat later. This delay usually includes a component due to the propagation time on the echo path. However, in certain applications, there is another and dominant component due to signal processing. Such delays include coding delays in cellular and teleconferencing systems. The present invention is also useful for these uses.

特に、本発明は、スピーカーフォンや電話会議システ
ムのような、遠端における会議通信装置での接続に有用
である。この場合、本発明は、会議通信装置における不
十分なエコー相殺による残留エコーを取り除くために有
用である。In particular, the present invention is useful for connection in conference communication devices at the far end, such as speakerphones and teleconferencing systems. In this case, the present invention is useful for removing a residual echo due to insufficient echo cancellation in the conference communication device.

国際電話呼びにおけるエコーを減じるために本発明を
使用した場合には、本明細書のおいて説明した信号処理
が行われる好ましい位置は国際交換センターであり、好
ましくはゲートウェイ交換機上の点（国際側上の）にお
ける国際トランクラインである。これにより、そのトラ
ンクラインを通過する全ての電話呼びのための特定の伝
送点に処理装置が置かれる。If the invention is used to reduce echoes in international telephone calls, the preferred location where the signal processing described herein takes place is at the international switching center, preferably at a point (international side) on the gateway exchange. Above). This places the processor at a specific transmission point for all telephone calls passing through that trunk line.

本発明が国内セルラ電話呼びにおけるエコーを減じる
ために使用される場合、処理装置を位置させる１つの好
ましい方法は、これをセルラー局にリンクするトランク
に接続することである。If the invention is used to reduce echo in a domestic cellular telephone call, one preferred way to locate the processing unit is to connect it to a trunk linking the cellular station.

本発明が国内衛星リンクにおけるエコーを減じるため
に使用される時には、処理装置を衛星からの受信チャン
ネルに接続することが好ましい。When the present invention is used to reduce echo on a domestic satellite link, it is preferable to connect the processing unit to a receive channel from the satellite.

一例として、本発明の原形はAnalog Devices ADSP
−21220上で運転されている。なお、実質的に計算力が
より少ない信号プロセッサでも、本発明によれば、ホス
トマシンとして採用できるようになる。As an example, the prototype of the present invention is Analog Devices ADSP
Operating on -21220. According to the present invention, a signal processor having substantially less computational power can be adopted as a host machine.

フロントページの続き (72)発明者ワイン，ウッドソンデールアメリカ合衆国 07920 ニュージャーシィ，バスキングリッジ，ジュニパーウエイ 56 (56)参考文献特開昭58−1337（ＪＰ，Ａ) 特開昭62−107533（ＪＰ，Ａ) 特開昭64−65936（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) H04B 3/00 - 3/44 Continuation of the front page (72) Inventor Wine, Woodson Dale United States 07920 New Jersey, Basking Ridge, Juniper Way 56 (56) References JP-A-58-1337 (JP, A) JP-A-62-107533 (JP) , A) JP-A-64-65936 (JP, A) (58) Fields investigated (Int. Cl. ⁷ , DB name) H04B 3/00-3/44

Claims

(57) [Claims]

A NEAR-IN signal placed on a FIRST network for transmission to a SECOND network for processing an incoming signal from a remote location, referred to as a FAR-IN signal, received by the FIRST network from the SECOND network. A method for attenuating the energy component due to the echo returned by the SECOND network of an incoming signal from a nearby area, referred to as: a) reducing the delay between the arrival of the NEAR-IN signal and the corresponding echo in the FAR-IN signal. Measuring; b) processing a copy of the NEAR-IN signal, representing the smoothed energy component of the NEAR-IN signal delayed by the measured delay and attenuated by the estimated transmission loss for the echo. Generating a time variable signal called TEMPLATE; c) In a non-linear processor, if the FAR-IN signal is
Passing the FAR-IN signal substantially without attenuation if it exceeds a threshold at least partially derived from PLATE; d) in the non-linear processor, if the FAR-IN signal is less than or equal to the threshold, If within the range, FAR-I
A method that attenuates the N signal.

2. The step of measuring the delay includes evaluating a frequency domain coherence distance C (τ, f) between the NEAR-IN signal and the FAR-IN signal, wherein the distance is a frequency f and a relative distance between the two signals. The distance C (τ, f) over the frequency band of interest
And the coherence-energy function C
The method of claim 1, wherein (τ) is obtained and a local peak value of the function C (τ) is identified.

3. The distance C (τ, f) is: Where f is the frequency, SY (f) is the averaged self-spectrum in the NEAR-IN signal, and SX (τ,
f) is an averaged self-spectrum of the FAR-IN signal, and SXY (τ, f) is an average of a cross spectrum of the NEAR-IN signal and the FAR-IN signal. 3. The method according to item 2.

4. The method of claim 1, wherein said threshold is equal to TEMPLATE.

5. The method according to claim 1, wherein said threshold value is derived by adding TEMPLATE to a value derived from an estimation of the noise level received in the corresponding subband from the SECOND network. The method of claim 1.

6. A NEAR-IN located at a local network for transmission to a remote network for processing an incoming communication signal from a remote location, referred to as a FAR-IN signal received at the local network from the remote network. A method for reducing the energy component attributable to an echo returned by a remote network of an incoming signal from a nearby area, referred to as a signal, comprising: a) the NEAR-IN signal and the corresponding echo arrival at the FAR-IN signal; Measuring the delay between: b) decomposing the FAR-IN signal into a plurality of frequency sub-band components, referred to as the FAR-IN sub-band signal, delaying the NEAR-IN signal by the measured delay, and NEAR-IN
C) decomposing the signal into a plurality of frequency sub-band components, referred to as a NEAR-IN sub-band signal; c) processing each copy of the NEAR-IN sub-band signal, delayed by the measured delay, and estimated with respect to the echo. TEMPLA representing the smoothed energy component of the NEAR-IN subband signal attenuated by the transmission loss
Generating a time-variable signal referred to as TE; d) in a non-linear processor, with substantially no attenuation, where each of the FAR-IN sub-band signals exceeds a threshold at least partially derived from a corresponding TEMPLATE.
E) passing each of the FAR-IN sub-band signals; e) in the non-linear processor, if each of the FAR-IN sub-band signals is within a defined range below the corresponding threshold, A) attenuating each and f) combining the non-linearly processed FAR-IN subband signals to form an echo attenuated full-band FAR-IN signal.

7. The step of measuring the delay includes evaluating a frequency domain coherence distance C (τ, f) between the NEAR-IN signal and the FAR-IN signal, wherein the distance is a relative value between the frequency f and the signal. And the distance C (τ, f) is summed over the frequency band of interest, whereby the coherence-energy function C
7. The method according to claim 6, wherein (τ) is obtained, and a local peak value of the function C (τ) is identified.

8. The distance C (τ, f) is: Where f is the frequency, SY (f) is the averaged self-spectrum in the NEAR-IN signal, and SX (τ,
f) is an averaged self-spectrum of the FAR-IN signal, and SXY (τ, f) is an average of a cross spectrum of the NEAR-IN signal and the FAR-IN signal. The method according to claim 7.

9. For each of the FAR-IN subband signals, a NOISE LEVEL less than or equal to the corresponding TEMPLATE at each point in time of the problem.
For each of the FAR-IN subband signals, steps (d) and (e) are performed when the FAR-IN subband signal is NOISE
7. The method according to claim 6, characterized in that the method is performed such that if it is less than LEVEL, it is passed without attenuation.

10. For each of the FAR-IN sub-band signals,
10. The method of claim 9, wherein the step of setting a corresponding NOISE LEVEL comprises obtaining an energy envelope of a FAR-IN subband signal and smoothing the envelope in an averaging procedure. Method.

11. The step of obtaining the energy envelope of each of the FAR-IN sub-band signals, comprising testing for the presence of energy in the FAR-IN signal, wherein the step of obtaining the energy envelope of each of the FAR-IN subband signals
11. The method according to claim 10, wherein the method is performed only when no signal energy is detected.

12. The method of claim 9 wherein said step of attenuating comprises clipping the FAR-IN subband signal to a predetermined level.

13. The predetermined level is substantially NOISE LEVEL.
13. A method according to claim 12, characterized in that:

14. The attenuating step includes mixing the clipped FAR-IN subband signal with a noise component, the noise component having a substantially flat frequency spectrum within the associated subband. 13. The method of claim 12, wherein said mixing is performed such that the level of the mixed signal is substantially equal to NOISE LEVEL.

15. The method according to claim 6, wherein each of said thresholds is equal to a corresponding TEMPLATE.

16. Each of said thresholds is associated with a corresponding TEMPLATE
7. The method of claim 6, wherein the method is derived by adding a value derived from an estimate of a noise level received in a corresponding subband from a remote network.

17. A communication system comprising a local network and a remote network, wherein at a remote location a sender places a remote talk signal, called a FAR SPEECH signal, on a remote network for transmission to the local network, and Is the FAR
Used in systems that receive the SPEECH signal as an incoming signal from a remote location, referred to as a FAR-IN signal, from a nearby location, referred to as a NEAR-IN signal, placed on a local network for transmission to the remote network. A method of processing a FAR-IN signal to reduce the energy component due to the echo returned by the remote network of the incoming signal, comprising: a) arrival of the NEAR-IN signal and the corresponding echo in the FAR-IN signal. B) test the energy in the FAR-IN signal due to the FAR SPEECH, set the DENY flag for a negative state when the energy is detected, and allow when the energy is not detected. Setting the PERMIT flag for the state; c) decomposing the FAR-IN signal into a plurality of frequency sub-band components called FAR-IN sub-bands Only the measurement delay
Delay the NEAR-IN signal, and apply the delayed NEAR-IN signal
Decomposing into a plurality of frequency sub-band components, called NEAR-IN sub-band signals; d) processing each copy of the NEAR-IN sub-band signal, delaying the measured delay and estimating transmission with respect to echo TEMPLATE representing the smoothed energy component of the NEAR-IN sub-band signal attenuated by loss
E) generating each of the FAR-IN sub-band signals substantially if the FAR-IN sub-band signals exceed a threshold at least partially derived from the corresponding TEMPLATE. F) pass each of the FAR-IN sub-band signals if the FAR-IN sub-band signals are within a defined range below the corresponding threshold. G) combining the passed FAR-IN sub-band signals to form an echo-attenuated full-band FAR-IN signal, and wherein steps (c)-(g) comprise a flag. Method executed only when is in PERMIT state.

18. The FAR-I after passing through the non-linear processor.
Each of the N signals is divided into a plurality of blocks, each block having a duration in the range of 10-20 ms, and each block consisting of a plurality of signal samples, after passing through the non-linear processor, if FAR SPEEC
If the H signal is detected during the period corresponding to any block, restore all samples in that block to their amplitude before attenuating, and if the FAR SPEECH signal corresponds to any block 18. The method of claim 17, further comprising the step of attenuating all samples of the block that represent local peaks in signal amplitude if not detected during the time period.

19. A signal returned to a local telephone user due to imperfect echo cancellation at the conference communication device in a signal, referred to as a FAR-IN signal, received by the local telephone user from the remote conference communication device. Attenuating the energy component due to the echoes of the local user voice, comprising: a) the signal transmitted to the telephone network by the local user, referred to as the NEAR-IN signal, and the arrival of the corresponding echo in the FAR-IN signal; B) process a copy of the NEAR-IN signal, process the copy of the NEAR-IN signal, delay the measured delay and attenuate the estimated transmission loss of the echo by the smoothed energy of the NEAR-IN signal Generating a time variable signal called TEMPLATE representing the components; c) in a non-linear processor, if the FAR-IN signal is
Passing the FAR-IN signal substantially unabated if it exceeds a threshold derived at least in part from PLATE; and d) in a non-linear processor, if the FAR-IN signal is less than or equal to the defined threshold. If it is within the range, FAR-I
A method that attenuates the N signal.

20. A method for processing an incoming signal from a remote location, referred to as a FAR-IN signal, received by a local network from a remote network, and placing the NEAR signal on the local network for transmission to the remote network.
An apparatus for attenuating an energy component due to an echo returned by a remote network of IN signals, comprising: a) means for measuring the delay between the NEAR-IN signal and the arrival of a corresponding echo in the FAR-IN signal; b) Decompose the FAR-IN signal into a plurality of frequency sub-band components called FAR-IN sub-band signals, and measure only the measured delay
Means for delaying the NEAR-IN signal and decomposing the delayed NEAR-IN signal into a plurality of frequency sub-band components referred to as NEAR-IN sub-band signals; c) receiving a copy of each of the NEAR-IN sub-band signals And
NEAR delayed by the measured delay measuring each of the copies and attenuated by the estimated transmission loss of the echo
Means for generating a time-variable signal called TEMPLATE representing the smoothed energy component of the IN sub-band signal; d) if the FAR-IN sub-band signal exceeds a threshold at least partially derived from the corresponding TEMPLATE In which case the FAR-IN sub-band signal is passed with virtually no attenuation,
A non-linear processor attenuating the FAR-IN sub-band signal if the FAR-IN sub-band signal is within a defined range below the corresponding threshold, and e) a non-linearly processed FAR-IN sub-band. Means for synthesizing the signals to form a full-band FAR-IN signal with attenuated echoes.

21. For each of the FAR-IN sub-band signals,
NOISE less than or equal to the corresponding TEMPLATE signal at each time of the interval
Further comprising means for setting a noise level, referred to as a LEVEL, wherein the non-linear processor further comprises: each of the FAR-IN subband signals having substantially no attenuation if each of the FAR-IN subband signals is below its NOISE LEVEL. 21. The device according to claim 20, wherein the device is passed through.

22. The apparatus according to claim 21, wherein said non-linear processor attenuates the FAR-IN sub-band signal by clipping it to a predetermined level.

23. The predetermined level is substantially NOISE LEV
23. The device according to claim 22, which is equal to EL.

24. Mixing the clipped FAR-IN sub-band signal with a noise component having a substantially flat frequency spectrum in the associated sub-band so that the level of the mixed signal is substantially NOISE 23. The apparatus of claim 22, further comprising means for equalizing LEVEL.

25. A local user returned to a local user by incomplete echo cancellation of the conference communication device in a signal called a FAR-IN signal received by the local telephone user from the conference communication device at the remote location. An apparatus for reducing energy components due to voice echoes, comprising: a) between a signal called a NEAR-IN signal transmitted by a local user to a telephone network and the arrival of a corresponding echo in a FAR-IN signal; B) receiving a copy of the NEAR-IN signal and delaying the NEAR-IN signal by the measured delay and attenuating the estimated transmission loss of the echo by the smoothed energy component of the NEAR-IN signal Means for generating a time-variable output signal called TEMPLATE, which represents a TEMPLATE, and c) if the FAR-IN signal exceeds a threshold at least partially derived from TEMPLATE. A non-linear processor for passing the FAR-IN signal substantially unattenuated when the FAR-IN signal is within a defined range below the threshold.

26. A NEAR-IN comprising a FIRST network and a SECOND network connected via a communication medium.
Communication signal FIRST for transmission to SECOND network
Network and the FAR-IN communication signal is SECON
A communication system that is received from the D network by the FIRST network and processes the FAR-IN signal to
In a communication system, further comprising a device for reducing the energy component due to the echo of the NEAR-IN signal returned from the OND network: a) the arrival of the NEAR-IN signal and the corresponding echo in the FAR-IN signal; B) receiving a copy of the NEAR-IN signal, processing the copy and delaying the NEAR-IN signal by the measured delay and attenuating the estimated transmission loss for the echo. Means for generating a time-variable output signal, referred to as TEMPLATE, representing a smoothed energy component of the signal; and c) converting the FAR-IN signal if the FAR-IN signal exceeds a threshold at least partially derived from TEMPLATE. Means for passing the signal substantially without attenuation and attenuating the FAR-IN signal if the FAR-IN signal is within a defined range below the threshold.

27. The communication signal is a telephone signal and
27. The communication system according to claim 26, wherein the RST and SECOND networks are telephone networks.

28. The communication system according to claim 27, wherein at least the FIRST telephone network is a cellular telephone network.

29. The communication system according to claim 27, wherein at least the SECOND telephone network is a cellular telephone network.

30. The communication system according to claim 27, wherein said FIRST and SECOND networks are interconnected by satellite links.

31. The FIRST and SECOND networks are interconnected by international trunk lines.
A communication system according to claim 1.

32. The delay measuring means evaluates a frequency domain coherence distance C (τ, f) between the NEAR-IN and FAR-IN signals, where the distance is a frequency f and a relative delay between the two signals. means for adding the distance C (τ, f) over the frequency band of interest to obtain a coherence energy function C (τ); and calculating the local peak value of the function C (τ) 27. The communication system according to claim 26, comprising a means for identifying.

33. A network comprising a FIRST and a SECOND network connected by a transmission medium, wherein a NEAR-IN communication signal is placed on the FIRST network for transmission to the SECOND network, and a FAR-IN communication signal is transmitted from the SECOND network to the FCON-D network.
A method for detecting an echo of a NEAR-IN signal returned to a FIRST network by a SECOND network in a communication system being received by an IRST network, comprising: a frequency domain coherence distance C (τ) of the NEAR-IN signal and the FAR-IN signal. , f), where the distance C is a function of the frequency f and the relative delay between the two signals, adding the distance C (τ, f) over the frequency band of interest to obtain the coherence energy function distance C (Τ) and identifying the local peak value of the function C (τ).

34. The distance C (τ, f) is given by: C (τ, f) = [| SYX (τ, f) | ² ] / [SY (f) xSX (τ, f)], where f represents frequency, SY (f) is an averaged self spectrum of the NEAR-IN signal, and SX (τ, f) is FA
The averaged self-spectrum of the R-IN signal, SYX
34. The method according to claim 33, wherein (?, F) is the average of the cross spectrum of the NEAR-IN signal and the FAR-IN signal.

35. FIRST and SECON connected by a transmission medium
An apparatus for detecting echoes in a communication system consisting of D, wherein a NEAR-IN communication signal is placed on a FIRST network for transmission to a SECOND network, a FAR-IN communication signal is received by the FIRST network from the SECOND network, and The echo is transmitted by the SECOND network
Means for estimating the frequency domain coherence distance C (τ, f) of the NEAR-IN signal and the FAR-IN signal in an apparatus that is an echo of the NEAR-IN signal returned to the FIRST network; Means for adding the distance C (τ, f) over the frequency band of interest to obtain a coherence energy function C (τ); a function of the relative delay τ between the signals; Means for identifying a target peak value.