KR101042479B1

KR101042479B1 - Apparatus and its method for providing echo cancellation using delay prediction

Info

Publication number: KR101042479B1
Application number: KR1020030099744A
Authority: KR
Inventors: 신우철; 배한수; 장세헌
Original assignee: 주식회사 케이티
Priority date: 2003-12-30
Filing date: 2003-12-30
Publication date: 2011-06-16
Also published as: KR20050070338A

Abstract

본 발명은, 지연시간 예측에 따른 반향 제거 기능을 가지는 음성 코딩/디코딩 장치 및 그 방법에 관한 것으로, 음성전화 등에서 음성신호 출력시에 가청신호 밖의 파일럿 신호(pilot signal)(P₁(n))를 음성출력부로 발생시키고, 이후 음성입력부로부터 입력되는 파일럿 신호(P₂(n))와의 차로부터 지연시간을 정확히 예측하여 반향 신호를 효과적으로 제거하기 위한, 지연시간 예측에 따른 반향 제거 기능을 가지는 음성 코딩/디코딩 장치 및 그 방법을 제공하고자 한다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech coding / decoding device having an echo cancellation function according to a delay time prediction, and a method thereof, wherein a pilot signal (P ₁ (n)) outside an audible signal is output when a speech signal is output in a voice telephone or the like. Is generated by the voice output unit, and then has an echo cancellation function according to the delay time prediction to effectively remove the echo signal by accurately predicting the delay time from the difference from the pilot signal P ₂ (n) input from the voice input unit. A coding / decoding apparatus and method thereof are provided.

이를 위하여, 본 발명은, 음성 코딩/디코딩 장치에 있어서, 전송채널을 통하여 수신한 음성부호를 음성신호로 복호화하고, 상기 음성신호에 제 1 파일럿 신호를 추가하여 음성출력수단을 통하여 출력시키기 위한 음성 복호화 수단; 상기 음성 복호화 수단으로부터 입력받은 상기 제 1 파일럿 신호와 음성입력수단을 통하여 입력받은 음성신호로부터 추출한 제 2 파일럿 신호를 비교하여 지연시간을 예측하고, 상기 예측한 지연시간에 따라 상기 음성입력수단을 통하여 입력받은 음성신호에서 반향 신호를 제거하기 위한 지연시간 예측형 반향 제거 수단; 및 상기 지연시간 예측형 반향 제거 수단에서 반향 신호를 제거한 음성신호를 전달받아 부호화하기 위한 음성 부호화 수단을 포함하며, 인터넷 음성전화(VoIP) 단말 등에 이용된다.To this end, the present invention, in the voice coding / decoding apparatus, a voice for decoding the voice code received through the transmission channel into a voice signal, and adds a first pilot signal to the voice signal for output through the voice output means Decoding means; The delay time is predicted by comparing the first pilot signal input from the voice decoding means and the second pilot signal extracted from the voice signal received through the voice input means, and through the voice input means according to the predicted delay time. Delay prediction prediction echo removing means for removing the echo signal from the received speech signal; And speech encoding means for receiving and encoding a speech signal from which the echo signal has been removed by the delay prediction type echo cancellation means, and used for an Internet voice telephone (VoIP) terminal.

인터넷 음성전화(VoIP), 반향 제거, 적응형 필터, 파일럿 신호, 지연시간 예측, 음성 코딩/디코딩 장치Voice over IP (VoIP), echo cancellation, adaptive filter, pilot signal, latency prediction, voice coding / decoding device

Description

Speech coding / decoding device having echo cancellation function according to delay prediction and its method {APPARATUS AND ITS METHOD FOR PROVIDING ECHO CANCELLATION USING DELAY PREDICTION}

도 1은 종래의 인터넷 음성전화(VoIP) 단말기에서 음성 코덱의 일실시예 구성도.1 is a block diagram of an embodiment of a voice codec in a conventional Internet voice telephone (VoIP) terminal.

도 2는 종래의 인터넷 음성전화(VoIP) 단말기에서 음성 코덱의 다른 실시예 구성도.2 is a block diagram of another embodiment of a voice codec in a conventional Internet voice telephone (VoIP) terminal.

도 3은 본 발명에 따른 지연시간 예측에 따른 반향 제거 기능을 가지는 음성 코딩/디코딩 장치에 대한 일실시예 구성도.3 is a diagram illustrating an embodiment of a speech coding / decoding apparatus having echo cancellation according to a delay time prediction according to the present invention.

도 4는 본 발명에 따른 지연시간 예측에 따른 반향 제거 기능을 가지는 음성 코딩/디코딩 장치 중 지연시간 예측기의 일실시예 상세 구성도.4 is a detailed block diagram of an embodiment of a delay predictor among speech coding / decoding apparatuses having echo cancellation according to a delay prediction according to the present invention.

도 5는 본 발명에 따른 지연시간 예측에 따른 반향 제거 기능을 가지는 음성 코딩/디코딩 장치 중 반향 제거기의 일실시예 상세 구성도.
5 is a detailed block diagram of an echo canceller among speech coding / decoding apparatuses having echo cancellation according to a delay time prediction according to the present invention.

* 도면의 주요 부분에 대한 부호 설명* Explanation of symbols on the main parts of the drawing

31 : 음성 복호화기 32 : 음성 부호화기 31: speech decoder 32: speech encoder

33 : 반향 제거부 331 : 지연시간 예측기33: echo canceller 331: delay time predictor

332 : 반향 제거기 34 : 스피커332: echo canceller 34: speaker

35 : 마이크로폰
35: microphone

본 발명은, 지연시간 예측에 따른 반향 제거 기능을 가지는 음성 코딩/디코딩 장치 및 그 방법에 관한 것으로, 더욱 상세하게는 음성전화 등에서 음성신호 출력시에 가청신호 밖의 파일럿 신호(pilot signal)(P₁(n))를 음성출력부로 발생시키고, 이후 음성입력부로부터 입력되는 파일럿 신호(P₂(n))와의 차로부터 지연시간을 정확히 예측하여 반향 신호를 효과적으로 제거하기 위한, 지연시간 예측에 따른 반향 제거 기능을 가지는 음성 코딩/디코딩 장치 및 그 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech coding / decoding apparatus having an echo cancellation function according to a delay time prediction, and more particularly, to a pilot signal (P ₁ ) outside an audible signal when outputting a speech signal in a voice telephone or the like. (n)) is generated by the voice output unit, and the echo cancellation according to the delay time prediction is performed to effectively remove the echo signal by accurately predicting the delay time from the difference from the pilot signal P ₂ (n) input from the voice input unit. A voice coding / decoding device having a function and a method thereof are provided.

이하의 일예에서는 인터넷 음성전화(VoIP)를 예로 들어 설명하기로 한다. 그러나 본 발명이 인터넷 음성전화(VoIP)에 한정되는 것은 아니다.In the following example, an Internet voice telephone (VoIP) will be described as an example. However, the present invention is not limited to the Internet voice telephone (VoIP).

종래의 인터넷 음성전화(VoIP) 단말기의 음성 코딩/디코딩 장치(코덱)는, 도 1에 도시된 바와 같이, 전송채널을 통하여 수신한 음성부호를 복호화하여 스피커(13)로 출력하기 위한 음성 복호화기(11) 및 마이크로폰(14)을 통하여 수신한 송화자의 음성신호를 부호화하여 압축전송하기 위한 음성 부호화기(12)를 구비 한다. 이 과정에서 복호화된 음성신호(Vd(n), n은 정수이며, 양자화 및 샘플링된 디지털 음성신호의 시간을 의미한다)는 그 일부의 신호가 다시 마이크로폰(14)을 통하여 입력되므로, 상대방의 단말기에서 반향이 발생하게 되며, 역방향에서도 마찬가지의 반향이 발생하게 된다.A conventional voice coding / decoding device (codec) of a VoIP terminal is a voice decoder for decoding a voice code received through a transmission channel and outputting the voice code to the speaker 13. (11) and a voice encoder (12) for encoding and compressing and transmitting the voice signal of the talker received through the microphone (14). In this process, the decoded voice signal (Vd (n), n is an integer and means the time of the quantized and sampled digital voice signal) is partially inputted through the microphone 14, and thus the terminal of the other party. Echo occurs at, and the same echo occurs in the reverse direction.

이와 같은 문제를 해결하기 위하여 종래에는 마이크로폰과 스피커를 일정 거리 이상 분리하여 물리적으로 반향되는 신호를 차단함으로써, 반향신호의 발생을 억제하는 방법을 이용하였다. 예를 들면, 이어폰을 이용하거나 지향성이 좋은 마이크로폰을 이용하는 방법이 그 예이다.In order to solve such a problem, conventionally, the microphone and the speaker are separated by a predetermined distance to block a signal that is physically reflected, and thus a method of suppressing the generation of the echo signal is used. For example, an earphone or a directional microphone is an example.

그러나 이와 같은 방법은 추가의 장비(예: 이어폰)를 요구하기 때문에 사용자의 불편을 야기한다. 또한, 스피커폰을 이용한 다자 통화환경에서는 마이크로폰과 스피커를 일정 거리 이상 분리하는 물리적인 반향제거 방법에 의하여서는 반향이 완벽히 제거되지 않는 문제점이 있다. 이를 보완하기 위해서 개발된 것이 도 2에 도시된 바와 같은 적응형 반향 제거 장치를 구비한 음성 코딩/디코딩 장치(코덱)이다.However, such a method requires additional equipment (eg, earphones), causing inconvenience to users. In addition, in a multi-party call environment using a speakerphone, there is a problem in that echo is not completely removed by a physical echo cancellation method that separates the microphone and the speaker more than a predetermined distance. Developed to compensate for this, a speech coding / decoding device (codec) with an adaptive echo cancellation device as shown in FIG.

도 2에 도시된 바와 같이, 적응형 반향 제거 장치를 구비한 음성 코딩/디코딩 장치는, 음성 복호화기(21)로부터 출력되는 음성신호(Vd(n))를 적응형 반향 제거 장치(23)에서 기설정된 시간 동안 지연시켜 가중치를 곱한 후, 마이크로폰(25)으로부터 입력되는 음성신호(Vi(m))로부터 빼준다.As shown in FIG. 2, the speech coding / decoding apparatus including the adaptive echo canceller is configured to convert the speech signal Vd (n) output from the speech decoder 21 into the adaptive echo canceller 23. After delaying for a predetermined time and multiplying the weights, they are subtracted from the voice signal Vi (m) input from the microphone 25.

그런데, 상기와 같은 종래 방법에서는 스피커와 마이크로폰의 거리가 일정한 구간에 있다고 가정하고(예 : 0.5m), 음성신호의 공기중 전파속도(Vs meter/sec)를 고려한 계산값에 의하여 지연시간이 기설정되어 있다.However, in the conventional method as described above, it is assumed that the distance between the speaker and the microphone is in a constant section (for example, 0.5 m), and the delay time is determined based on the calculated value considering the air velocity (Vs meter / sec) of the voice signal. It is set.

그런데, 인터넷 음성전화(VoIP) 단말이 스피커 및 마이크로폰을 구비한 개인용 컴퓨터(PC)일 경우 등에는 스피커와 마이크로폰의 위치가 사용자마다 또는 사용 시간마다 각각 달라지기 때문에 기설정된 지연시간을 이용하는 종래 방법으로는 반향 신호를 제거하기 어려운 문제점이 있다.However, when the VoIP terminal is a personal computer (PC) equipped with a speaker and a microphone, the location of the speaker and the microphone is different for each user or each use time. There is a problem that it is difficult to remove the echo signal.

즉, 상기와 같은 종래 방법은 스피커와 마이크로폰의 위치가 변경되는 통신 환경에서는 반향 신호를 효과적으로 제거하는 것이 불가능한 문제점이 있다.
That is, the conventional method as described above has a problem that it is impossible to effectively remove the echo signal in a communication environment in which the positions of the speaker and the microphone are changed.

본 발명은 상기와 같은 문제점을 해결하기 위하여 제안된 것으로, 음성전화 등에서 음성신호 출력시에 가청신호 밖의 파일럿 신호(pilot signal)(P₁(n))를 음성출력부로 발생시키고, 이후 음성입력부로부터 입력되는 파일럿 신호(P₂(n))와의 차로부터 지연시간을 정확히 예측하여 반향 신호를 효과적으로 제거하기 위한, 지연시간 예측에 따른 반향 제거 기능을 가지는 음성 코딩/디코딩 장치 및 그 방법을 제공하는데 그 목적이 있다.
The present invention has been proposed to solve the above problems, and generates a pilot signal (P ₁ (n)) out of an audible signal to a voice output unit when outputting a voice signal in a voice telephone or the like, and then from the voice input unit. Provided is a speech coding / decoding device having an echo cancellation function according to a delay time prediction for effectively canceling an echo signal by accurately predicting a delay time from a difference from an input pilot signal P ₂ (n). There is a purpose.

상기의 목적을 달성하기 위한 본 발명의 장치는, 음성 코딩/디코딩 장치에 있어서, 전송채널을 통하여 수신한 음성부호를 음성신호로 복호화하고, 상기 음성신호에 제 1 파일럿 신호를 추가하여 음성출력수단을 통하여 출력시키기 위한 음성 복호화 수단; 상기 음성 복호화 수단으로부터 입력받은 상기 제 1 파일럿 신호와 음성입력수단을 통하여 입력받은 음성신호로부터 추출한 제 2 파일럿 신호를 비교하여 지연시간을 예측하고, 상기 예측한 지연시간에 따라 상기 음성입력수단을 통하여 입력받은 음성신호에서 반향 신호를 제거하기 위한 지연시간 예측형 반향 제거 수단; 및 상기 지연시간 예측형 반향 제거 수단에서 반향 신호를 제거한 음성신호를 전달받아 부호화하기 위한 음성 부호화 수단을 포함한다.The apparatus of the present invention for achieving the above object, in the speech coding / decoding apparatus, a speech signal received through a transmission channel to decode the speech signal, and adds a first pilot signal to the speech signal means for audio output means Speech decoding means for outputting through; The delay time is predicted by comparing the first pilot signal input from the voice decoding means and the second pilot signal extracted from the voice signal received through the voice input means, and through the voice input means according to the predicted delay time. Delay prediction prediction echo removing means for removing the echo signal from the received speech signal; And speech encoding means for receiving and encoding the speech signal from which the echo signal is removed by the delay prediction type echo canceller.

한편, 본 발명의 방법은, 음성 코딩/디코딩 방법에 있어서, 음성 복호화기가 전송채널을 통하여 수신한 음성부호를 음성신호로 복호화한 후, 상기 음성신호에 제 1 파일럿 신호를 추가하여 음성출력수단을 통하여 출력시키는 음성 복호화 단계; 지연시간 예측기가 상기 음성 복호화기로부터 입력받은 상기 제 1 파일럿 신호와 음성입력수단을 통하여 입력받은 음성신호로부터 추출한 제 2 파일럿 신호를 비교하여 지연시간을 예측하는 지연시간 예측 단계; 반향 제거기가 상기 지연시간 예측 단계에서 예측한 지연시간에 따라 상기 음성입력수단을 통하여 입력받은 음성신호에서 반향 신호를 제거하는 반향 제거 단계; 및 음성 부호화기가 상기 반향 제거 단계에서 반향 신호를 제거한 음성신호를 전달받아 부호화하는 음성 부호화 단계를 포함한다.In the speech coding / decoding method, on the other hand, in the speech coding / decoding method, the speech coder decodes a speech code received through a transmission channel into a speech signal, and then adds a first pilot signal to the speech signal to provide speech output means. Voice decoding step of outputting through; A delay time predicting step of predicting a delay time by comparing a delay time predictor by comparing the first pilot signal received from the speech decoder with a second pilot signal extracted from the speech signal received through the speech input means; An echo canceling step of removing an echo signal from the speech signal received through the speech input means according to the delay time predicted by the echo canceller in the delay time estimating step; And a speech encoding step in which the speech encoder receives and encodes the speech signal from which the echo signal is removed in the echo cancellation step.

상술한 목적, 특징들 및 장점은 첨부된 도면과 관련한 다음의 상세한 설명을 통하여 보다 분명해 질 것이다. 이하 첨부된 도면을 참조하여 본 발명에 따른 바람직한 일실시예를 상세히 설명한다.The above-mentioned objects, features and advantages will become more apparent from the following detailed description in conjunction with the accompanying drawings. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 3은 본 발명에 따른 지연시간 예측에 따른 반향 제거 기능을 가지는 음성 코딩/디코딩 장치에 대한 일실시예 구성도이다.3 is a block diagram of an embodiment of a speech coding / decoding apparatus having echo cancellation according to a delay prediction according to the present invention.

도 3에 도시된 바와 같이, 본 발명에 따른 지연시간 예측에 따른 반향 제거 기능을 가지는 음성 코딩/디코딩 장치는, 전송채널을 통하여 수신한 음성부호를 복호화하여 음성신호(Vd(n))로 변환한 후에, 상기 음성신호에 파일럿 신호(P₁(n))를 추가하여 스피커(34)로 출력시키기 위한 음성 복호화기(31), 상기 음성 복호화기(31)로부터 입력받은 파일럿 신호(P₁(n))와 마이크로폰(35)으로부터 입력받은 음성신호(Vi(n))로부터 추출한 파일럿 신호(P₂(n))를 비교하여 지연시간을 예측하고, 상기 예측한 지연시간에 따라 마이크로폰(35)으로부터 입력받은 음성신호(Vi(n))에서 반향 신호를 제거한 음성신호(V(n))를 출력하기 위한 반향 제거부(33), 및 반향 제거부(33)를 통하여 반향 신호가 제거된 마이크로폰(35)으로부터의 음성신호(V(n))를 전달받아 음성부호로 부호화한 후 출력하기 위한 음성 부호화기(32)를 포함한다.As shown in FIG. 3, the speech coding / decoding apparatus having an echo cancellation function according to a delay time prediction according to the present invention decodes a speech code received through a transmission channel and converts the speech code into a speech signal Vd (n). After that, a pilot decoder P ₁ (n) is added to the voice signal and output to the speaker 34 and a pilot signal P ₁ (received from the voice decoder 31). n)) and the pilot signal P ₂ (n) extracted from the voice signal Vi (n) received from the microphone 35 to compare the predicted delay time, and the microphone 35 according to the estimated delay time. Echo canceler 33 for outputting an audio signal V (n) from which the echo signal is removed from the audio signal Vi (n) received from the microphone, and a microphone from which the echo signal is removed through the echo canceler 33 After receiving the voice signal V (n) from (35) and encoding it with a voice code, And a speech encoder 32 for outputting.

이때, 음성 복호화기(31)는 음성신호 출력시에 파일럿 신호(pilot signal)(P₁(n))를 발생시켜 스피커(34)를 구동한다. 따라서 Vp(n)은 초기의 N 샘플구간에서는 P₁(n)이고, 이후의 구간에서는 Vd(n)과 동일하다. 일반적으로 N은 약 0.1초 구간의 음성신호 구간에 해당하므로, 음성 샘플링 주파수와 0.1을 곱하여 정수의 값을 취한 값으로 선택될 수 있다.At this time, the voice decoder 31 generates a pilot signal P ₁ (n) at the time of outputting the voice signal to drive the speaker 34. Therefore, Vp (n) is P ₁ (n) in the initial N sample interval and is equal to Vd (n) in the subsequent interval. In general, since N corresponds to a voice signal section of about 0.1 second, it may be selected as a value obtained by multiplying the voice sampling frequency by 0.1 and taking an integer value.

한편, 파일럿 신호(P₁(n))는 사람이 인지할 수 없는 신호를 이용한다. 예를 들어, 가청주파수 대역에서 인간의 귀로 쉽게 감지할 수 없는 대역의 정현파 신호를 이용하며, 지연시간의 예측 오차를 줄이기 위하여 복수의 정현파 신호를 이용할 수도 있다. 이때, 정현파 신호의 크기는 마이크로폰(35)을 통하여 입력되는 신호 Vi(n)의 크기를 측정하여 지연시간(Td)을 예측하기 용이하도록 결정한다. 일반적으로 정현파 신호는 30 데시벨 정도의 신호로 발생하면 인간의 귀에 인지되지 않는 효과적인 파일럿 신호가 된다. 한편, 정현파 신호의 크기를 지연시간(Td)을 예측하기 용이하도록 결정하지 않으면 반향 제거기(332)의 계산량이 너무 많아 실시간 반향 제거 구현이 용이하지 않다. 따라서 지연시간(Td)을 정확하게 예측하면서도 계산량이 적도록 적당한 정현파 신호의 크기(예 : 30 데이벨)를 결정하여야 한다.On the other hand, the pilot signal P ₁ (n) uses a signal that cannot be recognized by a person. For example, a sinusoidal signal of a band that cannot be easily detected by the human ear in the audio frequency band may be used, and a plurality of sinusoidal signals may be used to reduce the prediction error of the delay time. At this time, the magnitude of the sinusoidal signal is determined to measure the magnitude of the signal Vi (n) input through the microphone 35 so as to easily predict the delay time Td. Typically, a sinusoidal signal generated at 30 decibels is an effective pilot signal that is not perceived by the human ear. On the other hand, if the magnitude of the sinusoidal signal is not determined to facilitate the prediction of the delay time Td, the amount of calculation of the echo canceller 332 is too large, making it difficult to implement real-time echo cancellation. Therefore, the size of the sinusoidal signal (e.g. 30 days) should be determined so that the calculation time is small while accurately predicting the delay time (Td).

한편, 상기 반향 제거부(33)는 음성 복호화기(31)로부터의 파일럿 신호(P₁(n))와 마이크로폰(35)으로부터의 음성신호에서 추출한 파일럿 신호(P₂(n))를 비교하여 지연시간을 예측하기 위한 지연시간 예측기(331) 및 상기 지연시간 예측기(331)로부터 전달받은 지연시간에 따라 마이크로폰(35)으로부터 입력받은 음성신호에서 반향신호를 제거하기 위한 반향 제거기(332)를 포함한다. 이때, 지연시간 예측기(331) 및 반향 제거기(332)의 상세한 동작 과정에 대하여서는 도 4 및 도 5를 각각 참조하여 상세히 살펴보기로 한다.On the other hand, the echo canceller 33 compares the pilot signal P ₁ (n) from the voice decoder 31 with the pilot signal P ₂ (n) extracted from the voice signal from the microphone 35. A delay estimator 331 for predicting a delay time and an echo canceller 332 for removing an echo signal from a voice signal input from the microphone 35 according to the delay time received from the delay time predictor 331. do. In this case, detailed operations of the delay predictor 331 and the echo canceller 332 will be described in detail with reference to FIGS. 4 and 5, respectively.

도 4는 본 발명에 따른 지연시간 예측에 따른 반향 제거 기능을 가지는 음성 코딩/디코딩 장치 중 지연시간 예측기의 일실시예 상세 구성도이다.4 is a detailed block diagram of a delay predictor among speech coding / decoding apparatuses having echo cancellation according to a delay prediction according to the present invention.

도 4에 도시된 바와 같이, 본 발명에 따른 지연시간 예측기(331)는, 마이크로폰(35)을 통하여 입력받은 음성신호(Vi(n))로부터 파일럿 신호(P₂(n))를 추출하기 위한 대역 통과기(41), 제어기(46)의 제어에 따라 음성 부호화기(32)로부터 입력받은 파일럿 신호(P₁(n))를 샘플단위로 지연시키기 위한 N 샘플 지연기(42), 상기 대역 통과기(41)로부터 전달받은 파일럿 신호(P₂(n))와 상기 N 샘플 지연기(42)로부터 전달받은 파일럿 신호(P₁(n))의 차를 계산하기 위한 상기 감산기(43), 감산기(43)로부터 차 신호(E(n))를 전달받아 평균 절대 오차값을 계산하기 위한 평균 계산기(44), 상기 평균 계산기(44)로부터 전달받은 평균 절대 오차값이 최소가 되는 지연시간(Td) 성분을 추출하여 반향 제거기(332)와 제어기(46)로 전달하기 위한 최소값 예측기(45), 및 기설정된 전체 샘플기간 또는 상기 최소값 예측기(45)로부터의 지연시간(Td)에 따라 상기 N 샘플 지연기(42)가 동작하는 샘플단위의 범위를 제어하기 위한 제어기(46)를 포함한다.As shown in FIG. 4, the delay time predictor 331 according to the present invention is configured to extract the pilot signal P ₂ (n) from the voice signal Vi (n) received through the microphone 35. N-sample delay 42 for delaying the pilot signal P ₁ (n) received from the speech coder 32 in units of samples under the control of the band pass 41 and the controller 46, the band pass The subtractor 43 and the subtractor for calculating a difference between the pilot signal P ₂ (n) received from the device 41 and the pilot signal P ₁ (n) received from the N sample delay unit 42. An average calculator 44 for receiving the difference signal E (n) from 43 and calculating a mean absolute error value; a delay time Td at which the average absolute error value received from the average calculator 44 becomes the minimum; ) The minimum value predictor 45 for extracting the component and passing it to the echo canceller 332 and the controller 46, Includes a controller 46 for controlling the range of sample units in which the N sample delay unit 42 operates according to the delay time Td from the minimum predictor 45.

이때, 스피커(34)와 마이크로폰(35)을 통한 감쇄가 없는 경우에 대역 통과기(41)로부터 추출된 파일럿 신호 P₂(n)는 P₁(n)의 단순한 지연 기간 성분으로 구성되나, 실제의 구현과정에서는 감쇄 혹은 증폭과 다양한 반사경로를 통하므로 일정한 샘플수(예, 100 샘플)의 P₁(n)과 P₂(n)값을 이용하여, 평균 절대 오차값을 평균 계산기(44)를 이용하여 구하고, 이 값이 최소가 되는 지연시간 성분을 계산한다. 이때, 지연시간 성분은 N 샘플 지연기(42)가 P₁(n)의 신호를 한 샘플단위로 지연하면서 평균 계산기(44)를 이용하여 계산한 평균 절대 오차의 신호 중에 최소값이 최소값 예측기(45)에서 선택된다.At this time, when there is no attenuation through the speaker 34 and the microphone 35, the pilot signal P ₂ (n) extracted from the band pass 41 is composed of a simple delay period component of P ₁ (n), but actually In the implementation process of A through the attenuation or amplification and various reflection paths, the average absolute error value is calculated using the P ₁ (n) and P ₂ (n) values of a certain number of samples (eg 100 samples). And calculate the delay component that minimizes this value. In this case, the delay time component is the minimum value predictor 45 of the average absolute error signal calculated using the average calculator 44 while the N sample delayer 42 delays the signal of P ₁ (n) by one sample unit. ) Is selected.

또한, 상기 제어기(46)는 시스템 초기 구동시(프로그램 시작 또는 PC 부팅시)에 기설정된 전체 샘플구간에서 동작하도록 상기 N 샘플 지연기(42)를 제어하고, 실제의 음성 송수신 과정에서는 상기 최소값 예측기(45)에서 추출한 지연시간을 재이용하여 기설정된 범위 내(+. -)의 샘플구간에서 동작하도록 상기 N 샘플 지연기(42)를 제어한다.In addition, the controller 46 controls the N sample delay unit 42 to operate at a predetermined total sample interval during system initial driving (when program starting or PC booting), and in the actual voice transmission / reception process, the minimum predictor The N sample delay unit 42 is controlled to operate in the sample interval within a predetermined range (+.-) By reusing the delay time extracted in (45).

한편, 지연시간 예측기(331)의 출력신호(Td)는 반향 제거기(332)로 입력되는데, 반향 제거기(332)는 종래의 기술에서와 유사한 적응 디지털 필터로 구성될 수 있다. 그러나 본 발명에서는 지연시간을 정확히 예측하므로, 적응 필터의 차수(order 또는 tap)를 적게 구현하여도(예 : 3) 반향 신호의 제거성능을 높일 수 있다.Meanwhile, the output signal Td of the delay time predictor 331 is input to the echo canceller 332. The echo canceller 332 may be configured with an adaptive digital filter similar to that of the related art. However, in the present invention, since the delay time is accurately predicted, even if the order (tap or tap) of the adaptive filter is implemented less (for example, 3), the cancellation performance of the echo signal can be improved.

도 5는 본 발명에 따른 지연시간 예측에 따른 반향 제거 기능을 가지는 음성 코딩/디코딩 장치 중 반향 제거기의 일실시예 상세 구성도이다.5 is a detailed block diagram of an echo canceller among speech coding / decoding apparatuses having echo cancellation according to a delay prediction according to the present invention.

도 5에 도시된 바와 같이, 본 발명에 따른 반향 제거기(332)는, 지연시간 예측기(331)로부터 전달받은 지연시간(Td)을 이용하여 음성 복호화기(31)의 출력 신 호(Vd(n))를 3차의 선형 디지털 필터로 통과시켜 반향신호 성분(Ve(n)을 발생시킨다. 즉, 음성 복호화기(31)의 출력 신호(Vd(n))를 시간 지연기를 통과시킨 후, 선형 디지털 필터의 가중치(W(-1), W(0), W(1))를 곱하여 반향신호 성분(Ve(n))을 발생시킨다. 그리고, 이 신호(Ve(n))를 마이크로폰 입력 신호에 대한 차 신호로서 출력한다.As shown in FIG. 5, the echo canceller 332 according to the present invention uses the delay time Td received from the delay predictor 331 to output the output signal Vd (n (n) of the speech decoder 31. ) Is passed through a third-order linear digital filter to generate the echo signal component Ve (n), i.e., after passing the output signal Vd (n) of the speech decoder 31 through the time delay, The echo signal component Ve (n) is generated by multiplying the weights W (-1), W (0) and W (1) of the digital filter, and the signal Ve (n) is converted into a microphone input signal. Output as a difference signal for.

이때, 선형 디지털 필터의 가중치(W(-1), W(0), W(1))는 종래의 적응형 디지털 필터에서와 같이 최소의 오차를 발생하는 값으로 설정할 수 있으며, 계산량을 줄이기 위하여 파일럿 신호의 발생시 가중치를 미리 구한 후 사용할 수도 있다. 또한, 기존의 LMS(Least Mean Square) 알고리즘을 이용하여 구한 값을 레지스터에 저장한 후 사용할 수도 있다.In this case, the weights W (-1), W (0), and W (1) of the linear digital filter may be set to values that generate a minimum error as in the conventional adaptive digital filter. When the pilot signal is generated, the weight may be obtained in advance and used. In addition, a value obtained using a conventional Least Mean Square (LMS) algorithm may be stored in a register and used.

이와 같이, 본 발명에 따른 지연시간 예측기(331)는 종래의 기술에서와 달리 음성 복호화기(31)의 출력신호에 파일럿 신호(P₁(n))를 추가하여 스피커(34)를 구동하고, 마이크로폰(35)으로부터 입력되는 파일럿 신호(P₂(n))를 분석하여 지연시간을 정확히 예측한다. 앞에서도 설명하였듯이, 본 발명에 이용되는 파일럿 신호는 실제의 음성신호와 거리가 먼 비가청 주파수를 이용하며(예, 3.5 KHz의 정현파 신호), 지연시간 예측기(331)는 이 신호의 주파수를 알고 있으므로, 대역 통과기(41)를 이용하여 파일럿 성분의 신호를 용이하게 추출할 수 있다.As described above, the delay time predictor 331 according to the present invention drives the speaker 34 by adding a pilot signal P ₁ (n) to the output signal of the speech decoder 31 unlike in the related art. The pilot signal P ₂ (n) input from the microphone 35 is analyzed to accurately predict the delay time. As described above, the pilot signal used in the present invention uses an inaudible frequency far from the actual speech signal (eg, a sinusoidal signal of 3.5 KHz), and the delay time predictor 331 knows the frequency of this signal. Therefore, the band pass 41 can be used to easily extract the signal of the pilot component.

즉, 종래에는 스피커와 마이크로폰(Ds meter)의 거리가 일정한 구간에 있다고 가정하고(예, 0.5m), 음성신호의 공기중 전파속도(Vs meter/sec)를 고려하여 하 기 [수학식 1]을 이용하여 지연시간(Td)을 계산하였다.That is, conventionally, assuming that the distance between the speaker and the microphone (Ds meter) is in a certain section (for example, 0.5m), considering the air velocity (Vs meter / sec) of the speech signal [Equation 1] Delay time (Td) was calculated using.

Td = Ds * Fs / VsTd = Ds * Fs / Vs

즉, 종래에는 스피커와 마이크로폰의 거리(Ds)가 일정한 값을 가지고 있다고 가정하므로(예 : 0.5 미터), 음성신호 샘플링 주파수가 8000 samples/sec 인 경우, Td = 0.5 * 8000 / 340 = 11 samples로 계산된다. 그러나 개인용 컴퓨터(PC)를 이용한 인터넷 음성전화 등에서는 스피커와 마이크로폰의 거리가 사용 환경에 따라 다양하다(예 : 0.1 미터 ~ 10 미터).That is, since the distance Ds between the speaker and the microphone has a constant value (for example, 0.5 meters), when the voice signal sampling frequency is 8000 samples / sec, Td = 0.5 * 8000/340 = 11 samples. Is calculated. However, the distance between the speaker and the microphone varies depending on the environment in the Internet voice call using a personal computer (eg, 0.1 to 10 meters).

따라서 지연시간을 정확히 예측하기 위해서는 많은 계산양이 필요할 수 있다. 즉, 10 미터의 거리를 고려하면 위의 [수학식 1]에서와 같이 10* 8000/340 = 235 샘플 동안의 지연시간 성분을 모두 비교하여야 한다. 또한, 광대역 음성신호의 경우 샘플링 주파수(Fs)가 올라가므로(예, 16000Hz) 지연시간 계산의 복잡도가 선형 비례로 올라간다(479 샘플).Therefore, a large amount of computation may be required to accurately predict the delay time. In other words, considering the distance of 10 meters, we need to compare all the delay components for 10 * 8000/340 = 235 samples as shown in Equation 1 above. In addition, in the case of a wideband voice signal, the sampling frequency Fs is increased (eg, 16000 Hz), so that the complexity of the delay time calculation increases linearly (479 samples).

이에 비해 본 발명에서는 실시간 반향 신호 제거를 용이하게 하기 위하여 시스템 초기 구동시(프로그램 시작 또는 PC 부팅시)에만 기설정된 전체 샘플구간에 대한 지연시간 예측을 수행하고, 실제의 음성 송수신 과정에서는 초기에 예측한 지연시간을 재이용하여 기설정된 범위 내(+. -)의 샘플구간에 대한 지연시간 예측을 수행한다.In contrast, in the present invention, in order to easily remove the real-time echo signal, the delay time prediction for the entire sample interval is performed only when the system is initially started (when the program is started or when the PC is booted). By using one delay time, delay time prediction is performed for a sample interval within a predetermined range (+.-).

상술한 바와 같은 본 발명의 방법은 프로그램으로 구현되어 컴퓨터로 읽을 수 있는 기록매체(씨디롬, 램, 롬, 플로피 디스크, 하드 디스크, 광자기 디스크 등)에 저장될 수 있다. 이러한 과정은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있으므로 더 이상 상세히 설명하지 않기로 한다.The method of the present invention as described above may be implemented as a program and stored in a computer-readable recording medium (CD-ROM, RAM, ROM, floppy disk, hard disk, magneto-optical disk, etc.). Since this process can be easily implemented by those skilled in the art will not be described in more detail.

이상에서 설명한 본 발명은, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 있어 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 여러 가지 치환, 변형 및 변경이 가능하므로 전술한 실시예 및 첨부된 도면에 의해 한정되는 것이 아니다.
The present invention described above is capable of various substitutions, modifications, and changes without departing from the spirit of the present invention for those skilled in the art to which the present invention pertains. It is not limited by the drawings.

상기와 같이 본 발명은, 음성전화 등에서 음성신호 출력시에 가청신호 밖의 파일럿 신호(pilot signal)(P₁(n))를 음성출력부로 발생시키고, 이후 음성입력부로부터 입력되는 파일럿 신호(P₂(n))와의 시간차로부터 지연시간을 정확히 예측하여 반향 신호를 제거하므로, 음성출력부와 음성입력부의 거리가 고정되어 있지 않더라도 실시간으로 최적화된 반향 신호 제거 기능을 제공할 수 있는 효과가 있다.As described above, the present invention generates a pilot signal P ₁ (n) out of an audible signal to a voice output unit when outputting a voice signal in a voice telephone or the like, and then inputs a pilot signal P ₂ ( Since the echo signal is removed by accurately predicting the delay time from the time difference with n)), it is possible to provide an optimized echo signal cancellation function in real time even if the distance between the audio output unit and the audio input unit is not fixed.

Claims

In the speech coding / decoding apparatus,

Speech decoding means for decoding the speech code received through the transmission channel into a speech signal and adding the first pilot signal to the speech signal and outputting the speech signal through the speech output means;

The delay time is predicted by comparing the first pilot signal input from the voice decoding means and the second pilot signal extracted from the voice signal received through the voice input means, and through the voice input means according to the predicted delay time. Delay prediction prediction echo removing means for removing the echo signal from the received speech signal; And

Speech encoding means for receiving and encoding the speech signal from which the echo signal is removed by the delay prediction type echo canceller.

Speech coding / decoding apparatus having an echo cancellation function according to the delay prediction including a.

The method of claim 1,

The delay prediction type echo cancellation means,

Delay time predicting means for predicting a delay time by comparing the first pilot signal received from the speech decoding means and the second pilot signal extracted from the speech signal received through the speech input means; And

The echo signal component is generated by delaying the speech signal from the speech decoding means according to the delay time predicted by the delay time predicting means, and from the speech signal input through the speech input means using the generated echo signal component. Echo cancellation means for canceling echo signals

The method of claim 2,

The delay time prediction means,

Pilot signal extracting means for extracting the second pilot signal from the speech signal received through the speech input means;

Sample delay means for delaying the first pilot signal received from the speech decoding means in units of samples under control of a control means;

Subtraction means for calculating a difference between the second pilot signal extracted by the pilot signal extraction means and the first pilot signal delayed by the sample delay means;

Average absolute error value calculating means for receiving a difference signal from said subtraction means for calculating an average absolute error value;

Minimum value predicting means for extracting a delay time at which the average absolute error value received from said average absolute error value calculating means becomes minimum and transferring it to said echo cancellation means; And

The control means for controlling a range of sample units in which the sample delay means operates

The method of claim 3, wherein

The control means,

The sample delay means controls the sample delay means to operate in the entire preset sample interval during system initial driving, and operates the sample delay within the predetermined range by reusing the delay time extracted by the minimum value predicting means in the actual voice transmission / reception process. Speech coding / decoding apparatus having an echo cancellation function according to the delay prediction to control the.

The method according to any one of claims 1 to 4,

The first and second pilot signal,

Speech coding / decoding device having echo cancellation function according to delay time prediction, characterized in that the signal outside the audio frequency band.

In the speech coding / decoding method,

A voice decoding step in which the voice decoder decodes the voice code received through the transmission channel into a voice signal, and then adds a first pilot signal to the voice signal and outputs it through voice output means;

A delay time predicting step of predicting a delay time by comparing a delay time predictor by comparing the first pilot signal received from the speech decoder with a second pilot signal extracted from the speech signal received through the speech input means;

An echo canceling step of removing an echo signal from the speech signal received through the speech input means according to the delay time predicted by the echo canceller in the delay time estimating step; And

The speech encoding step of the speech coder receives and encodes the speech signal from which the echo signal is removed in the echo cancellation step.

Speech coding / decoding method having an echo cancellation function according to the delay prediction comprising a.

The method of claim 6,

The echo cancellation step,

The echo canceller generates a echo signal component by delaying the speech signal from the speech decoder according to the delay time predicted in the delay time predicting step, and inputs it through the speech input means using the generated echo signal component. A speech coding / decoding method having an echo cancellation function according to a delay time prediction for removing an echo signal from a received speech signal.

8. The method according to claim 6 or 7,

The delay time prediction step,

Extracting, by a band pass, the second pilot signal from the voice signal received through the voice input means;

Delaying, by an N sample delayer, the first pilot signal received from the speech decoder in units of samples under the control of a controller;

A subtractor calculating a difference between the second pilot signal extracted by the band passband and the first pilot signal delayed by the N sample delayer;

An average calculator receiving a difference signal from the subtractor to calculate an average absolute error value; And

Extracting, by the minimum predictor, a delay component whose minimum absolute error value calculated by the average calculator becomes the minimum value and passing it to the echo canceller;