KR100647826B1

KR100647826B1 - The blind dereverberation models considering measured noises and the deriving method thereof

Info

Publication number: KR100647826B1
Application number: KR1020050047259A
Authority: KR
Inventors: 이종환; 이수영; 오상훈
Original assignee: 한국과학기술원
Priority date: 2005-06-02
Filing date: 2005-06-02
Publication date: 2006-11-23

Abstract

A blind echo cancel model based on a measured noise and a method for deriving the same are provided to improve a quality of an audio signal acquired by a convergent speed and a process result by deriving a learning rule of each model through an analysis of an independent element. A method for deriving a blind echo cancel model based on a measured noise includes the steps of: defining a blind minimum square cost function for an estimated independent signal source in a frequency region in a blind signal process structure having a signal observed by an input terminal and an independent signal source estimated by an output terminal; defining a blind minimum square cost function in an output terminal by adding the blind minimum square cost function defined for an independent signal source in a frequency region; and differentiating the blind minimum square cost function for each adaptive filter; and adapting a natural gradient for the differentiated blind minimum square cost function to a frequency region.

Description

The blind dereverberation models considering measured noises and the deriving method approximation

도 1은 종래의 반향모델 및 암묵 반향제거모델의 구조도이다.1 is a structural diagram of a conventional echo model and a blind echo cancellation model.

도 2는 종래의 음성합성 채널 및 음성해석 채널의 구조도이다.2 is a structural diagram of a conventional speech synthesis channel and speech interpretation channel.

도 3은 종래의 음성합성 채널을 포함하는 반향채널 모델을 나타낸 도면이다.3 is a diagram illustrating a reflection channel model including a conventional speech synthesis channel.

도 4는 종래의 백색화 필터를 적용한 순수한 암묵 반향제거 필터의 학습 모델을 나타낸 도면이다.4 is a diagram illustrating a learning model of a pure blind echo cancellation filter to which a conventional whitening filter is applied.

도 5는 종래의 실제 반향신호 녹음과정에서 반향이 포함된 신호에 백색 가우시안 측정 잡음이 더해진 형태의 관찰신호 모델을 나타낸 도면이다.FIG. 5 is a view illustrating an observation signal model in which white Gaussian measurement noise is added to a signal including echo in a conventional echo signal recording process.

도 6은 본 발명의 1개의 마이크로폰을 사용할 때의 2개의 독립인 출력신호를 학습하는 암묵 반향제거 모델을 나타낸 도면이다.FIG. 6 is a diagram illustrating a blind echo cancellation model for learning two independent output signals when using one microphone of the present invention. FIG.

도 7은 본 발명의 2개의 마이크로폰을 사용할 때의 2개의 독립인 출력신호를 학습하는 암묵 반향제거 모델을 나타낸 도면이다.FIG. 7 is a diagram illustrating a blind echo cancellation model for learning two independent output signals when using two microphones of the present invention. FIG.

도 8은 본 발명의 2개의 마이크로폰을 사용할 때의 3개의 독립인 출력신호를 학습하는 암묵 반향제거 모델을 나타낸 도면이다.8 is a diagram illustrating a blind echo cancellation model for learning three independent output signals when using two microphones of the present invention.

본 발명은 측정된 잡음을 고려한 암묵 반향제거 모델 및 그 유도방법에 관한 것으로, 특히 2차 이상의 통계적 특성을 고려할 수 있는 독립성분 분석을 반향제거 여파기 적응 알고리즘에 적용하여 측정된 잡음 하에서도 반향제거 성능을 개선하는 모델 및 그 유도방법에 관한 것이다.The present invention relates to a blind echo cancellation model and its derivation method considering the measured noise. In particular, an independent component analysis that can consider the second order or higher statistical characteristics is applied to an echo cancellation filter adaptation algorithm to perform echo cancellation performance under measured noise. To improve the model and its derivation method.

도 1에 보인 바와 같이, 암묵 반향분리는 샘플들 사이에서 서로 독립적인 미지의 음원신호

(10)가 미지의 반향필터

(11)를 통해서 관찰된 신호

(12)만을 알고 있을 때 원래의 음원신호

(10)를 복원하는 기술이다.As shown in Fig. 1, the blind echo separation is an unknown sound source signal independent of each other between samples.

(10) Unknown echo filter

Signal observed through (11)

Original sound source signal when only (12) is known

(10) is a technique to restore.

실제의 환경에서 존재하는 반향채널 효과를 다양한 시간 지연이 존재하는 선형필터

(11)로 모델링할 수 있으며, 센서로 측정한 관찰신호

(12)는 다음의 수학식 1과 같다.Linear filter with various time delays for echo channel effects in real environment

Can be modeled as (11), observation signal measured by sensor

(12) is shown in Equation 1 below.

여기서, *는 컨벌루션 연산자(convolution operator)이다. 한편, 반향제거 필터

(13)를 통과한 출력신호

(14)는 다음의 수학식 2와 같다.Where * is a convolution operator. Meanwhile, echo cancellation filter

Output signal passed through (13)

(14) is shown in Equation 2 below.

이때, 미지의 신호원

(10)는 샘플들 사이에 독립이고 비정규분포(non-Gaussian)인 확률밀도함수를 갖는다고 가정하고, 복원된 신호

(14)가 이러한 가정을 만족하는 가에 대한 기준으로 삼을 수 있는 비용함수(cost function)를 만듦으로써 반향제거 필터

(13)를 학습할 수 있는 적응 알고리즘(15)을 얻을 수 있다. 비용함수를 최소화시키는 형태로 학습이 진행되고 나면, 반향제거 필터

(13)는 반향필터

(11)의 역필터 형태로 얻어진다.Unknown source

(10) assumes that the sample has an independent and non-Gaussian probability density function

Echo cancellation filters by creating a cost function that can be used as a basis for whether (14) meets these assumptions.

An adaptive algorithm 15 that can learn (13) can be obtained. Once the learning is done in a form that minimizes the cost function, the echo cancellation filter

13, echo filter

Obtained in the form of an inverse filter of (11).

반향제거 필터

(13)의 학습법칙은 다양한 비용함수에 따라 서로 다른 알고리즘들이 존재하는데, 근사화와 좀 더 효율적인 학습을 위한 변형을 통해서 이러한 학습법칙들은 다음의 수학식 3과 같은 형태로 얻어진다.Echo cancellation filter

The learning law of (13) has different algorithms according to various cost functions. Through the approximation and transformation for more efficient learning, these learning laws are obtained in the form of Equation 3 below.

여기서,

는 추정하고자하는 음원신호

(10)의 확률밀도함수(probability density function)이고,

은 반향제거필터

(13)의 시간 지연 탭 수이다.here,

Is the sound source signal to be estimated

Probability density function of (10)

Silver echo cancellation filter

The number of time delay taps in (13).

지금까지는 도 1에서와 같이 미지의 음원신호

(10)가 샘플들 사이에 독립이고, 비정규분포를 이루는 가정을 만족하는 경우에 반향필터

(11)의 영향을 제거하는 반향제거 필터

(13)의 학습에 관한 내용이었다. 그러나, 음원신호

(10)가 음성신호(speech signal)인 경우에는 샘플들 사이가 독립이라는 가정에 맞지 않는 문제점이 있기 때문에 수학식 3과 같은 반향제거 필터

(13)에 관한 학습방법을 바로 적용하면 결과적으로 얻어진 적응필터

는 반향채널의 역채널특성을 갖을 뿐만 아니라 음성 샘플들 자체의 의존성(dependency)을 제거함으로써 음성을 백색화(whitening)하려는 특성도 포함하는 형태로 학습하게 된다.Until now, the unknown sound source signal as shown in FIG.

Echo filter when (10) is independent between samples and satisfies the assumption of non-normal distribution

(11) Echo cancellation filter to remove the effect

It was contents about learning of (13). However, the sound source signal

If (10) is a speech signal, there is a problem that does not fit the assumption that the samples are independent, so the echo cancellation filter as shown in Equation (3)

Adaptive filter obtained as a result of directly applying the learning method of (13)

Not only has the inverse channel characteristics of the echo channel, but also learns to include the characteristic of whitening the speech by removing the dependency of the speech samples themselves.

따라서, 음성신호가 신호원인 경우에는 도 2와 같이 잡음이 섞이지 않은 음성신호는 상호 독립적인 동기신호(excitation signal)로부터 음성합성 채널(20)을 통과해서 나온 신호로 가정하고, 관찰된 음성신호로부터 가능한 서로 독립인 음원신호를 얻으려는 학습을 통해서 음성합성 채널(20)의 역채널형태인 음성해석(speech analysis) 채널 또는 백색화 필터(whitening filter)(21)를 얻게 된다. 이렇게 얻어진 음성합성 채널(20)과 음성해석 채널(21)을 도 1의 반향필터

(11) 및 반향제거 필터

(13)에 적용하게 되면 도 3과 도 4와 같이 음성합성 채널(30)을 적용한 반향필터

(31)과 백색화 필터(40)를 적용한 순수한 반향제거 필터

(41)를 얻을 수 있다.Therefore, in the case where the voice signal is a signal source, it is assumed that the non-noise voice signal as shown in FIG. 2 is a signal passed through the voice synthesis channel 20 from mutually independent synchronization signals, and from the observed voice signal. By learning to obtain the independent source signal as much as possible, a speech analysis channel or a whitening filter 21, which is an inverse channel form of the speech synthesis channel 20, is obtained. The echo synthesis channel 20 and the speech interpretation channel 21 thus obtained are echo filters of FIG.

11 and echo cancellation filter

When applied to (13), the echo filter to which the speech synthesis channel 30 is applied as shown in FIGS. 3 and 4.

Pure echo cancellation filter with 31 and whitening filter 40

(41) can be obtained.

결국, 추정된 음원신호

(14)의 독립성을 이용하여 반향제거 필터

(13)를 학습하는 대부분의 방법에서는 도 4와 같은 반향제거 필터

(41)을 이용한다. 실제로, 도 3과 같은 반향필터

(31)의 영향만이 존재하는 이상적인 경우로 수학식 3을 이용하여 음성 신호원인 경우의 암묵 반향제거 실험을 진행하면 반향을 충분히 제거할 수 있는 반향제거 필터

(41)를 성공적으로 학습해 낼 수 있으며, 결과적으로 얻어진 추정된 음성신호

(14)도 반향필터

(31)의 영향이 상당히 없어졌음을 확인할 수 있다.Finally, the estimated sound source signal

Echo cancellation filter using independence of 14

In most methods of learning 13, the echo cancellation filter as shown in FIG.

(41) is used. In fact, the echo filter as shown in FIG.

As an ideal case where only the influence of (31) exists, an echo cancellation filter capable of sufficiently removing echoes by performing an echo cancellation experiment in the case of an audio signal source using Equation 3

(41) can be successfully learned and the resulting estimated speech signal

14 degree echo filter

It can be seen that the effect of (31) is considerably eliminated.

그러나, 실제의 환경에서 암묵 반향제거 성능을 확인하고 적용하기 위해서는 반향이 존재하는 공간에서 반향 영향을 받은 음성신호

(12)를 마이크와 녹음장비를 사용해서 측정해야 한다. 이 경우에 불가피하게 녹음된 신호

(12)에는 녹음장비에 따라 다르긴 하지만, 대체적으로 10~30dB 정도의 신호대 잡음비(SNR, signal-to-noise ratio)를 갖게 하는 백색 가우시안 잡음(white Gaussian noise)들이 측정 잡음으로 포함된다. 따라서, 이상적인 경우의 모델인 도 3은 실제의 적용 환경에서는 도 5와 같이 반향신호에 백색 가우시안 잡음이 더해진 형태로 변경되어야 한다.However, in order to confirm and apply the blind echo cancellation performance in a real environment, the echo signal affected by the echo in the space where the echo exists.

(12) should be measured using a microphone and recording equipment. In this case, the signal inevitably recorded

(12) includes white Gaussian noise as measurement noise, which, depending on the recording equipment, typically has a signal-to-noise ratio (SNR) of around 10 to 30 dB. Therefore, FIG. 3, which is an ideal model, should be changed to a form in which white Gaussian noise is added to an echo signal as shown in FIG. 5 in an actual application environment.

그러나, 이러한 잡음이 존재하는 경우에 변형된 반향모델을 보이고 있는 도 5로부터 도 4와 같이 반향제거 필터

(41)를 학습하면, 가산잡음의 영향으로 학습 수렴속도가 매우 느려지며, 그 결과로 추정된 음원신호의 품질도 매우 떨어진다.However, the echo cancellation filter shown in Figs. 5 to 4 shows a modified echo model in the presence of such noise.

Learning (41) slows down the learning convergence speed due to the addition noise, and the quality of the estimated sound source signal is also very low.

이에 본 발명은 상기 문제점을 해결하기 위한 것으로써, 측정된 잡음의 존재 하에서 암묵 반향제거 성능을 향상시키고 독립성분 분석을 이용하여 학습법칙을 유도할 수 있도록 하는 모델 및 그 유도방법을 제공하는 것을 목적으로 한다.Accordingly, an object of the present invention is to provide a model and a method of deriving a method of improving the blind echo cancellation performance in the presence of measured noise and deriving a learning law using independent component analysis. It is done.

본 발명은 마이크로폰을 이용하고, 측정 잡음이 존재하는 경우에는 반향제거의 성능을 높이기 위해서 입력신호를 관찰하여 서로 독립인 출력 신호원을 추정하는 것을 특징으로 하는 측정된 잡음을 고려한 암묵 반향제거 모델을 제시한다.The present invention provides a blind echo cancellation model considering measured noise, using a microphone, and estimating output signals independent of each other by observing an input signal in order to improve echo cancellation performance when measurement noise is present. present.

또한, 본 발명은 마이크로폰을 사용하여 입력단에서 관찰된 신호와 출력단에서 추정된 독립된 신호원을 갖는 암묵 신호처리 구조에서 암묵 최소제곱 비용함수를 추정된 독립된 신호원에 대해서 주파수 영역에서 각각 정의하는 제1단계; 상기 독립된 신호원에 대해서 주파수 영역에서 각각 정의된 암묵 최소제곱 비용함수를 합하여 전체 출력단에서의 암묵 최소제곱 비용함수를 정의하는 제2단계; 상기 암묵 최소제곱 비용함수를 각각의 적응필터에 대하여 미분하는 제3단계; 및 상기 미분된 암묵 최소제곱 비용함수에 대하여 자연 경사도 방법을 주파수 영역에서 적용하는 제4단계;를 포함하는 측정된 잡음을 고려한 암묵 반향제거 모델 유도방법을 제시한다.The present invention also provides a first method for defining a blind least square cost function in the frequency domain for an estimated independent signal source in a blind signal processing structure having a signal observed at an input stage and an independent signal source estimated at an output stage using a microphone. step; A second step of defining an implicit least-squares cost function at all output stages by adding the implicit least-squared cost functions respectively defined in the frequency domain for the independent signal source; Deriving the blind least squares cost function for each adaptive filter; And a fourth step of applying a natural gradient method in the frequency domain with respect to the differentiated blind squared cost function.

이하, 본 발명의 실시예에 대한 구성 및 그 작용을 첨부한 도면을 참조하면서 상세히 설명하기로 한다.Hereinafter, with reference to the accompanying drawings, the configuration and operation of the embodiment of the present invention will be described in detail.

먼저, 학습법칙을 유도하기가 간단하고 다양한 구조에서도 적용이 가능한 암묵 반향분리를 위한 암묵 최소제곱(blind least squares) 비용함수가 주파수 영역에서 제시된다. 암묵 최소제곱 비용함수를 이용하여 도 1과 같은 기존의 마이크로폰 1개를 사용하는 방법에서 학습법칙을 유도해 보기로 한다. 추정된 음원신호

(14)는 시간영역에서는 관찰신호

(12)와 반향제거 필터

(13)의 컨벌루션 (convolution)으로 상기 수학식 1과 같이 표현되지만, 주파수 영역에서는 다음의 수학식 4와 같이 표현된다.First, the blind least squares cost function for blind echo separation, which is simple to derive the learning law and is applicable to various structures, is presented in the frequency domain. Using the implicit least squares cost function, a learning rule is derived from the method of using one conventional microphone as shown in FIG. 1. Estimated Sound Source Signal

(14) is the observation signal in the time domain

12 and echo cancellation filter

The convolution of (13) is expressed by Equation 1 above, but is expressed by Equation 4 below in the frequency domain.

여기서,

는 각각

의 주파수 영역을 표현한 신호이고,

는 성분별 곱셈(component-by-component multiplication) 연산이다. 주파수 영역에서의 암묵 최소제곱 비용함수는 다음의 수학식 5와 같다.here,

Are each

Is a signal representing the frequency range of,

Is a component-by-component multiplication operation. The implicit least square cost function in the frequency domain is given by Equation 5 below.

여기서,

는 이산 푸리에 변환(discrete Fourier Transform)을 구하는 함수를 나타낸다. 추정된 음원신호

(14)의 샘플들 사이에 독립성을 최대화시키는 반향제거 필터

(13)의 학습법칙은 반향제거 필터

에 대한 경사 강 하법(gradient decent method)을 사용함으로써

를 최소화하는 형태로의 반향제거 필터

의 학습법칙을 얻을 수 있다.here,

Denotes a function for obtaining a discrete Fourier transform. Estimated Sound Source Signal

Echo cancellation filter to maximize independence between samples of 14

Learning rule of (13) is echo cancellation filter

By using the gradient decent method

Echo cancellation filter in a form that minimizes

The learning law of

상기 수학식 5를 미분함으로써 얻는 반향제거 필터

의 학습식은 다음의 수학식 6과 같다.Echo cancellation filter obtained by differentiating Equation 5

The learning equation of Equation 6 is as follows.

여기서, *는 복소 공액 연산자(complex conjugate operater)이다. 상기 수학식 6에서 오차에 해당하는

부분은 수렴 성능의 향상을 위해서 피셔 점수 함수(Fisher score function),

로 대체하는 것이 좋으며, 학습법칙은 다음의 수학식 7과 같이 얻어진다.Where * is a complex conjugate operater. Corresponding to the error in Equation 6

The part uses Fisher score function,

It is better to replace with, and the learning law is obtained as shown in Equation 7 below.

여기서, 자연 경사도(natural gradient) 방법을 추가함으로써 수렴 성능이 향상된 암묵 최소제곱 비용함수를 이용한 반향제거 필터 학습방법이 수학식 8과 같이 유도된다.Here, an echo cancellation filter learning method using an implicit least square cost function having improved convergence performance by adding a natural gradient method is derived as in Equation (8).

상기 수학식 8과 같이 유도된 반향제거 필터의 학습법칙을 적용하면, 종래의 상기 수학식 3과 같은 방법에 비해서 수렴 성능이 더 향상된 결과를 얻을 수 있다. 앞으로는 몇 가지의 암묵 반향제거 모델이 제시되고, 상기 수학식 5와 같은 암묵 최소제곱 비용함수를 이용하여 각각의 모델에 대한 학습법칙을 유도해 보겠다.By applying the learning law of the echo cancellation filter derived as in Equation 8, the convergence performance can be improved more than in the conventional Equation 3 method. Several implicit echo cancellation models will be proposed in the future, and we will derive the learning law for each model using the implicit least square cost function as shown in Equation (5).

먼저, 1개의 마이크로폰을 이용한 반향제거 모델을 도 6에 제시하였다. 종래의 모델에서는 도 1과 같이 1개의 관찰 신호

(12)가 존재할 때 1개의 음원신호

(14)만을 추정하게 되는데, 도 6에서는 2개의 반향제거 필터

(65)과

(66)를 통해서 2개의 서로 독립적인 신호

(67)와

(68)가 추정된다. 이는 백색 가우시안 잡음이 더해진 반향 영향을 받은 음성신호

(63)로부터 서로 독립인 2개의 신호

(67)과

(68)를 추정하는 문제이다.First, an echo cancellation model using one microphone is shown in FIG. 6. In the conventional model, one observation signal as shown in FIG.

1 sound source signal when (12) is present

Only 14 is estimated. In FIG. 6, two echo cancellation filters are shown.

65 and

2 mutually independent signals through 66

With 67

(68) is estimated. This is an echo affected voice signal with white Gaussian noise.

2 signals independent of each other from 63

(67) and

It is a matter of estimating (68).

도 6에서 제시된 1 ×1 ×2CH 반향제거 모델의 경우에 2개의 출력신호

(67)과

(68)로부터 상기 수학식 5와 같은 암묵 최소제곱 비용함수를 각각 구해보면, 다음의 수학식 9와 같다.Two output signals in the case of the 1 × 1 × 2CH echo cancellation model shown in FIG.

(67) and

The implicit least square cost functions of Eq. (5) can be obtained from Eq.

전체적으로, 암묵 최소제곱 비용함수는

과

의 합으로 정의할 수 있고, 최종적으로 도 6에서 제시된 1 ×1 ×2CH 모델의 반향제거 필터

(65)와

(66)의 학습법칙은 다음의 수학식 10과 수학식 11과 같이 유도된다.Overall, the implicit least squares cost function

and

It can be defined as the sum of, and finally the echo cancellation filter of the 1 × 1 × 2CH model shown in Figure 6

65 and

The learning law of (66) is derived as in

Equations

10 and 11 below.

상기 수학식 10과 수학식 11을 이용하여 반향제거 필터 학습결과로 가산잡음의 존재 하에서 추정된 2개의 독립신호는 16kHz의 표본화율(sampling rate)을 갖는 음원신호의 경우에 대해서 각각 0~4kHz와 4~8kHz의 신호 대역폭(signal bandwidth)을 갖는 반향 제거된 음성신호를 얻을 수 있다. 음성신호처리에서 주로 사용되는 0~4kHz의 대역폭을 갖는 출력 음성신호의 경우에는 종래의 방법으로 얻어진 반향 제거된 음성신호의 같은 대역신호를 비교했을 때에도 제시된 방법이 좋은 성능을 보임을 확인할 수 있다. 상기 경우에 16kHz인 표본화율을 갖는 음원신호에 대해서는 0~4kHz의 대역폭을 갖는 반향 제거된 음성신호를 얻을 수 있지만, 만약 4kHz 이상의 대역폭을 갖는 음성신호를 얻기 원하는 경우에는 원하는 대역폭에 따라 음성 신호의 표본화율을 증가시키는 방법을 사용함으로써 해결할 수 있다.Using the Equation 10 and Equation 11, two independent signals estimated in the presence of additive noise as an echo cancellation filter learning result are 0 to 4 kHz for the sound source signal having a sampling rate of 16 kHz and 16 kHz, respectively. An echo canceled speech signal having a signal bandwidth of 4 to 8 kHz can be obtained. In the case of the output speech signal having a bandwidth of 0 to 4 kHz mainly used in speech signal processing, the proposed method shows good performance even when comparing the same band signal of the echo canceled speech signal obtained by the conventional method. In this case, an echo canceled speech signal having a bandwidth of 0 to 4 kHz can be obtained for a sound source signal having a sampling rate of 16 kHz, but if a speech signal having a bandwidth of 4 kHz or more is desired, This can be solved by using a method that increases the sampling rate.

2개의 마이크로폰을 이용한 암묵 반향제거 모델들은 도 7과 도 8에서 제시되었다. 종래의 암묵 반향제거 모델의 경우에는 1개의 음원신호를 추정하기 위해서 마이크로폰을 1개만 사용하는 방법을 사용하는데, 본 발명에서는 2개의 마이크로폰을 사용해서 결과적으로 반향제거된 음원신호를 추정하는 모델도 함께 제시하고 있다.Shadow echo cancellation models using two microphones are presented in FIGS. 7 and 8. In the case of the conventional blind echo cancellation model, only one microphone is used to estimate one sound source signal. In the present invention, a model for estimating an echo canceled sound source signal using two microphones is also included. Suggesting.

먼저, 도 7은 2개의 마이크로폰을 사용하는 모델 중에서 1가지로서 실세계에서 암묵 반향제거 모델의 활용과정에서 마이크로폰을 통해서 녹음된 음성신호는 필연적으로 측정잡음이 도 5와 같이 반향필터

(51)에 더해지는 형태로 나타나게 된다. 또한, 이때 더해지는 측정 잡음은 실험적으로 백색 가우시안 잡음의 특성을 갖고 있음을 확인할 수 있다. 이에, 도 7에서는 2개의 마이크로폰에 각각 더해진 백색 가우시안 잡음

(73)과

(74)를 추정하고자 하는 음성신호와 독립인 또 하나의 독립 신호원으로 간주하고, 출력단에서는 2개의 서로 독립적인 음성신호원과 측정잡음 신호원을 추정하는 문제로 간주하고 있다. 비록, 2개의 마이크로폰에 더해진 백색 가우시안 잡음이 동일하지 않지만, 통계적 특성은 같은 가우시안 분포를 가지기 때문에 1개의 출력에서만 측정 잡음에 해당하는 독립 신호원 성분을 추출할 수 있다.First, FIG. 7 is one of models using two microphones. In the real world, the audio signal recorded through the microphone is inevitably measured by the echo filter as shown in FIG.

It appears in the form added to (51). In addition, it can be confirmed that the measured noise added at this time has a characteristic of white Gaussian noise. Thus, in FIG. 7, the white Gaussian noise added to the two microphones, respectively.

73 and

(74) is regarded as another independent signal source independent of the audio signal to be estimated, and the output stage is regarded as a problem of estimating two independent audio signal sources and measurement noise signal sources. Although the white Gaussian noise added to the two microphones is not the same, since the statistical properties have the same Gaussian distribution, independent signal source components corresponding to the measurement noise can be extracted from only one output.

도 7에서 제시된 1 ×2 ×2CH 반향제거 모델의 경우에 잡음의 영향을 받은 관찰된 반향신호를 주파수 영역에서 표현하면, 다음의 수학식 12와 같다.In the case of the 1 × 2 × 2CH echo cancellation model shown in FIG. 7, the observed echo signal affected by the noise is expressed in the following frequency domain.

여기서,

{1, 2}는 마이크로폰 인덱스(index)이다.here,

{1, 2} is the microphone index.

출력신호인 추정된 2개의 반향 제거된 신호를 주파수 영역에서 표현하면, 다음의 수학식 13과 같다.If the estimated two echo canceled signals, which are output signals, are expressed in the frequency domain, the following equation (13) is given.

여기서,

{1, 2}는 출력단의 인덱스이다.here,

{1, 2} is the index of the output terminal.

도 7에서 제시된 1 ×2 ×2CH 반향제거 모델의 경우에 2개의 출력신호

(82)와

(83)으로부터 수학식 5와 같은 암묵 최소제곱 비용함수를 구해보면 다음의 수학식 14와 같다.Two output signals in case of the 1 × 2 × 2CH echo cancellation model shown in FIG.

With 82

The implicit least square cost function of Eq. (5) is obtained from Eq.

전체적으로, 모든 출력들에 대한 암묵 최소제곱 비용함수는 모든

들에 대한 합으로 정의하여 다음의 수학식 15와 같이 나타낼 수 있다.Overall, the implicit least squares cost function for all outputs is

It can be defined as the sum of the following equations as shown in the following equation (15).

상기 수학식 15를 1 ×2 ×2CH 반향제거 필터들에 대해서 각각 미분하고, 자Differentiate Equation 15 with respect to 1 × 2 × 2CH echo cancellation filters, respectively.

연 경사도 방법을 적용하면, 결과적으로 반향제거 필터들의 학습법칙은 다음의 수학식 16과 같이 나타낼 수 있다.As a result of applying the soft gradient method, the learning law of the echo cancellation filters may be expressed as Equation 16 below.

여기서, 자연 경사도 부분은 다음의 수학식 17과 같이 나타낼 수 있다.Here, the natural gradient portion may be expressed as in Equation 17 below.

여기서,

는 허미션 연산자(Hermitian operator)이다.here,

Is a Hermitian operator.

상기 수학식 16에서 얻어진 1 ×2 ×2CH 모델의 반향제거 적응필터 학습법칙을 적용하면, 2개의 마이크로폰을 통해서 얻어진 반향에 의해 변형되고 측정 잡음이 존재하는 음성신호로부터 반향이 제거된 음성신호를 2개 중에서 1개의 출력단에서 복원해 낼 수 있다. 이때, 반향이 제거된 음성신호 출력에서도 여전히 백색 가우시안 잡음이 더해져 있지만, 크기는 관찰된 신호에서 보다 상대적으로 감소되어 나타나고 결과적으로 음성의 품질이 더욱 향상된다. 한편, 나머지 출력단에서는 백색 가우시안 측정 잡음의 신호원에 해당하는 백색 가우시안 잡음이 학습된다.Applying the echo cancellation adaptive filter learning law of the 1 x 2 x 2CH model obtained in Equation 16, the speech signal transformed by the echo obtained through the two microphones and the echo signal from which the echo is removed from the speech signal having the measurement noise is 2 Can be recovered from one output stage. At this time, the white Gaussian noise is still added to the audio signal output from which the echo is removed, but the magnitude appears to be relatively reduced than that of the observed signal, and as a result, the quality of the speech is further improved. On the other hand, the white output Gaussian noise corresponding to the signal source of the white Gaussian measurement noise is learned.

2개의 마이크로폰을 이용하여 암묵 반향제거를 수행하는 다른 모델이 도 8에 1 ×2 ×3CH 암묵 반향제거 모델로 제시되어 있다. 이는 도 7의 1 ×2 ×2CH 암묵 반향 모델과 비교할 때, 3개의 출력신호

(104),

(105) 및

(106)로 증가한 형태이다. 1 ×2 ×3CH 반향 제거 모델의 경우에 잡음의 영향을 받은 관찰된 반향신호는 주파수 영역에서 상기수학식 12와 같이 표시된다. 출력신호인 추정된 3개의 반향 제거된 신호를 주파수 영역에서 표현하면 상기 수학식 13과 같이 나타나고,

{1, 2, 3}는 출력단의 인덱스이다. 1 ×2 ×3CH 모델의 경우에 전제적인 암묵 최소제곱 비용함수를 구하면, 다음의 수학식 18과 같다.Another model for performing blind echo cancellation using two microphones is shown in FIG. 8 as a 1 × 2 × 3CH blind echo cancellation model. Compared with the 1 x 2 x 2CH blind echo model of FIG.

(104),

105 and

Increased to 106. In the case of the 1 × 2 × 3CH echo cancellation model, the observed echo signal affected by the noise is represented by Equation 12 in the frequency domain. When the estimated three echo canceled signals, which are output signals, are expressed in the frequency domain, they are represented by Equation (13).

{1, 2, 3} is the index of the output stage. In the case of the 1 × 2 × 3CH model, the implicit minimum square cost function is calculated by Equation 18 below.

여기서,

는 상기 수학식 14와 같이 표현된다.here,

Is expressed as in Equation 14 above.

상기 수학식 18을 1 ×2 ×3CH 반향제거 필터들에 대해서 각각 미분하고, 자연 경사도 방법을 적용하면, 결과적으로 반향제거 필터들의 학습법칙은 상기 수학식 16과 같이 나타난다. 단, 1 ×2 ×3CH 모델의 경우에는 상기 수학식 17과 같은 1 ×2 ×2CH 모델의 경우와 다르게 자연 경사도 부분이 수학식 19와 같이 얻어진다.Differentiating Equation 18 with respect to the 1 × 2 × 3CH echo cancellation filters and applying the natural gradient method, the result is that the learning law of the echo cancellation filters is expressed as Equation 16 above. However, in the case of the 1 x 2 x 3CH model, unlike the case of the 1 x 2 x 2CH model shown in Equation 17, a natural slope portion is obtained as in Equation 19.

1 ×2 ×3CH 모델에서 갖는 장점은 1 ×2 ×2CH 모델과 비교할 때, 2개의 관찰된 신호로부터 3개의 독립인 신호원을 학습으로 얻음으로써, 3개의 출력단 중에서 1개에는 측정 잡음원인 백색 가우시안 잡음이 출력되도록 학습되고 나머지 2개의 출력단에서는 16kHz의 표본화율을 갖는 음원신호의 경우에 대해서 각각 0~4kHz와 4~8kHz의 신호 대역폭을 갖는 반향제거된 음성신호원을 얻을 수 있다. 음성신호에서 중요한 정보를 갖고 있는 0~4kHz의 대역폭을 갖는 출력 음성신호의 경우에는 종래의 1개의 마이크로폰을 사용하고 1개의 출력신호를 이용하는 방법은 물론 본 발명에서 제시된 도 6의 1 ×1 ×2CH 모델을 이용하는 방법에 비해서도 음성의 품질 등에서 매우 높은 성능 향상을 기대할 수 있다. 또한, 본 발명에서 제시된 도 7의 1 ×2 ×2CH 모델의 경우와 비교하여서도 수렴 성능 및 음성 품질 면에서 향상된 결과를 얻을 수 있다.The advantage of the 1 × 2 × 3CH model is that, when compared to the 1 × 2 × 2CH model, three independent signal sources are learned from two observed signals, so that one of the three outputs is a white Gaussian source of measurement noise. In the case of the sound source signal having a learning rate of 16 kHz and the remaining two output stages, a reverberated speech signal source having a signal bandwidth of 0 to 4 kHz and 4 to 8 kHz, respectively, can be obtained. In the case of an output voice signal having a bandwidth of 0 to 4 kHz having important information in the voice signal, the method of using one conventional microphone and using one output signal, as well as the 1 × 1 × 2CH of FIG. Compared to the method using the model, a very high performance improvement can be expected in terms of voice quality. In addition, even in the case of the 1 × 2 × 2CH model of FIG. 7 presented in the present invention, improved results in convergence performance and voice quality can be obtained.

이상에서 설명한 내용을 통해 본 업에 종사하는 당업자라면 본 발명의 기술사상을 이탈하지 아니하는 범위 내에서 다양한 변경 및 수정이 가능함을 알 수 있을 것이다. 따라서, 본 발명의 기술적 범위는 실시예에 기재된 내용만으로 한정되는 것이 아니라 특허청구범위에 의하여 정해져야 한다.It will be apparent to those skilled in the art that various changes and modifications can be made without departing from the technical spirit of the present invention through the above description. Therefore, the technical scope of the present invention should not be limited only to the contents described in the embodiments, but should be defined by the claims.

이상에서와 같이 본 발명에 의한 측정된 잡음을 고려한 반향제거 모델 및 그유도방법은 측정된 잡음의 존재 하에서 암묵 반향제거 성능을 향상시키고 독립성분 분석을 이용하여 각 모델의 학습법칙을 유도할 수 있도록 함으로써, 종래의 방법에 비하여 수렴속도 및 처리결과로 얻어진 음성신호의 품질면에서 상당히 개선된 성능을 얻을 수 있도록 한다.As described above, the echo cancellation model and its induction method considering the measured noise according to the present invention improve the blind echo cancellation performance in the presence of the measured noise and derive the learning law of each model using independent component analysis. By doing so, it is possible to obtain a significantly improved performance in terms of convergence speed and quality of the speech signal obtained as a result of the conventional method.

Claims

A blind echo cancellation model considering measured noise using a microphone and estimating output signals independent of each other by observing an input signal in order to improve echo cancellation performance when there is measurement noise.

The method according to claim 1,

Wherein the microphone is one, the input signal is one, the output signal source is two, and the echo cancellation model is 1 × 1 × 2CH.

The method according to claim 1,

The microphone has two microphones, the input signal is two, the output signal source is two, the echo cancellation model is 1 × 2 × 2CH characterized in that the blind echo cancellation model considering the noise.

The method according to claim 1,

The microphone has two microphones, the input signal is two, the output signal source is three, the echo cancellation model is 1 × 2 × 3CH, the blind echo cancellation model considering the measured noise.

A first step of defining a blind least square cost function in the frequency domain with respect to the estimated independent signal source in the blind signal processing structure having the signal observed at the input and the independent signal source estimated at the output using a microphone;

A second step of defining an implicit least-squares cost function at all output stages by adding the implicit least-squared cost functions respectively defined in the frequency domain for the independent signal source;

Deriving the blind least squares cost function for each adaptive filter; And

A fourth step of applying a natural gradient method in the frequency domain to the differentiated implicit least square cost function;

Method of deriving the blind echo model considering the measured noise comprising a.

The method according to claim 5,

In the first step, there is one microphone, one observed signal, two independent signal sources, and the echo cancellation model is 1 × 1 × 2H. Way.

The method according to claim 5,

In the first stage, the microphone is 2, the observed signal is 2, the independent signal sources are 2, and the echo cancellation model is 1 x 2 x 2CH. .

The method according to claim 5,

In the first step, the microphone is 2, the observed signal is 2, the independent signal sources are 3, and the echo cancellation model is 1 x 2 x 3CH. .

The method according to any one of claims 5 to 8,

A method of deriving a blind echo model considering the measured noise, characterized in that the energy level of the adaptive filter exists by normalizing the estimated energy level of the independent signal source to a constant value.

The method according to any one of claims 5 to 8,

The method of deriving the blind echo cancellation model considering the measured noise, characterized in that to achieve a constant convergence performance by adapting the learning rate so that the ratio of the energy before the update of the adaptive filter and the amount of energy change after the update is constant.