KR20040077661A

KR20040077661A - Method and apparatus for removing noise from electronic signals

Info

Publication number: KR20040077661A
Application number: KR10-2004-7007752A
Authority: KR
Inventors: 그레고리씨. 버넷
Original assignee: 앨리프컴
Priority date: 2001-11-21
Filing date: 2002-11-21
Publication date: 2004-09-06
Also published as: EP1480589A1; KR100936093B1; CN1589127A; WO2004056298A1; AU2002359445A1; JP2005529379A

Abstract

사람 스피치로부터 음향 잡음을 제거하는 방법 및 시스템이 기재된다. 음향 잡음은 잡음 종류, 진폭, 또는 방향에 상관없이 제거된다. 이 시스템은 마이크로폰과 음성 활동 검출(VAD) 소자 사이에 연결되는 프로세서를 포함한다. 이 프로세서는 트랜스퍼 함수들을 발생시키는 잡음제거 알고리즘을 실행한다. 프로세서는 마이크로폰으로부터의 음향 데이터와 VAD로부터의 데이터를 수신한다. VAD가 음성 활동을 표시할 때, 그리고 VAD가 음성 활동이 없음을 표시할 때, 프로세서는 여러 다양한 트랜스퍼 함수들을 발생시킨다. 트랜스퍼 함수는 잡음제거된 데이터 스트림을 발생시키는 데 사용된다.A method and system for removing acoustic noise from human speech is described. Acoustic noise is removed regardless of noise type, amplitude, or direction. The system includes a processor coupled between the microphone and the voice activity detection (VAD) element. The processor runs a noise cancellation algorithm that generates transfer functions. The processor receives acoustic data from the microphone and data from the VAD. When the VAD indicates voice activity, and when the VAD indicates no voice activity, the processor generates various various transfer functions. The transfer function is used to generate a noise canceled data stream.

Description

METHOD AND APPARATUS FOR REMOVING NOISE FROM ELECTRONIC SIGNALS

전형적인 음향 장치에서는, 사람에게서 나는 소리가 녹음되거나 저장되고 또다른 위치의 수신기에 전송된다. 사용자 환경에서, 필요치않은 음향 잡음으로 대상 신호(사용자 음성)를 오염시키는 한 종류 이상의 잡음 소스가 존재할 수 있다. 이는 수신기가 (사람이든 기계이든) 사용자 음성을 이해하는 것을 힘들게하고 심지어는 불가능하게 한다. 이는 현재 셀룰러 전화와 PDA같은 휴대용 통신 장치의 보급이 확대되면서 특히 문제가 된다. 이러한 잡음 생성을 억제하는 방법들에는 여러 가지가 있으나, 이 방법들은 연산 시간이 오래 걸리거나 성가신 하드웨어들을 필요로하고, 대상 신호를 크게 왜곡시키거나, 유용한 성능이 결여되는 단점이 있다. 이 방법들 중 여러 가지가 ISBN 0-471-62692-9에 실린 Vaseghi의 "Advanced DigitalSignal Processing and Noise Reduction" 와 같은 교재에 설명되어 있다.In a typical acoustic device, sound from a person is recorded or stored and transmitted to a receiver in another location. In the user environment, there may be one or more kinds of noise sources that contaminate the target signal (user's voice) with unwanted acoustic noise. This makes it difficult and even impossible for the receiver to understand the user's voice (whether it is a person or a machine). This is a particular problem as the current proliferation of portable communication devices such as cellular phones and PDAs expands. There are many ways to suppress this noise generation, but these methods require long computational time, cumbersome hardware, large distortion of the target signal, or lack of useful performance. Many of these methods are described in textbooks such as Vaseghi's "Advanced Digital Signalal Processing and Noise Reduction" in ISBN 0-471-62692-9.

본 출원은 2001년 7월 12일자 미국특허출원 09/905,361 호의 연속분할출원(CIP)으로서 그 내용은 본원에서 참고로 인용된다. 본 특허출원은 2001년 11월 21일자 미국특허출원 60/332,202 호를 우선권주장한다.This application is the Serial Split Application (CIP) of US Patent Application Serial No. 09 / 905,361, filed Jul. 12, 2001, the contents of which are incorporated herein by reference. This patent application claims priority to US patent application Ser. No. 60 / 332,202, filed November 21, 2001.

본 발명은 음향 전송이나 녹음에서 필요치않은 음향 잡음을 제거하거나 억제하기 위한 수학적 방법 및 전자 시스템 분야에 관한 것이다.The present invention relates to the field of mathematical methods and electronic systems for the removal or suppression of unwanted acoustic noise in sound transmission or recording.

도 1은 한 실시예의 잡음 제거 시스템의 블록도표.1 is a block diagram of a noise cancellation system of one embodiment.

도 2는 마이크로폰에 대한 직접 경로와 단일 잡음 소스를 가정한, 한 실시예의 잡음 제거 알고리즘의 블록도표.2 is a block diagram of an noise cancellation algorithm of one embodiment, assuming a direct path to a microphone and a single noise source.

도 3은 n개의 구분된 잡음 소스로 일반화된 한 실시예의 잡음 제거 알고리즘의 전면 단부의 블록도표.3 is a block diagram of the front end of an embodiment noise cancellation algorithm generalized to n distinct noise sources.

도 4는 n개의 구분된 잡음 소스와 신호 반사가 존재하는 가장 일반화된 경우의 한 실시예의 잡음 제거 알고리즘의 전면 단부 블록도표.4 is a front end block diagram of the noise cancellation algorithm of one embodiment of the most generalized case where n distinct noise sources and signal reflections exist.

도 5는 한 실시예의 잡음 제거 방법의 순서도.5 is a flow chart of a noise cancellation method of an embodiment.

도 6은 여러 수많은 화자와 안내방송을 포함하는 공항 터미널 잡음에서 영어를 사용하는 한 미국 여성에 대한 실시예의 잡음 억제 알고리즘의 결과를 보여주는 도면.FIG. 6 shows the results of an embodiment noise suppression algorithm for an American woman using English language at an airport terminal noise including a number of speakers and announcements.

도 7 은 도 2, 3, 4의 실시예에서, 일방향 및 전방향 마이크로폰을 이용하는 잡음 제거를 위한 물리적 구현의 블록도표.7 is a block diagram of a physical implementation for noise cancellation using unidirectional and omnidirectional microphones in the embodiments of FIGS. 2, 3, and 4;

도 8 은 한 실시예에서, 두 개의 전방향 마이크로폰을 포함하는 잡음제거 마이크로폰 구조의 도면.8 is a diagram of a noise canceling microphone structure including two omnidirectional microphones, in one embodiment.

도 9 는 도 8의 실시예에서, 거리에 대해 요구되는 C의 플랏의 도면.9 is a diagram of a plot of C required for distance in the embodiment of FIG. 8;

도 10 은 실시예에서, 두 개의 마이크로폰이 서로 다른 응답특성을 보이는 실시예에서 잡음 제거 알고리즘의 앞 종단부(front end)의 블록도표.FIG. 10 is a block diagram of the front end of a noise canceling algorithm in an embodiment where the two microphones exhibit different response characteristics. FIG.

도 11A 는 보상전에 마이크로폰 간의(4cm의 거리에서) 주파수 응답(퍼센트)의 차이의 플랏을 도시한다.11A shows a plot of the difference in frequency response (percent) between microphones (at a distance of 4 cm) before compensation.

도 11B 는 실시예에서, DFT 보상후 마이크로폰 간의 (4cm의 거리에서) 주파수 응답(퍼센트)의 차이의 플랏을 도시한다.11B shows a plot of the difference in frequency response (percent) between microphones (at a distance of 4 cm) after DFT compensation in an embodiment.

도 11C 는 다른 실시예에서, 시간 영역 필터 보상 후 마이크로폰 간의 (4cm의 거리에서) 주파수 응답(퍼센트)의 차이의 플랏을 도시한다.11C shows a plot of the difference in frequency response (percent) between microphones (at a distance of 4 cm) after time domain filter compensation, in another embodiment.

도 1은 음성을 낼 때 생리적 정보로부터 도출되는 음성 발생시의 지식을 이용하는 한 실시예의 잡음 제거 시스템의 블록도표이다. 이 시스템은 한개 이상의 프로세서(30)에 신호들을 제공하는 센서(20)와 마이크로폰(10)을 포함한다. 프로세서는 잡음제거 서브시스템이나 알고리즘(40)을 포함한다.1 is a block diagram of one embodiment of a noise reduction system that utilizes knowledge at the time of speech generation derived from physiological information when speaking. The system includes a sensor 10 and a microphone 10 that provide signals to one or more processors 30. The processor includes a noise canceling subsystem or algorithm 40.

도 2는 단일 잡음 소스와 마이크로폰에 대한 직접 경로를 가정한, 한 실시예의 잡음 제거 시스템/알고리즘의 블록도표이다. 잡음 제거 시스템 도표는 단일 신호 소스(100)와 단일 잡음 소스(101)를 가진, 한 실시예의 과정의 그래픽 표현이다. 이 알고리즘은 두개의 마이크로폰, 즉, "신호" 마이크로폰(MIC1, 102)과 "잡음" 마이크로폰(MIC2, 103)을 이용하지만, 이에 제한되지는 않는다. MIC1은 일부 잡음을 가진 신호를 대부분 캡처한다고 가정하고, MIC2는 일부 신호를 가진 잡음을 대부분 캡처한다고 가정한다. 이는 종래의 개선된 음향 시스템에서 보여주는 공통적 설정이다. 신호 소스(100)로부터 MIC1으로의 데이터는 s(n)으로 표시되고, 신호소스(100)로부터 MIC2로의 데이터는 s₂(n)으로, 잡음 소스(101)로부터 MIC2로의 데이터는 n(n)으로 표시되고, 잡음 소스(101)로부터 MIC1으로의 데이터는 n₂(n)으로 표시된다. 마찬가지로, MIC1으로부터 잡음 제거 소자(105)로의 데이터는 m₁(n)으로 표시되고, MIC2로부터 잡음 제거 소자(105)로의 데이터는 m₂(n)으로 표시되며, 이때 s(n)은 소스(100)로부터 아날로그 신호의 구분된 샘플을 표시한다.2 is a block diagram of an embodiment noise cancellation system / algorithm, assuming a single noise source and a direct path to a microphone. The noise reduction system diagram is a graphical representation of the process of one embodiment, with a single signal source 100 and a single noise source 101. This algorithm uses, but is not limited to, two microphones, "signal" microphones MIC1, 102 and "noise" microphones MIC2, 103. Assume that MIC1 captures most of the signal with some noise, and that MIC2 captures most of the noise with some signal. This is a common setting seen in conventional improved acoustic systems. Data from the signal source 100 to MIC1 is represented by s (n), data from the signal source 100 to MIC2 is s ₂ (n), and data from the noise source 101 to MIC2 is n (n). The data from the noise source 101 to MIC1 is represented by n ₂ (n). Similarly, data from MIC1 to noise canceling element 105 is represented by m ₁ (n) and data from MIC2 to noise canceling element 105 is represented by m ₂ (n), where s (n) is the source ( Display a separate sample of the analog signal from 100).

잡음 제거 소자는 음성 활동 검출(VAD) 소자(104)로부터 신호를 또한 수신한다. VAD 소자(104)는 생리적 정보를 이용하여 화자가 말하고 있는 시점을 결정할 수 있다. 여러 실시예에서, VAD 소자는 RF 소자, 일렉트로글로토그래프(electroglotto graph), 초음파 소자, 음향 인후 마이크로폰, 그리고 기류 검출기를 포함한다.The noise canceling element also receives a signal from the voice activity detection (VAD) element 104. The VAD element 104 may determine the point in time at which the speaker is speaking using physiological information. In various embodiments, the VAD device includes an RF device, an electroglotto graph, an ultrasound device, an acoustic throat microphone, and an airflow detector.

신호 소스(100)로부터 MIC1까지, 그리고 잡음 소스(101)로부터 MIC2까지의 트랜스퍼 함수는 단위값을 가진다고 가정한다. 신호 소스(100)로부터 MIC2로의 트랜스퍼 함수는 H2(z)로 표시되고 잡음 소스(101)로부터 MIC1으로의 트랜스퍼 함수는 H1(z)로 표시된다. 단위 트랜스퍼 함수를 가정하는 것이 이 알고리즘의 일반성을 해치지는 않는 데, 이는 신호, 잡음, 마이크로폰간 실제 관계가 단순한 비일 뿐이고, 이 비들이 간단하게 이러한 방식으로 재규정되기 때문이다.Assume that the transfer function from signal source 100 to MIC1 and from noise source 101 to MIC2 has a unit value. The transfer function from signal source 100 to MIC2 is represented by H2 (z) and the transfer function from noise source 101 to MIC1 is represented by H1 (z). Assuming a unit transfer function does not impair the generality of this algorithm, because the actual relationship between signal, noise, and microphone is only a simple ratio, and these ratios are simply redefined in this way.

종래 잡음 제거 시스템에서는 MIC2로부터의 정보가 MIC1으로부터의 잡음을 제거하려 시도하는 데 사용된다. 그러나, 비구술 가정(unspoken assumption)은 음성 활동 감지(VAD)가 절대로 완전하지 않다는 것이며, 따라서 잡음제거가 잡음과함께 신호의 상당 부분을 제거하지 않도록 조심스럽게 실행되어야 하는 것이다. 그러나, VAD(104)가 완전하다고 가정되어 사용자에 의해 생성되는 음성이 전혀 없을 때 0과 같고 음성이 생성될 때 1과 같다면, 잡음 제거에 주목할만한 개선이 이루어질 수 있다.In conventional noise cancellation systems, information from MIC2 is used to attempt to remove noise from MIC1. However, the unspoken assumption is that voice activity detection (VAD) is by no means complete, and therefore must be exercised carefully so that noise cancellation does not remove much of the signal with noise. However, if the VAD 104 is assumed to be complete and equals 0 when there is no speech produced by the user at all and equals 1 when the speech is generated, then a noticeable improvement in noise cancellation can be made.

도 2를 참고하여 마이크로폰에 대한 직접 경로와 단일 잡음 소스를 분석할 때, MIC1에 유입되는 총 음향 정보는 m₁(n)으로 표시된다. MIC2에 유입되는 총 음향 정보는 마찬가지로 m₂(n)으로 표시된다. z(디지털 주파수) 도메인에서, 이들은 M₁(z)와 M₂(z)로 표시된다. 그래서,When analyzing the direct path and the single noise source for the microphone with reference to Figure 2, the total acoustic information flowing into MIC1 is represented by m ₁ (n). The total acoustic information flowing into MIC2 is likewise represented by m ₂ (n). In the z (digital frequency) domain, they are represented by M ₁ (z) and M ₂ (z). so,

M₁(z) = S(z) + N₂(z)M ₁ (z) = S (z) + N ₂ (z)

M₂(z) = N(z) + S₂(z)M ₂ (z) = N (z) + S ₂ (z)

이고ego

N₂(z) = N(z)H₁(z)N ₂ (z) = N (z) H ₁ (z)

S₂(z) = S(z)H₂(z)S ₂ (z) = S (z) H ₂ (z)

이어서,next,

M₁(z) = S(z) + N(z)H₁(z)M ₁ (z) = S (z) + N (z) H ₁ (z)

M₂(z) = N(z) + S(z)H₂(z) 방정식1M ₂ (z) = N (z) + S (z) H ₂ (z) Equation 1

이는 모든 두 마이크로폰 시스템에 대한 일반적인 경우이다. 실제 시스템에서, MIC1으로 일부 잡음 누출이 항상 있을 것이고, MIC2로 일부 신호 누출이 있을 것이다. 방정식 1은 네 개의 미지 관계와 단 두개의 기지 관계를 가지며, 따라서 명백하게 풀릴 수 없다.This is a common case for all two microphone systems. In a real system, there will always be some noise leakage with MIC1 and some signal leakage with MIC2. Equation 1 has four unknown relationships and only two known relationships, and therefore cannot be solved explicitly.

그러나, 방정식 1의 미지 관계 중 일부를 해결할 또다른 방식이 있다. 이 분석은 신호가 발생되지 않는 경우의 검사로 시작된다. 즉, VAD 소자로부터의 신호가 0과 같고 음성이 생성되지 않는 경우의 검사로 시작된다. 이 경우에, s(n) = S(z) = 0이고 방정식 1은However, there is another way to solve some of the unknown relationships in equation 1. This analysis begins with the examination when no signal is generated. That is, the test starts when the signal from the VAD element is equal to 0 and no speech is generated. In this case, s (n) = S (z) = 0 and equation 1 is

M_1n(z) = N(z)H₁(z)M _1n (z) = N (z) H ₁ (z)

M_2n(z) = N(z)M _2n (z) = N (z)

이때 M 변수에서의 첨자 n은 잡음만이 수신되고 있음을 의미한다.In this case, the subscript n in the M variable means that only noise is received.

이는 아래 방정식 2를 유도해낸다.This leads to equation 2 below.

M_1n(z) = M_2n(z)H₁(z)M _1n (z) = M _2n (z) H ₁ (z)

H₁(z) = M_1n(z)/M_2n(z) 방정식 2H ₁ (z) = M _1n (z) / M _2n (z) Equation 2

잡음만이 수신되고 있음을 시스템이 확신할 때 가용 시스템 식별 알고리즘과 마이크로폰 출력 중 하나를 이용하여 H₁(z)가 계산될 수 있다. 이 계산은 적응방식으로 실행될 수 있어서, 시스템이 잡음 변화에 대응할 수 있다.When the system is confident that only noise is being received, H ₁ (z) can be calculated using either the available system identification algorithm and the microphone output. This calculation can be performed adaptively so that the system can respond to noise changes.

방정식 1의 미지값 중 하나에 대한 해법이 이제 가능하다. 또다른 미지값 H₂(z)는 VAD가 1과 같고 음성이 생성되고 있는 사례를 이용하여 결정될 수 있다. 이경우가 발생하고 있으나 마이크로폰의 최근 히스토리가 낮은 수준의 잡음을 표시할 경우, n(s) = N(z)~0 라고 할 수 있다. 그러면 방정식 1은 아래와 같이 단순화된다.A solution to one of the unknown values of equation 1 is now possible. Another unknown value H ₂ (z) can be determined using the case where VAD is equal to 1 and voice is being generated. This happens, but if the microphone's recent history shows a low level of noise, then n (s) = N (z) ~ 0. Equation 1 is then simplified to

M_1s(z) = S(z)M _1s (z) = S (z)

M_2s(z) = S(z)H₂(z)M _2s (z) = S (z) H ₂ (z)

이는 다시,This again,

M_2s(z) = M_1s(z)H₂(z)M _2s (z) = M _1s (z) H ₂ (z)

H₂(z) = M_2s(z)/M_1s(z)H ₂ (z) = M _2s (z) / M _1s (z)

이는 H₁(z) 연산의 역이다. 그러나, 여러 다른 입력들이 사용되고 있음에 주목하여야 한다. 지금은 신호만이 발생하고 있으나 예전에는 잡음만이 발생하였다. H₂(z)를 계산할 때, H₁(z)에 대해 계산한 값들이 일정하게 유지되고, 그 역도 마찬가지다. 따라서, H₁(z)와 H₂(z) 중 하나가 연산될 때 그중 다른 하나가 실질적으로 변하지 않는다고 가정한다.This is the inverse of the H ₁ (z) operation. However, it should be noted that several different inputs are used. Now only signals are generated, but in the past only noise was generated. When calculating H ₂ (z), the values calculated for H ₁ (z) remain constant, and vice versa. Thus, it is assumed that when one of H ₁ (z) and H ₂ (z) is computed, the other one does not substantially change.

H₁(z)와 H₂(z)를 계산한 후, 신호로부터 잡음을 제거하는 데 이들이 사용된다. 방정식 1은 아래와 같이 다시 쓰여질 경우,After calculating H ₁ (z) and H ₂ (z), they are used to remove noise from the signal. Equation 1 is rewritten as

S(z) = M₁(z) - N(z)H₁(z)S (z) = M ₁ (z)-N (z) H ₁ (z)

N(z) = M₂(z) - S(z)H₂(z)N (z) = M ₂ (z)-S (z) H ₂ (z)

S(z) = M₁(z) - [M₂(z)-S(z)H₂(z)]H₁(z)S (z) = M ₁ (z)-[M ₂ (z) -S (z) H ₂ (z)] H ₁ (z)

S(z)[1-H₂(z)H₁(z)] = M₁(z) - M₂(z)H₁(z)S (z) [1-H ₂ (z) H ₁ (z)] = M ₁ (z)-M ₂ (z) H ₁ (z)

S(z)를 얻기 위해 N(z)에 아래와 같이 대입할 수 있다.To get S (z) we can substitute N (z) as

S(z) = [M₁(z)-M₂(z)H₁(z)]/[1-H₂(z)H₁(z)] 방정식 3S (z) = [M ₁ (z) -M ₂ (z) H ₁ (z)] / [1-H ₂ (z) H ₁ (z)] Equation 3

트랜스퍼 함수 H₁(z)와 H₂(z)가 충분한 정확도로 기술될 수 있다면, 잡음이 완전히 제거될 수 있고, 원 신호가 회복될 수 있다. 이는 잡음의 진폭/주파수 특성에 상관없이 사실이다. 유일한 가정이라면 VAD가 완벽하고, H₁(z)와 H₂(z)가 충분히 정확하며, 그리고 H₁(z)와 H₂(z) 중 하나가 계산 중일 때 나머지 하나가 실질적으로 변하지 않는다는 점이다. 실제로, 이 가정들은 합리적인 것으로 드러났다.If the transfer functions H ₁ (z) and H ₂ (z) can be described with sufficient accuracy, the noise can be completely eliminated and the original signal can be recovered. This is true regardless of the amplitude / frequency characteristics of the noise. The only assumption is that VAD is perfect, H ₁ (z) and H ₂ (z) are sufficiently accurate, and the other one does not change substantially when one of H ₁ (z) and H ₂ (z) is being calculated. to be. In fact, these assumptions turned out to be reasonable.

여기서 설명되는 잡음 제거 알고리즘은 어떠한 수의 잡음 소스도 포함하도록 쉽게 일반화된다. 도 3은 n개의 구분된 잡음 소스로 일반화된, 한 실시예의 잡음 제거 알고리즘의 전면 단부 블록도표이다. 이 구별된 잡음 소스들은 서로 반사나 에코를 일으킬 수 있으나, 이에 제한되지는 않는다. 여러 잡음 소스들이 도시되며, 각각의 잡음 소스들은 각각의 마이크로폰에 대한 트랜스퍼 함수나 경로를 지닌다. 앞서 이름붙여진 경로 H₂는 H₀로 다시 표시되어, MIC1으로의 잡음 소스 2의 경로 표시가 보다 편리하게 된다. 각 마이크로폰의 출력은, z 도메인으로 변환될 때, 다음과 같다.The noise cancellation algorithm described herein is easily generalized to include any number of noise sources. 3 is a front end block diagram of an embodiment of a noise cancellation algorithm, generalized to n distinct noise sources. These distinct noise sources may reflect or echo each other, but are not limited thereto. Several noise sources are shown, with each noise source having a transfer function or path for each microphone. The previously named path H ₂ is again marked as H ₀ , making the path representation of noise source 2 to MIC1 more convenient. The output of each microphone, when converted to the z domain, is as follows.

M₁(z) = S(z) + N₁(z)H₁(z) + N₂(z)H₂(z) + ...N_n(z)H_n(z)M ₁ (z) = S (z) + N ₁ (z) H ₁ (z) + N ₂ (z) H ₂ (z) + ... N _n (z) H _n (z)

M₂(z) = S(z)H₀(z) + N₁(z)G₁(z) + N₂(z)G₂(z) + ...N_n(z)G_n(z)M ₂ (z) = S (z) H ₀ (z) + N ₁ (z) G ₁ (z) + N ₂ (z) G ₂ (z) + ... N _n (z) G _n (z )

어떤 신호도 없을 경우(VAD=0), (명료성을 위해 z를 억제)In the absence of any signal (VAD = 0) (z is suppressed for clarity)

M_1n= N₁H₁+ N₂H₂+ ... N_nHn 방정식 4M _1n = N ₁ H ₁ + N ₂ H ₂ + ... N _n Hn Equation 4

M_2n= N₁G₁+ N₂G2 + ... N_nG_n방정식M _2n = N ₁ G ₁ + N ₂ G2 + ... N _n G _n equation

이제는 새 트랜스퍼 함수가 앞서 H1(z)처럼 정의될 수 있다.Now the new transfer function can be defined as H1 (z).

= M_1n/M_12n= (N₁H₁+ N₂H₂+ ... N_nHn)/(N₁G₁+ N₂G2 + ... N_nG_n)방정식 6 = M _1n / M _12n = (N ₁ H ₁ + N ₂ H ₂ + ... N _n Hn) / (N ₁ G ₁ + N ₂ G2 + ... N _n G _n ) Equation 6

은 잡음 소스와 그 트랜스퍼 함수에 따라 좌우되고, 어떤 신호도 전송되지 않는 어떤 순간에도 계산될 수 있다. 다시 한번, 마이크로폰 입력의 n 첨자는 잡음만이 감지되는 것을 표시하며, s 첨자는 마이크로폰에 의해 신호만이 수신되고 있음을 표시한다. Depends on the noise source and its transfer function and can be calculated at any moment when no signal is transmitted. Once again, the n subscript of the microphone input indicates that only noise is detected, and the s subscript indicates that only a signal is being received by the microphone.

어떤 잡음도 없다는 가정 하에서 방정식 4를 검증하면,If you verify equation 4 under the assumption that there is no noise,

M_1s= SM _1s = S

M_2s= SH₀ M _2s = SH ₀

H₀은 어떤 가용 트랜스퍼 함수 계산 알고리즘을 이용하여 앞서와 같이 풀릴 수 있다. 수학적으로,H ₀ can be solved as before using any available transfer function calculation algorithm. Mathematically,

H0 = M_2s/M_1s H0 = M _2s / M _1s

방정식 6에서 정의된을 이용하여 방정식 4를 다시 쓰면,Defined in equation 6 Rewrite equation 4 using

= (M₁-S)/(M₂-SH₀) 방정식 7 = (M ₁ -S) / (M ₂ -SH ₀ ) Equation 7

S에 대하여 해를 구하면,If you solve for S,

S = (M₁-M₂ )/(1-H₀ ) 방정식 8S = (M ₁ -M ₂ ) / (1-H ₀ ) Equation 8

이는 방정식 3과 같다. H₀가 H₂의 자리를 차지하고 있고가 H₁의 자리를 차지하고 있다. 따라서, 잡음 제거 알고리즘은 잡음 소스의 다중 에코를 포함하여 어떤 숫자의 잡음 소스에 대해서도 수학적으로 여전히 유효하다. 또한, H₀와이 매우 높은 정확도로 추정될 수 있고 신호로부터 마이크로폰까지의 단 한가지 경로의 앞서의 가정이 유지될 경우, 잡음이 완전히 제거될 수 있다.This is equal to equation 3. H ₀ occupies the place of H ₂ Occupies H ₁ . Thus, the noise cancellation algorithm is still mathematically valid for any number of noise sources, including multiple echoes of noise sources. In addition, H ₀ and This can be estimated with very high accuracy and noise can be completely eliminated if the previous assumption of only one path from the signal to the microphone is maintained.

가장 일반적인 경우는 다중 잡음 소스와 다중 신호 소스를 가지는 경우이다. 도 4는 n개의 구분된 잡음 소스와 신호 반사가 존재하는 가장 일반적 경우의 한 실시예의 잡음제거 알고리즘의 전면 단부의 블록도표이다. 여기서 반사된 신호는 두 마이크로폰에 들어간다. 이는 가장 일반적인 경우이다. 왜냐하면, 잡음 소스가 마이크로폰으로 반사해 들어가는 것은 간단한 추가적 잡음 소스로 정확하게 모델링될 수 있다. 명확성을 위해, 신호로부터 MIC2로의 직접 경로는 H₀(z)에서 H₀₀(z)로 변경되었으며, 마이크로폰 1과 2를 향하는 반사된 경로는 H₀₁(z)와 H₀₂(z)로 각각 표시된다.The most common case is having multiple noise sources and multiple signal sources. 4 is a block diagram of the front end of the noise cancellation algorithm of one embodiment in the most common case where there are n distinct noise sources and signal reflections. The reflected signal enters the two microphones. This is the most common case. Because the reflection of the noise source into the microphone can be accurately modeled as a simple additional noise source. For clarity, the direct path from the signal to MIC2 was changed from H ₀ (z) to H ₀₀ (z), and the reflected paths towards microphones 1 and 2 are labeled H ₀₁ (z) and H ₀₂ (z), respectively. do.

마이크로폰으로의 입력은 아래와 같이 된다.The input to the microphone is as follows.

M₁(z) = S(z) + S(z)H₀₁(z) + N₁(z)H₁(z) + N₂(z)H₂(z) +...N_n(z)H_n(z)M ₁ (z) = S (z) + S (z) H ₀₁ (z) + N ₁ (z) H ₁ (z) + N ₂ (z) H ₂ (z) + ... N _n (z H _n (z)

M₂(z) = S(z)H₀₀(z) + S(z)H₀₂(z) + N₁(z)G₁(z) + N₂(z)G₂(z) +...N_n(z)G_n(z)M ₂ (z) = S (z) H ₀₀ (z) + S (z) H ₀₂ (z) + N ₁ (z) G ₁ (z) + N ₂ (z) G ₂ (z) + .. .N _n (z) G _n (z)

... 방정식 9... Equation 9

VAD=0일 때, 입력은 아래와 같다.(다시 z를 억제)When VAD = 0, the input is as follows (z is suppressed again):

M_1n= N₁H₁+ N₂H₂+...N_nH_n M _1n = N ₁ H ₁ + N ₂ H ₂ + ... N _n H _n

M_2n= N₁G₁+ N₂G₂+...N_nG_n M _2n = N ₁ G ₁ + N ₂ G ₂ + ... N _n G _n

이는 방정식 5와 같다. 따라서, 방정식 6에서의계산은 예상한 바와 같이 변하지 않는다. 잡음이 없는 상황을 점검하면, 방정식 9는 아래와 같이 단순화된다.This is equal to equation 5. Thus, in equation 6 The calculation does not change as expected. Checking for the absence of noise, Equation 9 is simplified to

M_1s= S + SH₀₁ M _1s = S + SH ₀₁

M_2s= SH₀₀+ SH₀₂ M _2s = SH ₀₀ + SH ₀₂

이는 아래와 같이의 정의를 이끈다.This is shown below Leads to justice.

= M_2s/M_1s= (H₀₀+H₀₂)/(1+H₀₁) 방정식 10= M _2s / M _1s = (H ₀₀ + H ₀₂ ) / (1 + H ₀₁ ) Equation 10

(방정식 7에서처럼)에 대한 정의를 이용하여 방정식 9를 다시 쓰면,(As in equation 7) Rewrite equation 9 using the definition of

= [M₁- S(1+H₀₁)]/[M₂- S(H₀₀+H₀₂)] 방정식 11 = [M ₁ -S (1 + H ₀₁ )] / [M ₂ -S (H ₀₀ + H ₀₂ )] equation 11

산술적 조작으로 인해,Due to arithmetic operations,

S(1+H₀₁-(H₀₀+H₀₂)) = M₁-M₂ S (1 + H _01- (H ₀₀ + H ₀₂ )) = M ₁ -M ₂

S(1+H₀₁)[1-(H₀₀+H₀₂)/(1+H₀₁)] = M₁- M₂ S (1 + H ₀₁ ) [1- (H ₀₀ + H ₀₂ ) / (1 + H ₀₁ )] = M ₁ -M ₂

S(1+H₀₁)[1- ] = M₁-M₂ S (1 + H ₀₁ ) [1- ] = M ₁ -M ₂

따라서,therefore,

S(1+H₀₁) = M₁-M₂ /(1- ) 방정식 12S (1 + H ₀₁ ) = M ₁ -M ₂ /(One- ) Equation 12

방정식 12는 H₀를로 치환함으로서, 그리고 좌변에 (1+H₀₁)의 인수를 추가함으로서 방정식 8과 같다. 추가적인 인수는 S가 이 상황에서 직접 풀릴 수 없으나, 신호의 모든 에코를 추가한 신호에 대해 해(solution)가 발생할 수 있다. 이는 에코 억제로 다루기 위한 여러 기존 방법들이 있는 것과 같은 그러한 열악한 상황이 아니며, 에코들이 억제되지 않더라도, 음성의 해독능력에 어떤 상당한 수준까지 영향을 쉽게 미치지는 못한다. 보다 복잡한의 계산은 잡음 소스로 작용하는 마이크로폰2의 신호 에코를 설명하는 데 필요하다.Equation 12 gives H ₀ By substituting and adding the argument of (1 + H ₀₁ ) to the left side. An additional factor cannot be solved directly by S in this situation, but a solution may occur for the signal that adds all the echoes of the signal. This is not such a poor situation as there are many existing methods to deal with echo suppression, and even if the echoes are not suppressed, they do not easily affect the speech decipherability to any significant degree. More complex The calculation of is necessary to describe the signal echo of microphone 2 acting as a noise source.

도 5는 한 실시예의 잡음 제거 방법의 순서도이다. 동작시에, 음향 신호가 수신된다(단계 502). 게다가, 음성 활동에 연계된 생리적 정보가 수신된다(단계 504). 음향 신호를 나타내는 제 1 트랜스퍼 함수는 한개 이상의 지정 시간 주기동안 음향 신호에 음성 정보가 결여되어 있음을 바탕으로 계산된다(단계 506). 음향신호를 나타내는 제 2 트랜스퍼 함수는 한개 이상의 명시된 시간 주기동안 음향 신호에 음성 정보가 존재하는 지를 결정함에 따라 계산된다(단계 508). 제 1 트랜스퍼 함수와 제 2 트랜스퍼 함수의 한가지 이상의 조합을 이용하여 음향 신호로부터 잡음이 제거되어, 잡음제거된 음향 데이터 스트림을 생성한다(단계510).5 is a flow chart of a noise canceling method of an embodiment. In operation, an acoustic signal is received (step 502). In addition, physiological information associated with voice activity is received (step 504). A first transfer function representing the acoustic signal is calculated based on the lack of speech information in the acoustic signal for one or more specified time periods (step 506). A second transfer function representing the acoustic signal is calculated by determining whether speech information is present in the acoustic signal for one or more specified time periods (step 508). Noise is removed from the acoustic signal using one or more combinations of the first transfer function and the second transfer function to generate a noise canceled acoustic data stream (step 510).

잡음 제거 알고리즘은 직접 경로를 가진 단일 잡음 소스의 가장 간단한 경우로부터 반사와 에코를 가진 다중 잡음 소스까지 설명된다. 이 알고리즘은 어떤 환경 조건하에서도 기능하는 것으로 나타난다. 잡음의 종류와 양은과에 대해 좋은 추정치가 만들어질 경우 중요하지 않고, 둘 중 나머지 하나가 계산될 때 하나가 변하지 않을 경우 중요하지 않다. 사용자 환경이 에코가 존재하는 경우라면, 잡음 소스로부터 올 경우 이들이 보상받을 수 있다. 신호 에코가 또한 존재할 경우, 클리닝된 신호에 영향을 미칠 것이고, 그러나, 대부분의 환경에서는 이 효과가 무시할만한 것이어야 할 것이다.The noise cancellation algorithm is described from the simplest case of a single noise source with a direct path to multiple noise sources with reflections and echoes. The algorithm appears to work under any environmental condition. The type and amount of noise and It does not matter if a good estimate is made for, and does not matter if one does not change when the other is calculated. If the user environment is an echo, they can be compensated if they come from a noise source. If a signal echo is also present, it will affect the cleaned signal, but in most circumstances this effect should be negligible.

동작 시에, 한 실시예의 알고리즘은 다양한 잡음 종류, 진폭, 방위와의 관계에서 훌륭한 결과를 보여준다. 그러나, 수학적 개념에서 실제 환경으로 옮겨갈 때 항상 근사와 조절이 이루어져야 한다. 한가지 가정이 방정식 3에서 이루어지며, 이때 H₂(z)가 작다고 가정되고, 따라서 H₂(z)H₁(z) ~ 0. 그래서, 방정식 3은 아래와 같이 정리된다.In operation, the algorithm of one embodiment shows excellent results in relation to various noise types, amplitudes, and orientations. However, approximations and adjustments must always be made when moving from a mathematical concept to a real environment. One assumption is made in equation 3, where H ₂ (z) is assumed to be small, and therefore H ₂ (z) H ₁ (z) to 0. Thus, equation 3 is summarized as follows.

S(z) ~ M₁(z)-M₂(z)H₁(z)S (z) to M ₁ (z) -M ₂ (z) H ₁ (z)

이는 H₁(z)만 계산되면 된다는 것으로, 이에 따라 과정의 속도를 향상되고 연관된 연산 수가 크게 감소함을 의미한다. 마이크로폰을 적절히 선택함으로서, 이 근사가 쉽게 현실화된다.This means that only H ₁ (z) needs to be calculated, which speeds up the process and significantly reduces the number of operations involved. By appropriate selection of the microphone, this approximation is easily realized.

또다른 근사는 한 실시예에서 사용되는 필터에 관련된다. 실제 H₁(z)는 극(poles)과 0(zeros)을 가지며, 안정성과 단순성을 위해 모든 0 한정 임펄스 응답(FIR; Finite Impulse Response) 필터가 사용된다. 충분한 탭(60개 주변)을 가질 경우 실제 H₁(z)에 대한 근사가 매우 좋다.Another approximation relates to the filter used in one embodiment. Actual H ₁ (z) has poles and zeros, and all finite impulse response (FIR) filters are used for stability and simplicity. If you have enough taps (around 60), the approximation to the actual H ₁ (z) is very good.

서브밴드 선택에 있어서는, 트랜스퍼 함수가 계산되어야 하는 주파수 범위가 넓을수록, 정확하게 계산하는 것이 어렵다. 따라서 음향 데이터는 16개의 서브밴드로 나누어지고, 이때 최저 주파수는 50Hz, 최고 주파수는 3700Hz가 된다. 잡음 제거 알고리즘이 그후 각각의 서브밴드에 적용되고, 16개의 잡음제거된 데이터 스트림이 재조합되어 잡음제거된 음향 데이터를 도출한다. 이는 매우 잘 동작하지만, 서브밴드의 어떤 다른 조합(가령, 4, 6, 8, 32로 균등하게 이격된 서브밴드들)도 사용될 수 있고 마찬가지로 잘 동작한다고 발견되었다.In subband selection, the wider the frequency range over which the transfer function is to be calculated, the more difficult it is to calculate accurately. Thus, the acoustic data is divided into 16 subbands, with a minimum frequency of 50 Hz and a maximum frequency of 3700 Hz. A noise cancellation algorithm is then applied to each subband and 16 noise canceled data streams are recombined to yield noise canceled acoustic data. This works very well, but any other combination of subbands (eg subbands evenly spaced 4, 6, 8, 32) can be used and found to work as well.

잡음의 진폭은 사용되는 마이크로폰이 포화되지 않도록(즉, 선형 응답 범위 바깥에서 동작하도록) 한 실시예에서 제약되었다. 마이크로폰이 최적의 성능을 보장하기 위해 선형으로 동작한다는 것이 중요하다. 이 제한으로도, 매우 낮은 신호대 잡음비(SNR) 신호가 잡음제거될 수 있다(-10dB 미만).The amplitude of the noise was constrained in one embodiment so that the microphone used was not saturated (ie, operated outside the linear response range). It is important that the microphone behaves linearly to ensure optimal performance. Even with this limitation, very low signal-to-noise ratio (SNR) signals can be noise canceled (<-10dB).

H₁(z)는 최소 평균 제곱(LMS; Least Mean Square) 방식의 공통 적응식 트랜스퍼 함수를 이용하여 매 10밀리초마다 계산된다. 프렌티스-홀(Prentice-Hall)에서 출판하고 ISBN 0-13-004029-0에 실린 위드로우(Widrow)와 스턴스(Stearns)의 저서 "Adaptive Signal Processing"(1985)에 이에 대한 설명이 실려있다.H ₁ (z) is calculated every 10 milliseconds using a common adaptive transfer function of least mean square (LMS). This is described in the book "Adaptive Signal Processing" (1985) by Widrow and Stearns, published by Prentice-Hall and published in ISBN 0-13-004029-0.

한 실시예에 대한 VAD는 고주파(RF) 센서와 두 마이크로폰으로부터 얻어서, 음성 스피치와 비음성 스피치에 대해 매우 높은 정확도(>99%)를 보인다. 한 실시예의 VAD는 고주파(RF) 간섭계를 이용하여 사람 스피치 생성에 관련된 조직 운동을 감지하지만 이에 제한되지는 않는다. 따라서, 음향 잡음으로부터 완전히 자유롭고 어떤 음향 잡음 환경에서도 기능할 수 있다. RF 신호의 간단한 에너지 측정은 음성 스피치가 발생하고 있는 지를 결정하는 데 사용될 수 있다. 비음성 스피치는 RF 센서나 이와 유사한 보이싱 센서를 이용하여, 또는 이들의 조합을 통해 결정되는 음성 섹션에 유사한 방식으로, 기존의 음향-본위 방법을 이용하여 결정될 수 있다. 비-음성 스피치에는 한참 적은 에너지가 있기 때문에, 그 활동 정확도는 음성 스피치만큼 중요하지 않다.The VAD for one embodiment is obtained from a high frequency (RF) sensor and two microphones, showing very high accuracy (> 99%) for speech and non-speech speech. In one embodiment, the VAD detects, but is not limited to, tissue motion related to human speech generation using a radio frequency (RF) interferometer. Thus, it is completely free from acoustic noise and can function in any acoustic noise environment. Simple energy measurements of the RF signal can be used to determine if speech speech is occurring. Non-speech speech can be determined using existing acoustic-based methods, using RF sensors or similar voicing sensors, or in a similar manner to speech sections determined through a combination thereof. Since there is far less energy in non-speech speech, its activity accuracy is not as important as speech speech.

음성 스피치와 비음성 스피치가 신뢰할만하게 감지되면서 한 실시예의 알고리즘이 구현될 수 있다. 또한, 잡음 제거 알고리즘이 VAD를 얻는 방식에 좌우되지 않으며, 음성 스피치에 대해 특히, 정확할 뿐임을 반복하는 것이 유용하다. 스피치가 감지되지 않고 스피치에서 트레이닝이 발생하면, 이어지는 잡음제거된 음향 데이터가 왜곡될 수 있다.The algorithm of one embodiment may be implemented with reliable speech and non-speech speech being reliably detected. It is also useful to repeat that the noise cancellation algorithm does not depend on how VAD is obtained and is only accurate, especially for speech speech. If speech is not detected and training occurs in speech, subsequent noise canceled acoustic data may be distorted.

데이터가 네 개의 채널로 수집된다. 한 채널은 MIC1, 한 채널은 MIC2, 나머지 두 채널은 음성 스피치와 관련된 조직 운동을 감지한 고주파 센서에 대한 것이다. 데이터는 40kHz에서 동시에 샘플링되었으며, 디지털방식으로 여파되어 8kHz로 줄어든다. 높은 샘플링 속도는 아날로그에서 디지털로 변환 과정에서 생길 수 있는 위신호(aliasing)를 감소시키기 위해 사용되었다. 데이터 캡처 및 저장을 위해 4-채널 National Instruments A/D 보드가 Labview와 함께 사용되었다. 데이터는 C프로그램으로 판독되며, 한번에 10밀리초씩 잡음제거된다.Data is collected in four channels. One channel is for MIC1, one is for MIC2, and the other two are for high-frequency sensors that detect tissue movement related to speech speech. Data was sampled simultaneously at 40 kHz and digitally filtered down to 8 kHz. High sampling rates have been used to reduce aliasing that can occur during analog-to-digital conversion. Four-channel National Instruments A / D boards were used with Labview for data capture and storage. The data is read into a C program and de- noised 10 milliseconds at a time.

도 6은 여러명의 화자(speakers)들과 안내방송을 포함한 공항 터미널 잡음 존재 하에서 미국영어를 사용하는 여성에 대한 한 실시예의 잡음 억제 알고리즘의 결과를 도시한다. 화자(speaker)는 평상의 공항 터미널 잡음 하에서 번호 406-5562를 말하고 있다. 음향 데이터는 한번에 10밀리초씩 잡음제거되었고, 잡음 제거 이전에 10밀리초의 데이터가 50~3700kHz로 사전에 여파되었다. 대략 17dB의 잡음 감소가 명백히 나타난다. 어떤 사후 여파(post filtering)도 이 샘플에 실행되지 않는다. 따라서, 모든 잡음 감소는 한 실시예의 알고리즘으로 인한 것이다. 알고리즘은 순간적으로 잡음에 대해 조절되며, 다른 화자들로 인한 매우 어려운 잡음을 제거할 수 있다는 것이 명백하다. 여러 다른 종류의 잡음이 검사되어 비슷한 결과를 도출하였으며, 거리의 잡음, 헬리콥터 소리, 음악, 사인파,등이 상기 잡음 종류에 포함된다. 또한, 잡음의 방위는 잡음 억제 성능을 크게 변화시키지 않으면서 변화할 수 있다. 마지막으로, 처리된 스피치의 왜곡은 매우 낮아서, 스피치 인지 엔진과 사람 수신기에 대해 양호한 성능을 보장한다.FIG. 6 shows the results of an embodiment noise suppression algorithm for a woman speaking American English in the presence of airport terminal noise, including several speakers and announcements. The speaker is numbering 406-5562 under normal airport terminal noise. Acoustic data was de- noised 10 milliseconds at a time, and 10 milliseconds of data were pre-filtered at 50 to 3700 kHz before noise was removed. A noise reduction of approximately 17 dB is evident. No post filtering is done on this sample. Thus, all noise reduction is due to the algorithm of one embodiment. It is clear that the algorithm is adjusted for noise instantaneously and can remove very difficult noises caused by other speakers. Several different kinds of noise were examined and yielded similar results, including noise in the streets, helicopter sounds, music, sine waves, and the like. Also, the orientation of the noise can be changed without significantly altering the noise suppression performance. Finally, the distortion of the processed speech is very low, ensuring good performance for speech recognition engines and human receivers.

한 실시예의 잡음 제거 알고리즘은 어떤 환경 조건 하에서도 기능하는 것으로 나타났다. 잡음의 종류와 크기는과를 훌륭히 추정할 경우 중요하지 않다. 사용자 환경이 에코가 존재하는 경우라면, 잡음 소스로부터 유입될 경우 보상받을 수 있다. 신호 에코가 또한 존재할 경우, 처리된 신호에 영향을 미칠 것이지만, 대부분의 환경에서 그 영향은 무시할 수 있는 것이어야 할 것이다.The noise cancellation algorithm of one embodiment has been shown to function under any environmental condition. The type and magnitude of noise and It is not important if we estimate If the user environment is in the presence of an echo, it can be compensated if it comes from a noise source. If signal echoes are also present, they will affect the processed signal, but in most circumstances the effects should be negligible.

도 7은 도 2,3,4 하에서, 잡음에 대하여 일방향 마이크로폰 M2를, 스피치에 대하여 전방향 마이크로폰 M1을 사용하는 잡음제거의 물리적 구현의 블락 다이어그램이다. 위에서 언급한 바와 같이, 스피치에서 잡음 마이크로폰(MIC 2)까지의 경로는 영(0)으로 근사된다. 이러한 근사는 전방향 및 일방향 마이크로폰을 조심스럽게 배치함으로서 구현된다. 이는 잡음이 신호 위치 반대편을 향할 때 매우 잘 기능한다(20-40dB의 잡음 억제). 그러나, 잡음 소스가 화자(잡음 소스 N2)와 같은 편에 위치할 경우, 그 성능은 10-20dB의 잡음 억제에 그친다. 이러한 억제 기능 저하는 H2가 0에 근사된다는 사실을 보장하기 위해 취급되는 단계들로 귀속될 수 있다. 이 단계들은 잡음 마이크로폰(MIC2)에 대한 일방향 마이크로폰의 이용을 포함하였다. 그래서 잡음 데이터에 매우 미약한 신호가 존재한다. 일방향 마이크로폰이 특정 방향으로부터 들어오는 음향 정보를 소거함에 따라, 일방향 마이크로폰은 스피치와 같은 방향에서 수신되는 잡음 역시 소거한다. 이는 N2와 같이 한 위치의 잡음을 특성화하여 제거하는 적응성 알고리즘의 기능을 제한할 수 있다. 일방향 마이크로폰이 스피치 마이크로폰 M1에 사용되는 경우에도 동일한 효과가 나타난다.7 is a block diagram of a physical implementation of noise cancellation using, in FIGS. 2, 3 and 4, one way microphone M2 for noise and omnidirectional microphone M1 for speech. As mentioned above, the path from speech to noise microphone (MIC 2) is approximated to zero. This approximation is implemented by carefully placing the omni- and unidirectional microphones. This works very well when the noise is directed opposite the signal position (20-40 dB noise suppression). However, when the noise source is on the same side as the speaker (noise source N2), its performance is only 10-20dB of noise suppression. This inhibition can be attributed to the steps handled to ensure the fact that H2 is close to zero. These steps involved the use of one-way microphones for noise microphones (MIC2). So there is a very weak signal in the noise data. As the one-way microphone cancels sound information coming from a particular direction, the one-way microphone also cancels noise received in the same direction as the speech. This may limit the ability of an adaptive algorithm to characterize and remove noise at one location, such as N2. The same effect occurs when a unidirectional microphone is used for the speech microphone M1.

그러나, 단일 마이크로폰 M₂가 다른 전방향 마이크로폰으로 대치되면, 상당한 신호의 양이 M₂에 의해 캡처된다. 이는 앞서 가정한 H₂가 영이라는 것에 반대되며, 그리고 결과적으로 발성동안 상당한 신호의 양이 제거되며, 잡음제거와 신호제거의 결과가 도출된다. 이는 신호 왜곡이 계속 최소로 유지되어야 하는 경우에는 받아들일 수 없다. 왜곡을 감소하기 위해, 그 결과로 H₂에 대한 값이 계산된다. 그러나, H₂에 대한 값은 잡음이 있는 때에는 계산 될 수 없고, 또는 잡음은 스피치로서 잘못 분류되어 제거되지 않는다.However, if a single microphone M ₂ is replaced by another omnidirectional microphone, a significant amount of signal is captured by M ₂ . This is contrary to the assumption that H ₂ is zero, and as a result a significant amount of signal is removed during speech, resulting in noise cancellation and signal rejection. This is unacceptable if signal distortion must be kept to a minimum. In order to reduce the distortion, the value for H ₂ is calculated as a result. However, the value for H ₂ cannot be calculated when there is noise, or the noise is misclassified as speech and is not removed.

음향만의 마이크로폰 배열의 경우는 작은 2 개의 마이크로폰 배열이 문제의 해결책이 될 수 있음을 제시한다. 도 8은 한 실시예에서, 두 개의 전방향 마이크로폰을 포함하는 잡음제거 마이크로폰의 구조를 도시한다. (신호 소스를 향해) 동일한 방향으로 두 개가 일방향 마이크로폰을 이용하여서도 동일한 효과를 얻을 수 있다. 또하나의 실시예는 한 개의 일방향 마이크로폰과 한 개의 전방향 마이크로폰을 이용한다. 이러한 개념은 신호 소스 방향으로 음향 소스로부터 유사한 정보를 캡처하는 것이다. 신호 소스와 두 마이크로폰의 상대적 위치는 고정되어 있으며 공지되어 있다. 마이크로폰을 n개의 분리된 샘플에 대응하여 d 거리만큼 떨어지도록 배치하고, 배열의 축 상에 스피커를 배치함으로써, C_z ^-n의 형태로 H₂는 고정될 수 있다. 이 때 C는 신호 데이터 M₁과 M₂의 신호 데이터 진폭 차이이다. 아래 설명에서, 영을 제외한 어떠한 정수도 사용됨에도 불구하고, n=1 이라 가정하였다. 인과관계를 위해, 양의 정수의 사용이 바람직하다. 구면 압력 소스의 높이가 1/r 로 변함에 따라, 소스의 방향 특정만이 아니라, 이의 거리조차 허용된다. 요구되는 C는 다음에 의해 추정될 수 있다.In the case of acoustic-only microphone arrangements, a small two microphone arrangement can be a solution to the problem. FIG. 8 shows the structure of a noise canceling microphone including two omnidirectional microphones, in one embodiment. The same effect can be achieved by using two unidirectional microphones in the same direction (toward the signal source). Another embodiment uses one unidirectional microphone and one omnidirectional microphone. This concept is to capture similar information from the sound source in the direction of the signal source. The relative position of the signal source and the two microphones is fixed and known. H ₂ can be fixed in the form of C _z ^-n by placing the microphones so that they are separated by d distances corresponding to n separate samples, and by placing the speakers on the axis of the array. Where C is the difference in signal data amplitude between signal data M ₁ and M ₂ . In the description below, n = 1 is assumed, although any integer except zero is used. For causality, the use of positive integers is preferred. As the height of the spherical pressure source changes to 1 / r, not only the direction specification of the source, but even its distance is allowed. The required C can be estimated by

도 9는 도 8의 실시예에서, 요구되는 C에 대한 거리의 플랏이다. 점근선은 C=1.0에서 보여질 수 있으며, C는 대략 39cm에서 0.9에 도달한다. 약간 더 크게는, 대략 60cm에서 C는 0.94에 도달한다. 송수화기와 수화기에서 주로 마주치는 거리에서(4에서 12cm), C는 대략 0.5에서 0.75 사이이다. 이는 대략 60cm에 위치한 잡음소스에서 약 19%에서 44%의 차이이다. 그리고 대부분의 잡음 소스가 이보다 더 멀리 위치하는 것은 명백한 일이다. 따라서 이러한 구성을 이용하는 시스템은 잡음과 신호를 상당히 효율적으로, 심지어 그들이 유사한 방향을 지닌 때에도, 구별할 수 있다.9 is a plot of the distance to C required in the embodiment of FIG. 8. Asymptotes can be seen at C = 1.0 and C reaches 0.9 at approximately 39 cm. Slightly larger, at about 60 cm C reaches 0.94. At distances encountered mainly between handsets and receivers (4 to 12 cm), C is approximately between 0.5 and 0.75. This is a difference of about 19% to 44% for a noise source located approximately 60cm. And it is obvious that most noise sources are located farther than this. Thus, systems using this configuration can distinguish noise and signals quite efficiently, even when they have a similar direction.

C의 서투른 잡음 제거 추정치에 대한 결과를 결정하기 위해, C=nC₀라고 가정하며, 이 때 C는 어림치이며 C₀가 C 의 실제 값이다. 위로부터 이러한 신호 정의를 이용하면,To determine the result for C's poor noise canceling estimate, assume C = nC ₀ , where C is an approximation and C ₀ is the actual value of C. Using these signal definitions from above,

H₂(z)는 매우 작다고 가정했으며, 따라서 신호는 대략 다음과 같아진다,Assume H ₂ (z) is very small, so the signal is roughly

이는 스피치가 없을 경우 사실이다. 왜냐하면, 정의에 의해 H2 = 0이기 때문이다. 그러나, 스피치가 발생하고 있을 경우, H2는 0 이 아니며, Cz^-1이라고 설정되면,This is true if there is no speech. Because, by definition, H2 = 0. However, if speech is occurring, H2 is not 0 and if Cz ^-1 is set,

이는 다음과 같이 다시 쓸 수 있다.This can be rewritten as:

분모의 마지막 요소는 C 의 서투른 추정치에 따른 오류를 결정한다. 이 요소는 E로 명명되며, 다음과 같다.The final element of the denominator determines the error based on C's poor estimate. This element is named E and is

z^-1H₁(z)는 단지 필터이기 때문에, 이의 크기는 항상 양이다. 따라서, E에 의해 계산된 신호 크기내의 변화는 (1-n)에 완전히 의존한다.Since z ⁻¹ H ₁ (z) is just a filter, its magnitude is always positive. Thus, the change in signal magnitude calculated by E depends entirely on (1-n).

두가지 오류 가능성이 있다: C를 절하한 경우(n<1)와, C를 절상한 경우(n>1)이다. 첫 번째 경우, C는 실제보다 더 작게 평가되거나, 또는 신호는 추정치보다 더 가까워진다. 이 경우에 있어 (1-n)은 양수이고, 따라서 E는 양수이다. 이러한 경우 분모가 너무 크고, 신호의 크기는 너무 작다. 이는 신호-제거를 나타낸다. 두 번째 경우, 신호는 추정치보다 더 멀리 떨어져 있으며, E는 음수여서, 이는 S를 너무 크게 만든다. 이 경우 잡음제거는 불충분하다. 왜냐하면 매우 낮은 신호 왜곡이바람직하기 때문에, 추정치는 C의 절상 추정을 향해 벗어날 것이다.There are two possible errors: C is depreciated (n <1) and C is raised (n> 1). In the first case, C is evaluated smaller than it actually is, or the signal is closer than the estimate. In this case (1-n) is positive, so E is positive. In this case the denominator is too large and the signal is too small. This represents signal-rejection. In the second case, the signal is further away than the estimate, and E is negative, which makes S too large. In this case, noise cancellation is insufficient. Because very low signal distortion is desirable, the estimate will deviate towards the C's raise estimate.

이런 결과는 신호와 똑같은 실선 각도(M₁으로부터의 방향)에 위치한 잡음이 신호 위치와 잡음 위치 간 C 의 변화에 따라 실질적으로 제거될 것이라는 점을 또한 보여준다. 따라서, 대략 입으로부터 4cm 떨어진 M_1d을 가진 송수화기를 사용할 때, 요구되는 C는 대략 0.5 이고, 대략 1미터에서의 잡음의 경우 C는 약 0.96이다. 따라서, 잡음에서, C=0.5의 추정치가 의미하는 것은 이 잡음에 대하여 C가 절하추정되었다는 것이며, 제거될 것이라는 것이다. 제거의 양은 (1-n)에 직접적으로 의존할 것이다. 이 알고리즘은 신호에 대한 방향 및 범위를 이용하여 잡음으로부터 신호를 분리시킨다.These results also show that noise located at the same solid line angle as the signal (direction from M ₁ ) will be substantially removed as the C changes between the signal position and the noise position. Thus, when using a handset with M _1d approximately 4 cm away from the mouth, the required C is about 0.5 and for noise at about 1 meter C is about 0.96. Thus, in noise, an estimate of C = 0.5 means that C has been devalued for this noise and will be removed. The amount of removal will depend directly on (1-n). This algorithm uses the direction and range for the signal to separate the signal from noise.

발생하는 하나의 논의는 이 기술의 안정성을 포함한다. 구체적으로, 각각의 유성음 단편의 시작에서 1-H₁H₂의 역수를 계산하기 위한 필요성이 발생함에 따라, (1-H₁H₂)의 디컨볼루션(deconvolution)이 안정성의 문제를 제기한다. 이는 이 알고리즘 구현에 필요한 연산 시간이나 사이클당 명령 수를 감소시키는 데 도움이 된다. 왜냐하면, 모든 유성음 윈도의 역수를 연산할 필요가 없고 첫 번째 것만 연산하면 되기 때문이다. 그 이유는 H₂가 상수로 간주되기 때문이다. 그러나 이는, 거짓 양수를 만날 때마다 1-H₁H₂의 역수의 연산을 필요로하기 때문에 거짓 양수의 연산을 힘들게 할 것이다.One discussion that occurs includes the stability of this technique. Specifically, as the need arises to calculate the reciprocal of 1-H ₁ H ₂ at the beginning of each voiced fragment, the deconvolution of (1-H ₁ H ₂ ) raises the issue of stability. . This helps to reduce the computation time or instructions per cycle needed to implement this algorithm. Because you don't have to compute the inverse of all voiced windows, you only need to calculate the first one. This is because H ₂ is considered a constant. However, this would make the calculation of false positives difficult because every time a false positive is encountered, an inverse of 1-H ₁ H ₂ is required.

운이 좋게도, H₂의 선택은 디컨볼루션에 대한 필요성을 제거한다. 위의 설명으로부터, 신호는 다음과 같이 쓸 수 있다.Fortunately, the choice of H ₂ eliminates the need for deconvolution. From the above description, the signal can be written as

이는 다음과 같이 다시 쓸 수 있다.This can be rewritten as:

또는or

그러나, H₂(z)는 Cz^-1의 형태이기 때문에, 시간 영역에서의 수열은 다음과 같다.However, since H ₂ (z) is in the form of Cz ⁻¹ , the sequence in the time domain is as follows.

이것이 의미하는 것은 상기 현 신호 샘플이 상기 현 MIC 1 신호, 상기 현 MIC 2신호, 그리고 상기 이전 신호 샘플을 요구한다는 것이다. 이것이 의미하는 것은 어떠한 디컨볼루션도 필요로 되지 않으며, 이는 단지 이전과 같은 단순한 뺄셈에 이은 예전같은 컨볼루션만이 필요하다는 것이다. 요구되는 연산 증가도 최소일 것이다. 따라서 이는 실현하기에 용이하다.This means that the current signal sample requires the current MIC 1 signal, the current MIC 2 signal, and the previous signal sample. This means that no deconvolution is needed, just a simple subtraction as before, followed by the old convolution. The increase in computation required will also be minimal. This is therefore easy to realize.

이 실시예에서의 마이크로폰 응답내의 차이의 효과는, 도 2, 3, 4를 참고하여 기재된 구성을 검사함으로서 나타날 수 있으며, 이러한 구성에는 단지 이러한 시간 트랜스퍼 함수 A(z)와 B(z)만이 포함되며, 이는 MIC 1과 MIC 2의 주파수 응답을 필터링 및 증폭 응답과 함께 표현한다. 도 10은 두 개의 마이크로폰이 서로 다른 응답 특성을 지니는 실시예에서, 잡음 제거 알고리즘의 전면 단부의 블록도표이다.The effect of the difference in the microphone response in this embodiment can be seen by examining the configuration described with reference to Figures 2, 3 and 4, which only include these time transfer functions A (z) and B (z). This represents the frequency response of MIC 1 and MIC 2 together with the filtering and amplifying response. 10 is a block diagram of the front end of a noise cancellation algorithm in an embodiment in which two microphones have different response characteristics.

도 10은 단일 신호 소스(1000)와 단일 잡음 소스(1001)를 포함하는 한 실시예의 처리에 대한 그래픽 표현이다. 이 알고리즘은 두 개의 마이크로폰을 이용한다. 즉, 신호 마이크로폰 1(MIC1)과 잡음 마이크로폰 2(MIC2)를 포함하는 데, 이에 제한되지는 않는다. MIC1은 주로 신호를 일부 잡음과 함께 캡처한다고 가정되며, MIC2는 주로 잡음을 일부 신호와 함께 캡처한다고 가정된다. 신호 소스(1000)로부터 MIC1 으로의 데이터는 s(n)으로 표시되고, 이때, s(n)은 신호 소스(1000)로부터 아날로그 신호의 구분된 샘플이다. 신호 소스(1000)로부터 MIC2로의 데이터는 s₂(n)으로 표시된다. 잡음 소스(1001)로부터 MIC2로의 데이터는 n(n)으로 표시되고, 잡음 소스(1001)로부터 MIC1으로의 데이터는 n₂(n)으로 표시된다.10 is a graphical representation of an embodiment of a process including a single signal source 1000 and a single noise source 1001. This algorithm uses two microphones. That is, signal microphone 1 (MIC1) and noise microphone 2 (MIC2) are included, but are not limited thereto. MIC1 is primarily assumed to capture a signal with some noise, and MIC2 is primarily assumed to capture noise with some signal. Data from the signal source 1000 to MIC1 is represented by s (n), where s (n) is a separate sample of the analog signal from the signal source 1000. Data from the signal source 1000 to MIC2 is represented by s ₂ (n). Data from the noise source 1001 to MIC2 is represented by n (n), and data from the noise source 1001 to MIC1 is represented by n ₂ (n).

트랜스퍼 함수 A(z)는 MIC1의 주파수 응답을 그 필터링 및 증폭 응답과 함께 표현한다. 트랜스퍼 함수 B(z)는 MIC2의 주파수 응답을 필터링 및 증폭 응답과 함께 표현한다. 트랜스퍼 함수 A(z)의 출력은 m₁(n)으로 표현되고, 트랜스퍼 함수 B(z)의 출력은 m₂(n)으로 표시된다. 신호 m₁(n)과 m₂(n)은 잡음 제거 소자(1005)에 의해 수신되며, 신호에 대하여 동작하여 "클리닝된 스피치"를 출력한다.Transfer function A (z) represents the frequency response of MIC1 along with its filtering and amplifying response. Transfer function B (z) expresses the frequency response of MIC2 along with the filtering and amplifying response. The output of transfer function A (z) is represented by m ₁ (n) and the output of transfer function B (z) is represented by m ₂ (n). Signals m ₁ (n) and m ₂ (n) are received by noise canceling element 1005 and operate on the signal to output a “cleaned speech”.

이후, 용어 " MIC X의 주파수 응답" 은 그 마이크로폰에서의 데이터 기록 처리동안 발생하는 마이크로 폰의 결합 효과와 어떠한 증폭 또는 필터링 과정을 포함한다. 신호와 잡음에 대해 풀 때( 명확성을 위해 z 를 억제한다)The term "frequency response of MIC X" then includes the coupling effect of the microphone and any amplification or filtering process that occurs during the data recording process at that microphone. Solve for signal and noise (suppress z for clarity)

이 때, 뒤의 것을 전의 것으로 대치하면,If you replace the latter with the previous one at this time,

이는 (MIC1과 MIC2간의) 주파수 응답의 차이가 영향을 지니는 것을 나타내는 것으로 보인다. 그러나, 측정되는 바에 주목하여야 한다. 이전에(마이크로폰의 주파수 응답을 고려하기 전에), H₁은 다음의 것을 이용하여 측정되었다.This seems to indicate that the difference in frequency response (between MIC1 and MIC2) is influential. However, attention should be paid to what is measured. Previously (before considering the microphone's frequency response), H ₁ was measured using

이 때, n 첨자는 이 연산이 단지 잡음만을 포함하는 윈도우의 기간동안 구현된다는 것을 나타낸다. 그러나, 상기 식을 관찰할 때, 신호가 없을 때는 마이크로폰에서 아래와 같이 측정된다는 점에 주목하여야 한다.At this time, the n subscript indicates that this operation is implemented for the duration of the window containing only noise. However, when observing the above equation, it should be noted that in the absence of a signal, the microphone is measured as follows.

따라서, H₁은 다음과 같이 계산되어야 한다.Therefore, H ₁ should be calculated as follows.

그러나, B(z)와 A(z)는 H₁(z)를 계산할 때 고려되지 않는다.However, B (z) and A (z) are not taken into account when calculating H ₁ (z).

따라서, 실제로 측정된 것은 각 마이크로폰에서의 신호들의 비인 것이다.Therefore, what is actually measured is the ratio of the signals at each microphone.

이때,은 측정된 응답을 나타내며, H₁은 실제 응답이다. H₂의 계산도 유사하며. 그 결과는 다음과 같다.At this time, Represents the measured response and H ₁ is the actual response. The calculation of H ₂ is similar. the results are as follow.

위의 S에 관한 식에,과를 대입하면,In the expression for S above, and If you substitute,

또는or

이는, 마이크로폰의 주파수 응답이 포함되지 않을 때, 앞서의 경우와 같다. 여기서 S(z)A(z)가 S(z)를 대치하고, 그 값((z)와(z))가 실제의 H₁(z) 와 H₂(z)를 대치한다. 따라서, 이 알고리즘은, 이론상, 마이크로폰에 독립적이며, 관련 필터와 증폭기 응답에도 독립적이다.This is the same as the previous case when the frequency response of the microphone is not included. Where S (z) A (z) replaces S (z) and its value ( (z) and (z)) replaces actual H ₁ (z) and H ₂ (z). Thus, this algorithm is, in theory, microphone independent and independent of the associated filter and amplifier response.

그러나 실제의 경우에는, H₂=C_z ^-1(이때 C는 상수)라고 가정되나, 실제로는In practice, however, it is assumed that H ₂ = C _z ^-1 , where C is a constant.

따라서 결과는So the result is

이는 알려지지 않은 B(z)와 A(z)에 의존적이다. 이는 만일 마이크로폰들의 주파수 응답이 실질적으로 서로 다른 경우 문제를 일으킬 수 있으며, 이는 흔한 경우로서, 특히 흔히 사용되는 저렴한 마이크로폰의 경우 더욱 문제가 된다. 이는 MIC1으로부터 수신되는 데이터에 대해 적절한 관계를 가지도록 MIC2로부터의 데이터가 보상되어야 한다는 것을 의미한다. 이는 실제 신호에 대하여 예상되는 방향 및 거리에 위치하는 소스로부터 MIC1 및 MIC2의 광대역 신호를 레코딩함으로서 구현될 수 있다. 각각의 마이크로폰 신호의 DFT(이산 푸리에 변환)가 이 때 계산되고, 각각 주파수 빈(bin)에서의 변환의 크기도 계산된다. 각각의 주파수 빈에서의 MIC 2에 대한 DFT 의 크기는 MIC 1에 대한 DFT 의 크기와 C의 곱과 같도록 정해진다. 만일 이 M₁[n]에 대한 DFT 의 n번째 주파수 빈의 크기를 나타내면, M₂[n]과 곱할 요소는,This depends on the unknown B (z) and A (z). This can cause problems if the frequency response of the microphones is substantially different, which is a common case, especially for inexpensive microphones that are commonly used. This means that the data from MIC2 should be compensated for to have a proper relationship to the data received from MIC1. This can be implemented by recording the wideband signals of MIC1 and MIC2 from sources located in the direction and distance expected for the actual signal. The DFT (Discrete Fourier Transform) of each microphone signal is then calculated, and the magnitude of the transform at each frequency bin is also calculated. The magnitude of the DFT for MIC 2 in each frequency bin is determined to be equal to the product of the magnitude of the DFT for MIC 1 and C. If we represent the magnitude of the nth frequency bin of the DFT for this M ₁ [n], the element to multiply with M ₂ [n] is

이 된다.Becomes

이전의 MIC 2 DFT 위상을 이용하여, 새로운 MIC 2 DFT 진폭에 역변환이 적용된다. 이런 방법으로, MIC 2 가 재합성되어,Using the previous MIC 2 DFT phase, an inverse transform is applied to the new MIC 2 DFT amplitude. In this way, MIC 2 is resynthesized,

스피치만이 발생하고 있을 때 위 관계가 성립된다. 이 변환은 F의 특성에 가능한 가깝게 에뮬레이팅하는 필터를 이용하여, 시간 도메인에서 또한 실행될 수 있다. (예를들어, 매트랩 함수 FFT2.M 은 적합한 FIR 필터를 설계하기 위해 F[n]의 계산된 값과 함께 사용될 수 있다)The above relationship holds when only speech is occurring. This transformation can also be performed in the time domain, using a filter that emulates as close as possible to the characteristics of F. (For example, the Matlab function FFT2.M can be used with the calculated value of F [n] to design a suitable FIR filter.)

도 11A는 보상 전, 마이크로폰간의(4cm 간격으로 떨어진) 주파수 응답 차이(퍼센트)의 플랏이다. 도 11B는 DFT 보상 후, 마이크로폰간의(4cm 간격으로 떨어진) 주파수 응답 차이(퍼센트)의 플랏이다. 도 11C는 시간 영역 필터 보상 후, 마이크로폰간의(4cm 간격으로 떨어진) 주파수 응답 차이(퍼센트)의 플랏이다. 이러한 플랏은 위에서 언급한 보상 방법의 효율성을 보여준다. 따라서, 두 개의 매우 저렴한 전방향/일방향 마이크로폰의 사용으로, 두 보상 방법이 마이크로폰간의 정확한 관계를 복원한다.11A is a plot of the frequency response difference (percent) between microphones (spaced 4 cm apart) before compensation. 11B is a plot of the frequency response difference (percent) between microphones (spaced 4 cm apart) after DFT compensation. 11C is a plot of the frequency response difference (percent) between microphones (spaced 4 cm apart) after time domain filter compensation. This plot shows the effectiveness of the compensation method mentioned above. Thus, with the use of two very inexpensive omni / unidirectional microphones, the two compensation methods restore the correct relationship between the microphones.

변환은 상대적인 증폭과 필터링 과정이 변화하지 않는 한, 상대적으로 일정하여야 한다. 따라서, 보상 과정은 단지 제조 단계에서 한번만 수행될 필요가 있다. 그러나, 만일 필요하다면,매우 적은 잡음과 강한 신호를 지니는 분위기에서 시스템이 사용될 때까지, H₂=0 이라는 가정 하에서 이 알고리즘이 동작하도록 설정될 수 있다. 그후 보상 계수 F[n]은 이 시간부터 계산되고 사용될 수 있다. 잡음이 거의 없을 때는 잡음 제거가 필요하지 않기 때문에, 이 계산은 잡음제거 알고리즘에 부적절한 부담을 주지 않는다. 상기 잡음제거 계수는 또한 잡음 환경이 최대 정확성에 유리할 때 언제든지 업데이트 될 수 있다.The transformation should be relatively constant unless the relative amplification and filtering process changes. Thus, the compensation process only needs to be performed once in the manufacturing stage. However, if necessary, this algorithm can be set to operate under the assumption that H ₂ = 0 until the system is used in an atmosphere with very little noise and a strong signal. The compensation factor F [n] can then be calculated and used from this time. Since no noise cancellation is required when there is little noise, this calculation does not impose an improper burden on the noise cancellation algorithm. The noise cancellation coefficient can also be updated at any time when the noise environment is in favor of maximum accuracy.

여기서 보여지는 도면내에 묘사된 각각의 블록들과 단계들은 스스로 여기에 묘사될 필요없는 일련의 동작들을 포함할 수 있다. 관련 발명의 당업자들은 루틴, 알고리즘, 소스 코드, 마이크로코드, 프로그램 로직 어레이를 구현할 수 있을 것이고, 본원에서 기재된 상세한 설명과 도면을 바탕으로 발명을 구현할 수 있을 것이다. 여기서 설명되는 루틴은 아래 단락의 하나 이상의 것에 의해, 또는 아래 단락의 하나 이상의 조합에 의해 제공될 수 있다. 이는 관련 프로세서나 프로세서의 부분을 형성하는 비휘발성 메모리(보여지지 않음) 내에 저장되거나, 또는 디스크와 같은 이동가능한 미디어에 저장되거나, 또는 서버로부터 다운로드 되고 클라이언트 지역에 저장되거나, 또는 EEPROM 과 같은 반도체 칩, ASIC, 또는 DSP(디지털 신호 처리) 집적 회로 내에 배선되고 미리 프로그램 된다.Each of the blocks and steps depicted in the figures shown herein may comprise a series of actions that need not be described here on their own. Those skilled in the relevant art will be able to implement routines, algorithms, source code, microcode, program logic arrays, and may implement the inventions based on the detailed description and drawings described herein. The routines described herein may be provided by one or more of the following paragraphs, or by one or more combinations of the following paragraphs. It is stored in non-volatile memory (not shown) that forms the processor or part of the processor, or in removable media such as a disk, or downloaded from a server and stored in a client area, or a semiconductor chip such as an EEPROM. , ASIC, or DSP (Digital Signal Processing) integrated circuits are wired and preprogrammed.

Claims

As a method for removing noise from an electrical signal,

Receiving a plurality of acoustic signals at the first receiving element,

Receiving a plurality of acoustic signals at a second receiving element, wherein the plurality of acoustic signals are one or more noise signals generated by one or more noise sources and one or more voices generated by one or more signal sources A signal, wherein the one or more signal sources comprise one speaker (or speaker), the relative positions of the signal source, the first receiving element, and the second receiving element being known and fixed,

Receiving physiological information related to the voice activity of the speaker (or speaker), including whether or not voice activity is present,

Determining that there is a lack of speech activity from the plurality of acoustic signals for one or more specified periods, generating one or more first transfer functions representing the plurality of acoustic noise signals,

Determining that voice information is present in the plurality of acoustic signals during the one or more specified periods, generating one or more second transfer functions representing the plurality of acoustic signals, and

One or more combinations of the one or more first transfer functions and the one or more second transfer functions to remove noise from a plurality of acoustic signals, thereby generating one or more noise canceled data streams.

And removing the noise of the electrical signal.

The method of claim 1, wherein the first receiving element and the second receiving element each comprise one microphone selected from the group consisting of a unidirectional microphone and an omnidirectional microphone.

The method of claim 1, wherein a plurality of acoustic signals are received in separated time samples, and the first receiving element and the second receiving element are spaced apart by a distance d, where d corresponds to n separated time samples. Noise canceling method of an electrical signal, characterized in that.

The method of claim 1, wherein the at least one second transfer function is fixed as a function of the difference between the signal data amplitude of the first receiving element and the signal data amplitude of the second receiving element.

2. The noise of an electrical signal as recited in claim 1, wherein said step of removing noise from the plurality of acoustic signals comprises using directions and ranges for one or more signal sources from one or more first receiving elements. How to remove.

2. The apparatus of claim 1, wherein the frequency response of the at least one first receiving element and the at least one second receiving element are different, and the signal data from the at least one second receiving element is a signal from the at least one first receiving element. A method of removing noise from an electrical signal, characterized in that it is compensated to have a proper relationship to the data.

The method of claim 6,

Compensating signal data from at least one second receiving element comprises: receiving at least one first receiving element a wideband signal from one source in a direction and position anticipated with respect to a signal from the at least one signal source; And recording at the at least one second receiving element.

7. The method of claim 6, wherein said step of compensating signal data from at least one second receiving element comprises a frequency domain compensation process.

The method of claim 8, wherein the frequency domain compensation process,

Compute a frequency transform on the signal data from each of the at least one receiving element and at least one second receiving element signal,

Determine the magnitude of the frequency transform in each frequency bin, and

Setting the magnitude of the frequency transform for signal data from at least one second receiving device at each frequency to a value related to the magnitude of the frequency transform for signal data from at least one first receiving element.

And removing the noise of the electrical signal.

7. The method of claim 6, wherein compensating signal data from the at least one second receiving element comprises a time domain compensation process.

The method of claim 6, wherein the method is

-First set one or more second transfer functions to zero, and

Compensation coefficients are calculated whenever one or more noise signals are smaller than one or more voice signals.

The method of claim 1, further comprising the step of noise reduction.

10. The method of claim 1, wherein the plurality of acoustic signals comprises one or more reflections of one or more noise signals and one or more reflections of one or more speech signals.

The method of claim 1, wherein the step of receiving physiological information is selected from the group consisting of an acoustic microphone, an RF element, an electroglottograph, an ultrasonic element, acoustic throat microphones, and an airflow detector. Receiving physiological data related to human speech using one or more detectors.

The method of claim 1, wherein generating at least one first transfer function and at least one second transfer function comprises using at least one technique selected from the group comprising adaptive techniques and iterative techniques. Noise reduction method of electric signal.

A system for removing noise from an acoustic signal,

The system includes one or more receivers, one or more sensors, and one or more processors,

The one or more receivers include one or more signal receivers for receiving one or more acoustic signals from one signal source, and a noise receiver for receiving one or more noise signals from one noise source, wherein the signal source, The relative positions of one or more signal receivers and one or more noise receivers are fixed and known,

The one or more sensors receive physiological information related to human voice activity,

The one or more processors are coupled between one or more receivers and one or more sensors to generate a plurality of transfer functions,

Upon determining that there is lack of speech information from one or more acoustic signals for one or more specified time periods, one or more first transfer functions representing one or more acoustic signals are generated,

If it is determined that speech information is present in the one or more acoustic signals for one or more specified time periods, one or more second transfer functions representing one or more acoustic signals are generated,

And noise is removed from at least one sound signal using at least one combination of the at least one first transfer function and the at least one second transfer function.

16. The system of claim 15, wherein the one or more sensors include one or more radio frequency (RF) interferometers that detect tissue movement related to human speech.

16. The apparatus of claim 15, wherein the one or more sensors comprise one or more sensors selected from the group consisting of acoustic microphones, high frequency devices, electroglottographs, ultrasonic devices, acoustic throat microphones, and airflow detectors. Noise canceling system of the acoustic signal comprising a.

The method of claim 15, wherein the one or more processors,

-Divide the acoustic data of one or more acoustic signals into multiple subbands,

One or more combinations of one or more first transfer functions and one or more second transfer functions to remove noise from each of the plurality of subbands, wherein a plurality of noise canceled acoustic data streams are generated, and

Combining multiple noise canceled acoustic data streams to produce one or more noise canceled acoustic data streams.

Noise cancellation system of the acoustic signal, characterized in that.

16. The system of claim 15, wherein said at least one signal receiver and said at least one noise receiver are respective microphones selected from the group consisting of unidirectional and omnidirectional microphones.

A signal processing system connected between one or more users and one or more electrical components,

The signal processing system comprises one or more first receiving elements, one or more second receiving elements, and one or more noise canceling subsystems,

The one or more first receiving elements are arranged to receive one or more acoustic signals from one signal source, the one or more second receiving elements are arranged to receive one or more noise signals from one noise source, In this case, the relative positions of the signal source, the one or more first receiving elements, and the one or more second receiving elements are fixed and known,

The one or more noise canceling subsystems remove noise from an acoustic signal,

The noise canceling subsystem includes one or more processors and one or more sensors,

The at least one processor is coupled between at least one first receiver and at least one second receiver, and

The one or more sensors coupled to one or more processors, the one or more sensors configured to receive physiological information related to human voice activity, wherein the one or more processors generate a plurality of transfer functions,

Using one or more combinations of the one or more first transfer functions and the one or more second transfer functions to remove noise from one or more acoustic signals, thereby generating one or more noise canceled data streams. Signal processing system.

21. The signal processing system according to claim 20, wherein the first receiving element and the second receiving element are respective microphones selected from the group consisting of unidirectional microphones and omnidirectional microphones.

21. The apparatus of claim 20, wherein the one or more acoustic signals are received in separated time samples, wherein the first receiving element and the second receiving element are spaced apart by a distance d, wherein d is n Signal processing system, characterized in that the corresponding time sample.

21. The signal processing system of claim 20, wherein at least one second transfer function is fixed as a function of the difference between the signal data amplitude of the first receiving element and the signal data amplitude of the second receiving element.

21. The signal processing system of claim 20, wherein removing noise from the one or more acoustic signals utilizes directions and ranges for one or more signal sources from the one or more first receiving elements.

The method of claim 20,

The frequency response of at least one first receiving element and at least one second receiving element is different,

And the signal data from the one or more second receiving elements is compensated to have an appropriate relationship to the signal data from the one or more first receiving elements.

26. The method of claim 25, wherein compensating signal data from the one or more second receiving elements comprises: receiving one or more broadband signals from a source located in a direction and distance expected for a signal from one or more signal sources. And recording to one or more second receiving elements.

26. The signal processing system of claim 25, wherein compensating signal data from the one or more second receiving elements comprises frequency domain compensation.

The method of claim 27, wherein the frequency compensation,

Compute the magnitude of the frequency transform at each frequency bin, and

Signal processing system, characterized in that.

26. The signal processing system of claim 25, wherein compensating signal data from the one or more second receiving elements comprises time domain compensation.

The method of claim 25, wherein the compensation,

-Initially set one or more second transfer functions to zero, and

Compensation coefficients are calculated whenever one or more noise signals are smaller than one or more acoustic signals.

Signal processing system, characterized in that.

21. The signal processing system of claim 20, wherein the one or more acoustic signals comprise one or more reflections of one or more noise signals and one or more reflections of one or more acoustic signals.

The method of claim 20, wherein receiving physiological information comprises:

Receive physiological data related to human speech using one or more detectors selected from the group consisting of acoustic microphones, RF devices, electroglottographs, ultrasonic devices, acoustic throat microphones, and airflow detectors Signal processing system, characterized in that.

21. The signal processing system of claim 20, wherein generating at least one first transfer function and at least one second transfer function utilizes one or more techniques selected from the group comprising adaptive techniques and iterative techniques.