KR20080106625A

KR20080106625A - Apparatus and method for removing residual noise

Info

Publication number: KR20080106625A
Application number: KR1020070054258A
Authority: KR
Inventors: 이상신; 유재황; 임종태; 박성수
Original assignee: 에스케이 텔레콤주식회사
Priority date: 2007-06-04
Filing date: 2007-06-04
Publication date: 2008-12-09
Also published as: KR100890708B1

Abstract

An apparatus and a method for removing residual noise are provided to reduce a noise prediction signal to which a noise ratio in a speaker voice prediction signal is applied to remove a peak value due to noise in the speaker voice prediction signal, thereby minimizing performance deterioration in a real environment. A peak value distinguishing unit(21) distinguishes a peak value of a speaker voice and a peak value due to noise. A ratio estimating unit(23) estimates a ratio of noise size included in Y1 to noise size included in Y2. A comparing unit(25) compares a preset threshold value with the peak value of Y2. A signal output unit(27) outputs Y1 as an output signal when the peak value of Y2 is smaller than the threshold value. The signal output unit outputs Y2, to which a noise ratio in Y1 is applied, as an output signal when the peak value of Y2 is larger than the threshold value.

Description

Residual Noise Reduction Device and Method {APPARATUS AND METHOD FOR REMOVING RESIDUAL NOISE}

도 1은 신호 분리 장치 중에서 두 개의 마이크로폰을 이용하는 독립 성분 분석 처리 모듈의 내부 구성을 개략적으로 보인 도면.1 schematically shows the internal configuration of an independent component analysis processing module using two microphones in a signal separation device;

도 2는 종래 기술에 따른 잔류 잡음 제거 장치를 개략적으로 보인 도면.Figure 2 is a schematic view showing a residual noise removing device according to the prior art.

도 3은 본 발명에 따른 잔류 잡음 제거 장치와 신호 분리 장치의 연결 상태를 개략적으로 보인 도면.3 is a view schematically showing a connection state between a residual noise removing device and a signal separation device according to the present invention;

도 4는 신호 분리 장치에서 출력되는 잡음 예측 신호인 Y₂의 파형을 예시적으로 보인 도면.4 is a diagram illustrating a waveform of Y ₂ , a noise prediction signal output from a signal separation device, by way of example.

도 5는 본 발명의 일 실시예에 따른 잔류 잡음 제거 장치의 구성을 개략적으로 보인 도면.5 is a view schematically showing the configuration of a residual noise removing apparatus according to an embodiment of the present invention.

도 6은 본 발명의 일 실시예에 따른 잔류 잡음 제거 방법을 설명하기 위한 플로우챠트.6 is a flowchart for explaining a method of removing residual noise according to an embodiment of the present invention.

*** 도면의 주요 부분에 대한 부호의 설명 ****** Explanation of symbols for the main parts of the drawing ***

10. 신호 분리 장치, 20. 잔류 잡음 제거 장치, 21. 첨두치 구분부,10. Signal Separator, 20. Residual Noise Canceller, 21. Peak-to-Pick Division,

23. 잡음 비율 측정부, 25. 비교부, 27. 신호 출력부23. Noise ratio measuring section, 25. Comparing section, 27. Signal output section

본 발명은 잔류 잡음 제거 장치 및 방법에 관한 것으로서, 특히 잡음의 첨두치에 의한 영향을 줄일 수 있도록 하는 잔류 잡음 제거 장치 및 방법에 관한 것이다.The present invention relates to an apparatus and method for removing residual noise, and more particularly, to an apparatus and method for removing residual noise that can reduce the influence of the peak of noise.

음질은 음성 통화에서 중요한 요소로서, 예를 들어 이동 전화에서는 망의 부하를 최소화하여 망 용량을 증대시키기 위해, 대개 낮은 전송율의 음성 코덱을 사용한다.Sound quality is an important factor in voice calls, for example, mobile phones often use low code rate voice codecs to minimize network load and increase network capacity.

따라서, 지금까지의 이동 전화에서는 전달 음질을 향상시키기 위해, 음성 코덱을 개선하거나, 무선망 증설 및 엔지니어링을 통해 용량을 증대시킨다.Therefore, in the mobile phones to date, in order to improve the transmission sound quality, the voice codec is improved, or the capacity is increased through wireless network expansion and engineering.

이동 전화에서 사용하는 음성 코덱은 각각 지원하는 전송율이 있어, 현재 유럽식 이동 전화 기술인 WCMDA(Wideband CDMA)에서는 12.2kbps의 AMR(Adaptive Multi Rate codec) 음성 코덱을 사용하고, 동기식 CDMA(Code Division Multiple Access)에서는 8kbps의 EVRC(Enhanced Variable Rate Codec) 음성 코덱을 사용한다.Each voice codec used in mobile phones has a supported bit rate.Wideband CDMA (WCMDA), a European mobile phone technology, uses an adaptive multi rate codec (AMR) voice codec of 12.2kbps, and synchronous code division multiple access (CDMA). Uses an 8 kbps Enhanced Variable Rate Codec (EVRC) voice codec.

동기식 CMDA 이동 전화 초기에 규격으로 채택되었던 QCELP(Qualcomm Code Excited Linear Prediction)는 8kbps와 13kbps의 두 가지 모드가 존재하나, 8kbps 코덱의 경우에는 음질의 열화가 심하여 EVRC 코덱으로 대치되었다.QCELP (Qualcomm Code Excited Linear Prediction), which was adopted as a standard in the synchronous CMDA mobile phone, has two modes of 8kbps and 13kbps, but the 8kbps codec has been replaced by the EVRC codec due to the deterioration of sound quality.

그러나, EVRC 코덱 적용 이후에 추가적으로 음성 코덱 기술이 발표되었지만, 실제 상용망에는 적용되지 않고 있으며, 음성 전달 신뢰성 향상, 동시 통화자 수의 증가 및 음영 지역에서의 음성 통화를 가능하게 하기 위해, 무선망 기지국 및 중계기 등을 증설하고 엔지니어링을 실시한다.However, voice codec technology has been announced after the application of the EVRC codec, but it is not applied to the actual commercial network. In order to improve the voice transmission reliability, increase the number of simultaneous callers, and enable voice calls in the shadow area, the wireless network base station And repeaters and engineering.

그러나, 음성 코덱과 무선망 증설 및 엔지니어링을 통하여 음질을 향상시키고, 음성 통화 단절을 최소화하더라도, 소음 환경에서 전화 통화를 하는 경우에는 화자의 목소리와 소음이 섞여 전달되므로, 통화의 품질은 떨어질 수 밖에 없게 된다.However, even if the voice quality is improved through the expansion and engineering of the voice codec and the wireless network, and the call disconnection is minimized, the call quality is deteriorated because the voice and noise of the talker are transmitted when the telephone call is made in a noisy environment. There will be no.

이에 따라, 소음 환경에서의 통화 품질을 개선하기 위한 기술로 복수개의 마이크로폰에서 유입된 정보를 이용하여 화자의 목소리 방향으로 마이크로폰을 집중시키거나, 화자의 목소리와 기타 잡음 신호를 분리하거나, 잡음을 예측하여 제거하는 신호 분리 기술들이 제시되고 있는 데, 이러한 신호 분리 기술로는 빔 형성 방법, 독립 성분 분석 방법, 스펙트럼 차감 방법 등이 있다.As a result, technology to improve call quality in a noisy environment can use information from multiple microphones to focus the microphone in the direction of the speaker's voice, separate the speaker's voice from other noise signals, or predict noise. Signal separation techniques are proposed, which include beam forming, independent component analysis, and spectral subtraction.

전술한, 빔 형성 방법은 방향성을 갖는 다수의 마이크로폰을 이용하여 화자의 입 방향으로 마이크로폰 빔을 형성하여, 주변의 잡음은 최소화하고 화자의 목소리는 증폭이 되도록 하는 방법이고, 독립 성분 분석 방법은 화자의 목소리와 소음이 확률적으로 독립이라는 가정하에서 화자의 목소리와 소음 간의 확률적 종속성이 최소가 되도록 신호 처리를 해줌으로써, 두 신호를 분리해 내는 방법이며, 스펙트 럼 차감 방법은 잡음의 스펙트럼을 추정하여 소음이 섞인 신호에서 추정된 값을 차감해 줌으로써, 잡음을 없애는 방법이다.The above-described beamforming method uses a plurality of directional microphones to form a microphone beam in the direction of the speaker's mouth, thereby minimizing surrounding noise and amplifying the speaker's voice. Under the assumption that the voice and noise are stochastic independence, signal processing is performed to minimize the stochastic dependence between the speaker's voice and noise, and the spectral subtraction method estimates the spectrum of noise. By subtracting the estimated value from the mixed signal, the noise is eliminated.

전술한 방법 중에서 독립 성분 분석 방법은 디지털 신호 처리의 발전으로 주목받고 있는 기술로서, 화자의 목소리와 주변 잡음 성분들의 서로 독립적인 특성을 이용하여 확률적 독립성이 최대가 되도록 신호의 분리를 수행한다.Among the above-described methods, the independent component analysis method is a technology that is attracting attention due to the development of digital signal processing, and performs signal separation to maximize probability independence by using independent characteristics of the speaker's voice and surrounding noise components.

도 1은 신호 분리 장치 중에서 두 개의 마이크로폰을 이용하는 독립 성분 분석 처리 모듈의 내부 구성을 개략적으로 보인 도로, X₁, X₂는 두 개의 마이크로폰 입력으로서 각각은 화자의 목소리와 주변의 각종 잡음이 섞여 있으며, 두 신호는 마이크로폰 간의 거리에 따른 위상 및 이득차를 갖게 되는 데, 위상을 무시할 때, X₁과 X₂는 각각 수학식 1과 같이 표현할 수 있다(위상을 고려해도 유사한 형태로 표현 가능하다).1 is a schematic view showing the internal configuration of an independent component analysis processing module using two microphones among signal separation devices. X ₁ and X ₂ are two microphone inputs, each of which is a mixture of a speaker's voice and various noises. In addition, the two signals have a phase and a gain difference according to the distance between the microphones. When the phase is ignored, X ₁ and X ₂ can be expressed as Equation 1, respectively. .

수학식 1에서 j=1, 2이고, α_j와 β_j는 마이크로폰과 화자입 및 마이크로폰과 잡음의 거리에 따른 이득 성분이다.In Equation 1, j = 1, 2, and α _j and β _j are gain components according to the distance between the microphone, the speaker input, and the microphone and the noise.

그리고, 도 1에서 Y₁, Y₂는 독립 성분 분석 처리 모듈의 출력으로서 화자의 목소리와 주변의 잡음이 분리된 신호이고(여기서, Y₁을 화자 목소리로 예측되는 신호로 Y₂를 주변 잡음으로 예측되는 신호로 가정한다), W₁과 W₂ 및 C₁과 C₂는 독립 성 분 분석 처리를 위한 필터(적응 필터)이며, 각각의 필터는 X₁, X₂의 독립성이 최대가 되도록 하는 기준을 사용하여 Y₁, Y₂를 출력한다.In FIG. 1, Y ₁ and Y ₂ are outputs of an independent component analysis processing module, in which a speaker's voice and ambient noise are separated (wherein Y ₁ is a signal predicted by the speaker's voice and Y ₂ is an ambient noise). Is assumed to be a predicted signal), W ₁ and W ₂ and C ₁ and C ₂ are filters for independent component analysis (adaptation filters), and each filter ensures maximum independence of X ₁ and X ₂ . Output Y ₁ , Y ₂ using the criteria.

여기서, W₁, W₂ 및 C₁, C₂를 이상적인 적응 필터라 가정했을 때, 출력 Y₁과 Y₂는 화자 목소리와 잡음 성분으로 정확히 분리되게 되나, 실제 환경에 적용하게 되면, 알고리즘의 한계와 하드웨어 및 소프트웨어의 복잡도 문제로 인해 주변 잡음을 완벽하게 제거할 수 없게 된다.Here, assuming that W ₁ , W _2, and C ₁ , C ₂ are ideal adaptive filters, the outputs Y ₁ and Y ₂ are precisely separated into speaker voices and noise components, but when applied to a real environment, the algorithm is limited. Due to the complexity of the hardware and software, the ambient noise cannot be completely eliminated.

따라서, Y₁과 Y₂는 수학식 2와 같이 표현될 수 있다.Therefore, Y ₁ and Y ₂ may be expressed as Equation 2.

일반적으로 Y₁의 경우에는 α₁≫β₁, Y₂의 경우에는 α₂≪β₂가 된다.In general, for Y ₁ in the case of α _₁ »β _1, Y _2, it becomes α _₂ «β _2.

수학식 2에 나타낸 바와 같이, 필터를 통해 출력되는 신호에 잔류 잡음이 남게 되므로, 일반적으로 신호 분리 장치 이후에 도 2에 도시하는 바와 같은 후처리기(Post-Processor)인 잔류 잡음 제거 장치를 두어 잔류 잡음을 추가적으로 제거하게 되는 데, 종래에는 잔류 잡음 제거 장치로 스펙트럼 차감기를 사용하여 잡음의 스펙트럼을 추정하여 잡음을 차감한다.As shown in Equation 2, since the residual noise remains in the signal output through the filter, a residual noise canceller, which is a post-processor as shown in FIG. The noise is additionally removed. In the related art, a spectral subtractor is used as the residual noise removing device to estimate the spectrum of the noise to subtract the noise.

스펙트럼 차감기는 위너 필터 등을 사용하여 잡음의 스펙트럼을 정밀하게 예측하기도 하지만, 주로 묵음 구간에 잡음의 전력을 측정하여 일정하게 차감하는 단순한 방법을 사용한다.The spectral subtractor uses the Wiener filter to accurately predict the spectrum of noise, but mainly uses a simple method of measuring the power of the noise in the silent section and then subtracting it regularly.

그러나, 전술한 바와 같은 추가적인 차감에도 불구하고, 주변 잡음이 시끄러운 음악이거나, 자동차 경적과 같은 소리일 경우에는 잡음의 첨두치가 발생할 때, 적절하게 제거하지 못하고 임펄스 잡음과 같은 효과를 발생시켜 거부감을 일으키게 되는 문제점이 있다.However, in spite of the additional deduction as described above, when the ambient noise is loud music or a sound like an car horn, when the peak of the noise occurs, it is not properly removed and an effect such as an impulse noise is generated to cause a sense of rejection. There is a problem.

본 발명은 전술한 문제점을 해결하기 위해 안출된 것으로서, 화자 목소리 예측 신호뿐만 아니라 잡음 예측 신호를 입력으로 인가받아 잡음 예측 신호의 절대치가 기설정된 임계치를 초과할 때, 화자 목소리 예측 신호에서 잡음 비율이 적용된 잡음 예측 신호를 차감함으로써, 화자 목소리 예측 신호에서 잡음에 의한 첨두치를 제거할 수 있도록 하는 잔류 잡음 제거 장치 및 방법을 제공함에 그 목적이 있다.SUMMARY OF THE INVENTION The present invention has been made to solve the above-described problem, and when the noise prediction signal as well as the speaker voice prediction signal is received as an input, the noise ratio in the speaker voice prediction signal is increased when the absolute value of the noise prediction signal exceeds a preset threshold. It is an object of the present invention to provide a residual noise removing apparatus and method for removing noise peaks from a speaker voice prediction signal by subtracting an applied noise prediction signal.

전술한 목적을 달성하기 위한 본 발명의 일 실시예에 따른 잔류 잡음 제거 장치는, 입력 신호에서 화자 목소리와 잡음을 분리하여 출력하는 신호 분리 장치로부터 화자 목소리 예측 신호인 Y₁과 잡음 예측 신호인 Y₂를 입력으로 인가받아 상기 Y₁에 포함되어 있는 첨두치가 화자 목소리에 의한 첨두치인 지, 잡음에 의한 첨두치인 지를 구분하는 첨두치 구분부와; 상기 첨두치 구분부에서 상기 Y₁에 포함되어 있는 첨두치가 화자 목소리에 의한 첨두치와 잡음에 의한 첨두치 중에서 어느 하나로 구분되면, 상기 Y₁에 포함되어 있는 잡음 크기와 상기 Y₂에 포함되어 있는 잡음 크기의 비로 잡음 비율을 측정하는 잡음 비율 측정부와; 상기 Y₂의 첨두치와 기설정되어 있는 임계치를 비교하는 비교부와; 상기 비교부에서의 비교결과에 따라 상기 Y₂의 첨두치가 기설정되어 있는 임계치보다 작은 경우에는 상기 Y₁을 출력 신호로 출력하고, 상기 Y₂의 첨두치가 기설정되어 있는 임계치보다 큰 경우에는, 상기 Y₁에서 상기 잡음 비율이 적용된 Y₂를 차감한 신호를 출력 신호로 출력하는 신호 출력부를 포함하여 이루어지는 것이 바람직하다.Residual noise removing apparatus according to an embodiment of the present invention for achieving the above object, the speaker voice prediction signal Y ₁ and the noise prediction signal Y ₁ from the signal separation device that separates and outputs the speaker voice and noise from the input signal applying a _second receiving as input the peak sorting portion that if the peak value is included in the Y ₁ hit peak due to the speaker's voice, separated peak if hit by a noise; When the peak value included in the Y ₁ is divided into any one of the peak value by the speaker's voice and the peak value by the noise, the peak value division unit includes the noise size included in the Y ₁ and the Y ₂ included in the Y ₂ . A noise ratio measuring unit measuring a noise ratio by a ratio of noise levels; A comparison unit for comparing the peak value of Y ₂ with a preset threshold value; When the peak value of the Y ₂ is smaller than the preset threshold value according to the comparison result in the comparison unit, the Y ₁ is output as an output signal, and when the peak value of the Y ₂ is larger than the preset threshold value, And a signal output unit configured to output a signal obtained by subtracting Y ₂ to which the noise ratio is applied from Y ₁ as an output signal.

나아가, 상기 첨두치 구분부는, 상기 Y₁의 첨두치(│Y₁│)와 Y₂의 첨두치(│Y₂│)를 비교하여 상기 Y₁에 포함되어 있는 첨두치를 구분하되, 상기 Y₁의 첨두치와 상기 Y₂의 첨두치의 차가 0보다 크면 상기 Y₁에 포함된 첨두치를 화자 목소리에 의한 첨두치로 판단하고, 상기 Y₁의 첨두치와 상기 Y₂의 첨두치의 차가 0보다 작으면 상기 Y₁에 포함된 첨두치를 잡음에 의한 첨두치로 판단하는 것이 바람직하다.Further, the peak portion nine minutes, but nine minutes the peak value that compares the peak (│Y ₁ │) and peak (│Y ₂ │) of Y ₂ in the Y ₁ is included in the Y _1, wherein Y ₁ If the difference between the peak of and the peak of Y ₂ is greater than 0, the peak included in the Y ₁ is determined to be a peak by the speaker voice, and if the difference between the peak of Y _{1 and} the peak of Y ₂ is less than 0 It is preferable to determine the peak included in Y ₁ as the peak due to noise.

한편, 본 발명의 일 실시예에 따른 잔류 잡음 제거 방법은, 입력 신호에서 화자 목소리와 잡음을 분리하여 출력하는 신호 분리 장치로부터 화자 목소리 예측 신호인 Y₁과 잡음 예측 신호인 Y₂를 입력으로 인가받은 첨두치 구분부에서 상기 Y₁의 첨두치와 Y₂의 첨두치를 비교하여 상기 Y₁에 포함되어 있는 첨두치가 화자 목소 리에 의한 첨두치인 지, 잡음에 의한 첨두치인 지를 구분하는 과정과; 상기 Y₁에 포함되어 있는 잡음 크기와 상기 Y₂에 포함되어 있는 잡음 크기의 비로 잡음 비율을 측정하는 과정과; 상기 Y₂의 첨두치와 기설정되어 있는 임계치를 비교하여 상기 Y₂의 첨두치가 상기 임계치보다 작은 경우에는 상기 Y₁을 출력 신호로 출력하고, 상기 Y₂의 첨두치가 상기 임계치보다 큰 경우에는, 상기 Y₁에서 상기 잡음 비율이 적용된 Y₂를 차감한 신호를 출력 신호로 출력하는 과정을 포함하여 이루어지는 것이 바람직하다.On the other hand, in the residual noise removing method according to an embodiment of the present invention, a speaker voice prediction signal Y ₁ and a noise prediction signal Y ₂ are applied as inputs from a signal separation device that separates and outputs a speaker voice and noise from an input signal. Comparing the peak value of Y _{1 and} the peak value of Y ₂ in the received peak-to-peak division to determine whether the peak included in Y ₁ is the peak by speaker voice or the peak by noise; Measuring a noise ratio by a ratio between the noise level included in Y ₁ and the noise size included in Y ₂ ; When the peak value of Y ₂ is compared with a preset threshold and the peak value of Y ₂ is smaller than the threshold value, the Y ₁ is output as an output signal, and when the peak value of Y ₂ is larger than the threshold value, And outputting a signal obtained by subtracting Y ₂ to which the noise ratio is applied from Y ₁ as an output signal.

이하에서는 첨부한 도면을 참조하여 본 발명의 바람직한 실시예에 따른 잔류 잡음 제거 장치 및 방법에 대해서 상세하게 설명한다.Hereinafter, with reference to the accompanying drawings will be described in detail with respect to the residual noise removing apparatus and method according to an embodiment of the present invention.

본 발명에 따른 잔류 잡음 제거 장치 및 방법을 설명하기에 앞서, 본 발명의 실시예에서는 신호 분리 장치에서 출력되는 신호 중에서 Y₁을 화자 목소리 예측 신호로, Y₂를 잡음 예측 신호로 가정하기로 한다.Before describing the residual noise removing apparatus and method according to the present invention, in the embodiment of the present invention, it is assumed that Y ₁ is a speaker voice prediction signal and Y ₂ is a noise prediction signal. .

도 3은 본 발명에 따른 잔류 잡음 제거 장치와 신호 분리 장치의 연결 상태를 개략적으로 보인 도로, 잔류 잡음 제거 장치(20)는 신호 분리 장치(10)로부터 화자 목소리 예측 신호인 Y₁뿐만 아니라 잡음 예측 신호인 Y₂를 입력으로 받아 처리한다.3 is a road schematically showing a connection state between a residual noise removing device and a signal separation device according to the present invention, and the residual noise removing device 20 predicts noise as well as Y ₁ , which is a speaker voice prediction signal, from the signal separation device 10. Receive and process the signal Y ₂ as input.

잡음 예측 신호인 Y₂는 잡음 성분이 주이며, 화자 목소리 성분은 작은 성분으로 남게 되므로, Y₂가 첨두치를 갖는 구간과 대응하는 Y₁의 구간에서 잡음 예측 신호를 차감하게 되면, 임펄스 잡음의 성분을 크게 줄일 수 있게 된다.Since the noise prediction signal Y ₂ is mainly composed of noise components, and the speaker voice component remains as a small component, when the noise prediction signal is subtracted from the interval of Y ₁ corresponding to the interval where Y ₂ has a peak, the component of impulse noise Can be greatly reduced.

즉, 신호 분리 장치(10)의 출력 신호인 Y₁에서 잡음 첨두치를 갖는 구간은 Y₂에서 잡음 첨두치를 갖는 구간과 일치하게 된다.That is, the section having a noise peak at Y ₁ , which is an output signal of the signal separation device 10, coincides with the section having a noise peak at Y ₂ .

이에 따라, Y₁에서 임펄스 잡음을 일으킬 수 있는 성분은 Y₂에서 예측 가능하게 된다.Accordingly, a component capable of generating impulse noise at Y ₁ becomes predictable at Y ₂ .

도 4는 신호 분리 장치에서 출력되는 잡음 예측 신호인 Y₂의 파형을 예시적으로 보인 도로, 파형의 첨두치(최대치 또는 최소치)는 잡음이 화자 목소리에 크게 영향을 미치는 구간이 된다.4 is a diagram illustrating a waveform of Y ₂ , which is a noise prediction signal output from a signal separation device, and a peak value (maximum or minimum value) of a waveform is a section in which noise greatly affects a speaker's voice.

이러한, 첨두치는 화자의 목소리로 예측한 Y₁에서도 일정 수준 이상의 값을 갖게 되는 데, 도 4를 통해 알 수 있듯이, 이러한 첨두치는 전체 잡음 구간에서 일정 부분만을 차지하게 되므로, 발생하였을 때만 차감해 주더라도 화자 목소리는 좀 더 자연스럽게 전달될 수 있다.Such a peak value has a certain level or more even in Y ₁ predicted by the speaker's voice. As can be seen from FIG. 4, the peak value occupies only a portion of the entire noise section, and thus only a subtraction occurs when it occurs. The speaker's voice can be communicated more naturally.

도 5는 본 발명의 일 실시예에 따른 잔류 잡음 제거 장치의 구성을 개략적으로 보인 도로, 첨두치 구분부(21)와, 잡음 비율 측정부(23), 비교부(25)와, 신호 출력부(27)를 포함하여 이루어진다.5 is a road schematically showing the configuration of the residual noise removing apparatus according to an embodiment of the present invention, the peak value separating unit 21, the noise ratio measuring unit 23, the comparison unit 25, the signal output unit It consists of (27).

이와 같은 구성에 있어서, 첨두치 구분부(21)는 신호 분리 장치(10)로부터 화자 목소리 예측 신호인 Y₁와 잡음 예측 신호인 Y₂를 입력으로 인가받아 Y₁에 포함되어 있는 첨두치가 화자 목소리에 의한 첨두치인 지, 잡음에 의한 첨두치인 지를 구분한다.In this configuration, the peak sorting portion 21 has a peak value of the speaker's voice that is received is the speaker's voice prediction signal of Y ₁ and noise prediction signal is a Y ₂ from the signal separation unit 10 in the input contained in Y ₁ It distinguishes between peak value by and peak value by noise.

첨두치는 입력받은 신호의 절대치가 미리 정한 임계치보다 큰 경우를 말하는 것으로, 첨두치 구분부(21)에서는 Y₁의 첨두치(│Y₁│)와 Y₂의 첨두치(│Y₂│)를 비교하여 각 신호의 최대 또는 최소 첨두치로부터 화자 목소리에 의한 첨두치와 잡음에 의한 첨두치를 구분한다.The peak value refers to the case where the absolute value of the input signal is larger than the predetermined threshold value. In the peak value separator 21, the peak value of Y ₁ (│ Y ₁ │) and the peak value of Y ₂ (│ Y ₂ │) are determined. In comparison, the peak or noise peaks of the speaker's voice and the peaks of noise are distinguished from the maximum or minimum peaks of each signal.

즉, 화자 목소리 예측 신호인 Y₁의 경우에는 α₁≫β₁, 잡음 예측 신호인 Y₂의 경우에는 α₂≪β₂이므로(α_j와 β_j는 마이크로폰과 화자입 및 마이크로폰과 잡음의 거리에 따른 이득 성분이다), Y₁의 첨두치들과 Y₂의 첨두치들을 비교하면, Y₁에 포함되어 있는 첨두치가 화자 목소리에 의한 첨두치인지 잡음에 의한 첨두치인 지를 구분할 수 있게 된다.In other words, α ₁ ≫β _{1 for the} speaker voice prediction signal Y ₁ , and α ₂ ≪β ₂ for the noise prediction signal Y ₂ (α _j and β _j are the distance between the microphone and the speaker, and the microphone and noise. By comparing the peaks of Y _{1 and} the peaks of Y ₂ , it is possible to distinguish whether the peak included in Y ₁ is the peak by a speaker voice or the peak by noise.

예를 들어, Y₁의 첨두치와 Y₂의 첨두치의 차가 0보다 크면 Y₁에 포함된 첨두치를 화자 목소리에 의한 첨두치로, Y₁의 첨두치와 Y₂의 첨두치의 차가 0보다 작으면 Y₁에 포함된 첨두치를 잡음에 의한 첨두치로 판단할 수 있다.For example, if the difference between the peak of Y _{1 and} the peak of Y ₂ is greater than 0, the peak included in Y ₁ is the peak of the speaker's voice. If the difference between the peak of Y _{1 and} the peak of Y ₂ is less than 0, Y The peak included in ₁ may be determined as the peak due to noise.

한편, 잡음 비율 측정부(23)는 첨두치 구분부(21)를 통해 Y₁에 포함된 첨두치가 화자 목소리에 의한 첨두치 또는 잡음에 의한 첨두치로 구분되면, Y₁에 포함되 어 있는 잡음의 크기와 Y₂에 포함되어 있는 잡음 크기의 비를 구한다.On the other hand, the noise ratio measuring unit 23 is the peak value included in the Y ₁ through the peak value separator 21 is divided into a peak value by the speaker voice or a peak value by noise, the noise of the noise included in Y ₁ Find the ratio of the magnitude and the amount of noise contained in Y ₂ .

즉,

를 구하여 저장한다.In other words,

Obtain and save.

전술한, 잡음 비율(κ)은 여러 가지 방법으로 구할 수 있으나, 근사적으로 묵음 구간 즉, │Y₁│≪│Y₂│에서 잡음 비율 추정치(

)는 수학식 3과 같이 구할 수 있다.Described above, a noise ratio (κ) can be obtained, but in different ways, approximate to the silent section that is, │Y ₁ │«│Y noise ratio estimate in ₂ │ (

) Can be obtained as in Equation 3.

수학식 3에서, n은 관측 구간이다.In Equation 3, n is an observation interval.

한편, 비교부(25)는 Y₂의 첨두치와 기설정되어 있는 임계치를 비교하여 Y₂의 첨두치가 기설정되어 있는 임계치보다 큰 지를 판단한다.On the other hand, the comparing unit 25 compares the peak value of Y ₂ with a preset threshold and determines whether the peak value of Y ₂ is greater than the preset threshold.

신호 출력부(27)는 비교부(25)에서의 비교결과에 따라 수학식 4와 같이, Y₂의 첨두치가 기설정되어 있는 임계치보다 작은 경우에는, 신호 분리 장치(10)로부터 입력받은 화자 목소리 예측 신호인 Y₁을 출력 신호 Z로 그대로 출력하고, Y₂의 첨두치가 기설정되어 있는 임계치보다 큰 경우에는 신호 분리 장치(10)로부터 입력받은 화자 목소리 예측 신호인 Y₁에서 잡음 비율이 적용된 Y₂를 차감한 신호를 출력 신호 Z로 출력한다.When the peak value of Y ₂ is smaller than the predetermined threshold value as shown in Equation 4 according to the comparison result from the comparison unit 25, the signal output unit 27 receives the speaker voice input from the signal separation device 10. Y ₁ , which is a prediction signal, is output as it is as an output signal Z, and when the peak value of Y ₂ is larger than a preset threshold, the noise ratio is applied to Y ₁ , a speaker voice prediction signal input from the signal separation device 10. A signal obtained by subtracting ₂ is output as an output signal Z.

도 6은 본 발명의 일 실시예에 따른 잔류 잡음 제거 방법을 설명하기 위한 플로우챠트이다.6 is a flowchart for describing a method of removing residual noise according to an exemplary embodiment of the present invention.

우선, 첨두치 구분부(21)에서 신호 분리 장치(10)로부터 화자 목소리 예측 신호인 Y₁과 잡음 예측 신호인 Y₂를 입력으로 인가받으면(S10), 첨두치 구분부(21)는 Y₁의 첨두치와 Y₂의 첨두치의 차가 0보다 큰 지를 판단한다(S12).First, applying a peak sorting portion speaker voice prediction signal of Y ₁ and noise prediction signal is a Y ₂ from the signal separator 10 at 21 in the input receiving (S10), the peak value dividing section 21 is Y ₁ It is determined whether the difference between the peak value of and the peak value of Y ₂ is greater than 0 (S12).

상기한 과정 S12의 판단결과 Y₁의 첨두치와 Y₂의 첨두치의 차가 0보다 크면 Y₁에 포함된 첨두치를 화자 목소리에 의한 첨두치로 판단하고(S14), Y₁의 첨두치와 Y₂의 첨두치의 차가 0보다 작으면 Y₁에 포함된 첨두치를 잡음에 의한 첨두치로 판단한다(S16).If the difference between the peak value of Y _{1 and} the peak value of Y ₂ is greater than 0 as a result of the determination of step S12, the peak value included in Y ₁ is determined as the peak value by the speaker's voice (S14), and the peak value of Y ₁ and Y ₂ are If the difference between the peak values is less than 0, the peak value included in Y ₁ is determined as the peak value due to noise (S16).

이후, 잡음 비율 측정부(23)에서 Y₁ 신호에 포함되어 있는 잡음의 크기와 Y₁ 신호와 대응되는 Y₂ 구간에 포함되어 있는 잡음 크기의 비인 잡음 비율(κ)을 구하고(S18), 비교부(25)에서 Y₂의 첨두치와 기설정되어 있는 임계치를 비교하여 Y₂의 첨두치가 기설정되어 있는 임계치보다 큰 지를 판단한다(S20).Then, to obtain the ratio to noise ratio of the noise it included in the noise ratio measurement unit (23) Y ₂ interval corresponding to the noise level and the Y ₁ signal that is included in the Y ₁ signal on the size (κ) (S18), compare unit determines whether 25 compares the peak value with the threshold value that is set in group Y ₂ in the group is greater than the threshold value is set peak value of Y ₂ (S20).

상기한 과정 S20의 판단결과 Y₂의 첨두치가 기설정되어 있는 임계치보다 작은 경우에는, Y₁을 출력 신호 Z로 출력하고(S22), Y₂의 첨두치가 기설정되어 있는 임계치보다 큰 경우에는 Y₁에서 잡음 비율(κ)이 적용된 Y₂를 차감한 신호를 출력 신호 Z로 출력한다(S24).If the peak value of Y ₂ is smaller than the preset threshold as a result of the determination in step S20, Y ₁ is output as the output signal Z (S22), and if the peak value of Y ₂ is larger than the preset threshold, Y A signal obtained by subtracting Y ₂ to which a noise ratio κ is applied at ₁ is output as an output signal Z (S24).

본 발명의 잔류 잡음 제거 장치 및 방법은 전술한 실시예에 국한되지 않고 본 발명의 기술 사상이 허용하는 범위 내에서 다양하게 변형하여 실시할 수 있다. 즉, 본 발명에 따른 잔류 잡음 제거 장치는 신호와 잡음을 분리하여 잡음을 제거하는 신호 분리 장치 후단에 설치되어 화자 목소리 예측 신호뿐만 아니라 잡음 예측 신호도 함께 입력받아 화자 목소리 예측 신호에 포함되어 있는 잔류 잡음을 감지하여, 화자 목소리 예측 신호에 포함되어 있는 잔류 잡음을 제거하여 음질을 개선하는 모든 후처리기를 포괄한다.The residual noise removing apparatus and method of the present invention are not limited to the above-described embodiments, and may be variously modified and implemented within the range permitted by the technical idea of the present invention. That is, the residual noise removing device according to the present invention is installed at the rear of the signal separation device for separating the signal and the noise to remove the noise and receives the noise prediction signal as well as the speaker voice prediction signal and is included in the speaker voice prediction signal. It covers all postprocessors that detect noise and improve the sound quality by removing residual noise contained in the speaker's voice prediction signal.

이상에서 설명한 바와 같은 본 발명의 잔류 잡음 제거 장치 및 방법에 따르면, 화자 목소리 예측 신호뿐만 아니라 잡음 예측 신호를 입력받아 잡음 예측 신호의 절대치가 기설정된 임계치를 초과할 때, 화자 목소리 예측 신호에서 잡음 비율이 적용된 잡음 예측 신호를 차감함으로써, 화자 목소리 예측 신호에서 잡음에 의한 첨두치를 제거할 수 있게 되어, 실제 환경에서 발생하는 성능 열화를 최소화시킬 수 있게 된다.According to the apparatus and method for removing residual noise according to the present invention as described above, when the absolute value of the noise prediction signal is received in addition to the speaker voice prediction signal and the noise prediction signal exceeds a preset threshold, the noise ratio in the speaker voice prediction signal By subtracting the applied noise prediction signal, it is possible to remove the peak due to noise from the speaker voice prediction signal, thereby minimizing performance degradation occurring in the real environment.

Claims

As input to the speaker voice prediction signal of Y ₁ and noise prediction signal is a Y ₂ from the signal separation unit for separating the speaker's voice and the noise output from the input signal is received value of the peak corresponding to the at least one Y ₁ hit a peak due to the speaker's voice A peak-to-peak division for classifying a finger which is a peak by noise;

When the peak value included in the Y ₁ is divided into any one of the peak value by the speaker's voice and the peak value by the noise, the peak value division unit includes the noise size included in the Y ₁ and the Y ₂ included in the Y ₂ . A noise ratio measuring unit measuring a noise ratio by a ratio of noise levels;

A comparison unit for comparing the peak value of Y ₂ with a preset threshold value;

When the peak value of the Y ₂ is smaller than the preset threshold value according to the comparison result in the comparison unit, the Y ₁ is output as an output signal, and when the peak value of the Y ₂ is larger than the preset threshold value, And a signal output unit configured to output a signal obtained by subtracting Y ₂ to which the noise ratio is applied from Y ₁ as an output signal.

The method of claim 1, wherein the peak portion delimiter,

By comparing the peak value of Y ₁ (│Y ₁ │) and the peak value of Y ₂ (│Y ₂ │), the peak value included in Y ₁ is distinguished.

If the difference between the peak value of Y _{1 and} the peak value of Y ₂ is greater than 0, the peak value included in Y ₁ is determined as a peak value by a speaker voice, and the difference between the peak value of Y _{1 and} the peak value of Y ₂ is greater than 0. If it is small, the peak noise included in the Y ₁ is determined as a peak by noise.

From the input signal the speaker's voice and receiving is a speaker's voice prediction signal of Y ₁ and noise prediction signal of Y ₂ to separate the noise from the signal separation unit for outputting the input peak value dividing section in the peak and Y ₂ in the Y ₁ Comparing peaks and distinguishing whether the peak included in Y ₁ is the peak by a speaker's voice or the peak by a noise;

Measuring a noise ratio by a ratio between the noise level included in Y ₁ and the noise size included in Y ₂ ;

When the peak value of Y ₂ is compared with a preset threshold and the peak value of Y ₂ is smaller than the threshold value, the Y ₁ is output as an output signal, and when the peak value of Y ₂ is larger than the threshold value, And outputting a signal obtained by subtracting Y ₂ to which the noise ratio is applied from Y ₁ as an output signal.