KR102156102B1

KR102156102B1 - Apparatus and method for noise reduction of bone conduction speech signal

Info

Publication number: KR102156102B1
Application number: KR1020190024152A
Authority: KR
Inventors: 김명남; 조진호; 나승대
Original assignee: 경북대학교 산학협력단
Priority date: 2018-02-28
Filing date: 2019-02-28
Publication date: 2020-09-15
Also published as: KR20190104002A

Abstract

엔트로피 게이트와 상관관계 게이트에 의해 골전도 음성신호의 잡음을 효과적으로 제거할 수 있는 골전도 음성신호 잡음 제거 장치 및 방법, 기록 매체가 개시된다. 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 방법은, 골전도 진동센서에 의해 골전도 음성신호를 수집하는 단계; 상기 골전도 음성신호를 웨이블릿 패킷 분해(wavelet packet decomposition)에 의해 다수의 제1 밴드들로 분할하는 단계; 상기 제1 밴드별로 에너지를 산출하고, 상기 제1 밴드별로 산출된 에너지를 기반으로 에너지 문턱치를 산출하는 단계; 상기 에너지 문턱치를 기반으로 상기 다수의 제1 밴드들 중 음성 구간을 가지는 제2 밴드들을 결정하는 단계; 상기 제2 밴드들에 대해 각각 엔트로피를 산출하고, 상기 제2 밴드들의 엔트로피들을 기반으로 엔트로피 게이트를 생성하는 단계; 상기 제2 밴드들의 상관관계를 산출하고, 상기 제2 밴드들의 상관관계를 기반으로 상관관계 게이트를 생성하는 단계; 상기 엔트로피 게이트 및 상기 상관관계 게이트에 의해 상기 제2 밴드들에서 잡음을 제거하는 단계; 및 상기 잡음이 제거된 제2 밴드들을 합성하여 잡음 제거된 골전도 음성신호를 생성하는 단계를 포함한다.Disclosed are an apparatus, a method and a recording medium for removing noise of a bone conduction speech signal capable of effectively removing noise from a bone conduction speech signal by means of an entropy gate and a correlation gate. A method for removing noise from a bone conduction voice signal according to an embodiment of the present invention includes: collecting a bone conduction voice signal by a bone conduction vibration sensor; Dividing the bone conduction speech signal into a plurality of first bands by wavelet packet decomposition; Calculating energy for each of the first bands and calculating an energy threshold based on the energy calculated for each of the first bands; Determining second bands having an audio section among the plurality of first bands based on the energy threshold; Calculating entropy for each of the second bands, and generating an entropy gate based on the entropies of the second bands; Calculating a correlation of the second bands and generating a correlation gate based on the correlation of the second bands; Removing noise from the second bands by the entropy gate and the correlation gate; And generating a noise-removed bone conduction speech signal by synthesizing the noise-removed second bands.

Description

Apparatus and method for noise reduction of bone conduction speech signal

본 발명은 골전도 음성신호 잡음 제거 장치 및 방법에 관한 것으로, 보다 상세하게는 엔트로피 게이트와 상관관계 게이트에 의해 골전도 음성신호의 잡음을 효과적으로 제거할 수 있는 골전도 음성신호 잡음 제거 장치 및 방법에 관한 것이다.The present invention relates to a bone conduction speech signal noise removal apparatus and method, and more particularly, to a bone conduction speech signal noise removal apparatus and method capable of effectively removing noise from a bone conduction speech signal by an entropy gate and a correlation gate. About.

최근 다양한 음성신호 처리 응용분야에서, 외부 환경 잡음으로 인한 음성 인식 저하 현상은 해결되어야 할 중요한 문제로 인식되고 있으며, 높은 성능의 음성 향상 기술이 요구되고 있다. 음성 향상은 음성신호가 주변 잡음에 의해 오염되어 입력되었을 때 음성 신호에서 잡음을 제거하고 음성을 강화하여 음성 신호를 향상시키는 기법으로, 극한의 작업 환경이나 군사 작전 중에 사용되는 음성 통신 기기의 통신 품질을 향상시키거나 여러 가지 스마트 장비나 이식형 보청기와 같은 의료기기에서 인간-기기 상호작용 시 음성 인식이나 화자 인식 성능을 높일 수 있다. 또한, 헤드셋과 디지털 보청기와 같은 음향기기에서 배경 잡음을 억제하고 음질을 향상시키기 위해 사용될 수도 있다.Recently, in various voice signal processing applications, the phenomenon of deterioration in voice recognition due to external environmental noise is recognized as an important problem to be solved, and a high-performance voice enhancement technology is required. Voice enhancement is a technique that improves the voice signal by removing noise from the voice signal and reinforcing the voice when the voice signal is input because it is contaminated by ambient noise. Communication quality of voice communication devices used during extreme work environments or military operations. It can improve speech recognition or speaker recognition performance during human-device interaction in various smart devices or medical devices such as implantable hearing aids. It can also be used in sound equipment such as headsets and digital hearing aids to suppress background noise and improve sound quality.

잡음이 미치는 영향을 줄이기 위해, 다양한 잡음 감쇄 기법과 음성 향상 기법이 연구되어 왔으며 다양한 음성 신호처리 분야에 사용되고 있다. 예를 들어, 잡음 제거 및 음성 향상을 위해 웨이블릿 변환(wavelet transform) 기반의 알고리즘이 연구되고 있다. 그러나, 종래의 잡음 제거 알고리즘은 주로 공기 전도형의 음성 신호에서 잡음을 제거하는데 집중되어 왔으며, 이러한 종래의 잡음 제거 방식은 골전도 방식의 음성 신호에서 잡음을 효과적으로 제거하지 못할 수 있다.In order to reduce the effect of noise, various noise reduction techniques and speech enhancement techniques have been studied and are used in various speech signal processing fields. For example, a wavelet transform-based algorithm is being studied for noise removal and speech enhancement. However, conventional noise reduction algorithms have been mainly focused on removing noise from an air conduction type speech signal, and such a conventional noise reduction method may not effectively remove noise from a bone conduction type speech signal.

골전도 음성신호는 공기전도 음성신호와는 다른 청각경로를 통하여 전달되고, 전달 매질의 차이에 기인하여 공기전도 음성신호와는 상이한 잡음 특성을 갖는다. 골전도 음성신호는 유양돌기나 관자놀이와 같은 특정한 부위에서 진동을 통하여 피부와 두개골을 거쳐 달팽이관에 도달하게 된다. 이러한 전달매질의 차이로 인해, 동일한 잡음 환경 속에서 공기전도 보청기와 비교하여 골전도 보청기는 잡음 환경에서도 음성신호 전달이 효율적인 장점을 보이나, 극심한 환경에서는 공기의 소리 진동이 피부를 통하여 보청기에 유입되는 문제점이 발생하게 된다. 이러한 이유로, 공기전도 음성신호 기반의 잡음 제거 기술을 이용하여 골전도 음성신호에서 잡음을 제거하는 경우에, 잡음 제거가 제대로 이루어지지 않거나, 골전도 음성정보의 손실량이 커지고 음성 연속성이 확보되기 어려운 등의 문제가 발생할 수 있다.The bone conduction audio signal is transmitted through an auditory path different from the air conduction audio signal, and has different noise characteristics from the air conduction audio signal due to a difference in transmission medium. Bone conduction voice signals reach the cochlea through the skin and skull through vibration in specific areas such as mastoid processes and temples. Due to the difference in transmission medium, bone conduction hearing aids are more efficient in transmitting voice signals even in noisy environments than air conduction hearing aids in the same noisy environment, but in extreme environments, sound vibrations of air flow into the hearing aids through the skin. Problems arise. For this reason, when noise is removed from the bone conduction speech signal using the noise removal technology based on the air conduction speech signal, the noise is not properly removed, the loss of bone conduction speech information increases, and it is difficult to ensure speech continuity. Can cause problems.

본 발명은 엔트로피 게이트와 상관관계 게이트에 의해 골전도 음성신호의 잡음을 효과적으로 제거할 수 있는 골전도 음성신호 잡음 제거 장치 및 방법, 기록 매체를 제공하기 위한 것이다.An object of the present invention is to provide a bone conduction speech signal noise removal apparatus, method, and recording medium capable of effectively removing noise from a bone conduction speech signal through an entropy gate and a correlation gate.

본 발명이 해결하고자 하는 과제는 이상에서 언급된 과제로 제한되지 않는다. 언급되지 않은 다른 기술적 과제들은 이하의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The problem to be solved by the present invention is not limited to the problems mentioned above. Other technical problems that are not mentioned will be clearly understood by those of ordinary skill in the art from the following description.

본 발명의 일 측면에 따른 골전도 음성신호 잡음 제거 방법은 골전도 음성신호에서 잡음을 제거하는 골전도 음성신호 잡음 제거 방법에 있어서, 골전도 진동센서에 의해 골전도 음성신호를 수집하는 단계; 상기 골전도 음성신호를 웨이블릿 패킷 분해(wavelet packet decomposition)에 의해 다수의 제1 밴드들로 분할하는 단계; 상기 제1 밴드별로 에너지를 산출하고, 상기 제1 밴드별로 산출된 에너지를 기반으로 에너지 문턱치를 산출하는 단계; 상기 에너지 문턱치를 기반으로 상기 다수의 제1 밴드들 중 음성 구간을 가지는 제2 밴드들을 결정하는 단계; 상기 제2 밴드들에 대해 각각 엔트로피를 산출하고, 상기 제2 밴드들의 엔트로피들을 기반으로 엔트로피 게이트를 생성하는 단계; 상기 제2 밴드들의 상관관계를 산출하고, 상기 제2 밴드들의 상관관계를 기반으로 상관관계 게이트를 생성하는 단계; 상기 엔트로피 게이트 및 상기 상관관계 게이트에 의해 상기 제2 밴드들에서 잡음을 제거하는 단계; 및 상기 잡음이 제거된 제2 밴드들을 합성하여 잡음 제거된 골전도 음성신호를 생성하는 단계를 포함한다.A method for removing noise from a bone conduction speech signal according to an aspect of the present invention provides a method for removing noise from a bone conduction speech signal, comprising: collecting a bone conduction speech signal by a bone conduction vibration sensor; Dividing the bone conduction speech signal into a plurality of first bands by wavelet packet decomposition; Calculating energy for each of the first bands and calculating an energy threshold based on the energy calculated for each of the first bands; Determining second bands having an audio section among the plurality of first bands based on the energy threshold; Calculating entropy for each of the second bands, and generating an entropy gate based on the entropies of the second bands; Calculating a correlation of the second bands and generating a correlation gate based on the correlation of the second bands; Removing noise from the second bands by the entropy gate and the correlation gate; And generating a noise-removed bone conduction speech signal by synthesizing the noise-removed second bands.

상기 엔트로피 게이트를 생성하는 단계는, 상기 제2 밴드들의 엔트로피들의 평균값 및 상기 제2 밴드들 각각의 엔트로피의 로그값을 산출하는 단계; 및 상기 평균값, 상기 로그값을 기반으로 상기 엔트로피 게이트를 생성하는 단계를 포함할 수 있다.The generating of the entropy gate may include calculating an average value of entropy of the second bands and a log value of entropy of each of the second bands; And generating the entropy gate based on the average value and the log value.

상기 엔트로피 게이트를 생성하는 단계는 하기의 수식 1에 따라 상기 엔트로피 게이트를 생성할 수 있다.In the generating of the entropy gate, the entropy gate may be generated according to Equation 1 below.

[수식 1][Equation 1]

상기 수식 1에서, EG(t)는 상기 엔트로피 게이트,

는 상기 제2 밴드들 각각의 엔트로피, k 및 N은 상기 제2 밴드들의 개수이다.In Equation 1, EG(t) is the entropy gate,

Is the entropy of each of the second bands, and k and N is the number of the second bands.

상기 잡음을 제거하는 단계는, 상기 엔트로피 게이트에 의해 상기 제2 밴드들의 잡음을 1차 제거하는 단계; 및 상기 상관관계 게이트에 의해, 상기 엔트로피 게이트를 통과한 제2 밴드들의 잡음을 2차 제거하는 단계를 포함할 수 있다.The removing of the noise may include first removing the noise of the second bands by the entropy gate; And secondary removing noise of the second bands passing through the entropy gate by the correlation gate.

상기 상관관계 게이트를 생성하는 단계는 상기 제2 밴드들의 상관관계 및 상기 엔트로피 게이트를 통과한 제2 밴드를 기반으로 상기 상관관계 게이트를 생성할 수 있다.In the generating of the correlation gate, the correlation gate may be generated based on a correlation of the second bands and a second band passing through the entropy gate.

상기 상관관계 게이트를 생성하는 단계는 하기의 수식 2에 따라 상기 상관관계 게이트를 생성할 수 있다.In the step of generating the correlation gate, the correlation gate may be generated according to Equation 2 below.

[수식 2][Equation 2]

상기 수식 2에서, 상기 CG(t)는 상기 상관관계 게이트, Ψ_ck는 상기 제2 밴드들 중 k번째 제2 밴드, m은 상기 제2 밴드들의 평균값, n은 상기 제2 밴드들의 개수, Ψ_EG는 상기 엔트로피 게이트를 통과한 제2 밴드이다.In Equation 2, CG(t) is the correlation gate, Ψ _ck is the k-th second band of the second bands, m is the average value of the second bands, n is the number of the second bands, Ψ _EG is a second band that has passed through the entropy gate.

상기 상관관계 게이트를 생성하는 단계는 상기 제2 밴드들의 상관관계 및 골전도 음성특징과 가장 유사도가 높은 특징밴드를 기반으로 상기 상관관계 게이트를 생성할 수 있다.In the generating of the correlation gate, the correlation gate may be generated based on a correlation of the second bands and a feature band having the highest similarity to a bone conduction voice characteristic.

본 발명의 다른 실시예에서, 상기 잡음을 제거하는 단계는, 상기 상관관계 게이트에 의해 상기 제2 밴드들의 잡음을 1차 제거하는 단계; 및 상기 엔트로피 게이트에 의해, 상기 상관관계 게이트를 통과한 제2 밴드들의 잡음을 2차 제거하는 단계를 포함할 수 있다.In another embodiment of the present invention, the removing of the noise may include first removing noise of the second bands by the correlation gate; And secondarily removing noise of the second bands passing through the correlation gate by the entropy gate.

상기 다수의 제1 밴드들로 분할하는 단계는, 상기 골전도 음성신호를 저주파수 대역과 고주파수 대역으로 분할하는 단계; 상기 골전도 음성신호가 상기 다수의 제1 밴드들로 분할되도록, 상기 저주파수 대역 및 상기 고주파수 대역을 각각 분할하는 단계를 포함할 수 있다. 상기 특징밴드는 상기 저주파수 대역을 포함할 수 있다.The dividing into the plurality of first bands may include dividing the bone conduction audio signal into a low frequency band and a high frequency band; It may include dividing the low frequency band and the high frequency band, respectively, so that the bone conduction voice signal is divided into the plurality of first bands. The feature band may include the low frequency band.

본 발명의 다른 측면에 따르면, 상기 골전도 음성신호 잡음 제거 방법을 실행하기 위한 프로그램이 기록된 컴퓨터로 판독 가능한 기록 매체가 제공된다.According to another aspect of the present invention, a computer-readable recording medium in which a program for executing the method for removing noise from a bone conduction voice signal is recorded is provided.

본 발명의 또 다른 측면에 따르면, 골전도 음성신호에서 잡음을 제거하는 골전도 음성신호 잡음 제거 장치에 있어서, 골전도 진동센서에 의해 골전도 음성신호를 수집하는 골전도 음성신호 수집부; 상기 골전도 음성신호를 웨이블릿 패킷 분해(wavelet packet decomposition)에 의해 다수의 제1 밴드들로 분할하는 웨이블릿 패킷 분해부; 상기 제1 밴드별로 에너지를 산출하고, 상기 제1 밴드별로 산출된 에너지를 기반으로 에너지 문턱치를 산출하는 에너지 문턱치 산출부; 상기 에너지 문턱치를 기반으로 상기 다수의 제1 밴드들 중 음성 구간을 가지는 제2 밴드들을 결정하는 밴드 결정부; 상기 제2 밴드들에 대해 각각 엔트로피를 산출하고, 상기 제2 밴드들의 엔트로피들을 기반으로 엔트로피 게이트를 생성하는 엔트로피 게이트 생성부; 상기 제2 밴드들의 상관관계를 산출하고, 상기 제2 밴드들의 상관관계를 기반으로 상관관계 게이트를 생성하는 상관관계 게이트 생성부; 상기 엔트로피 게이트 및 상기 상관관계 게이트에 의해 상기 제2 밴드들에서 잡음을 제거하는 잡음 제거부; 및 상기 잡음이 제거된 제2 밴드들을 합성하여 잡음 제거된 골전도 음성신호를 생성하는 합성부를 포함하는 골전도 음성신호 잡음 제거 장치가 제공된다.According to another aspect of the present invention, there is provided an apparatus for removing noise from a bone conduction audio signal for removing noise from a bone conduction audio signal, comprising: a bone conduction audio signal collection unit for collecting a bone conduction audio signal by a bone conduction vibration sensor; A wavelet packet decomposition unit for dividing the bone conduction speech signal into a plurality of first bands by wavelet packet decomposition; An energy threshold calculating unit that calculates energy for each of the first bands and calculates an energy threshold based on the energy calculated for each of the first bands; A band determination unit configured to determine second bands having an audio section among the plurality of first bands based on the energy threshold; An entropy gate generator for calculating entropy for each of the second bands and generating an entropy gate based on entropy of the second bands; A correlation gate generator for calculating a correlation of the second bands and generating a correlation gate based on the correlation of the second bands; A noise removing unit removing noise from the second bands by the entropy gate and the correlation gate; And a synthesizer for synthesizing the noise-removed second bands to generate a noise-removed bone conduction speech signal.

상기 엔트로피 게이트 생성부는, 상기 제2 밴드들의 엔트로피들의 평균값 및 상기 제2 밴드들 각각의 엔트로피의 로그값을 산출하고; 그리고 상기 평균값, 상기 로그값을 기반으로 상기 엔트로피 게이트를 생성할 수 있다.The entropy gate generator calculates an average value of entropy of the second bands and a log value of entropy of each of the second bands; In addition, the entropy gate may be generated based on the average value and the log value.

상기 잡음 제거부는, 상기 엔트로피 게이트에 의해 상기 제2 밴드들의 잡음을 1차 제거하고; 그리고 상기 상관관계 게이트에 의해, 상기 엔트로피 게이트를 통과한 제2 밴드들의 잡음을 2차 제거할 수 있다.The noise removal unit first removes noise of the second bands by the entropy gate; In addition, noise of the second bands passing through the entropy gate may be secondarily removed by the correlation gate.

상기 상관관계 게이트 생성부는 상기 제2 밴드들의 상관관계 및 상기 엔트로피 게이트를 통과한 제2 밴드를 기반으로 상기 상관관계 게이트를 생성할 수 있다.The correlation gate generator may generate the correlation gate based on a correlation of the second bands and a second band that has passed through the entropy gate.

상기 상관관계 게이트 생성부는 상기 제2 밴드들의 상관관계 및 골전도 음성특징과 가장 유사도가 높은 특징밴드를 기반으로 상기 상관관계 게이트를 생성할 수 있다.The correlation gate generator may generate the correlation gate based on the correlation of the second bands and a feature band having the highest similarity to the bone conduction voice characteristic.

본 발명의 다른 살시예에서, 상기 잡음 제거부는, 상기 상관관계 게이트에 의해 상기 제2 밴드들의 잡음을 1차 제거하고; 그리고 상기 엔트로피 게이트에 의해, 상기 상관관계 게이트를 통과한 제2 밴드들의 잡음을 2차 제거할 수 있다.In another embodiment of the present invention, the noise removal unit is configured to first remove noise of the second bands by the correlation gate; In addition, noise of the second bands passing through the correlation gate may be secondarily removed by the entropy gate.

상기 웨이블릿 패킷 분해부는, 상기 골전도 음성신호를 저주파수 대역과 고주파수 대역으로 분할하고; 그리고 상기 골전도 음성신호가 상기 다수의 제1 밴드들로 분할되도록, 상기 저주파수 대역 및 상기 고주파수 대역을 각각 분할할 수 있다. 상기 특징밴드는 상기 저주파수 대역을 포함할 수 있다.The wavelet packet decomposition unit divides the bone conduction audio signal into a low frequency band and a high frequency band; In addition, the low frequency band and the high frequency band may be respectively divided so that the bone conduction voice signal is divided into the plurality of first bands. The feature band may include the low frequency band.

본 발명은 엔트로피 게이트와 상관관계 게이트에 의해 골전도 음성신호의 잡음을 효과적으로 제거할 수 있는 골전도 음성신호 잡음 제거 장치 및 방법, 기록 매체가 제공된다.The present invention provides a bone conduction speech signal noise removal apparatus, method, and recording medium capable of effectively removing noise from a bone conduction speech signal through an entropy gate and a correlation gate.

본 발명의 효과는 상술한 효과들로 제한되지 않는다. 언급되지 않은 효과들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확히 이해될 수 있을 것이다.The effect of the present invention is not limited to the above-described effects. Effects not mentioned will be clearly understood by those of ordinary skill in the art from the present specification and the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 골전도 음성신호 잡음 제거 방법의 개념도이다.
도 2는 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 방법의 흐름도이다.
도 3은 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 장치의 구성도이다.
도 4는 본 발명의 실시예에 따라 골전도 음성신호를 웨이블릿 패킷 분해한 것을 보여주는 개념도이다.
도 5는 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 장치를 구성하는 음성 향상부의 구성도이다.
도 6은 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 장치의 성능 평가를 위한 실험 모델의 개략도이다.
도 7은 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 장치의 잡음 제거 결과를 보여주는 도면이다.
도 8은 본 발명의 실시예에 따라 잡음 제거된 골전도 음성신호의 분광(spectra) 결과이다.
도 9는 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 방법의 잡음 제거 성능을 종래의 잡음 제거 방법과 비교한 결과이다.1 is a conceptual diagram of a method for removing noise from a bone conduction speech signal according to an embodiment of the present invention.
2 is a flowchart of a method for removing noise from a bone conduction speech signal according to an embodiment of the present invention.
3 is a block diagram of an apparatus for removing noise from a bone conduction speech signal according to an embodiment of the present invention.
4 is a conceptual diagram illustrating a wavelet packet decomposition of a bone conduction voice signal according to an embodiment of the present invention.
5 is a block diagram of a voice enhancement unit constituting an apparatus for removing noise from a bone conduction voice signal according to an embodiment of the present invention.
6 is a schematic diagram of an experimental model for evaluating the performance of an apparatus for removing noise from a bone conduction speech signal according to an embodiment of the present invention.
7 is a view showing a noise removal result of the bone conduction speech signal noise removal apparatus according to an embodiment of the present invention.
8 is a spectral result of a bone conduction speech signal from which noise is removed according to an embodiment of the present invention.
9 is a result of comparing the noise removal performance of the bone conduction speech signal noise removal method according to an embodiment of the present invention with a conventional noise removal method.

본 발명의 다른 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술하는 실시예를 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예에 한정되지 않으며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 만일 정의되지 않더라도, 여기서 사용되는 모든 용어들(기술 혹은 과학 용어들을 포함)은 이 발명이 속한 종래 기술에서 보편적 기술에 의해 일반적으로 수용되는 것과 동일한 의미를 갖는다. 공지된 구성에 대한 일반적인 설명은 본 발명의 요지를 흐리지 않기 위해 생략될 수 있다. 본 발명의 도면에서 동일하거나 상응하는 구성에 대하여는 가급적 동일한 도면부호가 사용된다. 본 발명의 이해를 돕기 위하여, 도면에서 일부 구성은 다소 과장되거나 축소되어 도시될 수 있다.Other advantages and features of the present invention, and a method of achieving them will become apparent with reference to embodiments to be described later in detail together with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, and the present invention is only defined by the scope of the claims. Even if not defined, all terms (including technical or scientific terms) used herein have the same meaning as commonly accepted by universal technology in the prior art to which this invention belongs. General descriptions of known configurations may be omitted so as not to obscure the subject matter of the present invention. In the drawings of the present invention, the same reference numerals are used as much as possible for the same or corresponding configurations. In order to help the understanding of the present invention, some configurations in the drawings may be somewhat exaggerated or reduced.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다", "가지다" 또는 "구비하다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in the present application are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In the present application, terms such as "comprise", "have" or "have" are intended to designate the presence of features, numbers, steps, actions, components, parts, or a combination thereof described in the specification. It is to be understood that the possibility of the presence or addition of other features, numbers, steps, actions, components, parts, or combinations thereof, or any further features, is not excluded in advance.

본 명세서 전체에서 사용되는 '~부'는 적어도 하나의 기능이나 동작을 처리하는 단위로서, 예를 들어 소프트웨어, FPGA 또는 ASIC과 같은 하드웨어 구성요소를 의미할 수 있다. 그렇지만 '~부'가 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. '~부'는 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다.The'~ unit' used throughout this specification is a unit that processes at least one function or operation, and may mean, for example, a hardware component such as software, FPGA, or ASIC. However,'~ part' is not limited to software or hardware. The'~ unit' may be configured to be in an addressable storage medium, or may be configured to reproduce one or more processors.

일 예로서 '~부'는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로 코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들 및 변수들을 포함할 수 있다. 구성요소와 '~부'에서 제공하는 기능은 복수의 구성요소 및 '~부'들에 의해 분리되어 수행될 수도 있고, 다른 추가적인 구성요소와 통합될 수도 있다.As an example,'~unit' refers to components such as software components, object-oriented software components, class components, and task components, processes, functions, properties, procedures, and subs. Routines, segments of program code, drivers, firmware, microcode, circuits, data, databases, data structures, tables, arrays, and variables. The components and functions provided by the'~ unit' may be performed separately by a plurality of elements and the'~ units', or may be integrated with other additional elements.

도 1은 본 발명의 일 실시예에 따른 골전도 음성신호 잡음 제거 방법의 개념도이다. 도 1을 참조하면, 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 방법은 잡음이 섞인 골전도 음성신호를 필터링한 후, 웨이블릿 패킷 분해(wavelet packet decomposition)에 의해 제1 밴드들로 분할하는 단계(S1~S3)와, 제1 밴드별로 에너지를 산출하여 에너지 문턱치를 산출하고 에너지 문턱치를 기반으로 제1 밴드들 중 음성 구간을 가지는 제2 밴드들을 결정한 후, 제2 밴드들에 대해 각각 엔트로피를 산출하여 엔트로피 게이트(엔트로피 윈도우)를 생성하는 단계(S4)와, 제2 밴드들의 상관관계를 기반으로 상관관계 게이트(상관관계 윈도우)를 생성하는 단계(S5)와, 엔트로피 게이트 및 상관관계 게이트에 의해 제2 밴드들에서 잡음을 제거하는 단계(S6)와, 잡음이 제거된 제2 밴드들을 합성(재결합)하여 잡음 제거된 골전도 음성신호를 생성하는 단계(S7~S8)를 포함한다.1 is a conceptual diagram of a method for removing noise from a bone conduction speech signal according to an embodiment of the present invention. Referring to FIG. 1, in the method for removing noise from a bone conduction speech signal according to an embodiment of the present invention, after filtering a bone conduction speech signal mixed with noise, it is divided into first bands by wavelet packet decomposition. In steps S1 to S3, energy is calculated for each first band to calculate an energy threshold, and based on the energy threshold, second bands having a negative section among the first bands are determined, and then entropy for each of the second bands. The step of generating an entropy gate (entropy window) by calculating (S4), the step of generating a correlation gate (correlation window) based on the correlation of the second bands (S5), and the entropy gate and the correlation gate And removing noise from the second bands (S6), and generating a noise-removed bone conduction speech signal by synthesizing (recombining) the second bands from which the noise has been removed (S7 to S8).

도 2는 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 방법의 흐름도이다. 도 3은 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 장치의 구성도이다. 도 2 및 도 3을 참조하면, 골전도 음성신호 잡음 제거 장치(100)는 골전도 음성신호 수집부(120)와, 웨이블릿 패킷 분해부(140)와, 음성 향상부(160)를 포함한다.2 is a flowchart of a method for removing noise from a bone conduction speech signal according to an embodiment of the present invention. 3 is a block diagram of an apparatus for removing noise from a bone conduction speech signal according to an embodiment of the present invention. Referring to FIGS. 2 and 3, the apparatus 100 for removing noise from a bone conduction speech signal includes a bone conduction speech signal collection unit 120, a wavelet packet decomposition unit 140, and a speech enhancement unit 160.

골전도 음성신호 수집부(120)는 잡음이 포함된 골전도 음성신호를 수집한다(S10). 일 실시예로, 골전도 음성신호 수집부(120)는 골전도 음성신호를 수집하는 골전도 진동센서 및/또는 골전도 진동센서에 의해 수집된 골전도 음성신호를 입력받는 장치 등을 포함할 수 있다. 잡음이 섞인 골전도 음성신호 y(n)은 하기의 식 (1)과 같이 표현될 수 있다.The bone conduction audio signal collection unit 120 collects a bone conduction audio signal including noise (S10). In one embodiment, the bone conduction voice signal collection unit 120 may include a bone conduction vibration sensor for collecting a bone conduction voice signal and/or a device for receiving a bone conduction voice signal collected by the bone conduction vibration sensor. have. The bone conduction audio signal y(n) mixed with noise can be expressed as Equation (1) below.

[식 (1)][Equation (1)]

식 (1)에서, s(n)은 n번째 프레임의 잡음이 없는 깨끗한 골전도 음성신호이고, v(n)은 n번째 프레임의 배경 잡음이다. 골전도 음성신호 수집부(120)로부터 잡음이 섞인 골전도 음성신호가 입력되면, 골전도 음성신호는 예를 들어 100 Hz 하이패스 필터 등에 의해 필터링되고, 이어서 신호구간과 잡음구간을 추출(구별)하기 위하여 웨이블릿 패킷 분해부(140)에 의해 웨이블릿 패킷 분해(wavelet packet decomposition)된다(S40). 이에 따라, 골전도 음성신호는 시간-주파수 2차원 영역의 제1 밴드들로 분할된다.In Equation (1), s ( n ) is a clean bone conduction speech signal without noise of the nth frame, and v ( n ) is the background noise of the nth frame. When a bone conduction audio signal mixed with noise is input from the bone conduction audio signal collection unit 120, the bone conduction audio signal is filtered by, for example, a 100 Hz high pass filter, and then the signal section and the noise section are extracted (distinguished). To do this, wavelet packet decomposition is performed by the wavelet packet decomposition unit 140 (S40). Accordingly, the bone conduction audio signal is divided into first bands in a time-frequency two-dimensional region.

일 실시예로, 웨이블릿 패킷 분해부(140)는 단순히 산술적인 밴드별 에너지를 기반으로 하지 않고, 인간의 음향 청각 모델을 기반으로 인간의 청신경에 자극되는 에너지의 크기에 맞추어, 음성 신호를 도 4에 도시된 바와 같이 20개의 제1 밴드들로 분해할 수 있다. 골전도 음성신호로부터 분해된 제1 밴드들은 20개의 주파수 정보를 가진 시간 영역 신호의 형태를 가지고 있으며, 시간과 주파수의 정보를 모두 나타내어 2차원 행렬로 나타낼 수 있다.In one embodiment, the wavelet packet decomposition unit 140 is not simply based on an arithmetic band-specific energy, but based on a human acoustic and auditory model, according to the amount of energy stimulated by the human auditory nerve, and converts the speech signal into FIG. It can be decomposed into 20 first bands as shown in FIG. The first bands decomposed from the bone conduction speech signal have a form of a time domain signal having 20 frequency information, and can represent both time and frequency information in a two-dimensional matrix.

일 실시예에서, 음성 신호는 20개의 서브밴드를 가지는 웨이블릿 계수 (w _j,m (k))로 분해될 수 있다. 골전도 음성신호는 주로 저주파수 밴드에 집중되기 때문에, 고주파수 대역의 서브밴드 개수보다 저주파수 대역의 서브밴드 개수가 많다. 웨이블릿 계수 w _j,m (k)는 j번째 레벨, m번째 웨이블릿 밴드의 k번째 웨이블릿 계수를 나타낸다(j=2,3,4, m=1,...,20). 골전도 음성신호는 네번째 레벨에서 20개의 밴드로 분해될 수 있다. 웨이블릿 계수 w _j,m (k)를 시간과 주파수 영역의 정보를 동시에 처리하기 위해, 웨이블릿 계수를 하기의 식 (2)와 같이 2차원 행렬로 나타낼 수 있다. In one embodiment, the speech signal may be decomposed into wavelet coefficients ( w _j,m ( k )) having 20 subbands. Since the bone conduction audio signal is mainly concentrated in the low frequency band, the number of subbands in the low frequency band is larger than the number of subbands in the high frequency band. The wavelet coefficient w _j,m (k) represents the k-th wavelet coefficient of the j- th level and m- th wavelet band ( j =2,3,4, m =1,...,20). The bone conduction audio signal can be decomposed into 20 bands at the fourth level. In order to process the wavelet coefficients w _j,m (k) simultaneously with information in the time and frequency domains, the wavelet coefficients can be expressed as a two-dimensional matrix as shown in Equation (2) below.

[식 (2)][Equation (2)]

식 (2)에서,

은 특정시간 t에서의 m번째 제1 밴드(서브밴드)의 웨이블릿 계수의 행렬로 이루어진다. 다시 도 2 및 도 3을 참조하면, 웨이블릿 패킷 분해부(140)에 의해 골전도 음성신호가 다수의 제1 밴드들(서브밴드들)로 분해되면, 음성 향상부(160)는 제1 밴드들 별로 잡음을 제거한 후, 잡음 제거된 밴드들을 합성하여 음성 향상된 골전도 음성신호를 생성한다.In equation (2),

Is composed of a matrix of wavelet coefficients of the m-th first band (subband) at a specific time t. Referring back to FIGS. 2 and 3, when the bone conduction voice signal is decomposed into a plurality of first bands (subbands) by the wavelet packet decomposing unit 140, the voice enhancing unit 160 includes the first bands. After each noise is removed, the noise-removed bands are synthesized to generate a voice-enhanced bone conduction voice signal.

도 5는 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 장치를 구성하는 음성 향상부의 구성도이다. 도 5를 참조하면, 음성 향상부(160)는 에너지 문턱치 산출부(161), 밴드 결정부(162), 엔트로피 게이트 생성부(163), 상관관계 게이트 생성부(164), 잡음 제거부(165) 및 합성부(166)를 포함할 수 있다.5 is a block diagram of a voice enhancement unit constituting an apparatus for removing noise from a bone conduction voice signal according to an embodiment of the present invention. Referring to FIG. 5, the voice enhancement unit 160 includes an energy threshold calculation unit 161, a band determination unit 162, an entropy gate generation unit 163, a correlation gate generation unit 164, and a noise removal unit 165. ) And a synthesis unit 166 may be included.

도 2 및 도 5를 참조하면, 에너지 문턱치 산출부(161)는 웨이블릿 패킷 분해부(140)에 의해 분해된 제1 밴드들(예를 들어, 20개의 서브밴드)별로 에너지를 산출하고, 제1 밴드별로 산출된 에너지를 기반으로 에너지 문턱치를 산출한다(S30). 에너지 문턱치 산출부(161)는 예를 들어 하기의 수식 (3)에 따라 음성신호와 잡음신호를 구별하기 위한 에너지 문턱치 ψ_th(t)를 산출할 수 있다.2 and 5, the energy threshold calculation unit 161 calculates energy for each of the first bands (eg, 20 subbands) decomposed by the wavelet packet decomposition unit 140, and the first An energy threshold is calculated based on the energy calculated for each band (S30). The energy threshold calculation unit 161 may calculate an energy threshold ψ _th (t) for distinguishing a voice signal and a noise signal according to Equation (3) below.

[수식 (3)][Equation (3)]

수식 (3)에서,

는 음성신호,

는 잡음신호, m은 제1 밴드의 순번, ψ_m(t)는 m번째 제1 밴드이다. 골전도 음성신호는 수식 (3)에 의해 완전히 잡음이 제거되지 않는다. 잡음의 효과적인 제거를 위하여, 먼저 밴드 결정부(162)는 에너지 문턱치 ψ_th(t)를 기반으로 다수의 제1 밴드들 중 음성 구간을 가지는 제2 밴드들을 결정할 수 있다(S40). 실시예에서, 20개의 제1 밴드들 중에서 음성신호를 포함하고 있는 제1 밴드들이 제2 밴드들로 결정될 수 있다. 다른 실시예에서, 음성신호를 가지고 있는 제1 밴드들 중에서 음성신호의 정보량이 설정치 이상 포함된 제1 밴드들 만을 제2 밴드들로 결정하는 것도 가능하다.In formula (3),

Is the voice signal,

Is the noise signal, m is the sequence number of the first band, and ψ _m (t) is the m-th first band. The bone conduction audio signal is not completely removed by Equation (3). In order to effectively remove noise, the band determiner 162 may first determine second bands having a speech section among a plurality of first bands based on the energy threshold ψ _th (t) (S40). In an embodiment, among the 20 first bands, first bands including an audio signal may be determined as second bands. In another embodiment, it is possible to determine only the first bands in which the information amount of the voice signal is greater than or equal to the set value among the first bands having the voice signal as the second bands.

밴드 결정부(162)에 의해 제2 밴드들이 결정되면, 엔트로피 게이트 생성부(163)는 제2 밴드들에 대해 각각 엔트로피를 산출하고, 제2 밴드들의 엔트로피들을 기반으로 엔트로피 게이트를 생성할 수 있다(S50). 실시예에서, 엔트로피 게이트 생성부(163)는 제2 밴드들의 엔트로피들의 평균값 및 제2 밴드들 각각의 엔트로피의 로그값을 산출하고, 산출된 제2 밴드들의 엔트로피들의 평균값 및 제2 밴드들 각각의 엔트로피의 로그값을 기반으로 엔트로피 게이트를 생성할 수 있다. 엔트로피 게이트 생성부(163)는 하기의 수식 (4)에 따라 엔트로피 게이트를 생성할 수 있다.When the second bands are determined by the band determiner 162, the entropy gate generator 163 may calculate entropy for each of the second bands and generate an entropy gate based on the entropy of the second bands. (S50). In an embodiment, the entropy gate generator 163 calculates an average value of entropy of the second bands and a log value of entropy of each of the second bands, and the average value of the calculated entropy of the second bands and each of the second bands An entropy gate can be created based on the logarithm of entropy. The entropy gate generator 163 may generate an entropy gate according to Equation (4) below.

[수식 (4)][Equation (4)]

수식 (4)에서, EG(t)는 엔트로피 밴드 문턱치를 기반으로 하는 엔트로피 게이트이고,

는 제2 밴드들 각각의 샤논 엔트로피(Shannon entropy), N(=k)은 제2 밴드들의 개수(N≤20)이다. 골전도 음성정보는 잡음보다 높은 엔트로피 값을 가지며, 엔트로피 게이트 EG(t)에 의해 1차로 잡음이 제거될 수 있다. 골전도 음성신호의 음성 향상은 엔트로피 게이트 만으로는 목표로 하는 수준에 도달하기 어렵다.In Equation (4), EG(t) is an entropy gate based on an entropy band threshold,

Is Shannon entropy of each of the second bands, and N(=k) is the number of second bands (N≦20). The bone conduction speech information has an entropy value higher than that of noise, and noise may be firstly removed by the entropy gate EG(t). It is difficult to achieve the target level of the bone conduction voice signal with only the entropy gate.

골전도 음성신호의 음성 향상 효과를 극대화하기 위하여, 상관관계 게이트 생성부(164)는 제2 밴드들의 상관관계(cross-correlation)를 산출하고, 제2 밴드들의 상관관계를 기반으로 상관관계 게이트를 생성한다(S60). 실시예에서, 상관관계 게이트 생성부(164)는 제2 밴드들의 상관관계 및 엔트로피 게이트를 통과한 제2 밴드를 기반으로 상관관계 게이트를 생성할 수 있다. 상관관계 게이트 생성부(164)는 예를 들어 하기의 수식 (5)에 따라 상관관계 게이트 CG(t)를 생성할 수 있다.In order to maximize the speech enhancement effect of the bone conduction speech signal, the correlation gate generator 164 calculates cross-correlation of the second bands, and generates a correlation gate based on the correlation of the second bands. Generate (S60). In an embodiment, the correlation gate generator 164 may generate a correlation gate based on the correlation of the second bands and the second band that has passed through the entropy gate. The correlation gate generator 164 may generate the correlation gate CG(t) according to Equation (5) below, for example.

[수식 (5)][Equation (5)]

수식 (5)에서, Ψ_ck는 제2 밴드들 중 k번째 제2 밴드, m은 제2 밴드들의 평균값, n은 제2 밴드들의 개수, Ψ_EG는 엔트로피 게이트를 통과한 제2 밴드이다.In Equation (5), Ψ _ck is the k-th second band among the second bands, m is the average value of the second bands, n is the number of second bands, and Ψ _EG is the second band passing through the entropy gate.

잡음 제거부(165)는 엔트로피 게이트 및 상관관계 게이트에 의해 제2 밴드들에서 잡음을 제거할 수 있다(S70). 일 실시예에서, 잡음 제거부(165)는 먼저 엔트로피 게이트에 의해 제2 밴드들의 잡음을 1차 제거한 후, 엔트로피 게이트를 통과한 제2 밴드들의 잡음을 상관관계 게이트에 의해 2차 제거할 수 있다.The noise removing unit 165 may remove noise from the second bands by the entropy gate and the correlation gate (S70). In an embodiment, the noise removal unit 165 may first remove the noise of the second bands by the entropy gate first, and then remove the noise of the second bands passing through the entropy gate by the correlation gate. .

대안적으로, 잡음 제거부(165)는 상관관계 게이트에 의해 제2 밴드들의 잡음을 1차 제거한 후, 상관관계 게이트를 통과한 제2 밴드들의 잡음을 엔트로피 게이트에 의해 2차 제거할 수도 있다. 상관관계 게이트 생성부(164)는 제2 밴드들의 상관관계 및 골전도 음성특징과 가장 유사도가 높은 특징밴드를 기반으로 상관관계 게이트를 생성할 수 있다. 이 경우, 수식 (5)에서 Ψ_EG는 골전도 음성신호와 매우 유사한 특징을 가지는

으로 치환될 수 있으며, 수식 (4)에서

는 상관관계 게이트를 통과한 제2 밴드들 각각의 샤논 엔트로피(Shannon entropy)일 수 있다.

(도 4 참조)은 골전도 음성신호로부터 첫번째 레벨에서 분할된 저주파수 대역 및 고주파수 대역 중 저주파수 대역을 포함할 수 있다.Alternatively, the noise removal unit 165 may first remove the noise of the second bands by the correlation gate and then remove the noise of the second bands passing through the correlation gate by the entropy gate. The correlation gate generator 164 may generate a correlation gate based on a correlation between the second bands and a feature band having the highest similarity to the bone conduction voice characteristic. In this case, in Equation (5), Ψ _EG has characteristics very similar to the bone conduction audio signal.

Can be substituted with, in Equation (4)

May be Shannon entropy of each of the second bands passing through the correlation gate.

(See FIG. 4) may include a low frequency band of a low frequency band and a high frequency band divided at the first level from the bone conduction voice signal.

잡음 제거부(165)에 의해 엔트로피 게이트 및 상관관계 게이트를 기반으로 골전도 음성신호의 제2 밴드들이 1차, 2차 잡음 제거되면, 합성부(166)는 잡음 제거된 제2 밴드들을 역 웨이블릿 패킷 분해(Inverse wavelet packet decomposition)에 의해 합성하여 잡음 제거된 골전도 음성신호를 생성할 수 있다.When the second bands of the bone conduction speech signal are removed from the first and second noises based on the entropy gate and the correlation gate by the noise removal unit 165, the synthesis unit 166 reverses the noise removed second bands. A bone conduction speech signal from which noise is removed may be generated by synthesizing by inverse wavelet packet decomposition.

본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 장치 및 방법에 의하면, 골전도 보청기에서 공기를 통한 소리의 진동이 피부를 통하여 골전도 보청기에 전달되는 과정에서 음성신호를 오염시키는 외부 환경 잡음을 효과적으로 제거할 수 있으며, 효율적인 음성신호 전달이 가능하다. 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 방법의 성능을 평가하기 위한 실험을 수행하였다. 도 6은 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 장치의 성능 평가를 위한 실험 모델의 개략도이다. 전기 콘덴서 마이크로폰(ECM; Electric Condenser Microphone) 및 피에조 센서(piezo-electric sensor)가 실험 모델에 사용되었다.According to an apparatus and method for removing noise from a bone conduction voice signal according to an embodiment of the present invention, external environmental noise that contaminates the voice signal in the process of transmitting the vibration of sound through the air from the bone conduction hearing aid to the bone conduction hearing aid through the skin. It can be effectively removed, and efficient voice signal transmission is possible. An experiment was conducted to evaluate the performance of the bone conduction speech signal noise removal method according to an embodiment of the present invention. 6 is a schematic diagram of an experimental model for evaluating the performance of an apparatus for removing noise from a bone conduction speech signal according to an embodiment of the present invention. An electric condenser microphone (ECM) and a piezo-electric sensor were used in the experimental model.

전기 콘덴서 마이크로폰은 공기 전도 음성 신호의 수집을 위해 실험자의 입에 가까운 위치에 배치되었으며, 피에조 센서는 골전도 음성신호의 수집에 적합하도록 실험자의 정수리에 고정되었다. 잡음 환경을 모사하기 위하여 100dB SPL 스피커가 사용되었다. 70-75dB SPL의 모음 'a'가 여러번 반복적으로 발음되었다. 실험은 약 30-35 dB SPL 배경 잡음 환경에서 수행되었다.The electric condenser microphone was placed close to the experimenter's mouth to collect the air conduction audio signal, and the piezo sensor was fixed to the experimenter's head to be suitable for collecting the bone conduction audio signal. A 100dB SPL speaker was used to simulate a noisy environment. The vowel'a' of 70-75dB SPL was pronounced repeatedly several times. The experiment was conducted in an environment of about 30-35 dB SPL background noise.

도 7은 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 장치의 잡음 제거 결과를 보여주는 도면이다. 도 7의 (a)는 잡음 제거 전의 신호로, 상부는 마이크로폰에 의해 수집된 잡음 제거 전의 공기전도 음성신호이고, 하부는 피에조 센서에 의해 수집된 잡음 제거 전의 골전도 음성신호이다. 도 7의 (b)는 엔트로피 게이트 및 상관관계 게이트에 의해 잡음 제거된 후의 신호로, 상부는 잡음 제거 후의 공기전도 음성신호이고, 하부는 잡음 제거 후의 골전도 음성신호이다. 모음 'a'가 반복적으로 발음되었으며, 백색 잡음 및 누화 잡음은 2.5초부터 발생되었다. 7 is a view showing a noise removal result of the bone conduction speech signal noise removal apparatus according to an embodiment of the present invention. 7A is a signal before noise removal, the upper part is the air conduction voice signal before noise removal collected by the microphone, and the lower part is the bone conduction voice signal before noise removal collected by the piezo sensor. 7B is a signal after noise is removed by an entropy gate and a correlation gate. The upper part is an air conduction voice signal after noise removal, and the lower part is a bone conduction voice signal after noise removal. The vowel'a' was repeatedly pronounced, and white noise and crosstalk noise occurred from 2.5 seconds.

본 발명의 실시예에 따라 엔트로피 및 상관관계 문턱치들에 의해 잡음이 효율적으로 제거된 것을 알 수 있다. 도 7의 (b) 상부 음성파형으로부터, 공기전도 음성신호의 경우, 골전도 음성신호와 상이한 전파 특성으로 인해, 백색 잡음의 제거와 함께 첫번째 및 두번째 'a' 음성 정보가 손실되었으며, 여섯번째 및 7번째 'a' 음성 정보는 잡음이 효과적으로 제거되지 않은 것을 알 수 있다.It can be seen that noise is efficiently removed by entropy and correlation thresholds according to an embodiment of the present invention. From the upper voice waveform of FIG. 7 (b), in the case of the air-conducting voice signal, due to the propagation characteristics different from the bone-conducting voice signal, the first and second'a' voice information was lost along with the removal of white noise, It can be seen that noise is not effectively removed from the 7th'a' voice information.

도 8은 본 발명의 실시예에 따라 잡음 제거된 골전도 음성신호의 분광(spectra) 결과로, 도 8의 (a)는 누화 잡음을 가지는 음성신호, (b)는 백색 잡음을 가지는 음성신호, (c)는 본 발명의 실시예에 따라 누화 잡음이 제거된 음성신호, (d)는 본 발명의 실시예에 따라 백색 잡음이 제거된 음성신호의 분광 결과이다. 누화 잡음(babble noise)은 음성신호와 매우 유사한 특징을 가지므로, 종래의 음성 향상 방법으로는 음성신호와 구별하기 어렵지만, 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 방법은 백색 잡음 및 누화 잡음에 대하여 우수한 잡음 제거 성능을 나타내는 것을 알 수 있다.8 is a spectral result of the bone conduction audio signal from which noise is removed according to an embodiment of the present invention. FIG. 8A is a voice signal having crosstalk noise, (b) a voice signal having white noise, (c) is a voice signal from which crosstalk noise is removed according to an embodiment of the present invention, and (d) is a spectral result of a voice signal from which white noise is removed according to an embodiment of the present invention. Crosstalk noise (babble noise) has a characteristic very similar to that of a speech signal, so it is difficult to distinguish it from a speech signal by a conventional speech enhancement method, but the bone conduction speech signal noise removal method according to an embodiment of the present invention includes white noise and crosstalk. It can be seen that it exhibits excellent noise reduction performance against noise.

도 9는 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 방법의 잡음 제거 성능을 종래의 잡음 제거 방법과 비교한 결과이다. 도 9의 (a)는 종래의 적응형 필터(adaptive filter)에 의해 잡음 제거된 음성신호 및 스펙트럼, (b)는 종래의 스펙트럼 차감(spectral subtraction)에 의해 잡음 제거된 음성신호 및 스펙트럼, (c)는 본 발명의 실시예에 따라 잡음 제거된 음성신호 및 스펙트럼을 나타낸 것이다. 도 9의 (a), (b)에 도시된 스펙트럼으로부터, 종래의 잡음 제거 방법의 경우, 잡음이 완전히 제거되지 않은 것을 알 수 있다. 이에 반해, 도 9의 (c)에 도시된 바와 같이, 본 발명의 실시예에 따른 골전도 음성신호 잡음 제거 방법의 경우, 시간 영역 및 주파수 영역에서 모두 잡음이 깨끗하게 제거된 것을 알 수 있다.9 is a result of comparing the noise removal performance of the bone conduction speech signal noise removal method according to an embodiment of the present invention with a conventional noise removal method. 9A is a speech signal and spectrum noise-removed by a conventional adaptive filter, (b) a speech signal and spectrum noise-removed by a conventional spectral subtraction, (c) ) Shows a noise-removed speech signal and spectrum according to an embodiment of the present invention. From the spectra shown in FIGS. 9A and 9B, it can be seen that in the case of the conventional noise removal method, noise is not completely removed. On the other hand, as shown in (c) of FIG. 9, in the case of the bone conduction speech signal noise removal method according to an embodiment of the present invention, it can be seen that noise is cleanly removed in both the time domain and the frequency domain.

본 발명의 실시예에 따른 방법은 예를 들어 컴퓨터에서 실행될 수 있는 프로그램으로 작성 가능하고, 컴퓨터로 읽을 수 있는 기록매체를 이용하여 상기 프로그램을 동작시키는 범용 디지털 컴퓨터에서 구현될 수 있다. 컴퓨터로 읽을 수 있는 기록매체는 SRAM(Static RAM), DRAM(Dynamic RAM), SDRAM(Synchronous DRAM) 등과 같은 휘발성 메모리, ROM(Read Only Memory), PROM(Programmable ROM), EPROM(Electrically Programmable ROM), EEPROM(Electrically Erasable and Programmable ROM), 플래시 메모리 장치, PRAM(Phase-change RAM), MRAM(Magnetic RAM), RRAM(Resistive RAM), FRAM(Ferroelectric RAM)과 같은 불휘발성 메모리, 플로피 디스크, 하드 디스크 또는 광학적 판독 매체 예를 들어 시디롬, 디브이디 등과 같은 형태의 저장매체일 수 있으나, 이에 제한되지는 않는다.The method according to an embodiment of the present invention can be written as a program that can be executed on a computer, and can be implemented in a general-purpose digital computer that operates the program using a computer-readable recording medium. Computer-readable recording media include volatile memories such as SRAM (Static RAM), DRAM (Dynamic RAM), SDRAM (Synchronous DRAM), Read Only Memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Nonvolatile memory such as Electrically Erasable and Programmable ROM (EEPROM), flash memory device, phase-change RAM (PRAM), magnetic RAM (MRAM), resistive RAM (RRAM), ferroelectric RAM (FRAM), floppy disk, hard disk, or The optical reading medium may be, for example, a storage medium in the form of a CD-ROM or a DVD, but is not limited thereto.

이상의 실시예들은 본 발명의 이해를 돕기 위하여 제시된 것으로, 본 발명의 범위를 제한하지 않으며, 이로부터 다양한 변형 가능한 실시예들도 본 발명의 범위에 속하는 것임을 이해하여야 한다. 본 발명의 기술적 보호범위는 청구범위의 기술적 사상에 의해 정해져야 할 것이며, 본 발명의 기술적 보호범위는 청구범위의 문언적 기재 그 자체로 한정되는 것이 아니라 실질적으로는 기술적 가치가 균등한 범주의 발명까지 미치는 것임을 이해하여야 한다.It should be understood that the above embodiments have been presented to aid the understanding of the present invention, and do not limit the scope of the present invention, and various deformable embodiments are also within the scope of the present invention. The technical protection scope of the present invention should be determined by the technical spirit of the claims, and the technical protection scope of the present invention is not limited to the literal description of the claims per se, but the invention of the scope of which the technical value is substantially equal. It should be understood that it reaches to.

100: 골전도 음성신호 잡음 제거 장치
120: 골전도 음성신호 수집부
140: 웨이블릿 패킷 분해부
160: 음성 향상부
161: 에너지 문턱치 산출부
162: 밴드 결정부
163: 엔트로피 게이트 생성부
164: 상관관계 게이트 생성부
165: 잡음 제거부
166: 합성부100: bone conduction speech signal noise removal device
120: bone conduction voice signal collection unit
140: wavelet packet decomposition unit
160: voice enhancement unit
161: energy threshold calculation unit
162: band determination unit
163: entropy gate generation unit
164: correlation gate generation unit
165: noise canceling unit
166: synthesis unit

Claims

In the bone conduction speech signal noise removal method for removing noise from the bone conduction speech signal,
Collecting a bone conduction voice signal by a bone conduction vibration sensor;
Dividing the bone conduction speech signal into a plurality of first bands by wavelet packet decomposition;
Calculating energy for each of the first bands and calculating an energy threshold based on the energy calculated for each of the first bands;
Determining second bands having an audio section among the plurality of first bands based on the energy threshold;
Calculating entropy for each of the second bands, and generating an entropy gate based on the entropies of the second bands;
Calculating a correlation of the second bands and generating a correlation gate based on the correlation of the second bands;
Removing noise from the second bands by the entropy gate and the correlation gate; And
Comprising the step of synthesizing the noise-removed second bands to generate a noise-removed bone conduction speech signal,
Generating the entropy gate,
Calculating an average value of entropy of the second bands and a log value of entropy of each of the second bands; And
And generating the entropy gate based on the average value and the log value.

delete

The method of claim 1,
In the step of generating the entropy gate, the entropy gate is generated according to Equation 1 below,
[Equation 1]

In Equation 1, EG(t) is the entropy gate,

Is the entropy of each of the second bands, and k and N are the number of the second bands.

The method of claim 3,
The step of removing the noise,
First removing noise of the second bands by the entropy gate; And
And secondly removing noise of the second bands passing through the entropy gate by the correlation gate.

The method of claim 4,
The generating of the correlation gate comprises generating the correlation gate based on a correlation of the second bands and a second band passing through the entropy gate.

The method of claim 5,
In the step of generating the correlation gate, generating the correlation gate according to Equation 2 below,
[Equation 2]

In Equation 2, CG(t) is the correlation gate, Ψ _ck is the k-th second band of the second bands, m is the average value of the second bands, n is the number of the second bands, Ψ _EG is the second band passing through the entropy gate, bone conduction speech signal noise removal method.

The method of claim 1,
The generating of the correlation gate comprises generating the correlation gate based on the correlation of the second bands and a feature band having the highest similarity to the bone conduction voice feature.

The method of claim 7,
The step of removing the noise,
First removing noise of the second bands by the correlation gate; And
And removing, by the entropy gate, the noise of the second bands passing through the correlation gate by a second order.

The method of claim 8,
The step of dividing into the plurality of first bands,
Dividing the bone conduction audio signal into a low frequency band and a high frequency band;
Dividing the low frequency band and the high frequency band, respectively, so that the bone conduction audio signal is divided into the plurality of first bands,
The characteristic band is a bone conduction speech signal noise removal method including the low frequency band.

A computer-readable recording medium in which a program for executing the method for removing noise of a bone conduction speech signal according to any one of claims 1, 3 to 9 is recorded.

In the bone conduction speech signal noise removal apparatus for removing noise from the bone conduction speech signal,
Bone conduction voice signal collection unit for collecting the bone conduction voice signal by the bone conduction vibration sensor;
A wavelet packet decomposition unit for dividing the bone conduction speech signal into a plurality of first bands by wavelet packet decomposition;
An energy threshold calculating unit that calculates energy for each of the first bands and calculates an energy threshold based on the energy calculated for each of the first bands;
A band determination unit configured to determine second bands having an audio section among the plurality of first bands based on the energy threshold;
An entropy gate generator for calculating entropy for each of the second bands and generating an entropy gate based on entropy of the second bands;
A correlation gate generator for calculating a correlation of the second bands and generating a correlation gate based on the correlation of the second bands;
A noise removing unit removing noise from the second bands by the entropy gate and the correlation gate; And
A synthesizer for synthesizing the noise-removed second bands to generate a noise-removed bone conduction speech signal,
The entropy gate generation unit,
Calculating an average value of entropy of the second bands and a log value of entropy of each of the second bands; And
Bone conduction speech signal noise removal apparatus for generating the entropy gate based on the average value and the log value.

delete

The method of claim 11,
The noise removal unit,
First-order noise cancellation of the second bands by the entropy gate; And
Bone conduction speech signal noise removal apparatus for secondarily removing noise of second bands passing through the entropy gate by the correlation gate.

The method of claim 13,
The correlation gate generator generates the correlation gate based on the correlation of the second bands and the second band passing through the entropy gate.

The method of claim 11,
The correlation gate generating unit generates the correlation gate based on the correlation of the second bands and the characteristic band having the highest similarity to the bone conduction speech characteristic.

The method of claim 15,
The noise removal unit,
First-order removal of noise in the second bands by the correlation gate; And
Bone conduction speech signal noise removal apparatus for secondary removal of noise of the second bands passing through the correlation gate by the entropy gate.

The method of claim 16,
The wavelet packet decomposition unit,
Dividing the bone conduction audio signal into a low frequency band and a high frequency band; And
Dividing the low frequency band and the high frequency band, respectively, so that the bone conduction audio signal is divided into the plurality of first bands,
The characteristic band is a bone conduction speech signal noise removal apparatus including the low frequency band.