KR20060064270A

KR20060064270A - Digital audio signal preprocessing method for mobile telecommunication terminal

Info

Publication number: KR20060064270A
Application number: KR1020040103030A
Authority: KR
Inventors: 송석원
Original assignee: 주식회사 라이브젠
Priority date: 2004-12-08
Filing date: 2004-12-08
Publication date: 2006-06-13
Also published as: KR100592926B1

Abstract

본 발명은 음성 압축을 행하기 전에 저역 성분의 제거 및 볼륨 제어를 수행함으로써 이동통신 단말기에서 사용되는 보코더를 통해서 디지털 오디오 파일을 복원할 때 발생하는 음질 저하를 방지할 수 있도록 한 이동통신 단말기용 디지털 오디오신호의 전처리 방법에 관한 것이다.According to the present invention, a digital signal for a mobile communication terminal can be prevented from deteriorating sound quality that occurs when a digital audio file is restored through a vocoder used in a mobile communication terminal by performing low frequency component removal and volume control before performing voice compression. A method of preprocessing an audio signal.

본 발명은 디지털 오디오신호를 이동통신 단말기용 인코더에 의해 압축하기 전에 소정의 전처리를 수행하는 이동통신 단말기용 디지털 오디오신호의 전처리 방법에 있어서, (a) 압축에 사용될 디지털 오디오 데이터를 입력받는 단계; (b) 상기 단계 (a)에서 입력받은 디지털 오디오 데이터에서 소정 주파수 이하의 저주파 대역 성분을 제거하는 단계; (c) 상기 저주파 대역 성분이 제거된 디지털 오디오 데이터에 대해 미리 정해진 시간 간격으로 음성 에너지를 계산하는 단계 및 (d) 상기 계산된 음성 에너지가 미리 정해진 기준치보다 큰 경우에는 미리 정해진 증감률에 의해 볼륨을 감소시키고, 작은 경우에는 상기 증감률에 의해 볼륨을 증가시키는 단계를 포함하여 이루어진 것을 특징으로 한다.The present invention provides a method of preprocessing a digital audio signal for a mobile communication terminal which performs a predetermined preprocessing before the digital audio signal is compressed by an encoder for the mobile communication terminal, the method comprising: (a) receiving digital audio data to be used for compression; (b) removing low frequency band components below a predetermined frequency from the digital audio data input in step (a); (c) calculating voice energy at predetermined time intervals for the digital audio data from which the low frequency band component has been removed; and (d) if the calculated voice energy is larger than a predetermined reference value, the volume is determined by a predetermined increase and decrease rate. It is characterized in that it comprises a step of reducing, and in the case of small, increasing the volume by the increase and decrease rate.

이동통신, 단말기, 오디오, 보코더, 전처리, 주파수, 볼륨Mobile communication, terminal, audio, vocoder, preprocessing, frequency, volume

Description

Digital audio signal preprocessing method for mobile telecommunication terminal

도 1은 본 발명의 이동통신 단말기용 디지털 오디오신호의 전처리 방법을 설명하기 위한 흐름도,1 is a flowchart illustrating a preprocessing method of a digital audio signal for a mobile communication terminal of the present invention;

도 2는 도 1에서 사용되는 저주파노치필터의 응답특성 곡선도이다.FIG. 2 is a response curve diagram of a low frequency notch filter used in FIG. 1.

본 발명은 이동통신 단말기용 디지털 오디오신호의 전처리 방법에 관한 것으로, 특히 이동통신 단말기에서 사용되는 보코더를 통해서 디지털 오디오 파일을 복원할 때 발생하는 음질 저하를 방지하기 위하여 음성 압축을 행하기 전에 소정의 전처리를 수행하는 이동통신 단말기용 디지털 오디오신호의 전처리 방법에 관한 것이다.The present invention relates to a method for preprocessing a digital audio signal for a mobile communication terminal. In particular, the present invention relates to a method for preprocessing a digital audio signal before performing voice compression in order to prevent a sound deterioration that occurs when the digital audio file is restored through a vocoder used in the mobile communication terminal. A method of preprocessing a digital audio signal for a mobile communication terminal which performs preprocessing.

이동통신 시스템의 음성 채널 대역폭은 유선통신 시스템의 64kbps에 비해서 매우 작기 때문에 음성 신호는 압축하여 전송된다. 현재 이동통신 시스템에서 사용되는 음성압축기법으로는 IS-95의 QCELP(Qualcomm Code Excited Linear Prediction)나 EVRC(Enhanced Variable Rate Coding) 등이 대표적인데, 모두 LPC(Linear Prediction Coding; 선형 예측 부호화) 분석 방법을 기반으로 하는 공통점을 가지고 있다. 이러한 LPC 계역의 음성 압축 기법은 사람의 발성 구조에 최적화된 모델을 사용하고 있어서 사람의 음성을 중전송률이나 저전송률로 압축하는데 매우 효율적이다. 또한 스펙트럼의 효율을 높이고 시스템의 소비전력을 줄이기 위해서 사람이 말을 할 때만 신호를 압축하여 전달하고 사람이 말을 하지 않을 때는 신호를 전달하지 않는 방법을 사용하고 있다.Since the voice channel bandwidth of the mobile communication system is very small compared to 64 kbps of the wire communication system, the voice signal is compressed and transmitted. Currently, voice compression techniques used in mobile communication systems include ISCE's QCELP (Qualcomm Code Excited Linear Prediction) and EVRC (Enhanced Variable Rate Coding), all of which are LPC analysis methods. It has something in common based on it. The LPC-based speech compression technique uses a model optimized for human speech structure and is very efficient for compressing human speech at medium or low rates. In addition, to improve the efficiency of the spectrum and reduce the power consumption of the system, a method of compressing and transmitting a signal only when a person speaks and not transmitting a signal when a person does not speak.

한편 근래 들어서는 이동통신 단말기의 대중화와 함께 이동통신 단말기를 이용한 멀티미디어의 사용도 일상화되어가고 있다. 그러나 현재의 이동통신 단말기 기술로는 대량의 정보 전송 및 고속의 데이터 처리에 한계가 있어서 고용량, 고품질의 멀티미디어 데이터를 다루는 데는 한계가 있다. 디지털 오디오 데이터 또한 멀티미디어 데이터의 일종으로 현재 유선 상에서 일상화된 wav 포맷은 고용량을 필요로 하며, mp3 포맷 또한 비교적 고용량과 높은 데이터 처리속도를 요구하고 있다. 따라서 저용량 및 낮은 데이터 처리속도의 장점을 갖는 이동통신 단말기 내장 보코더(Vocoder)를 이용해 오디오 데이터를 재생하는 방법이 이용되고 있다.On the other hand, with the recent popularization of mobile communication terminals, the use of multimedia using mobile communication terminals is becoming more common. However, current mobile communication terminal technologies have limitations in handling large amounts of data and high quality multimedia data due to limitations in the transmission of large amounts of information and high-speed data processing. Digital audio data is also a kind of multimedia data. The wav format, which is currently used on wires, requires high capacity, and the mp3 format also requires relatively high capacity and high data processing speed. Therefore, a method of reproducing audio data using a mobile communication terminal built-in Vocoder having advantages of low capacity and low data processing speed has been used.

그러나 휴대폰 내장 보코더를 이용하면 저용량 및 낮은 처리속도의 장점이 있는 것은 사실이나 음성 위주의 신호 압축 및 복원의 특성상 오디오 신호를 다루는 데에는 다소 부적합하기 때문에 보코더에 오디오 데이터를 통과시킨 결과는 상당 수준 왜곡되고 품질이 떨어지는 결과를 보이는 문제점이 있었다.However, the use of a mobile phone's built-in vocoder has the advantages of low capacity and low throughput, but due to the nature of voice-driven signal compression and reconstruction, it is somewhat unsuitable for handling audio signals. There was a problem of poor quality.

본 발명은 전술한 문제점을 해결하기 위해 안출된 것으로서, 음성 압축을 행하기 전에 저역 성분의 제거 및 볼륨 제어를 수행함으로써 이동통신 단말기에서 사용되는 보코더를 통해서 디지털 오디오 파일을 복원할 때 발생하는 음질 저하를 방지할 수 있도록 한 이동통신 단말기용 디지털 오디오신호의 전처리 방법을 제공하는데 그 목적이 있다.SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problem, and the sound quality deterioration occurring when restoring a digital audio file through a vocoder used in a mobile communication terminal by removing low frequency components and performing volume control before performing voice compression. It is an object of the present invention to provide a method for preprocessing a digital audio signal for a mobile communication terminal to prevent the error.

전술한 목적을 달성하기 위한 본 발명은 디지털 오디오신호를 이동통신 단말기용 인코더에 의해 압축하기 전에 소정의 전처리를 수행하는 이동통신 단말기용 디지털 오디오신호의 전처리 방법에 있어서, (a) 압축에 사용될 디지털 오디오 데 이터를 입력받는 단계; (b) 상기 단계 (a)에서 입력받은 디지털 오디오 데이터에서 소정 주파수 이하의 저주파 대역 성분을 제거하는 단계; (c) 상기 저주파 대역 성분이 제거된 디지털 오디오 데이터에 대해 미리 정해진 시간 간격으로 음성 에너지를 계산하는 단계 및 (d) 상기 계산된 음성 에너지가 미리 정해진 기준치보다 큰 경우에는 미리 정해진 증감률에 의해 볼륨을 감소시키고, 작은 경우에는 상기 증감률에 의해 볼륨을 증가시키는 단계를 포함하여 이루어진 것을 특징으로 한다.In order to achieve the above object, the present invention provides a method for preprocessing a digital audio signal for a mobile communication terminal which performs a predetermined preprocessing before the digital audio signal is compressed by the encoder for the mobile communication terminal. Receiving audio data; (b) removing low frequency band components below a predetermined frequency from the digital audio data input in step (a); (c) calculating voice energy at predetermined time intervals for the digital audio data from which the low frequency band component has been removed; and (d) if the calculated voice energy is larger than a predetermined reference value, the volume is determined by a predetermined increase and decrease rate. It is characterized in that it comprises a step of reducing, and in the case of small, increasing the volume by the increase and decrease rate.

이하에는 첨부한 도면을 참조하여 본 발명의 바람직한 실시예에 따른 이동통신 단말기용 디지털 오디오신호의 전처리 방법에 대해서 상세하게 설명한다.Hereinafter, a method of preprocessing a digital audio signal for a mobile communication terminal according to a preferred embodiment of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 이동통신 단말기용 디지털 오디오신호의 전처리 방법을 설명하기 위한 흐름도이다. 도 1에 도시한 바와 같이, 이동통신 단말기에서 사용될 수 있는 디지털 오디오신호를 생성하기 위해 단계 S10에서는 QCELP나 EVRC와 같은 음성압축 인코더의 입력 포맷으로 적당한 wav 또는 raw 디지털 오디오 데이터를 소스로 지정한다.1 is a flowchart illustrating a method of preprocessing a digital audio signal for a mobile communication terminal according to the present invention. As shown in FIG. 1, in order to generate a digital audio signal that can be used in a mobile communication terminal, in step S10, a suitable wav or raw digital audio data is designated as a source as an input format of a voice compression encoder such as QCELP or EVRC.

한편 현재 이동통신 단말기에서 사용되는 보코더의 특징이 목소리에 적합한 압축 알고리즘을 통한 데이터량의 감소에 있는바, 목소리의 특성에 맞지 않는 신호에 대해서 보코더는 추가적인 부가 정보 데이터를 필요로 하며, 일정한 데이터 량을 유지해야 하는 제약조건에 따라 목소리 이외의 부가정보 데이터는 경우에 따라 버리게 된다. 따라서 오디오 신호에 대해서 보코더는 효율적인 데이터 압축을 하지 못하게 된다. 또한 보코더의 특성상 소정 주파수 이하의 대역, 예를 들어 100Hz 이하의 매우 낮은 저주파 대역에 대해서는 압축 효율이 높지 않아서 추가적인 압축 정보를 필요로 하는 반면에 일상적인 이동통신 단말기의 재생 환경에서 중,고주파 대력의 오디오 신호가 청각적으로 듣기 좋기 때문에 이를 감안하여 단계 S12에서는 저주파노치필터를 사용하여 소정 주파수, 예를 들어 200Hz 이하의 주파수 대역 성분에 대해 점진적으로 감쇠시켜 제거한다. 도 2는 도 1에서 사용되는 저주파노치필터의 응답특성 곡선도이다. 도 2에 도시한 바와 같이, 저주파노치필터에 의해 입력되는 디지털 오디오 데이터 중에서 200Hz 이하의 주파수 성분은 그 이득이 점진적으로 감소되어 제거됨을 알 수가 있다.On the other hand, the characteristic of the vocoder used in the mobile communication terminal is the reduction of the amount of data through the compression algorithm suitable for the voice. For the signals that do not match the characteristics of the voice, the vocoder requires additional information data, and the amount of constant data In some cases, additional information data other than voices may be discarded according to the constraint of maintaining the. As a result, the vocoder does not provide efficient data compression for audio signals. In addition, due to the characteristics of the vocoder, the compression efficiency is not high for a band below a predetermined frequency, for example, a very low low frequency band of 100 Hz or less, and thus additional compression information is required. In consideration of this, the audio signal is audibly audible, and in step S12, a low frequency notch filter is used to gradually attenuate and remove a frequency band component having a predetermined frequency, for example, 200 Hz or less. FIG. 2 is a response curve diagram of a low frequency notch filter used in FIG. 1. As shown in FIG. 2, it can be seen that frequency components of 200 Hz or less in the digital audio data inputted by the low frequency notch filter are gradually decreased to remove the gain.

한편 QCELP나 EVRC와 같은 압축 알고리즘은 인간의 목소리에 적합하도록 구현되어 있기 때문에 너무 크거나 작은 레벨을 갖는 오디오신호를 코딩하기에는 적합하지 않다. 특히 이러한 오디오신호가 음악일 경우에는 그 효율이 더욱 크게 떨어지게 된다. 이를 감안하여 입력되는 오디오 데이터의 상태에 따라 유동적으로 볼륨을 증가시키거나 감소시키는 알고리즘을 적용하여 음성 압축의 음질 효율을 최대로 만들어주는데, 이 때 볼륨의 증가와 감소는 미리 정해진 시간 간격(T), 예를 들어 50msec의 시간 간격으로 일정하게 이루어지며, 나아가 직전의 오디오 데이터의 크기에 의해 증가 또는 감소를 결정한다.On the other hand, compression algorithms such as QCELP and EVRC are not suitable for coding audio signals with too large or too small levels because they are implemented for human voices. In particular, when the audio signal is music, the efficiency is further reduced. In consideration of this, an algorithm that dynamically increases or decreases the volume according to the state of the input audio data is applied to maximize the sound quality efficiency of the voice compression, in which the increase and decrease of the volume are a predetermined time interval (T). For example, it is made constant at a time interval of, for example, 50 msec, and further increases or decreases by the size of the immediately preceding audio data.

이를 상세하게 설명하면, 먼저 단계 S14에서는 현재의 시간 간격(t-T, T)까지의 음성 에너지(

)를 계산하고, 다시 단계 S16에서는 이렇게 계산된 음성 에너지(

)를 이상적인 음성 에너지로 미리 정해진 소정의 기준치(

)와 비교한다.In detail, first, in step S14, the voice energy up to the current time interval tT, T (

), And in step S16, the calculated voice energy (

) Is a predetermined reference value (

).

단계 S16에서의 비교 결과, 현재의 음성 에너지(

)가 상기 기준치(

)보다 큰 경우에는 단계 S18로 진행하여 아래의 수학식 1과 같이 현재의 음성 에너지(

)의 크기를 1로 하고 여기에 미리 정해진 증감률(

)를 감한 값을 현재의 음성 에너지(

)에 곱하여 이 시간 간격에 대한 음성 에너지로 잠정 결정한다.As a result of the comparison in step S16, the current voice energy (

) Is the reference value (

If larger than), go to step S18 and the current voice energy (

) Is set to 1, and the predetermined increase and decrease rate (

) Subtracts the current voice energy (

Multiply by) to determine tentatively the voice energy for this time interval.

다음으로 단계 S20에서는 이렇게 잠정 결정된 음성 에너지(

)가 음성 에너지의 최대 허용치(

)보다 큰 지를 판단하는데, 큰 경우에는 음성 에너지(

)를 최대 허용치(

) 이내로 제한하기 위하여 단계 S22로 진행하여 아래의 수학식 2에서와 같이 최대 허용치(

)를 현재의 음성 에너지(

)로 결정하게 된다.Next, in step S20, the tentatively determined voice energy (

) Is the maximum allowable amount of voice energy (

), If it is greater than the voice energy (

) Is the maximum allowed value (

In order to limit to within), proceed to step S22 and the maximum allowable value (

) Is the current voice energy (

Is determined by).

한편 단계 S20에서의 판단 결과, 잠정 결정된 음성 에너지(

)가 최대 허용치(

)보다 작은 경우에는 단계 S24로 진행하여 잠정 결정된 음성 에너지(

)를 최종적인 음성 에너지로 결정하게 된다.On the other hand, as a result of the determination in step S20, the tentatively determined voice energy (

) Is the maximum allowable value (

If smaller than), proceed to step S24 to determine the tentatively determined voice energy (

) Is the final voice energy.

한편 단계 S16에서의 비교 결과, 현재의 음성 에너지(

)가 상기 기준치(

)보다 작은 경우에는 단계 S26으로 진행하여 아래의 수학식 3과 같이 현재의 음성 에너지(

)의 크기를 1로 하고 여기에 미리 정해진 증감률(

)를 더한 값을 현재의 음성 에너지(

)에 곱하여 이 시간 간격에 대한 음성 에너지로 잠정 결정한다.On the other hand, as a result of the comparison in step S16, the current voice energy (

) Is the reference value (

If less than), go to step S26 to the current voice energy (

) Is set to 1, and the predetermined increase and decrease rate (

) Plus the current voice energy (

Multiply by) to determine tentatively the voice energy for this time interval.

다음으로 단계 S28에서는 이렇게 잠정 결정된 음성 에너지(

)가 음성 에너지의 최소 허용치(

)보다 작은 지를 판단하는데, 작은 경우에는 음성 에너지(

)를 최소 허용치(

) 이내로 제한하기 위하여 단계 S30으로 진행하여 아래의 수학식 4에서와 같이 최대 허용치(

)를 현재의 음성 에너지(

)로 결정하게 된다.Next, in step S28, the tentatively determined voice energy (

) Is the minimum allowable amount of negative energy (

), If it is smaller than the voice energy (

) Is the minimum allowed value (

In order to limit to within), proceed to step S30 and the maximum allowable value (

) Is the current voice energy (

Is determined by).

한편 단계 S28에서의 판단 결과, 잠정 결정된 음성 에너지(

)가 최소 허용치(

)보다 큰 경우에는 단계 S24로 진행하여 잠정 결정된 음성 에너지(

)를 최종적인 음성 에너지로 결정하게 된다.On the other hand, as a result of the determination in step S28, the tentatively determined voice energy (

) Is the minimum allowed value (

If larger than), proceed to step S24 to determine the tentatively determined voice energy (

) Is the final voice energy.

이와 같이 하여 최종적인 음성 에너지가 결정되게 되면, 이러한 음성 에너지를 갖도록 오디오 데이터의 볼륨을 증가 또는 감소시키고, 이어서 QCELP나 EVRC와 같은 인코더를 통과시켜 압축을 수행한 후에 자체 저장하거나 유/무선 전송 장치를 통하여 외부로 전송하게 된다. 그리고 이렇게 압축된 데이터는 이동통신 단말기에 기구비된 보코더를 통해 복원된 후에 재생 장치를 통해 음파로 출력되는데, 이에 따라 음질의 저하나 왜곡이 감소된 소리가 음향이 재생되게 된다.When the final voice energy is determined in this way, the volume of the audio data is increased or decreased to have this voice energy, and then passed through an encoder such as QCELP or EVRC to perform compression and then store itself or wire / wireless transmission device. It is transmitted to the outside through. The compressed data is reconstructed through the vocoder instrumented in the mobile communication terminal and then output as sound waves through the reproducing apparatus. As a result, the sound of which sound quality is reduced or distortion is reduced is reproduced.

본 발명의 이동통신 단말기용 디지털 오디오신호의 전처리 방법은 전술한 실시예에 국한되지 않고 본 발명의 기술 사상이 허용하는 범위 내에서 다양하게 변형하여 실시할 수가 있다.The pre-processing method of the digital audio signal for a mobile communication terminal of the present invention is not limited to the above-described embodiment and can be modified in various ways within the scope of the technical idea of the present invention.

이상에서 설명한 바와 같이 본 발명의 이동통신 단말기의 디지털 오디오신호의 전처리 방법에 따르면, 디지털 오디오신호를 압축할 때 이동통신 단말기에 구비된 보코더의 특성을 감안하여 적절한 전처리를 수행함으로써 이렇게 압축된 오디오신호가 이동통신 단말기에서 복원 및 재생될 때 음질의 저하나 왜곡이 감소된 오디오 신호를 감상할 수 있는 효과가 있다.As described above, according to the digital audio signal preprocessing method of the mobile communication terminal of the present invention, the compressed audio signal is compressed by performing appropriate preprocessing in consideration of the characteristics of the vocoder provided in the mobile communication terminal. When the audio is restored and reproduced in the mobile communication terminal, there is an effect that the audio signal with reduced degradation or distortion can be enjoyed.

Claims

A method of preprocessing a digital audio signal for a mobile communication terminal which performs a predetermined preprocessing before the digital audio signal is compressed by the encoder for the mobile communication terminal.

(a) receiving digital audio data to be used for compression;

(b) removing low frequency band components below a predetermined frequency from the digital audio data input in step (a);

(c) calculating voice energy at predetermined time intervals for the digital audio data from which the low frequency band component has been removed; and

(d) reducing the volume by a predetermined increase / decrease rate when the calculated voice energy is greater than a predetermined reference value, and increasing the volume by the increase / decrease rate when small. Preprocessing method of digital audio signal for a terminal.

The method of claim 1, wherein the predetermined frequency in step (b) is 200 Hz.

3. The method according to claim 1 or 2, wherein in the step (d), the volume determined by the increase and decrease rate is determined as the maximum allowable value when the volume is larger than the maximum allowable value of the voice energy, and the minimum when the volume is smaller than the minimum allowable value. A method for preprocessing a digital audio signal for a mobile communication terminal, comprising the step of determining the tolerance value.