KR100246370B1

KR100246370B1 - Adaptive orthogonalization coding method of audio signal

Info

Publication number: KR100246370B1
Application number: KR1019920009565A
Authority: KR
Inventors: 안한준
Original assignee: 구자홍; 엘지전자주식회사
Priority date: 1992-06-02
Filing date: 1992-06-02
Publication date: 2000-03-15
Also published as: KR940001115A

Abstract

본 발명은 디지털 오디오신호의 처리 기술에 관한 것으로, 비가청신호를 제거함에 있어서, 입력신호를 주파수 대역별로 변환한 후, 귀의 청각특성을 이용하여 들리지않는 주파수 성분을 제거함으로써 고음질을 보장하는 동시에 비트비를 저하시켜 즉, 압축부호화를 행하여 동시에 많은양의 데이터를 전송할 수 있게 되고, 단위면적당 많은 양의 데이터를 기록할 수 있게한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a technique for processing digital audio signals. In removing an inaudible signal, the present invention converts an input signal for each frequency band and removes inaudible frequency components using the ear's auditory characteristics to ensure high sound quality and at the same time. In other words, it is possible to transmit a large amount of data at the same time by performing compression encoding and to record a large amount of data per unit area.

Description

Adaptive Orthogonal Transform Coding Method of Audio Signal

제1도는 일반적인 오디오신호 직교변환 블록도.1 is a general audio signal orthogonal transform block diagram.

제2도는 실제 가청 드래쉬홀드 특성 그래프.2 is a graph of the actual audible threshold characteristics.

제3도는 근사 가청 드레쉬홀드 특성그래프.3 is an approximate audible threshold characteristic graph.

제4도는 오디오신호 적응직교변환 부호화 방법이 적용되는 오디오신호 처리 블록도.4 is an audio signal processing block diagram to which an audio signal adaptive orthogonal transformation coding method is applied.

제5도는 상대마스킹에 의해 변형된 가청 드레쉬홀드 특성 그래프.5 is an audible threshold characteristic graph modified by relative masking.

* 도면의 주요부분에 대한 부호의 설명* Explanation of symbols for main parts of the drawings

11 : 윈도우 12 : 직교 변환부11: Windows 12: Orthogonal Converter

13 : 파워 산출부 14 : 가청 드레쉬홀드 계산부13 power calculation unit 14 audible threshold calculation unit

15 : 비가청신호 제거부 16 : 양자화부15: non-audible signal removing unit 16: quantization unit

17 : 부호화부 18 : 부대역별 피크치 검출부17 coding unit 18 sub-band peak value detection unit

19 : 상대 마스킹레벨 계산부19: relative masking level calculator

본 발명은 디지털 오디오신호의 처리 기술에 관한 것으로, 특히 상대 마스킹을 적용하여 리던던시(Redundancy)를 보다 많이 제거하는데 적당하도록 한 오디오신호의 적응직교변환 부호화 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a technique for processing digital audio signals, and more particularly, to an adaptive orthogonal transformation encoding method of an audio signal adapted to remove more redundancy by applying relative masking.

제1도는 일반적인 오디오신호 직교변환 블록도로서 이에 도시한 바와 같이, 입력신호(V_in)를 직교변환 블록 길이별로 분리하는 윈도우(1)와, 상기 윈도우(1)의 출력신호를 직교변환하는 직교변환부(2)와, 상기 윈도우(1)의 출력신호에 대한 평균파워를 계산하는 파워 산출부(3)와, 상기 파워 산출부(3)의 출력신호에서 가청드레쉬헐드값을 계산하는 가청 드레쉬홀드 계산부(4)와, 상기 직교변환부(2)의 출력신호와 상기 가청 드레쉬홀드 계산부(4)의 출력신호를 비교하여 가청 주파수대 이외의 신호를 제거하는 비가청신호 제거부(5)와, 상기 비가청신호 제거부(5)의 출력신호를 양자화시키는 양자화부(6)와, 상기 양자화부(6)의 출력신호를 부호화하는 부호화부(7)로 구성된 것으로, 이와 같이 구성된 종래의 부호화 방법을 제2도 및 제3도로 참조하여 설명하면 다음과 같다.1 is a general audio signal orthogonal transform block diagram. As shown in FIG. 1, a window 1 for separating an input signal V _in for an orthogonal transform block length and an orthogonal transform for an orthogonal transform of an output signal of the window 1 are shown in FIG. An audible drawer for calculating an audible threshold value from an output signal of the converter 2, a power calculator 3 that calculates an average power of the output signal of the window 1 A non-audible signal removing unit 5 for removing signals other than the audible frequency band by comparing the hold calculation unit 4 and the output signal of the orthogonal transformation unit 2 with the output signal of the audible threshold hold calculation unit 4. ), A quantization unit 6 for quantizing the output signal of the inaudible signal removing unit 5, and an encoding unit 7 for encoding the output signal of the quantization unit 6. The encoding method will be described with reference to FIGS. 2 and 3 Well below.

입력신호(V_in)가 윈도우(1)에서 직교변환 블록 길이별로 분리된 후, 직교변환부(2)에 의해 직교변환되며, 한편으로는 파워산출부(3)에 공급되어 여기서 평균 파워가 산출되며, 가청 드레쉬홀드 계산부(4)는 그 산출된 평균파워 레벨에서 가청 드레쉬홀드값을 계산한다.After the input signal V _in is separated by the orthogonal transform block length in the window 1, it is orthogonally transformed by the orthogonal transform unit 2 and supplied to the power calculating unit 3, where the average power is calculated. The audible threshold calculation unit 4 calculates the audible threshold value at the calculated average power level.

그리고, 비가청신호 제거부(5)는 상기 직교변환부(2)에서 출력되는 신호와 가청 드레쉬홀드 계산부(4)에서 출력되는 드레쉬홀드레벨을 비교하여 가청주파수대 이외의 신호를 제거하고, 가청주파수대의 신호를 출력하게 되며, 이렇게 처리된 신호는 다시 양자화부(6)를 통해 양자화된 다음, 부호화부(7)를 통해 부호화되며, 이때 사용된 가청주파수대의 드레쉬홀드 특성 곡선은 제2도와 같고, 제2도에서 사용된 음압에 맞추어 스켈링된 근사 가청 드레쉬홀드 특성 곡선은 제3도와 같다.In addition, the inaudible signal removing unit 5 compares the signal output from the orthogonal transformation unit 2 with the threshold level output from the audible threshold calculation unit 4 to remove signals other than the audible frequency band, The audio frequency band signal is output, and the processed signal is quantized again by the quantization unit 6 and then encoded by the encoding unit 7. At this time, the threshold characteristic curve of the used audio frequency band is The approximate audible threshold characteristic curve skewed to the sound pressure used in FIG. 2 is shown in FIG.

여기서, 가청 드레쉬홀드 곡선은 입력신호의 평균 파워에 의해 적응적으로 제어되며, 그 파워레벨은 다음의 (식 1)에 의해 매 변환 블록당의 파워가 구해진 후, 한 개 이상의 블록당 파워의 평균값이 구해진다.Here, the audible threshold curve is adaptively controlled by the average power of the input signal, and the power level is the average value of the power per one or more blocks after the power per conversion block is obtained by the following equation (1). Is obtained.

단, 여기서 P_i: 블록당 파워, N : 변환 블록의 길이, a_ij: i번째 블록의 j번째 주파수 성분의 값이다.Where P _{i is the} power per block, N is the length of the transform block, and a _ij is the value of the j th frequency component of the i th block.

제2도와 제3도 사이의 스켈링을 위해 최대 스켈링치(16라인의 경우 2¹⁵)를 120dB(최대치)에 대응시키며, 후술할 식 2)를 이용하여 상기 (식 1)에서 구한 파워레벨에 대한 1KHZ 대역의 근사 가청 드레쉬홀드 레벨(TH)을 계산한다.The maximum scaling value (2 ^{15 for} 16 lines) corresponds to 120 dB (maximum value) for scaling between FIG. 2 and FIG. 3, and the power level obtained from Eq. Calculate the approximate audible threshold level (TH) in the 1KHZ band.

여기서 간과할 수 없는 사실은 인간의 귀는 가청 한계 드레쉬홀드 이하의 음압을 가지는 주파수 성분을 듣지못하고(절대 마스킹), 또한 근처의 대역에 음압이 큰 성분이 존재할 때, 주변의 다른 주파수 성분도 듣지못한다(상대 마스킹)는 것이다. 그러나 종래의 부호화 방법에 있어서는 절대 마스킹 성분에 대해서만 들리지않는 성분을 제거하므로 실제로 상대 마스킹에 의해 들리지않는 주파수 성분을 다수 부호화하게 되어 그 결과 부호화 효율이 떨어지게 되는 결함이 있었다.What cannot be overlooked here is that the human ear does not hear frequency components with sound pressures below the audible threshold threshold (absolute masking), and also when other components in the nearby bands have high sound pressure, they do not hear other frequency components around them. It is not possible (relative masking). However, in the conventional coding method, since the components that are not heard only for the absolute masking components are removed, a large number of frequency components that are actually inaudible due to relative masking are encoded, resulting in a poor coding efficiency.

본 발명은 이와 같은 종래의 결함을 해결하기 위하여 절대 마스킹 및 상대적 마스킹 모두에 대해 들리지않는 성분을 제거하는 방법을 창안한 것으로, 이를 첨부한 도면에 의하여 설명한다.In order to solve such a conventional defect, the present invention has been devised a method for removing an inaudible component for both absolute and relative masking, which will be described with reference to the accompanying drawings.

제4도는 본 발명의 오디오신호 적응직교변환 부호화 방법이 적용되 오디오신호 처리 시스템의 블록도로서 이에 도시한 바와 같이, 입력신호(V_in)를 직교변환 블록 길이별로 분리하는 윈도우(11)와, 상기 윈도우(11)의 출력신호를 직교변환하는 직교변환부(12)와, 상기 윈도우(11)의 출력신호에 대한 평균파워를 계산하는 파워 산출부(13)와, 상기 파워 산출부(13)의 출력신호에서 가청 드레쉬헐드값을 계산하는 가청 드레쉬홀드 계산부(14)와, 상기 직교변환된 스펙트럼 신호를 귀의 임계 대역에 맞추어 부대역으로 분류하고, 그 부대역별로 피크치를 검출하는 부대역별 피크치 검출부(18)와, 상기 부대역별 피크치 검출부(18)에서 검출된 피크치에 따라 상대 마스킹레벨을 계산하는 상대 마스킹레벨 계산부(19)와, 상기 상대 마스킹레벨 계산부(19) 및 가청 드레쉬홀드 계산부(14)의 출력에 의해 변형된 가청 드레쉬홀드 레벨을 결정하고, 그 레벨 이하의 주파수성분을 제거하는 비가청신호 제거부(15)와, 상기 비가청신호 제거부(15)의 출력신호를 양자화시키는 양자화부(16)와, 상기 양자화부(16)의 출력신호를 부호화하는 부호화부(17)로 구성한 것으로, 이와같이 구성한 본 발명을 제5도를 참조하여 상세히 설명하면 다음과 같다.4 is a block diagram of an audio signal processing system to which the audio signal adaptive orthogonal transformation encoding method of the present invention is applied. As shown in FIG. 4, a window 11 for dividing an input signal V _in by an orthogonal transform block length is shown. An orthogonal transform unit 12 for orthogonally converting an output signal of the window 11, a power calculator 13 for calculating an average power of the output signal of the window 11, and a power calculator 13; An audible threshold calculation unit 14 that calculates an audible threshold value from an output signal, and sub-band peak values for classifying the orthogonal transformed spectrum signal into subbands according to a critical band of the ear and detecting peak values for the subbands A detector 18, a relative masking level calculator 19 for calculating a relative masking level according to the peak value detected by the subband peak value detector 18, the relative masking level calculator 19 and an audible dress An inaudible signal removal unit 15 that determines the audible threshold level modified by the output of the decalculation unit 14 and removes frequency components below the level, and the output signal of the inaudible signal removal unit 15. It consists of a quantization unit 16 for quantizing the quantization unit 16, and an encoding unit 17 for encoding the output signal of the quantization unit 16, the present invention configured as described above in detail with reference to FIG.

입력신호(V_in)가 윈도우(11)에서 직교변환 블록 길이별로 분리한 후, 직교변환부(12)에 의해 직교변환되고, 한편으로는 파워산출부(13)에 공급되어 평균 파워가 산출되며, 가청 드레쉬홀드 계산부(4)는 그 산출된 평균 파워 레벨에서 가청 드레쉬홀드값(절대마스킹)을 계산한다.After the input signal (V _in ) is separated by the orthogonal transform block length in the window 11, it is orthogonally transformed by the orthogonal transform unit 12, and supplied to the power calculating unit 13 to calculate the average power. The audible threshold calculation unit 4 calculates an audible threshold value (absolute masking) at the calculated average power level.

한편, 부대역별 피크치 검출부(18)는 상기 직교변환부(12)에서 출력되는 변환된 스펙트럼 신호를 귀의 임계대역에 맞추어 부대역으로 분류함과 아울러 부대역별 피크치를 계산하며, 상기 마스킹레벨 계산부(19)는 이렇게 검출된 피크치에 따라 상대 마스킹레벨을 구하게 되는데, 상개 마스킹이란 절대 마스킹 이상의 파워를 가진 성분이라도 근처의 큰 파워성분에 의해 들리지 않게되는 것을 의미한다.Meanwhile, the subband peak value detector 18 classifies the converted spectral signal output from the orthogonal transform unit 12 into subbands according to the threshold band of the ear and calculates the peak value for each subband, and calculates the masking level calculator ( 19) calculates the relative masking level according to the detected peak value. The upper masking means that even a component having a power higher than the absolute masking cannot be heard by a nearby large power component.

그 상대 마스킹레벨은 제5도에서와 같이, 주파수 대역을 사람의 귀로 인식할 수 있는 임계대역(A₁, A₂, A₃……A_n)에 의거하여 부대역으로 분류하고, 각 부대역에서의 피크치를 계산하여 저주파 대역으로는 25dB/OCT의 기울기로, 고주파 대역으로는 -10dB/OCT의 기울기로 감소하는 레벨을 설정하여 구해진다.The relative masking level is classified into subbands based on the critical bands A ₁ , A ₂ , A ₃ ... A _n , which can recognize the frequency band as the human ear, as shown in FIG. 5. The peak value at is calculated by setting a decreasing level at a slope of 25 dB / OCT in the low frequency band and a slope of -10 dB / OCT in the high frequency band.

이후, 비가청신호 제거부(15)는 상기 가청드레쉬홀드 계산부(14)에서 출력되는 절대 마스킹레벨과 상기 상대 마스킹레벨 계산부(19)에서 출력되는 부대역의 상대 마스킹레벨, 이전/이후 부대역의 상대 마스킹레벨중에서 가장 큰값을 선택하여 제5도의 점선과 같은 변형된 가청 드레쉬홀드 곡선을 구한 후, 상기 직교변환부(12)의 출력중에서 그 곡선 이하의 파워를 갖는 주파수 성분을 제거하고, 그 나머지의 성분이 양자화부(16)에 의해 양자화된 다음, 부호화부(17)에 의해 부호화되어 출력 된다.Subsequently, the inaudible signal removing unit 15 is a relative masking level of the absolute masking level output from the audible threshold calculation unit 14 and the subbands output from the relative masking level calculating unit 19, before / after the unit. After selecting the largest value among the inverse relative masking levels to obtain a modified audible threshold curve such as the dotted line of FIG. 5, the frequency component having the power below the curve is removed from the output of the orthogonal transformation unit 12. The remaining components are quantized by the quantization unit 16 and then encoded and output by the encoding unit 17.

이상에서 상세히 설명한 바와 같이, 본 발명은 기존의 절대마스크 레벨과 더불어 상대 마스킹레벨을 적용함으로써 입력신호의 리던던시를 한층 제거할 수 있게 되고, 이에따라 음질의 열화가 최소화되어 비트비(Bit Rate)를 낮출 수 있는 효과가 있다.As described in detail above, the present invention can further remove redundancy of the input signal by applying a relative masking level in addition to the existing absolute mask level, thereby minimizing deterioration of sound quality and thus lowering the bit rate. It can be effective.

Claims

An adaptive orthogonal transformation encoding method of an audio signal, characterized in that an inaudible frequency component is removed by applying a relative masking level during orthogonal transformation encoding of an audio signal using an auditory signal.

The method of claim 1, wherein when applying the relative masking level, the spectral component is classified into subbands based on the threshold band of the ear, and then the peak value of each subband is detected, and the detected peak value is determined. An adaptive orthogonal transformation encoding method of an audio signal, characterized by setting the decreasing to a constant slope as a relative masking level.