KR0181061B1

KR0181061B1 - Adaptive digital audio encoding apparatus and a bit allocation method thereof

Info

Publication number: KR0181061B1
Application number: KR1019950007286A
Authority: KR
Inventors: 김종일
Original assignee: 배순훈; 대우전자주식회사
Priority date: 1995-03-31
Filing date: 1995-03-31
Publication date: 1999-05-01
Also published as: KR960036723A

Abstract

입력 디지탈 오디오 신호를 적응적으로 부호화할 수 있는 신규한 장치는 상기 디지탈 오디오 신호의 각 서브밴드들에 대한 신호대 마스크 비 데이터, 음압 레벨들 및 마스킹 문턱치들을 측정하는 제1측정기 ; 상기 입력 디지탈 오디오 신호의 각 프레임들에 대한 인지 정보량들을 상기 측정된 신호대 마스크 비 데이터, 음압 레벨들 및 마스킹 문턱치들에 의거하여 측정하며, 상기 측정된 인지 정보량들에 대응하는 적어도 두 개 이상의 현재 빛 이전 프레임들을 포함하는 프레임 그룹에 대한 평균 및 표준 편차 파라메터들을 측정하는 제2측정기 ; 상기 각 서브밴드들에 대한 비트를 상기 측정된 신호대 마스크 비 데이터, 인지 정보량들, 그리고 평균 및 표준편차 파라메터들에 의거하여 적응적으로 결정하며, 상기 각 서브밴드들에 대해 결정된 비트에 상응하는 비트 할당 정보를 발생하는 비트 할당 수단과 ; 상기 필터된 각 서브밴드 오디오 신호를 상기 발생된 각 서브밴드에 대한 비트 할당 정보에 응답하여 양자화하는 양자기 ; 및 상기 양자화된 오디오 신호와 상기 발생된 비트 할당 정보들을 포맷하는 회로를 구비한다.The novel apparatus capable of adaptively encoding an input digital audio signal comprises: a first meter for measuring signal-to-mask ratio data, sound pressure levels and masking thresholds for respective subbands of the digital audio signal; Measure the amount of cognitive information for each frame of the input digital audio signal based on the measured signal-to-mask ratio data, sound pressure levels and masking thresholds, and at least two or more current lights corresponding to the measured cognitive information amounts. A second meter for measuring mean and standard deviation parameters for a frame group containing previous frames; A bit for each subband is adaptively determined based on the measured signal-to-mask ratio data, cognitive information quantities, and average and standard deviation parameters, and a bit corresponding to the determined bit for each subband. Bit allocation means for generating allocation information; A quantizer for quantizing the filtered subband audio signals in response to bit allocation information for the generated subbands; And circuitry for formatting the quantized audio signal and the generated bit allocation information.

Description

Adaptive Digital Audio Coding Device and Its Bit Allocation Method

제1도는 본 발명에 따른 입력 디지탈 오디오 신호를 적응적으로 부호화 하는 신규한 장치 및 비트 할당 방법을 개략적으로 예시하는 도면.1 schematically illustrates a novel apparatus and bit allocation method for adaptively encoding an input digital audio signal according to the present invention.

제2도는 제1도에 도시된 제1비트 할당 유니트의 상세 블럭도.FIG. 2 is a detailed block diagram of the first bit allocation unit shown in FIG.

* 도면의 주요부분에 대한 부호의 설명* Explanation of symbols for main parts of the drawings

10 : 서브밴드 필터링 디바이스10: subband filtering device

20, 30 : 제1 및 제2인지 파라메터 측정기20, 30: first and second cognitive parameter measuring instrument

40, 50 : 제1 및 제2비트 할당 유니트40, 50: first and second bit allocation units

60 : 양자기 70 : 포맷팅 회로60: quantum 70: formatting circuit

본 발명은 디지탈 오디오 부호화 방법 및 그의 장치에 관한 것으로, 특히, 다수개의 프레임들을 갖는 입력 디지탈 오디오 신호를 인간이 청각특성에 부합하는 프레임들의 인지 정보량들에 의거하여 적응적으로 부호화하는 개선된 적응 디지탈 오디오 부호화 장치 및 그의 비트 할당 방법에 관한 것이다.TECHNICAL FIELD The present invention relates to a digital audio encoding method and apparatus therefor, and more particularly, to an improved adaptive digital encoding of an input digital audio signal having a plurality of frames based on cognitive information amounts of frames corresponding to an auditory characteristic by a human being. An audio encoding device and a bit allocation method thereof.

콤팩트 디스크(Compact Disk : CD) 및/또는 디지탈 오디오 테이프(Digital Audio Tape : DAT)플레이어와 같은 고음질을 달성하기 위해 디지탈화된 오디오 신호들을 전송한다. 오디오 신호가 디지탈 형태로 표현되어, 특히 고화질 텔레비젼(HDTV)에 사용될 때, 상당한 양의 데이터가 전송된다. 그러나, 오디오 신호에 할당된 가용한 대역폭은 제한되어 있기 때문에, 상당한 양, 예를들어, 48KHz로 샘플링된 16비트 펄스 부호 변조(PCM) 오디오 신호, 즉, 초당 768Kbits의 데이터를 제한된 오디오 대역, 예를들어, 약 128KHz로 전송하기 위해서는 압축해야 된다.Transmit digitalized audio signals to achieve high sound quality, such as Compact Disc (CD) and / or Digital Audio Tape (DAT) players. Audio signals are represented in digital form and, in particular, when used in high definition television (HDTV), a significant amount of data is transmitted. However, because the available bandwidth allocated to audio signals is limited, a significant amount of data, such as a 16-bit pulse code modulation (PCM) audio signal sampled at 48 KHz, i.e. 768 Kbits per second, is limited to a limited audio band, e.g. For example, to transmit at about 128KHz, it must be compressed.

다양한 오디오 압축 디바이스들 또는 기법들중에서, HDTV 응용을 위해 심리음향 알고리즘(Psychoacoustic Algorithm)을 사용하는 소위 MPEG(Moving Pictures Expert Group)-오디오 알고리즘이 제안되어 왔다.Among various audio compression devices or techniques, a so-called Moving Pictures Expert Group (MPEG) -audio algorithm has been proposed that uses Psychoacoustic Algorithm for HDTV applications.

MPEG-오디오 알고리즘은 서브밴드 샘플링, 심리음향 알고리즘, 양자화 및 부호화 그리고 프레임 포맷팅과 같은 4가지 주요 구성요소로 이루어져 있다. 서브 밴드 필터링은 입력 PCM 오디오 신호를 시간영역에서 주파수 영역으로 매핑하는 프로세스이다. B개(예를들면, 32) 서브밴드를 갖는 필터뱅크(Filter Bank)가 사용될 수도 있다. 각각의 서브밴드에서, 12또는 36샘플들이 그룹되어 처리되며, B개의 서브밴드로 부터의 그룹된 샘플들, 즉, B x 12 또는 36 샘플들이 한 프레임으로 구성된다. 각각의 프레임은 오디오 신호를 부호화, 전송 및 복호화하기 위한 처리 단위이다. 심리음향 모델링은 각각의 서브밴드 또는 서브샘플군에 대해, 양자화 및 부호화의 처리를 제어하기 위해 데이터, 예를들면, 신호대 마스크 비(Signal-to-Mask Ratio : SMR) 데이터의 세트를 생성한다. 그 다음에, 서브밴드 샘플들을 양자화 및 부호화는 단계에서, 생성된 SMR 데이터를 참조하여 가용한 비트를 한 프레임의 각각의 서브밴드에 적응적으로 할당하여 양자화 한다. 프레임 포맷은 프레임 데이터를 다른 필요한 부가정보와 함께 적절한 형태로 포맷하여 전송한다.MPEG-audio algorithms consist of four main components: subband sampling, psychoacoustic algorithms, quantization and coding, and frame formatting. Subband filtering is the process of mapping the input PCM audio signal from the time domain to the frequency domain. A filter bank with B (eg 32) subbands may be used. In each subband, 12 or 36 samples are grouped and processed, grouped samples from the B subbands, ie, B x 12 or 36 samples, in one frame. Each frame is a processing unit for encoding, transmitting and decoding an audio signal. Psychoacoustic modeling generates a set of data, eg, signal-to-mask ratio (SMR) data, for each subband or group of subsamples to control the processing of quantization and encoding. Subsequently, in the step of quantizing and encoding the subband samples, the available bits are quantized by adaptively allocating available bits to each subband of one frame with reference to the generated SMR data. The frame format formats and transmits the frame data in an appropriate form along with other necessary additional information.

그러나, 전술한 MPEG 오디오 기법은 각 프레임에 고정된 비트를 할당하기 때문에, 그기법에는 프레임들 사이에 연속적으로 변경될 수도 있는 입력 디지탈 오디오 신호의 평균(Mean)들, 표준 편차(Standard Deviation)들, 인지 정보량(Perceptual Entropy) 들과 같은 통계적 특성을 반영할 수 없다는 문제점이 내재되어 있다.However, since the aforementioned MPEG audio technique allocates a fixed bit to each frame, the technique includes averages and standard deviations of the input digital audio signal that may be changed continuously between the frames. However, there is a problem in that statistical characteristics such as perceptual entropy cannot be reflected.

따라서, 본 발명의 목적은 다수개의 프레임들을 갖는 입력 디지탈 오디오 신호를 인간의 청각특성에 부합하는 프레임들의 인지 정보량들에 의거하여 적응적으로 부호화하여 부호화 효율과 음질을 향상시키는 신규한 장치 및 그의 비트 할당 방법을 제공하는데 있다.Accordingly, an object of the present invention is to provide a novel apparatus and its bit, which adaptively encode an input digital audio signal having a plurality of frames based on cognitive information amounts of frames corresponding to a human auditory characteristic, thereby improving coding efficiency and sound quality. To provide an allocation method.

본 발명의 일 태양에 따르면, 다수개의 프레임들을 갖는 입력 디지탈 오디오 신호를 적응적으로 부호화하며, 상기 각각의 프레임은 다수개의 서브밴드들을 포함하는 장치로서, 상기 입력 디지탈 오디오 신호를 필터링하여 다수개의 서브밴드 단위로 분할하는 수단과 ; 상기 입력 디지탈 오디오 신호의 각 서브밴드들에 대한 신호대 마스크 비 데이터, 음압 레벨들 및 마스킹 문턱치들을 측정하는 제1측정 수단과 ; 상기 측정된 신호대 마스크 비 데이터, 음압 레벨들 및 마스킹 문턱치들에 의거하여 상기 입력 디지탈 오디오 신호의 각각의 프레임들에 대한 인지 정보량들을 측정하며, 상기 측정된 인지 정보량들에 대응하는 적어도 두 개 이상의 현재 및 이전 프레임들을 포함하는 프레임 그룹에 대한 평균 및 표준 편차 파라메터들을 측정하는 제2측정 수단과 ; 상기 측정된 신호대 마스크 비 데이터, 인지 정보량들, 그리고 평균 및 표준편차 파라메터들에 의거하여 상기 각각의 서브밴드들에 대한 비트를 적응적으로 결정하며, 상기 각 서브밴드들에 대해 결정된 비트들에 상응하는 비트 할당 정보들을 발생하는 비트 할당 수단과 ; 상기 필터된 각 서브밴드 오디오 신호를 상기 발생된 각 서브밴드에 대한 비트 할당 정보에 응답하여 양자화하는 수단과 ; 및 상기 양자화된 오디오 신호와 상기 발생된 비트 할당 정보들을 포맷하는 수단을 구비한다.According to an aspect of the present invention, an apparatus for adaptively encoding an input digital audio signal having a plurality of frames, each frame comprising a plurality of subbands, wherein the input digital audio signal is filtered to filter the input digital audio signal. Means for dividing into band units; First measuring means for measuring signal-to-mask ratio data, sound pressure levels and masking thresholds for respective subbands of the input digital audio signal; Measure cognitive information amounts for each frame of the input digital audio signal based on the measured signal-to-mask ratio data, sound pressure levels and masking thresholds, and at least two currents corresponding to the measured cognitive information amounts. Second measuring means for measuring average and standard deviation parameters for a frame group comprising previous frames; Adaptively determine bits for each of the subbands based on the measured signal-to-mask ratio data, cognitive information quantities, and mean and standard deviation parameters, and correspond to the bits determined for the respective subbands. Bit allocation means for generating bit allocation information; Means for quantizing each filtered subband audio signal in response to bit allocation information for each generated subband; And means for formatting the quantized audio signal and the generated bit allocation information.

본 발명의 다른 태양에 따르면, 다수개의 프레임들을 갖는 입력 디지탈 오디오 신호를 적응적으로 부호화하는데 이용하며, 상기 각각의 프레임은 다수개의 서브 밴드를 포함하는 비트 할당 방법으로서, 상기 방법은 : 상기 입력 디지탈 오디오 신호를 필터링하여 다수개의 서브밴드 단위로 분할하는 단계와 ; 상기 디지탈 오디오 신호의 각 서브밴드들에 대한 신호대 마스크 비 데이터, 음압 레벨들 및 마스킹 문턱치들을 측정하는 단계와 ; 상기 입력 디지탈 오디오 신호의 각 프레임들에 대한 인지 정보량들을 상기 측정된 마스크 비 데이터, 음압 레벨들 및 마스킹 문턱치들에 의거하여 측정하며, 상기 측정된 인지 정보량들에 대응하는 적어도 두 개 이상의 현재 및 이전 프레임들을 포함하는 프레임 그룹에 대한 평균 및 표준 편차 파라메터들을 측정하는 단계와 ; 상기 측정된 평균 및 표준편차 파라메터들에 의거하여 상기 프레임 그룹의 결정 레벨들을 측정하는 단계와 ; 상기 측정된 결정 레벨들, 상기 결정레벨들의 전체수, 상기 인지 정보량들 및 기설정된 평균 비트에 의거하여 상기 프레임 그룹의 각 프레임들에 대한 비트를 결정하며, 상기 각각의 프레임들에 대해 결정된 비트들에 상응하는 비트 할당 정보들을 발생하는 단계와 ; 및 상기 측정된 신호대 마스크 비 데이터 및 상기 발생된 비트 할당 정보들에 의거하여 상기 각 프레임의 각각의 서브밴드들에 대한 비트들을 결정하며, 상기 각각의 서브밴드들에 대해 결정된 비트들에 상응하는 비트 할당 정보들을 발생하는 단계를 포함한다.According to another aspect of the invention, a bit allocation method is used to adaptively encode an input digital audio signal having a plurality of frames, each frame comprising a plurality of subbands, the method comprising: the input digital; Filtering and dividing an audio signal into a plurality of subband units; Measuring signal-to-mask ratio data, sound pressure levels, and masking thresholds for respective subbands of the digital audio signal; Cognitive information amounts for each frame of the input digital audio signal are measured based on the measured mask ratio data, sound pressure levels and masking thresholds, and at least two current and previous ones corresponding to the measured cognitive information amounts. Measuring average and standard deviation parameters for a frame group comprising the frames; Measuring decision levels of the frame group based on the measured mean and standard deviation parameters; Determine bits for each frame of the frame group based on the measured decision levels, the total number of decision levels, the amount of cognitive information, and a predetermined average bit, and the bits determined for each of the frames. Generating bit allocation information corresponding to the; And determine bits for respective subbands of each frame based on the measured signal to mask ratio data and the generated bit allocation information, and bits corresponding to the bits determined for the respective subbands. Generating allocation information.

본 발명의 상기 목적과 다른 목적 및 특징들은 첨부한 도면과 관련하여 제공된 바람직한 실시예를 설명하면 명백해 질 것이다.The above and other objects and features of the present invention will become apparent from the description of the preferred embodiments provided in connection with the accompanying drawings.

제1도를 참조하면, 본 발명에 따른 적응 디지탈 오디오 부호화 장치 및 비트 할당 방법을 개략적으로 예시하는 블록도가 도시되어 있다.Referring to FIG. 1, there is shown a block diagram schematically illustrating an adaptive digital audio encoding apparatus and a bit allocation method according to the present invention.

적응 디지탈 오디오 부호화 장치는 서브밴드 필터링 디바이스(10), 제1 및 제2인지 파라메터 측정기 (20 및 30), 제1 및 제2 비트 할당 유니트(40 및 50), 양자기(60), 및 포맷팅 회로(70)를 포함한다.The adaptive digital audio encoding apparatus includes a subband filtering device 10, first and second cognitive parameter measuring instruments 20 and 30, first and second bit allocation units 40 and 50, quantizer 60, and formatting. Circuit 70.

N개(즉, n = 0,1,..,N-1, 여기서, N 은 양의 정수)의 샘플들을 포함하는 i번째 프레임, 또는 현재 프레임들의 입력 디지탈 오디오 신호(X(n))는 제1인지 파라메터 측정기(20) 및 서브밴드 필터링 디바이스(10)로 인가된다. 서브밴드 필터링 디바이스(10)는 입력 디지탈 오디오 신호의 서브밴드 필터링 동작을 수행한다. 본 명세서에서 사용된 프레임은 고정된 오디오 샘플들의 수에 상응하는 디지탈 오디오 신호의 일부분을 나타내며, 또한 디지탈 오디오 신호의 부호화 및 복호화를 위한 처리 단위이다.The input digital audio signal X (n) of the i-th frame, or current frames, containing N samples (i.e., n = 0,1, .. N-1, where N is a positive integer) Is applied to the first perceptual parameter meter 20 and the subband filtering device 10. The subband filtering device 10 performs a subband filtering operation of the input digital audio signal. As used herein, a frame represents a portion of a digital audio signal corresponding to a fixed number of audio samples, and is also a processing unit for encoding and decoding the digital audio signal.

도시된 바와같이, 서브밴드 필터링 디바이스(10)는 현재 프레임의 입력 디지탈 오디오 신호를 수신하여 본 기술 분야에 잘 알려진 서브밴드 필터링 기법, 예를들면, ISO/IEC JTCI/SC2/WG 11, Part 3, Audio Proposal, CD-11172-3(1991)에 기술된 소위 MPEG 오디오 알고리즘에 개시된 방법을 이용함으로써, 입력 디지탈 오디오 신호들의 필터링을 수행한다. 즉, 서브밴드 필터링 디바이스(10)는 샘플링 주파수(fs)를 갖는 입력 디지탈 오디오 신호를 샘플링 주파수(fs/B)를 갖는 B(예를들면, 32)개의 동일한 주파수 대역을 갖는 서브밴드로 분할하며, 분할된 서브밴드 오디오 샘플들을 양자기(70)로 제공한다.As shown, the subband filtering device 10 receives the input digital audio signal of the current frame to provide subband filtering techniques well known in the art, for example, ISO / IEC JTCI / SC2 / WG 11, Part 3 By using the method disclosed in the so-called MPEG audio algorithm described in Audio Proposal, CD-11172-3 (1991), filtering of input digital audio signals is performed. That is, the subband filtering device 10 divides an input digital audio signal having a sampling frequency fs into subbands having B (eg, 32) identical frequency bands having a sampling frequency fs / B and Provide the divided subband audio samples to the quantizer 70.

한편, 제1인지 파라메터 측정기(20)는 현재 프레임의 입력 디지탈 오디오 신호들을 수신하여, 예를들면, MPEG 오디오 알고리즘에 논의된 심리 음향 모델을 이용하여 입력 디지탈 오디오 신호에 포함된 각각의 서브밴드들에 대한 신호대 마스크 비(SMR) 데이터, 음압 레벨(Sound Pressure Level)들, 마스킹 문턱치(Masking Threshold)들을 측정한다. 각 서브밴드에 대한 SMR 데이터는 다음과 같이 추출된다.Meanwhile, the first cognitive parameter measurer 20 receives the input digital audio signals of the current frame and, for example, each of the subbands included in the input digital audio signal using the psychoacoustic model discussed in the MPEG audio algorithm. Signal to Mask Ratio (SMR) data, Sound Pressure Levels, and Masking Thresholds are measured for. SMR data for each subband is extracted as follows.

여기서, j는 서브밴드 인덱스(j = 0, 1,..,B-1)이고, B는 한 프레임에서의 전체 서브밴드 수이며, SMR(j)는 j번째 서브밴드에서의 신호대 마스크 비이고 ; P(j)는 고속 퓨리에 변환(Fast Fourier Transform) 기법으로 부터 측정된 j번째 서브밴드에서의 음압 레벨이며 ; M(j)_는 j번째 서브밴드에서의 마스킹 문턱치이며 ; 그리고, SMR(j), P(j) 및 M(j)의 단위는 모두 데시벨(dB)이다.Where j is the subband index (j = 0, 1, .., B-1), B is the total number of subbands in one frame, and SMR (j) is the signal-to-mask ratio in the jth subband. ; P (j) is the sound pressure level in the jth subband measured from the Fast Fourier Transform technique; M (j) _ is the masking threshold in the jth subband; The units of SMR (j), P (j) and M (j) are all decibels (dB).

마스킹 문턱치는 가청 한계(Audible Limit)를 나타내는 것으로써, 고유의 가청 한계 또는 음의 문턱치와 오디오 신호의 다른 순음 및 비순음 성분들의 존재에 의해 발생된 증분과의 합을 나타낸다. 그 다음, SMR1(j) 데이터는 제 2 비트 할당 유니트(50)로 제공되는 반면에, P(j) 및 M(j)는 제2인지 파라메터 측정기(30)로 공급된다.The masking threshold represents the audible limit, which is the sum of the inherent audible limit or negative threshold and the increment generated by the presence of other pure and non-pure components of the audio signal. Then, SMR1 (j) data is provided to the second bit allocation unit 50, while P (j) and M (j) are supplied to the second cognitive parameter meter 30.

제2인지 파라메터 측정기(30)는 제1인지 파라메터 측정기(20)로 부터의 음압 레벨들 및 마스킹 문턱치들에 의거하여 i번째 프레임의 입력 디지탈 오디오 신호에 대한 인지 정보량(PE(i))을 측정한다. i번째 프레임의 입력 디지탈 오디오 신호에 대한 인지 정보량(PE(i))은, 본 기술 분야에 잘 알려져 있는 바와같이, 다음과 같이 나타낼 수도 있다.The second cognitive parameter measurer 30 measures the amount of cognitive information PE (i) for the input digital audio signal of the i-th frame based on the sound pressure levels and the masking thresholds from the first cognitive parameter measurer 20. do. The amount of recognition information PE (i) for the input digital audio signal of the i-th frame may be expressed as follows, as is well known in the art.

여기서, i,j 및 B 는 상기에 정의된 바와같은 동일한 의미들을 갖는다. 식(2)은 소위 전송율 왜곡 이론(Rate Distortion Theory)을 적용함으로써 얻을 수 있으며, 인간의 청각 특성에 의거한 인지 정보량들에 일치한다. 그리고, 제2인지 파라메터 측정기(30)는 Q(예를들면, 4)개의 현재 및 그것의 이전 프레임들에 대해 측정된 인지 정보량들, 즉, PE(i), PE(i-1), PE(i-2) 및 PE(i-3)들을 그룹화함으로써, 제1도 및 제2도를 참조하여 이하에서 상세하게 설명될 제1비트 할당 유니트(40)의 처리에 따라 그룹된 프레임들간에 비트들을 적응적으로 할당할 수 있도록 하며, 또한, 프레임 그룹의 전체 인지 정보량을 이용하여 그들의 통계적 특성을 나타내는 평균(Mean ; PEm) 및 표준 편차(Standard Deviation ; PEstd) 파라메터들을 측정한다. 프레임 그룹의 전체 인지 정보량에 대한 평균 파라메터(PEm)는, 본 기술 분야에 잘 알려진 바와같이, 다음과 같이 얻을 수도 있다.Where i, j and B have the same meanings as defined above. Equation (2) can be obtained by applying the so-called Rate Distortion Theory, which corresponds to the amount of cognitive information based on human auditory characteristics. The second cognitive parameter meter 30 then measures the amount of cognitive information measured for Q (e.g. 4) current and its previous frames, i.e., PE (i), PE (i-1), PE. By grouping (i-2) and PE (i-3), bits between frames grouped according to the processing of the first bit allocation unit 40 to be described in detail below with reference to FIGS. 1 and 2. And the average (PE) and standard deviation (PEstd) parameters representing their statistical characteristics, using the total amount of cognitive information of the frame group. The average parameter PEm for the total amount of cognitive information of the frame group may be obtained as follows, as is well known in the art.

여기서, p(p = 0,1,..,Q-1)는 프레임 그룹에서 사용된 프레임 인덱스로써, Q는 상기 프레임 그룹의 전체 프레임수를 나타내며, 그리고 PE(p)는 프레임 그룹에서 p번째 프레임의 인지 정보량을 나타낸다.Where p (p = 0,1, .. Q-1) is the frame index used in the frame group, Q is the total number of frames in the frame group, and PE (p) is the pth in the frame group. Represents the amount of cognitive information of the frame.

그러므로, 프레임 그룹의 전체 인지 정보량들에 대한 표준 편차(PEstd)는, 본 기술 분야에 잘 알려진 바와같이, 다음과 같이 계산될 수도 있다.Therefore, the standard deviation PEstd for the total cognitive information amounts of the frame group may be calculated as follows, as is well known in the art.

여기서, p 및 Q는 상기에 정의된 바와같은 동일한 의미들을 갖는다.Where p and Q have the same meanings as defined above.

제2인지 파라메터 측정기(30)에서 그룹화되고 측정된 p번째 프레임의 인지정보량(PE(p))과, 평균 및 표준편차 파라메터들(PEm 및 PEstd)은 제1비트 할당 유니트(40)로 인가된다. 제1비트 할당 유니트(40)는 제2인지 파라메터 측정기(30)로 부터의 p번째 프레임의 인지 정보량과 평균 및 표준편차 파라메터들에 의거하여 프레임 그룹에 포함된 각각의 프레임들에 대한 비트를 결정하며, 프레임 그룹의 각각의 프레임들에 대해 결정된 비트들에 상응하는 비트할당 정보(FBI)를 제2비트 할당 유니트(50) 및 포맷팅 회로(70)로 제공한다.The recognition information amount PE (p) and the average and standard deviation parameters PEm and PEstd of the p-th frame grouped and measured by the second recognition parameter measurer 30 are applied to the first bit allocation unit 40. . The first bit allocation unit 40 determines a bit for each frame included in the frame group based on the amount of acknowledgment information of the pth frame from the second recognition parameter measurer 30 and the average and standard deviation parameters. The bit allocation information FBI corresponding to the bits determined for the respective frames of the frame group is provided to the second bit allocation unit 50 and the formatting circuit 70.

제2도를 참조하면, 제1도에 도시된 제1비트 할당 유니트(40)의 상세 블록도가 도시된다. 상기 제1비트 할당 유니트(40)는 결정 레벨 측정기(41), 감산기(42), 및 비트 할당 디바이스(43)를 포함한다.Referring to FIG. 2, a detailed block diagram of the first bit allocation unit 40 shown in FIG. 1 is shown. The first bit allocation unit 40 includes a decision level measurer 41, a subtractor 42, and a bit allocation device 43.

결정 레벨 측정기(41)는, 비트 할당 디바이스(43)가 프레임 그룹에서의 각각의 프레임에 비트를 적응적으로 할당하기 위해, 제1도에 도시된 제2인지 파라메터 측정기(30)으로 부터의 평균 및 표준편차 파라메터들(PEm 및 PEstd)에 의거하여 프레임 그룹의 최적의 결정 레벨들을 측정한다. 본 발명의 바람직한 실시예에 있어서, 프레임 그룹의 k 번째 결정 레벨(D(k))은 다음과 같이 나타낼 수 있다.The decision level measurer 41 calculates an average from the second cognitive parameter measurer 30 shown in FIG. 1 so that the bit allocation device 43 adaptively allocates a bit to each frame in the frame group. And the optimal determination levels of the frame group based on the standard deviation parameters PEm and PEstd. In a preferred embodiment of the present invention, the kth determination level D (k) of the frame group can be expressed as follows.

여기서, k(k = -q~q)는 결정 레벨 인덱스이고, q는 양의 정수이며, 그리고 NF는 프레임 그룹에서 정규화 팩터(Factor)이다.Where k (k = −q ~ q) is the decision level index, q is a positive integer and NF is the normalization factor in the frame group.

식(5)으로 부터 알수 있는 바와 같이, 프레임 그룹의 k번째 결정 레벨(D(k))과 (k-1)번째 결정 레벨(D(k-1))사이의 레벨 간격은 제2인지 파라메터 측정기(30)로 부터의 표준편차 파라메터(PEstd)와 프레임 그룹의 정규화 팩터(NF)에 의존하는데 반해, 결정 레벨들의 전체 수는 기설정된다. 결정 레벨들의 전체수는 부호화 장치의 부호화 효율과 음질에 의거하여 결정될 수 있다. 결정 레벨 측정기(41)에서 상용된 프레임 그룹의 정규화 팩터(NF)는 실제 인간의 청각 특성과 매우 부합하는 상기 프레임 그룹의 최적의 결정 레벨들을 추출하기 위해서 제2인지 파라메터 측정기(30)로 부터의 평균 및 표준편차 파라메터들과, 그것의 메모리(도시 안됨)에 사전저장된 글로벌(Global) 평균(PEgm)과 글로벌 표준편차 파라메터들의 평균(PEgstd)을 이용함으로써 바람직하게 결정될 수 있다. 글로벌 평균과 글로벌 표준편차들의 평균 파라메터들의 각각은 기설정된 기간 동안에 측정된 평균과 표준편차 파라메터들을 이용함으로써 용이하게 측정될 수가 있다. 본 발명의 바람직한 실시예에 있어서, 프레임 그룹의 정규화 팩터(NF)는 다음과 같이 결정될 수 있다.As can be seen from equation (5), the level interval between the kth decision level D (k) and the (k-1) th decision level D (k-1) of the frame group is the second parameter. In contrast to the standard deviation parameter PEstd from the meter 30 and the normalization factor NF of the frame group, the total number of decision levels is preset. The total number of decision levels may be determined based on the coding efficiency and sound quality of the encoding apparatus. The normalization factor (NF) of the frame group commonly used in the decision level measurer (41) is obtained from the second cognitive parameter measurer (30) in order to extract the optimal decision levels of the frame group that closely match the actual human auditory characteristics. It can preferably be determined by using the mean and standard deviation parameters, and the global mean PEgm pre-stored in its memory (not shown) and the mean PEgstd of the global standard deviation parameters. Each of the global parameters and the mean parameters of the global standard deviations can be easily measured by using the mean and standard deviation parameters measured during the predetermined period. In a preferred embodiment of the present invention, the normalization factor (NF) of the frame group can be determined as follows.

식(6)에서 PEm과 PEstd는 상기 그룹에서 식(3)및 식(4)로 구한 인지 정보량이 평균 및 표쥰 편차이고 PEgm과 PEgstd는 장시간(예를 들어 4sec)동안의 PEm과 PEstd 의 평균값에 해당하는데, 오디오 신호의 이러한 장시간 특성은 일반적으로 일정하므로 사전에 자장된 값을 이용할수 있으며, 또는 이전의 PEm과 PEstd 을 장시간 동안의 평균을 구하여 사용할수도 있다.In Eq. (6), PEm and PEstd are mean and standard deviations of cognitive information obtained from Eq. (3) and Eq. (4), and PEgm and PEgstd are the average values of PEm and PEstd for a long time (for example, 4 sec). Correspondingly, this long-term characteristic of the audio signal is generally constant, so that the previously stored value can be used, or the previous PEm and PEstd can be averaged for a long time.

식들 (5) 및 (6)로 부터 알 수 있는 바와 같이, 상기 프레임 그룹의 결정 레벨들은 평균 파라메터의 정수배로써 결정될 수 있음을 주목하자.As can be seen from equations (5) and (6), note that the decision levels of the frame group can be determined as integer multiples of the average parameter.

한편, 감산기(41)는 제2인지 파라메터 측정기(30)로 부터의 인지 파라메터(E(p))를 평균 파라메터(PEm)로 감산함으로써, 프레임 그룹에서 p번째 프레임의 편차신호(E(p))를 계산한다. 그 다음에, 결정 레벨 측정기(41)에서 측정되고 사전 결정된 결정 레벨(D(k)) 및 결정 레벨들의 전체 수(즉, 2q+1)는 비트 할당 디바이스(43)로 동시에 제공된다.On the other hand, the subtractor 41 subtracts the recognition parameter E (p) from the second recognition parameter measuring instrument 30 by the average parameter PEm, thereby causing the deviation signal E (p) of the p-th frame in the frame group. Calculate Then, the determination level D (k) and the total number of decision levels (ie 2q + 1) measured at the decision level meter 41 are simultaneously provided to the bit allocation device 43.

비트 할당 디바이스(43)는 프레임 그룹의 각 프레임 그룹에 대한 비트를, 결정 레벨 측정기(41)로 부터의 결정 레벨들과 결정 레벨들의 전체 수, 그리고 감산기(42)로 부터의 편차 신호에 의거하여 결정하며, 각 프레임에 대해 결정된 비트들에 상응하는 비트 할당 정보들을 제1도에 도시된 제2비트 할당 유니트(50) 및 포맷팅 회로(70)로 각기 제공한다. 본 발명의 바람직한 실시예에 있어서, 프레임 그룹에서 p번째 프레임의 비트할당(FB(p))은 다음과 같이 결정 될 수 있다.The bit allocation device 43 divides the bits for each frame group of the frame group based on the decision levels from the decision level meter 41 and the total number of decision levels, and the deviation signal from the subtractor 42. And the bit allocation information corresponding to the determined bits for each frame are respectively provided to the second bit allocation unit 50 and the formatting circuit 70 shown in FIG. In a preferred embodiment of the present invention, the bit allocation (FB (p)) of the p-th frame in the frame group can be determined as follows.

여기서, p는 상기에 정의된 바와같은 동일한 의미를 가지며, FBm은 프레임당 평균 비트(예를들면, 3072비트(즉, 샘플링 주파수가 48KHz, 오디오 데이터가 16-비트 PCM형태, 따라서 초당 128Kbits인 경우))이고, BV는 기설정된 비트 변동값을 나타내며, 2q+1은 기설정된 결정레벨들의 전체 수를 나타내며, 그리고 I는 p번째 프레임에서의 레벨 인덱스이다.Where p has the same meaning as defined above, where FBm is the average bit per frame (e.g., 3072 bits (i.e., 48KHz sampling frequency, 16-bit PCM form of audio data, and therefore 128Kbits per second). )), BV represents a predetermined bit shift value, 2q + 1 represents the total number of predetermined decision levels, and I is the level index in the pth frame.

식(7)으로 부터 알 수 있는 바와같이, p번째 채널에 대한 비트할당(FB(p))은 평균비트(FBm)와 그것의 두 번째 항으로부터 계산된 가변비트를 가산함으로써 계산될 수 있다. 기설정된 비트 변동값(BV)은 식(7)에 정의된 바와같이 한 프레임에 대한 평균 비트값과 동일한 값으로 결정될 수 있으며, 프레임 그룹에서 p번째 프레임 에 대한 레벨 인덱스(I)는 결정 레벨 측정기(41)로 부터의 결정 레벨들(D(k))과 감산기(42)로 부터의 편차 신호(E(p))에 의거하여 결정될 수 있다. 본 발명의 바람직한 실시예에 있어서, 프레임 그룹에서 p번째 프레임 대한 레벨 인덱스(I)는 다음 테이블에 도시된 바와 같이 나타낼수도 있다(여기서, 결정 레벨들의 간격이 1.27, 그리고 결정 레벨 인덱스(k)는 -2~2로 가정되었음).As can be seen from equation (7), the bit allocation FB (p) for the p-th channel can be calculated by adding the average bit FBm and the variable bit calculated from its second term. The predetermined bit shift value BV may be determined to be equal to the average bit value for one frame as defined in Equation (7), and the level index I for the pth frame in the frame group is determined by the determination level meter. Can be determined based on the decision levels D (k) from 41 and the deviation signal E (p) from the subtractor 42. In a preferred embodiment of the present invention, the level index (I) for the pth frame in a frame group may be represented as shown in the following table (where the spacing of decision levels is 1.27, and the decision level index k is -2 to 2).

테이블로 부터 알수 있는 바와같이, 예를들어, 만일 편차신호(E(p))가 결정 레벨들(-2.55~-1.28)사이에 존재하면, p번째 프레임의 레벨 인덱스(I)는 -1로서 선택될 수 있으며, 그리고, 만일 편차신호가 결정 레벨들(-1.27~-1.26)사이에 있게 되면, 레벨 인덱스는 0로서 선택될 수 있다. 이와같은 방법으로, p번째 프레임의 비트 할당은 식(7)을 이용함으로써 유리하게 결정될 수 있다.As can be seen from the table, for example, if the deviation signal E (p) exists between the decision levels (2.55 to -1.28), the level index (I) of the p th frame is -1. And, if the deviation signal is between decision levels (1.27 to -1.26), the level index can be selected as zero. In this way, the bit allocation of the pth frame can be advantageously determined by using equation (7).

그 다음에, 비트 할당 유니트(45)에서, 프레임 그룹의 각 프레임에 대해 결정된 비트에 상응하는 비트 할당 정보(FBI)와 제1도에 도시된 제1인지 파라메터 측정기(20)로 부터의 신호대 마스크 비(SMR(j))는 제2비트 할당 유니트(50)로 동시에 제공되며, 그리고 비트 할당 정보는 상기 포맷팅 회로(70)로 제공된다.Then, in the bit allocation unit 45, the bit allocation information FBI corresponding to the determined bit for each frame of the frame group and the signal mask mask from the first recognition parameter measurer 20 shown in FIG. The ratio SMR (j) is simultaneously provided to the second bit allocation unit 50, and the bit allocation information is provided to the formatting circuit 70.

제1도를 다시 참조하면, 제2비트 할당 유니트(50)는 제1인지 파라메터 측정기(20)로 부터 공급된 신호대 마스크 비와 제1비트 할당 유니트(40)로 부터 제공된 비트할당 정보를 수신하여, 프레임 그룹의 각 프레임에 포함된 각 서브밴드에 대한 비트를 결정하며, 각 서브밴드에 대해 결정된 비트에 상응하는 비트 할당 정보(SBI)를 양자기(60)와 포맷팅 회로(70)로 동시에 제공한다. 제2비트 할당 유니트(50)에서 사용된 프로세스 원리는, 제1비트 할당 유니트(40)로 부터 전송된 각 프레임에 대한 가용한 비트를 초과하지 않는 범위내에서 한 프레임에 대한 전체 신호대 마스크 비에 의거한다. 그 다음에, 제2비트 할당 유니트(50)로 부터의 각 서브밴드에 대한 비트 할당 정보(SBI)와 서브 밴드 필터링 디바이스(10)로 부터의 분할된 서브 밴드 오디오 샘플들은 양자기(60)로 동시에 제공된다.Referring back to FIG. 1, the second bit allocation unit 50 receives the signal-to-mask ratio supplied from the first recognition parameter measuring unit 20 and the bit allocation information provided from the first bit allocation unit 40. And determining bits for each subband included in each frame of the frame group, and simultaneously providing bit allocation information (SBI) corresponding to the bits determined for each subband to the quantizer 60 and the formatting circuit 70 at the same time. do. The process principle used in the second bit allocation unit 50 is based on the total signal-to-mask ratio for one frame within the range not exceeding the available bits for each frame transmitted from the first bit allocation unit 40. It depends. Then, the bit allocation information SBI for each subband from the second bit allocation unit 50 and the divided subband audio samples from the subband filtering device 10 are transferred to the quantizer 60. Provided at the same time.

양자기(60)는 서브밴드 필터링 디바이스(10)로 부터의 분할된 서브밴드 오디오 샘플들을 제2비트 할당 유니트(50)로 부터의 오디오 샘플들과 상응하는 비트할당 정보에 의거하여 적응적으로 양자화하여 포맷팅 회로(70)로 제공한다.The quantizer 60 adaptively quantizes the divided subband audio samples from the subband filtering device 10 based on the bit allocation information corresponding to the audio samples from the second bit allocation unit 50. To the formatting circuit 70.

포맷팅 회로(70)에서, 양자기(60)로 부터의 양자화된 오디오 샘플들과 제1 및 제2비트 할당 유니트들(40 및 50)로 부터의 비트할당 정보들을 포맷화하고 그것의 전송을 위해 전송기(도시 안됨)로 전송함으로써, 입력 디지탈 오디오 신호의 부호화 효율과 음질을 향상시킬 수가 있다. 서브밴드 필터링 디바이스(10)의 원리와 기능, 제2비트 할당 유니트(50), 양자기(60) 및 포맷팅 회로(70)들은 MPEG 오디오 알고리즘에 논의된 그것들과 기본적으로 동일하다.In the formatting circuit 70, for formatting and transmitting the quantized audio samples from the quantizer 60 and the bit allocation information from the first and second bit allocation units 40 and 50. By transmitting to a transmitter (not shown), the coding efficiency and sound quality of the input digital audio signal can be improved. The principle and function of the subband filtering device 10, the second bit allocation unit 50, the quantizer 60 and the formatting circuits 70 are basically the same as those discussed in the MPEG audio algorithm.

본 발명은 비록 특정 실시예를 참조하여 설명되었지만, 본 기술 분야에 통상의 지식을 가진자라면 첨부된 청구범위의 사상과 범주를 벗어남이 없이도 다양하게 변경하여 실시할 수 있음을 이해할 것이다.Although the present invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes may be made without departing from the spirit and scope of the appended claims.

Claims

An apparatus for adaptively encoding an input digital audio signal having a plurality of frames, each frame comprising a plurality of subbands, comprising: means for filtering and dividing the input digital audio signal into a plurality of subband units; First measuring means for measuring signal-to-mask ratio data, sound pressure levels and masking thresholds for respective subbands of the input digital audio signal; Measure cognitive information amounts of respective frames of the input digital audio signal based on the measured signal-to-mask ratio data, sound pressure levels and masking thresholds, and at least two current and previous ones corresponding to the measured cognitive information amounts. Second measuring means for measuring average and standard deviation parameters for a frame group comprising frames; Adaptively determine bits for each of the subbands based on the measured signal-to-mask ratio data, cognitive information quantities, and mean and standard deviation parameters, and correspond to the bits determined for the respective subbands. Bit allocation means for generating bit allocation information; Means for quantizing each filtered subband audio signal in response to bit allocation information for each generated subband; And means for formatting the quantized audio signal and the generated bit allocation information.

2. The apparatus of claim 1, wherein the bit allocation means comprises: means for measuring decision levels for the frame group based on the measured mean and standard deviation parameters; Means for generating a deviation signal indicative of a deviation between the measured cognitive information amount and the average parameter; Determine a bit for each frame of the frame group based on the measured decision levels, the total number of decision levels, the amount of cognitive information, and a predetermined average bit, First bit allocation means for generating corresponding bit allocation information; And determining bits for respective subbands of each frame based on the measured signal to mask ratio data and the generated bit allocation information, and bit allocation corresponding to the bits determined for the respective subbands. An adaptive digital audio encoding device having a second bit allocation means for generating information.

The method of claim 2, wherein the decision levels (D) of the frame group are determined as follows.

Where k (k = -q to q) is the decision level index, q is a positive integer, NF is the normalization factor in the frame group, and PEstd is the standard deviation parameter of the frame group. Encoding device.

The method of claim 3, wherein the bit allocation (FB (p)) of the p-th frame in the frame group is calculated as follows:

Where p is a frame index in the frame group, FBm is an average bit for one frame, BV is a predetermined bit shift value, 2q + 1 is the total number of predetermined decision levels, and I is the pth Adaptive digital audio encoding device that is a level index of a frame.

A bit allocation method for adaptively encoding an input digital audio signal having a plurality of frames, wherein each frame includes a plurality of subbands, the method comprising: filtering the input digital audio signal to filter a plurality of subbands. Dividing into band units; Measuring signal-to-mask ratio data, sound pressure levels, and masking thresholds for respective subbands of the digital audio signal; Measure the amount of cognitive information for each frame of the input digital audio signal based on the measured signal-to-mask ratio data, sound pressure levels and masking thresholds, and at least two currents corresponding to the measured cognitive information amounts; Measuring average and standard deviation parameters for a frame group containing previous frames; Measuring decision levels of the frame group based on the measured mean and standard deviation parameters; Determine bits for each frame of the frame group based on the measured decision levels, the total number of decision levels, the amount of cognitive information, and a predetermined average bit, and the bits determined for each of the frames. Generating bit allocation information corresponding to the; And determine bits for respective subbands of each frame based on the measured signal to mask ratio data and the generated bit allocation information, and bits corresponding to the bits determined for the respective subbands. And generating allocation information.