KR20070025905A

KR20070025905A - Method of effective sampling frequency bitstream composition for multi-channel audio coding

Info

Publication number: KR20070025905A
Application number: KR1020060004058A
Authority: KR
Inventors: 방희석; 오현오; 김동수; 임재현; 정양원
Original assignee: 엘지전자 주식회사
Priority date: 2005-08-30
Filing date: 2006-01-13
Publication date: 2007-03-08
Also published as: CN101253807A; CN101253554B; CN101248484B; CN101253806B; CN101253551A; CN101253552B; CN101253808B; CN101253552A; CN101253554A; CN101253809B; CN101253808A; CN101253553B; CN101253809A; CN101253810B; CN101253807B; CN101253810A; CN101248484A; CN101253553A; CN101253806A; CN101253551B

Abstract

A method for effective sampling frequency bitstream composition in multi-channel audio coding is provided to use the sampling frequency of a spatial information bitstream identical with the sampling frequency of a core codec bitstream in order to efficiently express the sampling frequency index of the spatial information bitstream, thereby improving the efficiency of encoding and transmission by reducing a bit number for the sampling frequency index of the spatial information bitstream. A method for encoding multi-channel audio signals comprises the following steps of: downmixing the multi-channel audio signal and extracting spatial information from the multi-channel audio signal(701,702); and generating a core codec bitstream and a spatial information bitstream by using the same sampling frequency for the downmixed audio signal and the spatial information, wherein indicator information about the sampling frequency is included in one of the core codec bitstream and the spatial information bitstream(703,704).

Description

TECHNICAL FIELD OF EFFECTIVE SAMPLING FREQUENCY BITSTREAM COMPOSITION FOR MULTI-CHANNEL AUDIO CODING

도 1 은 본 발명에서의 오디오 신호에 대한 공간 정보를 인간이 인식하는 방법을 나타내는 도면.BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 is a diagram illustrating a method for a human to recognize spatial information about an audio signal in the present invention.

도 2 는 본 발명에서의 공간 인코더 및 디코더를 이용하여 멀티채널 오디오 신호를 코딩하는 방법에 대한 도면.2 is a diagram of a method of coding a multichannel audio signal using a spatial encoder and decoder in the present invention.

도 3 는 본 발명에서의 멀티채널 오디오 신호를 공간 디코더 내에서 2채널에서 5.1채널로 바꾸는 단계에 대한 상세한 도면.3 is a detailed diagram of a step of converting a multichannel audio signal from 2 to 5.1 channels in a spatial decoder according to the present invention;

도 4a 및 4b는 본 발명에 따른 멀티채널 오디오 신호의 매 프레임당 비트스트림 및 코어 코덱 비트스트림에 대한 도면.4A and 4B are diagrams of a bitstream and a core codec bitstream per frame of a multichannel audio signal according to the present invention.

도 5 은 본 발명에 따른 ADTS 헤더에서의 샘플링 주파수 인덱스를 신택스(syntax) 상에서 나타내는 도면.5 illustrates, on syntax, a sampling frequency index in an ADTS header in accordance with the present invention;

도 6 은 본 발명에 따른 멀티채널 오디오 신호를 코딩하는데 있어서, 공간 정보 비트스트림의 샘플링 주파수 인덱스에 대한 신택스 상에서의 도시.6 is a diagram on syntax for a sampling frequency index of a spatial information bitstream in coding a multichannel audio signal in accordance with the present invention.

도 7 는 본 발명에 따른 제2 실시예에 대한 멀티채널 오디오 신호의 인코딩 방법에 대한 흐름도.7 is a flowchart of a method for encoding a multichannel audio signal in accordance with a second embodiment of the present invention.

도 8 은 본 발명에 따른 제2 실시예에 대한 멀티채널 오디오 신호의 디코딩 방법에 대한 흐름도.8 is a flowchart of a method of decoding a multichannel audio signal in a second embodiment according to the present invention.

*도면의 주요부분에 대한 부호의 설명* Explanation of symbols for main parts of the drawings

101.원거리 음원 102.직접적인 음파101.Remote sound source 102.Direct sound wave

104.반사된 음파 201.멀티채널 오디오 신호104. Reflected Sound Wave 201. Multichannel Audio Signal

202.다운믹스부 203.공간 파라미터 추출부202. Downmix unit 203. Spatial parameter extraction unit

204.공간 인코더 205.아티스틱 다운믹스 오디오 신호204 Spatial Encoder 205 Artistic Downmix Audio Signal

206.모노 또는 스테레오 오디오 신호 207.공간 파라미터206 mono or stereo audio signal

208.공간 디코더 208.Space Decoder

302.2채널 분석 필터뱅크 303.2채널 타임/주파수 신호302.2 Channel Analysis Filter Bank 303.2 Channel Time / Frequency Signal

304.업믹스부 305.6채널 시간/주파수 신호304.Upmix section 305.6 channel time / frequency signal

306.6채널 통합 필터뱅크 401.프레임306.6-channel integrated filterbank 401 frame

402.코어 코덱 비트스트림 403.공간 정보 비트스트림402 Core Codec Bitstream 403 Spatial Information Bitstream

404.컨피규레이션 비트스트림 405.공간 데이터 비트스트림404. Configuration bitstream 405. Spatial data bitstream

409.AAC 헤더 410.AAC 데이터409.AAC Header 410.AAC Data

본 발명은 오디오 신호의 비트스트림(bitstream) 구성방법에 관한 것으로서, 보다 상세하게는 멀티채널 오디오(multi-channel audio) 코딩에서 효과적인 샘플링 주파수(sampling frequency) 비트스트림을 구성하는 방법에 관한 것이다. The present invention relates to a method for constructing a bitstream of an audio signal, and more particularly, to a method for constructing an effective sampling frequency bitstream in multi-channel audio coding.

최근에 디지털 오디오 신호에 대한 다양한 코딩기술 및 방법들이 개발되고 있으며, 이와 관련된 제품들이 생산되고 있다. 또한 심리음향 모델(Psychoacoustic model)을 이용하여 멀티채널 오디오 신호의 코딩방법들이 개발되고 있으며, 이에 대한 표준화 작업이 진행되고 있다. 상기 심리음향 모델은 인간이 소리를 인식하는 방식, 예를 들면 큰 소리 다음에 오는 작은 소리는 들리지 않으며, 20Hz 내지 20000Hz의 주파수에 해당되는 소리만 들을 수 있다는 사실을 이용하여, 코딩과정에서 불필요한 부분에 대한 오디오 신호를 제거함으로써 필요한 데이터의 양을 효과적으로 줄일 수 있는 것이다.Recently, various coding techniques and methods for digital audio signals have been developed, and related products have been produced. In addition, coding methods for multichannel audio signals have been developed using a psychoacoustic model, and standardization thereof has been performed. The psychoacoustic model is an unnecessary part of the coding process by using a method of recognizing a sound, for example, a small sound following a loud sound, and only a sound corresponding to a frequency of 20 Hz to 20000 Hz. By eliminating the audio signal for, the amount of data needed can be effectively reduced.

현재 MPEG-1 오디오(MEPG-1 레이어 Ⅲ), MPEG-4 AAC(Advanced Audio Coding) 및 MPEG-4 HE-AAC(High-Efficiency AAC)와 같은 오디오 표준 기술이 개발되어 상용화되고 있다. 또한 공간 정보를 이용하는 멀티채널 오디오 신호의 코딩방법이 개발되고 있다. 상기 멀티채널 오디오 신호의 코딩방법은 압축된 오디오 신호(예를 들면, 스테레오 또는 모노 오디오 신호) 및 낮은 비트-레이트의 부가정보(low-rate side information)(예를 들면, 공간 정보) 채널을 이용하여 멀티채널 오디오 신호의 전송 효율을 매우 효과적으로 향상시키는 것이다.Currently, audio standard technologies such as MPEG-1 Audio (MEPG-1 Layer III), MPEG-4 Advanced Audio Coding (AAC), and MPEG-4 High-Efficiency AAC (HE-AAC) have been developed and commercialized. In addition, a method of coding a multichannel audio signal using spatial information has been developed. The multi-channel audio signal coding method uses a compressed audio signal (e.g., stereo or mono audio signal) and a low bit-rate side information (e.g., spatial information) channel. Therefore, the transmission efficiency of the multichannel audio signal is greatly improved.

그러나, 상기 멀티채널 오디오 신호의 코딩방법에서 공간 정보와 같은 부가정보를 코딩하는데 불필요한 정보가 포함되어 있어 압축 및 전송에 있어서 효율이 좋지 못하다는 단점이 있었다.However, in the method of coding the multi-channel audio signal, unnecessary information is included in coding additional information such as spatial information, and thus there is a disadvantage in that the efficiency of compression and transmission is not good.

따라서 상기와 같은 문제점을 해결하기 위해 제안된 본 발명은, 멀티채널 오디오 신호를 코딩하는데 있어서, 공간 정보와 같은 부가정보를 효과적인 방식으로 표현하여 사용되는 데이터 양을 줄임으로써, 멀티채널 오디오 신호의 압축 및 전송효율을 향상시킬 수 있는 인코딩 및 디코딩 방법을 제공하는데 그 목적이 있다. Accordingly, the present invention proposed to solve the above problems, in coding a multi-channel audio signal, by reducing the amount of data used by representing additional information such as spatial information in an effective manner, compression of the multi-channel audio signal And an encoding and decoding method capable of improving transmission efficiency.

상기의 목적을 달성하기 위하여 본 발명에 따른 멀티채널 오디오 신호의 인코딩 방법은, 멀티채널 오디오 신호를 다운믹스(downmix)하고, 상기 멀티채널 오디오 신호로부터 공간 정보(spatial information)를 추출하는 단계와; 상기 다운믹스된 오디오 신호 및 상기 공간 정보에 대하여 동일한 샘플링 주파수(sampling frequency)를 이용하여 코어 코덱 비트스트림(core codec bitstream) 및 공간 정보 비트스트림(spatial bitstream)을 생성하되, 상기 코어 코덱 비트스트림 및 공간 정보 비트스트림 중 어느 하나의 비트스트림 내에 상기 샘플링 주파수에 대한 인디케이터 정보(indicator information)를 포함하는 단계;를 포함하는 것을 특징으로 한다.According to an aspect of the present invention, there is provided a method of encoding a multichannel audio signal, the method comprising: downmixing a multichannel audio signal and extracting spatial information from the multichannel audio signal; A core codec bitstream and a spatial bitstream are generated using the same sampling frequency with respect to the downmixed audio signal and the spatial information, wherein the core codec bitstream and And including indicator information for the sampling frequency in any one of the spatial information bitstreams.

또한, 상기의 목적을 달성하기 위하여 본 발명에 따른 다른 멀티채널 오디오 신호의 인코딩 방법은, 상기 멀티채널 오디오 신호를 다운믹스하고, 상기 멀티채널 오디오 신호로부터 공간 정보를 추출하는 단계와; 상기 다운믹스된 오디오 신호에 대한 제1 샘플링 주파수(first sampling frequency)를 이용하여 코어 코덱 비트스트림을 생성하고, 상기 공간 정보에 대한 제2 샘플링 주파수(second sampling frequency)를 이용하여 공간 정보 비트스트림을 생성하되, 상기 제1 샘플링 주파수 및 제2 샘플링 주파수의 동일여부를 지시하는 식별정보(indentification information)를 상기 공간 정보 비트스트림 내에 포함하는 단계;를 포함하는 것을 특징으로 한다. In addition, another method of encoding a multichannel audio signal according to the present invention for achieving the above object comprises the steps of downmixing the multichannel audio signal and extracting spatial information from the multichannel audio signal; A core codec bitstream is generated using a first sampling frequency for the downmixed audio signal, and a spatial information bitstream is generated using a second sampling frequency for the spatial information. Generating, including, in the spatial information bitstream, identification information indicating whether the first sampling frequency and the second sampling frequency are equal to each other.

여기서 상기 제1 샘플링 주파수 및 제2 샘플링 주파수가 동일한 경우에는, 상기 동일여부를 표시하는 식별정보만으로 상기 제2 샘플링 주파수에 대한 인디케이터 정보를 표시하고, 상기 제1 샘플링 주파수 및 제2 샘플링 주파수가 동일하지 않는 경우에는, 상기 동일여부를 표시하는 식별정보 및 상기 제 2 샘플링 주파수에 대한 인덱스(index)를 이용하여 상기 제2 샘플링 주파수에 대한 인디케이터 정보를 표시하는 것이 바람직하다. Here, when the first sampling frequency and the second sampling frequency are the same, the indicator information for the second sampling frequency is displayed only with the identification information indicating whether the same, and the first sampling frequency and the second sampling frequency are the same. If not, it is preferable to display the indicator information for the second sampling frequency by using the identification information indicating whether the same and the index (index) for the second sampling frequency.

또한, 상기 목적을 달성하기 위하여 본 발명에 따른 멀티채널 오디오 신호의 디코딩 방법은, 동일한 샘플링 주파수를 이용하여 생성된 코어 코텍 비트스트림 및 공간 정보 비트스트림을 수신하는 단계와; 상기 수신된 비트스트림 내에서 샘플링 주파수 인덱스를 추출하는 단계와; 상기 추출된 샘플링 주파수 인덱스를 이용하여 상기 코어 코텍 비트스트림 및 공간 정보 비트스트림을 디코딩하는 단계;를 포함하는 것을 특징으로 한다. 본 명세서에서의 공간 정보 비트스트림을 디코딩하는 단계는 공간 정보 비트스트림 내의 정보를 이용하여 코어 코덱 비트스트림에 포함된 오디오 신호를 멀티채널 오디오 신호로 변환하는 단계를 포함할 수 있다.In addition, to achieve the above object, a decoding method of a multichannel audio signal according to the present invention comprises the steps of: receiving a core codec bitstream and a spatial information bitstream generated using the same sampling frequency; Extracting a sampling frequency index within the received bitstream; And decoding the core codec bitstream and the spatial information bitstream using the extracted sampling frequency index. The decoding of the spatial information bitstream in the present specification may include converting an audio signal included in the core codec bitstream into a multichannel audio signal using information in the spatial information bitstream.

또한, 상기 목적을 달성하기 위하여 본 발명에 따른 다른 멀티채널 오디오 신호의 디코딩 방법은, 제1 샘플링 주파수를 이용하여 생성된 코어 코텍 비트스트림과, 제2 샘플링 주파수를 이용하여 생성된 공간 정보 비트스트림을 수신하는 단 계와; 상기 수신된 코어 코덱 비트스트림 내에서 상기 제1 샘플링 주파수 인덱스를 추출하고, 상기 제1 샘플링 주파수 인덱스를 이용하여 코어 코덱 비트스트림을 디코딩하는 단계와; 상기 수신된 공간 정보 비트스트림 내에서 상기 제1 샘플링 주파수와 제2 샘플링 주파수의 동일 여부를 지시하는 식별정보를 확인하여 상기 제2 샘플링 주파수 인덱스를 추출하고, 상기 추출된 제2 샘플링 주파수 인덱스를 이용하여 공간 정보 비트스트림을 디코딩하는 단계;를 포함하는 것을 특징으로 한다. According to another aspect of the present invention, there is provided a method for decoding a multi-channel audio signal, comprising: a core codec bitstream generated using a first sampling frequency and a spatial information bitstream generated using a second sampling frequency. Receiving a step; Extracting the first sampling frequency index within the received core codec bitstream and decoding the core codec bitstream using the first sampling frequency index; Identifying identification information indicating whether the first sampling frequency and the second sampling frequency are the same in the received spatial information bitstream, extracting the second sampling frequency index, and using the extracted second sampling frequency index. And decoding the spatial information bitstream.

또한, 상기 목적을 달성하기 위하여 본 발명에 따른 오디오 신호의 생성방법은, 동일한 주파수를 이용하여 생성된 코어 코덱 비트스트림 및 공간 정보 비트스트림으로 오디오 신호를 구성하며, 상기 공간 정보 비트스트림은 데이터 비트스림(Data Bitstream) 및 컨피규레이션 비트스트림(Configuration Bitstream)을 포함하되, 상기 코어 코덱 비트스트림 및 공간 정보 비트스트림 중 어느 하나의 비트스트림 내에 상기 샘플링 주파수에 대한 인디케이터 정보가 포함되는 것을 특징으로 한다.In addition, in order to achieve the above object, an audio signal generating method according to the present invention comprises an audio signal composed of a core codec bitstream and a spatial information bitstream generated using the same frequency, wherein the spatial information bitstream is a data bit. A data bitstream and a configuration bitstream are included, and the indicator information for the sampling frequency is included in any one of the core codec bitstream and the spatial information bitstream.

또한, 상기 목적을 달성하기 위하여 본 발명에 따른 다른 오디오 신호의 생성방법은, 제1 주파수를 이용하여 생성된 코어 코덱 비트스트림 및 제2 주파수를 이용하여 생성된 공간 정보 비트스트림으로 상기 오디오 신호를 구성하고, 상기 공간 정보 비트스트림은 공간 데이터 비트스트림 및 컨피규레이션 비트스트림을 포함하되, 상기 컨피규레이션 비트스트림 내에 상기 제1 샘플링 주파수 및 제2 샘플링 주파수의 동일여부를 표시하는 식별정보가 포함되는 것을 특징으로 한다. In addition, another method of generating an audio signal according to the present invention in order to achieve the above object, the core codec bitstream generated by using a first frequency and the spatial information bitstream generated by using a second frequency. And the spatial information bitstream includes a spatial data bitstream and a configuration bitstream, wherein the identification information indicating whether the first sampling frequency and the second sampling frequency are equal to each other is included in the configuration bitstream. do.

상기와 같은 본 발명에 따르면, 멀티채널 오디오 신호의 공간 정보를 효과적 으로 표현하여 사용되는 데이터 양을 줄임으로써, 압축 및 전송 효율을 크게 향상시킬 수 있게 된다. According to the present invention as described above, by reducing the amount of data used by effectively representing the spatial information of the multi-channel audio signal, it is possible to greatly improve the compression and transmission efficiency.

이하, 첨부된 도면을 참조하여 본 발명에 대하여 상세히 설명한다.Hereinafter, with reference to the accompanying drawings will be described in detail with respect to the present invention.

도 1 은 본 발명에서의 오디오 신호에 대한 공간 정보를 인간이 인식하는 방법을 도시한다.1 shows a method for a human to recognize spatial information about an audio signal in the present invention.

멀티채널 오디오 신호에 대한 코딩방법은 인간이 오디오 신호를 3차원적 공간으로 인지한다는 사실을 바탕으로, 다수의 파라미터 세트(parameter sets)를 통하여 상기 오디오 신호를 3차원적 공간 정보로 표현할 수 있다는 것을 이용한다. 멀티채널 오디오 신호의 공간 정보를 표시하기 위한 "공간 파라미터(spatial parameter)"라고 불리는 상기 파라미터에는 CLD(Channel level differences), ICC(Inter Channel Coherences) 및 CPC(Channel Prediction Coefficients) 등이 있다. 상기 CLD는 두 채널간의 에너지 차이를 의미하고, 상기 ICC는 두 채널 간의 상관관계(correlation)를 의미하며, 상기 CPC는 두 채널로부터 세 채널을 생성할 때 이용되는 예측 계수(prediction coefficient)를 의미한다. The coding method for a multi-channel audio signal is based on the fact that a human perceives an audio signal as a three-dimensional space, and thus, the audio signal can be represented as three-dimensional spatial information through a plurality of parameter sets. I use it. Such parameters, called "spatial parameters" for indicating spatial information of a multi-channel audio signal, include channel level differences (CLD), inter channel coherences (ICC), channel prediction coefficients (CPC), and the like. The CLD denotes an energy difference between two channels, the ICC denotes a correlation between two channels, and the CPC denotes a prediction coefficient used when generating three channels from two channels. .

인간이 오디오 신호를 어떻게 공간적으로 인식하며, 상기 공간 파라미터의 개념이 어떻게 생성되는지가 도 1에 도시된다. 원거리에 있는 음원(sound source, 105)으로부터의 직접적인 음파(direct sound wave)(103)가 인간의 왼쪽 귀(107)에 도달하고, 또 다른 직접적인 음파(102)는 머리 주위에서 회절(diffraction)되어 오른쪽 귀(106)에 도달하게 된다. 상기 두 음파(102 및 103)는 도달시간 및 에너지 레벨에서 차이를 보이게 되며, 이와 같은 차이가 상기 CLD 및 CPC 파라미터를 생성 하게 된다. 또한 만일 반사된 음파(104 및 105)가 양 귀에 도달되거나, 또는 상기 음원(105)이 분산되어 있다면, 서로 상관관계(correlation)가 없는 음파가 양 귀에 도달될 것이고, 이것이 상기 ICC 파라미터를 생성하게 된다. 상기와 같이 원리로 생성된 공간 파라미터들은 멀티채널 오디오 신호를 모노 또는 스테레오 신호로 전송한 후 다시 멀티채널로 출력하는데 있어서, 강력한 비트 수 감소를 가능하게 한다는 것이 알려져 있다.How a human perceives an audio signal spatially and how the concept of the spatial parameter is generated is shown in FIG. 1. A direct sound wave 103 from a distant sound source 105 reaches the human left ear 107, while another direct sound wave 102 is diffracted around the head. The right ear 106 is reached. The two sound waves 102 and 103 show a difference in arrival time and energy level, and this difference generates the CLD and CPC parameters. Also, if the reflected sound waves 104 and 105 reach both ears, or if the sound source 105 is dispersed, sound waves that do not correlate with each other will reach both ears, which will produce the ICC parameter. do. Spatial parameters generated on the principle as described above are known to enable a strong number of bits in transmitting a multichannel audio signal as a mono or stereo signal and then outputting the multichannel audio signal back to the multichannel.

본 발명은 상기 공간 파라미터들에 관한 정보를 매우 효율적인 방법으로 비트스트림 내에 표현하는 방법을 제시한다.The present invention proposes a method of representing information about said spatial parameters in a bitstream in a very efficient way.

도 2 는 본 발명이 적용될 수 있는, 공간 인코더 및 디코더를 이용하여 멀티채널 오디오 신호를 코딩하는 원리를 도시한다. 도시된 것처럼, 먼저 공간 인코더(spatial encoder, 204)는 멀티채널 오디오 신호(201)를 수신한다. 여기서 N은 입력 채널(input channel)의 수를 의미한다. 상기 멀티채널 오디오 신호(201)는 다운믹스(downmix)부(202)에서 다운믹스되어 다운믹스 신호(down-mix signal, 206)로 된다.2 illustrates the principle of coding a multichannel audio signal using a spatial encoder and decoder to which the present invention may be applied. As shown, a spatial encoder 204 first receives a multichannel audio signal 201. Here, N means the number of input channels. The multi-channel audio signal 201 is downmixed by the downmix unit 202 to become a down-mix signal 206.

또한 상기 멀티채널 오디오 신호의 공간 정보, 즉 공간 파라미터가 공간 파라미터 추출부(203)에서 상기 멀티채널 오디오 신호(201)로부터 추출된다. 여기서 공간 정보(spatial information)란 멀티채널(예를 들면, Left, Right, Center, Left surround, Right surround 등) 오디오 신호를 다운믹스하고, 상기 다운믹스 신호(206)를 전송하며, 상기 전송된 다운믹스 신호를 다시 멀티채널로 업믹스 할 때 사용되는 오디오 신호 채널에 대한 정보를 말한다. 상기 다운믹스 신호(206)는 모노 또는 스테레오 신호를 포함하는데, 본 명세서에서는 상기 다운믹스 신호 중 스테레오 신호를 멀티채널로 변경하는 경우를 기준으로 설명한다. 선택적으로, 상기 다운믹스 신호(206)는 외부에서 직접 제공되는 다운믹스 신호, 예를 들면 아티스틱 다운믹스 신호(Artistic downmix signal, 205)를 이용하여 생성될 수 있다.In addition, the spatial information of the multichannel audio signal, that is, the spatial parameter, is extracted from the multichannel audio signal 201 by the spatial parameter extractor 203. In this case, spatial information refers to downmixing a multi-channel (eg, Left, Right, Center, Left surround, Right surround, etc.) audio signal, transmitting the downmix signal 206, and transmitting the transmitted down signal. Information about the audio signal channel used when upmixing a mix signal back to multichannel. The downmix signal 206 includes a mono or stereo signal. In the present specification, the downmix signal 206 will be described based on a case in which a stereo signal is changed to a multi-channel. Alternatively, the downmix signal 206 may be generated using an externally provided downmix signal, for example, an artistic downmix signal 205.

상기 다운믹스 신호(206)는 상기 코어 코덱(예를 들면, MP3, AC-3, DTS 또는 AAC) 코딩방법을 이용하여 인코딩된 후 압축되어 전송되고, 또한 상기 공간 정보, 즉 공간 파라미터(207)도 함께 전송된다. 만일 사용자의 시스템이 상기 다운믹스 신호(206)로만 출력할 수 있다면, 상기 압축되어 전송된 다운믹스 신호(206)는 디코딩된 후 직접 출력(209)될 수 있다. 만일 상기 시스템이 멀티채널 오디오 신호로 출력할 수 있다면, 상기 압축되어 전송된 오디오 신호는 디코딩된 후 공간 디코더(spatial decoder, 208)에서 상기 함께 전송된 상기 멀티채널 오디오 신호의 공간 정보, 즉 공간 파라미터(207)를 이용하여 멀티채널 오디오 신호(210)로 변환되어 출력될 수 있다.The downmix signal 206 is encoded using the core codec (e.g., MP3, AC-3, DTS or AAC) coding method and then compressed and transmitted. Is also sent together. If the user's system can only output the downmix signal 206, the compressed downmixed signal 206 may be decoded and output 209 directly. If the system can output a multi-channel audio signal, the compressed and transmitted audio signal is decoded and then spatial information, i.e., spatial parameter, of the multi-channel audio signal transmitted together in a spatial decoder 208. The multi-channel audio signal 210 may be converted and output using the reference numeral 207.

멀티채널 오디오 신호를 직접 전송하는 대신에, 상기와 같이 다운믹스 신호(206)로 다운믹스하여 전송하고, 상기 멀티채널 오디오 신호의 공간 정보, 즉 공간 파라미터(207)를 함께 전송하는 방식은 압축 및 전송효율의 관점에서 매우 유리하다.Instead of transmitting the multichannel audio signal directly, the method of downmixing and transmitting the downmix signal 206 as described above, and transmitting the spatial information of the multichannel audio signal, that is, the spatial parameter 207 together, is compressed and It is very advantageous in terms of transmission efficiency.

본 발명은 상기 멀티채널 오디오 신호의 공간 정보, 즉 공간 파라미터(207)를 함께 전송하는데 있어서, 상기 공간 파라미터(207)를 보다 효율적으로 방법으로 표현하여 비트스트림을 구성함으로써 압축 및 전송효율을 개선할 수 있다.According to the present invention, in transmitting the spatial information of the multi-channel audio signal, that is, the spatial parameter 207 together, the spatial parameter 207 can be represented in a more efficient manner to construct a bitstream to improve compression and transmission efficiency. Can be.

도 3은 본 발명의 한 실시예에 따른, 멀티채널 오디오 신호를 상기 공간 디코더 내에서 2채널에서 5.1채널로 바꾸는 단계를 상세하게 도시한다. 본 발명은 도 3에서 도시된 것과 같이 다운믹스 신호를 5.1채널로 변환하는데 이용될 수 있으며, 또한 상기 다운믹스 신호를 5.1채널 이상의 멀티채널로 변환하는 경우에도 이용될 수 있다. 도시되는 것처럼, 상기 2채널에서 5.1채널로의 변환은 일반적으로 시간/주파수 영역(time/frequency domain)에서 이루어지는데, 그 과정은 다음과 같다. 먼저 2-채널 분석 필터뱅크(Analysis filterbank, 302)는 디코딩되어 전송된 스테레오 오디오 신호(301)를 2채널의 시간/주파수 영역 오디오 신호(303)로 변환한다. 그 다음에 상기 2채널 시간/주파수 영역 오디오 신호(303)는 상기 공간 정보, 즉 공간 파라미터를 이용하여 6채널 시간/주파수 오디오 신호(305)로 업믹스(upmix, 304)된다. 그 다음에 상기 6채널 시간/주파수 영역 오디오 신호(305)는 6채널 통합 필터뱅크(Synthesis filterbank, 306)에 의해 5.1채널 오디오 신호(307)로 변환되어 출력된다.Figure 3 illustrates in detail the steps of converting a multichannel audio signal from two channels to 5.1 channels in the spatial decoder, according to an embodiment of the invention. The present invention can be used to convert the downmix signal to 5.1 channels as shown in FIG. 3, and can also be used to convert the downmix signal to 5.1 or more multi-channels. As shown, the two channel to 5.1 channel conversion is generally performed in the time / frequency domain. The process is as follows. First, the two-channel analysis filterbank 302 converts the decoded and transmitted stereo audio signal 301 into two channels of time / frequency domain audio signals 303. The two-channel time / frequency domain audio signal 303 is then upmixed to a six-channel time / frequency audio signal 305 using the spatial information, i.e., spatial parameters. The six channel time / frequency domain audio signal 305 is then converted into a 5.1 channel audio signal 307 by a six channel integrated filter bank 306 and output.

본 발명에 따른 보다 효율적으로 생성된 멀티채널 오디오 신호의 공간 정보는, 상기 업믹스 단계에서 2채널 시간/주파수 오디오 신호로부터 6채널 시간/주파수 영역 오디오 신호로 변환되는데 이용될 수 있다. The spatial information of a more efficiently generated multichannel audio signal according to the present invention may be used to convert from a two-channel time / frequency audio signal to a six-channel time / frequency domain audio signal in the upmix step.

도 4a 및 4b는 본 발명에 따른 멀티채널 오디오 신호의 매 프레임(frame)당 비트스트림 및 코어 코덱 비트스트림을 도시한다. 도 4a에 도시되는 것처럼, 전체 비트스트림의 일부를 구성하는 매 프레임(401)은 다운믹스 신호에 대한 코어 코덱 비트스트림(Core Codec Bitstream, 402) 및 공간 정보 비트스트림(Spatial Information Bitstream, 403)으로 구성되고, 상기 공간 정보 비트스트림(403)은 컨피규레이션 비트스트림(configuration Bitstream, 404) 및 공간 데이터 비트스트림(Data Bitstream, 405)으로 구성된다. 상기 코어 코덱 비트스트림(402) 코딩방법은 AAC (Advanced audio coding), MP-3(MPEG layer3), AC-3(Dolby digital), DTS(Digital Theater System)등을 포함한다. 4A and 4B illustrate a bitstream and a core codec bitstream per frame of a multichannel audio signal according to the present invention. As shown in FIG. 4A, every frame 401 constituting a part of the entire bitstream is divided into a core codec bitstream 402 and a spatial information bitstream 403 for a downmix signal. The spatial information bitstream 403 is configured of a configuration bitstream 404 and a spatial data bitstream 405. The core codec bitstream 402 coding method includes AAC (Advanced audio coding), MP-3 (MPEG layer3), AC-3 (Dolby digital), DTS (Digital Theater System), and the like.

상기 코어 코덱 비트스트림(402)은 상기 멀티채널 오디오 신호의 코어 코덱 비트스트림(402)에 대한 샘플링 주파수 정보(406)(즉, 샘플링 주파수 인덱스)를 포함하고, 상기 컨피규레이션 비트스트림(404)은 멀티채널 오디오 신호의 공간 정보 비트스트림(403)에 대한 샘플링 주파수 정보(407)(즉, 샘플링 주파수 인덱스)를 포함한다. The core codec bitstream 402 includes sampling frequency information 406 (ie, sampling frequency index) for the core codec bitstream 402 of the multichannel audio signal, and the configuration bitstream 404 is multi Sampling frequency information 407 (ie, sampling frequency index) for the spatial information bitstream 403 of the channel audio signal.

상기 매 프레임은 도 4a에서 도시되는 것처럼 구성되는 것이 가장 바람직하지만, 상기 코어 코덱 비트스트림(402) 및 공간 데이터 비트스트림(405)으로 구성되거나, 또는 공간 데이터 비트스트림(405)만으로 구성될 수도 있다. 본 발명은 상기 공간 정보 비트스트림(403)에 대한 샘플링 주파수 인덱스를 효율적으로 표현하여 비트 수를 감소시키는 것이다. 따라서, 도 4a와 같이 프레임을 구성하면, 감소된 비트 수로 표현된 상기 샘플링 주파수 인덱스 정보를 상기 공간 정보 비트스트림을 구성하는 컨피규레이션 비트스트림이 포함하게 되고, 상기 컨피규레이션 비트스트림은 매 프레임마다 사용되므로, 매 프레임마다 비트 수가 감소되는 효과가 나타나게 되어, 전체 비트스트림에서 상당한 비트 수를 감소시킬 수 있게 된다.Each frame is most preferably configured as shown in FIG. 4A, but may consist of the core codec bitstream 402 and the spatial data bitstream 405, or may consist of only the spatial data bitstream 405. . The present invention efficiently represents the sampling frequency index for the spatial information bitstream 403 to reduce the number of bits. Therefore, when the frame is configured as shown in FIG. 4A, the configuration bitstream constituting the spatial information bitstream is included in the sampling frequency index information represented by the reduced number of bits, and the configuration bitstream is used every frame. The effect is that the number of bits is reduced every frame, which makes it possible to significantly reduce the number of bits in the entire bitstream.

상기 코어 코덱 비트스트림에 사용되는 다수의 AAC의 포맷에는 ADTS(Audio Data Transport Stream), LATM(Low-overhead MPEG-4 Audio Transport Multiplex), ADIF(Audio Data Interchange Format) 등이 있다. 도 4b에서는 상기 다수의 AAC 포맷 중 ADTS 포맷의 경우에 대하여 예를 들어 설명한다. Formats of a plurality of AACs used in the core codec bitstream include Audio Data Transport Stream (ADTS), Low-overhead MPEG-4 Audio Transport Multiplex (LATM), Audio Data Interchange Format (ADIF), and the like. In FIG. 4B, an example of the ADTS format among the plurality of AAC formats will be described.

먼저 ADTS(Audio Data Transport Stream) 포맷은 AAC 헤더 비트스트림(header bitstream, 409) 및 AAC 데이터 비트스트림(410)으로 구성되며, 상기 AAC 헤더 비트스트림(409)에는 AAC 싱크워드(syncword)(408)가 포함된다. 상기 AAC 싱크워드(408)는 AAC 비트스트림이 시작된다는 것을 표시하는 것으로서, 예를 들면 이진수 111111111111을 사용하여 표현될 수 있다. 상기 AAC 헤더 비트스트림(409)은 고정 헤더(fixed header) 및 가변 헤더(variable header) 영역으로 구성될 수 있다.First, the Audio Data Transport Stream (ADTS) format includes an AAC header bitstream 409 and an AAC data bitstream 410. An AAC header bitstream 409 includes an AAC syncword 408. Included. The AAC syncword 408 indicates that the AAC bitstream begins, and can be represented using, for example, the binary 111111111111. The AAC header bitstream 409 may be composed of a fixed header and a variable header area.

본 발명은 멀티채널 오디오 신호에 대한 샘플링 주파수 인덱스를 상기 코어 코덱 비트스트림에 대한 샘플링 주파수 인덱스를 이용하여 표현하는 것인데, 상기 코어 코덱 비트스트림에 대한 샘플링 주파수 인덱스는 상기 ADTS 포맷과 같은 형식으로 생성되는 코어 코덱 비트스트림, 예를 들면 AAC 비트스트림 내에 포함된다. According to the present invention, a sampling frequency index for a multi-channel audio signal is represented using a sampling frequency index for the core codec bitstream. The sampling frequency index for the core codec bitstream is generated in the same format as the ADTS format. It is included in a core codec bitstream, for example an AAC bitstream.

본 발명은 2가지 실시예를 포함하는데, 본 발명에 따른 제1 실시예는 멀티 채널 오디오 신호에 대한 공간 정보 비트스트림의 샘플링 주파수를 코어 코덱 비트스트림의 샘플링 주파수와 동일한 것으로 사용함으로써, 상기 공간 정보 비트스트림의 샘플링 주파수 정보(즉, 샘플링 주파수 인덱스)를 표시하지 않는 것이다. The present invention includes two embodiments, wherein the first embodiment according to the present invention uses the sampling frequency of the spatial information bitstream for a multi-channel audio signal to be the same as the sampling frequency of the core codec bitstream, thereby providing the spatial information. The sampling frequency information (ie, sampling frequency index) of the bitstream is not displayed.

본 발명에 따른 제2 실시예는, 상기 코어 코덱 비트스트림의 제1 샘플링 주파수와 상기 공간 정보 비트스트림의 제2 샘플링 주사수가 동일한지를 표시하는 식 별정보(이하, "플래그(flag)"라 한다)와 추가적인 인덱스를 사용하여, 상기 공간 정보 비트스트림의 샘플링 주파수 인디케이터 정보를 효율적으로 표시하는 것이다.According to the second embodiment of the present invention, identification information indicating whether the first sampling frequency of the core codec bitstream and the second sampling scan number of the spatial information bitstream is the same (hereinafter referred to as "flag"). ) And an additional index to efficiently indicate sampling frequency indicator information of the spatial information bitstream.

도 5는 본 발명에 따른 멀티채널 오디오 신호에 대한 코어 코덱 비트스트림의 샘플링 주파수 인덱스를 신택스(syntax) 상에 도시한다.5 illustrates on a syntax the sampling frequency index of a core codec bitstream for a multichannel audio signal according to the present invention.

상기 샘플링 주파수 인덱스는 AAC 비트스트림 내에, 보다 구체적으로는 AAC 고정 헤더 영역에 위치된다. 상기 샘플링 주파수 인덱스는 96000, 88200, 64000, 48000, 44100, 32000, 24000, 22050, 16000, 12000, 11025, 8000Hz 및 4개의 예비필드에 대응되는 값들을 표시하기 위해 4비트가 할당된다. 상기 예비필드는 특정 되지 않은 샘플링 주파수를 의미하며, 향후 또 다른 샘플링 주파수로 결정될 수 있는 영역이다. The sampling frequency index is located in the AAC bitstream, more specifically in the AAC fixed header region. The sampling frequency index is assigned 4 bits to indicate values corresponding to 96000, 88200, 64000, 48000, 44100, 32000, 24000, 22050, 16000, 12000, 11025, 8000Hz, and four spare fields. The preliminary field means an unspecified sampling frequency and may be determined as another sampling frequency in the future.

상기 샘플링 주파수 인덱스 및 대응되는 실제 샘플링 주파수의 관계는 테이블로 표현되어 있으므로, 먼저 실제 샘플링 주파수에 대응되는 샘플링 주파수 인덱스를 인코딩하여 전송한다. 그 다음에 디코더에서 상기 인코딩에서 사용된 것과 동일한 테이블을 참조하여 상기 전송된 샘플링 주파수 인덱스에 대응되는 샘플링 주파수를 추출하고, 상기 추출된 샘플링 주파수를 이용하여 디코딩하게 된다.Since the relationship between the sampling frequency index and the corresponding actual sampling frequency is represented by a table, first, the sampling frequency index corresponding to the actual sampling frequency is encoded and transmitted. The decoder then extracts a sampling frequency corresponding to the transmitted sampling frequency index with reference to the same table used in the encoding, and decodes using the extracted sampling frequency.

일반적으로 샘플링 주파수 인덱스는 코어 코덱 비트스트림에 대한 것과 공간 정보 비트스트림에 관한 것이 있으며, 본 발명은 상기 도 5에서 도시된 코어 코덱 비트스트림의 샘플링 주파수 인덱스를 이용하여 공간 정보 비트스트림의 샘플링 주파수 인덱스를 효율적으로 표현할 수 있다.In general, the sampling frequency index relates to the core codec bitstream and the spatial information bitstream. The present invention provides a sampling frequency index of the spatial information bitstream using the sampling frequency index of the core codec bitstream shown in FIG. Can be expressed efficiently.

본 발명의 제1 실시예에서는, 멀티채널 오디오 신호에 대한 상기 공간 정보 비트스트림의 샘플링 주파수를 상기 코어 코덱 비트스트림의 샘플링 주파수와 동일한 값으로 사용하여 인코딩함으로써, 상기 공간 정보 비트스트림의 샘플링 주파수 인덱스를 사용하지 않음으로써 비트 수를 줄일 수 있다.In the first embodiment of the present invention, the sampling frequency index of the spatial information bitstream is encoded by using the sampling frequency of the spatial information bitstream for the multichannel audio signal to be the same as the sampling frequency of the core codec bitstream. You can reduce the number of bits by not using.

도 6 은 본 발명의 제2 실시예에 따른 멀티채널 오디오 신호를 인코딩하는데 있어서, 상기 공간 정보 비트스트림의 샘플링 주파수 인디케이터 정보를 신택스 상에서 도시한다.6 illustrates, on syntax, sampling frequency indicator information of the spatial information bitstream in encoding a multichannel audio signal according to a second embodiment of the present invention.

도시된 것처럼, 상기 공간 정보 비트스트림을 구성하는 컨피규레이션(configuration) 비트스트림 내에는 플래그("bsSamplingFrequenytflag")(601) 및 상기 공간 정보 비트스트림의 샘플링 주파수 인덱스("bsSamplingfrequencyindex")(602)가 포함되며, 상기 플래그(601)를 표현하기 위해 1비트가 할당되고, 상기 샘플링 주파수 인덱스(602)를 표현하기 위해 4비트가 할당된다. 만일 상기 샘플링 주파수 인덱스(602)가 0xf에 해당되면, 상기 0xf를 자유 샘플링 주파수에 관한 정보로 사용하여, 사용된 샘플링 주파수 값을 표시하기 위해 추가적인 24비트를 이용한다.As shown, the configuration bitstream constituting the spatial information bitstream includes a flag ("bsSamplingFrequenytflag") 601 and a sampling frequency index ("bsSamplingfrequencyindex") 602 of the spatial information bitstream. 1 bit is allocated to represent the flag 601 and 4 bits are allocated to represent the sampling frequency index 602. If the sampling frequency index 602 corresponds to 0xf, an additional 24 bits are used to indicate the sampling frequency value used by using the 0xf as information on the free sampling frequency.

본 발명의 제2 실시예에서는, 제1 샘플링 주파수를 이용하여 코어 코덱 비트스트림을 생성하고, 제2 샘플링 주파수를 이용하여 공간 정보 비트스트림을 생성하되, 상기 제1 샘플링 주파수 및 제2 샘플링 주파수의 동일여부를 지시하는 플래그(flag), 예를 들면 1 비트의 "bsSamplingFrequenytflag" 플래그(601)를 상기 공간 정보 비트스트림 내에 포함되도록 함으로써 상기 제2 샘플링 주파수에 대한 인디케이터 정보를 효율적으로 표현할 수 있다.In a second embodiment of the present invention, a core codec bitstream is generated using a first sampling frequency, and a spatial information bitstream is generated using a second sampling frequency, and the first and second sampling frequencies are generated. A flag indicating whether or not to be identical, for example, a 1-bit " bsSamplingFrequenytflag " flag 601 can be included in the spatial information bitstream so that the indicator information for the second sampling frequency can be efficiently represented.

상기 제2 샘플링 주파수 인디케이터 정보를 효율적으로 표현하기 위해, 상기 제 1 샘플링 주파수 및 제2 샘플링 주파수가 동일한 경우에는, 상기 플래그(601) 값이 1b(또는 0b)로 표시되고, 상기 제2 샘플링 주파수는 상기 제1 샘플링 주파수와 동일한 것으로 사용한다. 따라서 상기 제2 샘플링 주파수 인덱스(602), 즉 "bsSamplingfrequencyindex"를 표현하기 위하여 추가적인 비트를 사용하지 않을 수 있다.In order to efficiently represent the second sampling frequency indicator information, when the first sampling frequency and the second sampling frequency are the same, the value of the flag 601 is indicated by 1b (or 0b), and the second sampling frequency. Is the same as the first sampling frequency. Therefore, an additional bit may not be used to represent the second sampling frequency index 602, that is, "bsSamplingfrequencyindex".

만일 상기 제1 샘플링 주파수 및 제2 샘플링 주파수가 동일하지 않은 경우에는, 상기 플래그(601) 값이 0b(또는 1b)으로 표시되고, 상기 제2 샘플링 주파수 인디케이터 정보를 표현하기 위해 추가적인 4비트의 인덱스, 즉 "bsSamplingFrequencyindex"(602)를 사용한다. 상기 "bsSamplingFrequencyindex"(602)의 값이 0x0~0xb에 해당되는 경우에는 96000, 88200, 64000, 48000, 44100, 32000, 24000, 22050, 16000, 12000, 11025, 8000Hz를 사용하고, 상기 "bsSamplingFrequencyindex"(602)의 값이 0xc~0xe에 해당되는 경우에는 예비 필드(reserved field)로 사용하며, 상기 "bsSamplingFrequencyindex"(602)의 값이 0xf에 해당되는 경우에는 상기 0xf를 자유 샘플링 주파수 인덱스로 사용하여 추가적인 비트(예를 들면 24비트)로 샘플링 주파수 값을 표현할 수 있다.If the first sampling frequency and the second sampling frequency are not the same, the value of the flag 601 is indicated by 0b (or 1b), and an additional 4-bit index to represent the second sampling frequency indicator information. , Ie, "bsSamplingFrequencyindex" 602. When the value of the "bsSamplingFrequencyindex" 602 corresponds to 0x0 to 0xb, 96000, 88200, 64000, 48000, 44100, 32000, 24000, 22050, 16000, 12000, 11025, and 8000Hz are used, and the "bsSamplingFrequencyindex" ( If the value of 602) corresponds to 0xc to 0xe, it is used as a reserved field. If the value of the "bsSamplingFrequencyindex" 602 corresponds to 0xf, the 0xf is used as a free sampling frequency index. The sampling frequency value can be expressed in bits (for example, 24 bits).

또한 상기 "bsSamplingFrequencyindex"(602) 값은 상기 96000 내지 8000Hz까지의 12개의 값에 대해 4개의 값을 가지는 3개의 그룹으로 나누어 표현할 수 있다. 예를 들면 만일 상기 제1 주파수와 제2 주파수가 상기 3개의 그룹 중 동일한 그룹 에 속한 경우에는, 상기 제1 주파수와 제2 주파수가 동일한 그룹에 속하는지를 표시하는 1비트 플래그 및 같은 그룹일 경우 4가지 경우에 해당되는 2비트 인덱스로 구성될 수 있다. 만일 상기 제1 주파수와 제 2 주파수가 동일한 그룹에 속하지 않는 경우에는, 상기 제1 주파수와 제2 주파수가 동일한 그룹에 속하는지를 표시하는 1비트 플래그 및 다른 그룹일 경우 8가지에 해당되는 3비트 인덱스로 구성될 수 있다.In addition, the value of "bsSamplingFrequencyindex" 602 may be divided into three groups having four values for 12 values ranging from 96000 to 8000 Hz. For example, if the first frequency and the second frequency belong to the same group of the three groups, 1-bit flag indicating whether the first frequency and the second frequency belong to the same group and 4 if the same group It may consist of a 2-bit index corresponding to one case. If the first frequency and the second frequency do not belong to the same group, a 1-bit flag indicating whether the first frequency and the second frequency belong to the same group, and if the other group is a three-bit index corresponding to eight It can be configured as.

도 7 은 본 발명에 따른 제2 실시예에 대한 멀티채널 오디오 신호의 인코딩 방법에 대한 흐름도이다.7 is a flowchart of a method of encoding a multichannel audio signal according to a second embodiment according to the present invention.

상기 멀티채널 오디오 신호를 인코딩하기 위해, 먼저 멀티채널 오디오 신호, 바람직하게는 5.1채널 오디오 신호를 모노 또는 스테레오 오디오 신호로 다운믹스(701)한다. 이때 선택적으로, 상기 다운믹스된 오디오 신호, 즉 모노 또는 스테레오 오디오 신호는 외부에서 직접 제공되는 다운믹스된 오디오 신호, 즉 아티스틱 다운믹스 오디오 신호를 이용하여 생성될 수 있다. 상기 멀티채널 오디오 신호로부터 공간 정보를 추출(702)한다. 상술된 것처럼, 상기 공간 정보에는 CLD, ICC 및 CPC 등이 포함된다. To encode the multichannel audio signal, first downmix 701 a multichannel audio signal, preferably a 5.1 channel audio signal, into a mono or stereo audio signal. Optionally, the downmixed audio signal, ie, mono or stereo audio signal, may be generated using a downmixed audio signal, ie, an artistic downmix audio signal, which is directly provided from the outside. Spatial information is extracted from the multichannel audio signal (702). As described above, the spatial information includes CLD, ICC, CPC, and the like.

그 다음에 제1 샘플링 주파수를 이용하여 상기 다운믹스된 오디오 신호에 대한 멀티채널 오디오 신호의 코어 코덱 비트스트림을 생성(703)하고, 제2 샘플링 주파수를 이용하여 멀태채널 오디오 신호의 공간 정보 비트스트림을 생성(704)한다. 만일 상기 제1 샘플링 주파수와 제2 샘플링 주파수가 동일한 경우(705)에는, 상기 제 2 샘플링 주파수 인디케이터 정보를 상기 플래그만으로 표시(707)한다. 즉 , 상기 제2 샘플링 주파수는 상기 제1 샘플링 주파수와 동일한 것을 사용한다. 만일 상기 제1 샘플링 주파수와 제2 샘플링 주파수가 다른 경우에는, 상기 제2 샘플링 주파수 인디케이터 정보를 플래그 및 추가적인 인덱스를 이용하여 표시(706)한다. 그 다음에 상기 제2 샘플링 주파수 인디케이터 정보를 포함하는 전체 비트스트림을 전송(708)한다. A core codec bitstream of the multichannel audio signal for the downmixed audio signal is then generated using the first sampling frequency (703), and the spatial information bitstream of the multichannel audio signal using the second sampling frequency. 704 is generated. If the first sampling frequency and the second sampling frequency are the same (705), the second sampling frequency indicator information is displayed only by the flag (707). That is, the second sampling frequency uses the same as the first sampling frequency. If the first sampling frequency and the second sampling frequency are different, the second sampling frequency indicator information is displayed using a flag and an additional index (706). The entire bitstream including the second sampling frequency indicator information is then transmitted 708.

도 8은 본 발명에 따른 제2 실시예에 대한 멀티채널 오디오 신호의 디코딩 방법에 대한 흐름도이다. 상기 멀티채널 오디오 신호를 디코딩하기 위해, 먼저 제1 샘플링 주파수를 이용하여 생성된 코어 코덱 비트스트림 및 제2 샘플링 주파수를 이용하여 생성된 공간 정보 비트스트림을 수신(801)하고, 상기 수신된 코어 코덱 비트스트림에서 상기 제1 샘플링 주파수 정보를 추출하고, 상기 제1 샘플링 주파수 정보를 이용하여 코어 코덱 비트스트림을 디코딩(802)한다. 8 is a flowchart of a method of decoding a multichannel audio signal according to a second embodiment according to the present invention. In order to decode the multichannel audio signal, first, a core codec bitstream generated using a first sampling frequency and a spatial information bitstream generated using a second sampling frequency are received (801), and the received core codec The first sampling frequency information is extracted from the bitstream, and the core codec bitstream is decoded using the first sampling frequency information (802).

그 다음에 수신된 공간 정보 비트스트림에서 상기 제1 샘플링 주파수와 제2 샘플링 주파수의 동일여부를 지시하는 플래그를 확인(803)한다. 만일 상기 제1 샘플링 주파수와 제2 샘플링 주파수가 동일하다면, 상기 제1 샘플링 주파수를 이용하여 상기 공간 정보 비트스트림을 디코딩(805)하고, 만일 상기 제1 샘플링 주파수와 제2 샘플링 주파수가 동일하지 않다면, 플래그 및 추가적인 인덱스를 사용하여 표현된 제2 샘플링 주파수 인디케이터 정보를 이용하여 제2 샘플링 주파수를 추출하고, 상기 제2 샘플링 주파수를 이용하여 상기 공간 정보 비트스트림을 디코딩(804)한다.A flag indicating whether the first sampling frequency is equal to the second sampling frequency in the received spatial information bitstream is then confirmed (803). If the first sampling frequency and the second sampling frequency are the same, the spatial information bitstream is decoded using the first sampling frequency (805), and if the first sampling frequency and the second sampling frequency are not the same. Extracts a second sampling frequency using the second sampling frequency indicator information expressed using a flag and an additional index, and decodes the spatial information bitstream using the second sampling frequency (804).

그 다음에 만일 사용자가 사용하는 시스템이 모노 또는 스테레오 채널을 지 원한다면(806), 상기 디코딩된 코어 코덱 비트스트림만을 이용하여 모노 또는 스테레오 오디오 신호로 출력(807)하고, 만일 상기 시스템이 멀티채널(예를 들면, 5.1채널)을 지원한다면, 상기 디코딩된 공간 정보 비트스트림을 이용하여 상기 디코딩된 코어 코덱 비트스트림을 멀티채널(예를 들면, 5.1채널)로 변환하여 출력(808)한다.If the system you use supports mono or stereo channels (806), then output 807 as a mono or stereo audio signal using only the decoded core codec bitstream, and if the system is multi-channel ( For example, if it supports 5.1 channels, the decoded core codec bitstream is converted into a multichannel (for example, 5.1 channel) using the decoded spatial information bitstream and outputs (808).

지금까지 본 발명에 대하여 몇몇 실시예들을 들어 구체적으로 설명하였으나, 상기 실시예들은 본 발명을 이해하기 위한 설명을 위해 제시된 것이며, 본 발명의 범위가 상기 실시예에 제한되는 것은 아니다. 당업자라면 본 발명의 기술적 사상의 범위를 벗어나지 않고도 다양한 변형이 가능함을 이해할 수 있을 것이며, 본 발명의 범위는 첨부된 특허청구범위에 의해서 해석되어야 할 것이다.Although the present invention has been described in detail with reference to some embodiments, the above embodiments are presented for the purpose of understanding the present invention, and the scope of the present invention is not limited to the above embodiments. Those skilled in the art will understand that various modifications are possible without departing from the scope of the technical idea of the present invention, and the scope of the present invention should be interpreted by the appended claims.

이상에서 기술된 것과 같이, 본 발명에 따른 공간 정보 비트스트림의 샘플링 주파수 인덱스를 효율적으로 표현하기 위해, 공간 정보 비트스트림의 샘플링 주파수를 코어 코덱 비트스트림의 샘플링 주파수와 동일한 것을 사용함으로써, 상기 공간 정보 비트스트림의 샘플링 주파수 인덱스에 대한 비트 수를 줄일 수 있어 인코딩 및 전송효율을 향상시킬 수 있다.As described above, in order to efficiently express the sampling frequency index of the spatial information bitstream according to the present invention, by using the same sampling frequency of the core codec bitstream as the sampling frequency of the spatial information bitstream, By reducing the number of bits for the sampling frequency index of the bitstream, encoding and transmission efficiency can be improved.

또한 상기 공간 정보 비트스트림의 샘플링 주파수 인덱스를 플래그 및 추가적인 인덱스를 이용하여 표현함으로써, 상기 공간 정보 비트스트림의 샘플링 주파수 인덱스를 효율적으로 표현하여 인코딩 및 전송효율을 향상시킬 수 있는 효과가 있다.In addition, by representing the sampling frequency index of the spatial information bitstream using a flag and an additional index, it is possible to efficiently represent the sampling frequency index of the spatial information bitstream to improve encoding and transmission efficiency.

Claims

A method of encoding a multichannel audio signal,

(a) downmixing the multichannel audio signal and extracting spatial information from the multichannel audio signal; And

(b) generating a core codec bitstream and a spatial information bitstream using the same sampling frequency with respect to the downmixed audio signal and the spatial information, wherein any one of the core codec bitstream and the spatial information bitstream Incorporating indicator information for the sampling frequency in the multichannel audio signal.

A method of encoding a multichannel audio signal,

(b) generating a core codec bitstream using a first sampling frequency for the downmixed audio signal and generating a spatial information bitstream using a second sampling frequency for the spatial information, wherein the first sampling And including in said spatial information bitstream identification information indicating whether a frequency and a second sampling frequency are equal.

The method of claim 2,

In step (b),

If the first sampling frequency and the second sampling frequency are the same, further comprising displaying indicator information for the second sampling frequency with only identification information indicating whether the same, multi-channel audio signal Method of encoding.

The method of claim 2,

In step (b),

When the first sampling frequency and the second sampling frequency are not the same, the indicator information for the second sampling frequency is determined by using identification information indicating whether the same and an index for the second sampling frequency. Further comprising the step of being displayed.

The method of claim 4, wherein

The index for the second sampling frequency corresponds to a value corresponding to a frequency of 96000, 88200, 64000, 48000, 44100, 32000, 24000, 22050, 16000, 12000, 11025, and 8000 Hz and a value corresponding to an unspecified frequency. The method of encoding a multi-channel audio signal, characterized in that.

The method of claim 4, wherein

The index for the second sampling frequency divides 12 values corresponding to frequencies of 96000 Hz to 8000 Hz into three groups of four, and indicates identification information indicating whether the first sampling frequency and the second sampling frequency belong to the same group. The encoding method of the multi-channel audio signal, characterized in that displayed by using.

In the method of decoding a multichannel audio signal,

(a) receiving a core codec bitstream and a spatial information bitstream generated using the same sampling frequency;

(b) extracting a sampling frequency index within the received bitstream; And

and (c) decoding the core cortex bitstream and the spatial information bitstream using the extracted sampling frequency index.

In the method of decoding a multichannel audio signal,

(a) receiving a core cortex stream generated using a first sampling frequency and a spatial information bitstream generated using a second sampling frequency; And

(b) extracting the first sampling frequency index within the received core codec bitstream and decoding the core codec bitstream using the first sampling frequency index; And

(c) extracting the second sampling frequency index by identifying identification information indicating whether the first sampling frequency and the second sampling frequency are the same in the received spatial information bitstream, and extracting the extracted second sampling frequency. And decoding the spatial information bitstream using an index.

The method of claim 8,

The index for the second sampling frequency corresponds to a value corresponding to a frequency of 96000, 88200, 64000, 48000, 44100, 32000, 24000, 22050, 16000, 12000, 11025, and 8000 Hz and a value corresponding to an unspecified frequency. The method of decoding a multi-channel audio signal, characterized in that.

The method of claim 8,

The index for the second sampling frequency divides 12 values corresponding to frequencies of 96000 Hz to 8000 Hz into three groups of four, and indicates identification information indicating whether the first sampling frequency and the second sampling frequency belong to the same group. The method of decoding a multi-channel audio signal, characterized in that displayed by using.

In generating an audio signal,

The audio signal includes a core codec bitstream and a spatial information bitstream generated using the same frequency.

The spatial information bitstream includes a spatial data bitstream and a configuration bitstream, wherein the indicator for the sampling frequency in any one of the core codec bitstream and the spatial information bitstream And information is included.

In generating an audio signal,

The audio signal includes a core codec bitstream generated using a first frequency and a spatial information bitstream generated using a second frequency.

The spatial information bitstream includes a spatial data bitstream and a configuration bitstream, wherein the identification information indicating whether the first sampling frequency and the second sampling frequency are equal to each other is included in the configuration bitstream. Signal generation method.