KR20070003545A

KR20070003545A - Clipping restoration for multi-channel audio coding

Info

Publication number: KR20070003545A
Application number: KR1020060030671A
Authority: KR
Inventors: 방희석; 오현오; 김동수; 임재현; 정양원
Original assignee: 엘지전자 주식회사
Priority date: 2005-06-30
Filing date: 2006-04-04
Publication date: 2007-01-05
Also published as: KR20070003544A; KR20070003547A; KR20070003543A; KR20070003546A

Abstract

A method and an apparatus for encoding a multi-channel audio signal, and a method and an apparatus for decoding the multi-channel audio signal are provided to absolve the clipping problem occurring in a multi-channel audio signal by applying a clipping preventing gain to a downmix audio signal before a downmixing process. A downmixing unit(303) generates a downmix audio signal by performing a downmixing process after applying a clipping prevention gain to a multi-channel audio signal. A space information generating unit(304) extracts space information from the multi-channel audio signal. A bit stream formatting unit(305) generates an entire bit stream by using the downmix audio signal and the space information.

Description

Clipping Restoration in Multichannel Audio Coding {CLIPPING RESTORATION FOR MULTI-CHANNEL AUDIO CODING}

도 1은 본 발명에서의 오디오 신호에 대한 공간 정보를 인간이 인식하는 방법을 나타내는 도면.BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 is a diagram illustrating a method for a human to recognize spatial information about an audio signal in the present invention.

도 2는 클리핑 발생과정을 나타내는 도면.2 is a diagram illustrating a clipping process.

도 3은 본 발명에 따른 클리핑방지게인 및 다운믹스게인을 적용하여 클리핑을 방지하기 위한 제1 방법에 대한 도면.3 is a diagram of a first method for preventing clipping by applying a clipping prevention gain and a downmix gain according to the present invention;

도 4는 본 발명에 따른 클리핑방지게인 및 다운믹스게인을 적용하여 클리핑을 방지하기 위한 제2 방법에 대한 도면.4 is a diagram of a second method for preventing clipping by applying a clipping prevention gain and a downmix gain according to the present invention;

도 5a 및 5b는 본 발명에 따른 클리핑방지게인 값에 대한 다양한 실시예를 도시하는 테이블.5A and 5B are tables illustrating various embodiments of anti-clipping gain values in accordance with the present invention.

도 6은 본 발명에 따른 클리핑방지게인, 다운믹스게인 및 클리핑복구정보를 사용하여 클리핑을 방지하기 위한 방법에 대한 도면.6 is a diagram of a method for preventing clipping using anti-clipping gain, downmix gain and clipping recovery information in accordance with the present invention.

도 7은 본 발명에 따른 프레임 주위에서 음질저하를 방지하는 클리핑방지게인 이용원리를 도시하는 그래프.Figure 7 is a graph showing the principle of using the anti-clipping gain to prevent sound degradation around the frame according to the present invention.

도 8은 본 발명에 따른 클리핑을 방지하기 위한 제1 방법을 이용하는 멀티채널 오디오 신호의 인코딩 방법에 대한 흐름도.8 is a flowchart of a method of encoding a multichannel audio signal using a first method for preventing clipping in accordance with the present invention.

도 9는 본 발명에 따른 클리핑을 방지하기 위한 제2 방법을 이용하는 멀티채널 오디오 신호의 인코딩 방법에 대한 흐름도.9 is a flowchart of a method for encoding a multichannel audio signal using a second method for preventing clipping in accordance with the present invention.

도 10은 본 발명에 따른 클리핑을 방지하기 위한 제1 방법을 이용하는 멀티채널 오디오 신호의 디코딩 방법에 대한 흐름도.10 is a flowchart of a method of decoding a multichannel audio signal using a first method for preventing clipping in accordance with the present invention.

도 11은 본 발명에 따른 클리핑을 방지하기 위한 제2 방법을 이용하는 멀티채널 오디오 신호의 디코딩 방법에 대한 흐름도.11 is a flowchart of a method for decoding a multichannel audio signal using a second method for preventing clipping in accordance with the present invention.

*도면의 주요부분에 대한 부호의 설명* Explanation of symbols for main parts of the drawings

101.원거리 음원 102.직접적인 음파101.Remote sound source 102.Direct sound wave

104.반사된 음파 301.멀티채널 오디오 신호104. Reflected sound waves 301. Multichannel audio signal

303.다운믹스부 304.공간정보발생부303. Downmix unit 304. Spatial information generation unit

305.비트스트림포맷터 306.전체 비트스트림305. Bitstream Formatter 306. Full Bitstream

307.비트스트림파서 308.오디오디코딩 및 멀티채널생성부307. Bitstream Parser 308. Audio-coding and multichannel generator

311.공간 인코더 312.공간 디코더311.Space Encoder 312.Space Decoder

본 발명은 멀티채널 오디오 신호의 공간 정보에 대한 부호-복호화(encoding-decoding)방법에 관한 것으로서, 더욱 상세하게는 클리핑 복원방법을 갖는 멀티채널 오디오 신호의 부호화-복호화 방법에 대한 것이다.The present invention relates to an encoding-decoding method for spatial information of a multichannel audio signal, and more particularly, to an encoding-decoding method of a multichannel audio signal having a clipping recovery method.

최근에 디지털 오디오 신호에 대한 다양한 코딩기술 및 방법들이 개발되고 있으며, 이와 관련된 제품들이 생산되고 있다. 또한 심리음향 모델(Psychoacoustic model)을 이용한 멀티채널 오디오 신호(multi-channel audio signal)의 코딩방법들이 개발되고 있으며, 이에 대한 표준화 작업이 진행되고 있다. 상기 심리음향 모델은 인간이 소리를 인식하는 방식, 예를 들면 큰 소리 다음에 오는 작은 소리는 들리지 않으며, 20Hz 내지 20000Hz의 주파수에 해당되는 소리만 들을 수 있다는 사실을 이용하여, 코딩과정에서 불필요한 부분에 대한 오디오 신호를 제거함으로써 필요한 데이터의 양을 효과적으로 줄일 수 있는 것이다.Recently, various coding techniques and methods for digital audio signals have been developed, and related products have been produced. In addition, coding methods for a multi-channel audio signal using a psychoacoustic model have been developed, and standardization thereof has been in progress. The psychoacoustic model is an unnecessary part of the coding process by using a method of recognizing a sound, for example, a small sound following a loud sound, and only a sound corresponding to a frequency of 20 Hz to 20000 Hz. By eliminating the audio signal for, the amount of data needed can be effectively reduced.

현재 MPEG-1 오디오(MEPG-1 레이어 Ⅲ), MPEG-4 AAC(Advanced Audio Coding) 및 MPEG-4 HE-AAC(High-Efficiency AAC)와 같은 오디오 표준 기술이 개발되어 상용화되고 있다. 또한 공간 정보를 이용하는 멀티채널 오디오 신호의 코딩방법이 개발되고 있다. 상기 멀티채널 오디오 신호의 코딩방법은 압축된 오디오 신호(예를 들면, 모노 또는 스테레오 오디오 신호) 및 낮은 비트-레이트의 부가정보(low-rate side information)(예를 들면, 공간 정보) 채널을 이용하여 멀티채널 오디오 신호의 전송 효율을 매우 효과적으로 향상시키는 것이다.Currently, audio standard technologies such as MPEG-1 Audio (MEPG-1 Layer III), MPEG-4 Advanced Audio Coding (AAC), and MPEG-4 High-Efficiency AAC (HE-AAC) have been developed and commercialized. In addition, a method of coding a multichannel audio signal using spatial information has been developed. The multi-channel audio signal coding method uses a compressed audio signal (e.g., mono or stereo audio signal) and a low bit-rate side information (e.g., spatial information) channel. Therefore, the transmission efficiency of the multichannel audio signal is greatly improved.

그러나, 상기 멀티채널 오디오 신호의 비트스트림을 구성하는데 있어서, 종래에는 멀티채널을 모노 또는 스테레오 오디오 신호로 다운믹스하면 클리핑(Clipping) 문제가 발생하였었다. 특히 부호화된 신호는 16비트 등으로 크기가 제한되어야하므로, 상기 부호화된 신호는 코어 코덱 인코딩 이후에도 클리핑이 지속된다. 상기 클리핑은 오디오 신호의 출력에도 영향을 주며, 음질 저하의 원인이 되었었다. However, in configuring the bitstream of the multichannel audio signal, a conventional clipping problem occurs when downmixing the multichannel to a mono or stereo audio signal. In particular, since the coded signal should be limited in size to 16 bits or the like, the coded signal continues clipping even after core codec encoding. The clipping also affects the output of the audio signal, and has been a cause of sound quality degradation.

따라서 상기와 같은 문제점을 해결하기 위해 제안된 본 발명은, 멀티채널 오디오 신호를 코딩하는데 있어서, 다운믹스 오디오 신호에 클리핑방지게인을 적용하거나, 또는 다운믹스하기 전의 멀티채널에 클리핑방지게인을 적용하여 비트스트림을 구성함으로써, 멀티채널 오디오 신호에서 일어나는 클리핑 문제를 해결하는 방법 및 장치를 제공하는데 그 목적이 있다.Accordingly, the present invention proposed to solve the above problems, in coding a multi-channel audio signal, by applying an anti-clipping gain to the downmix audio signal, or by applying an anti-clipping gain to the multi-channel before downmixing It is an object of the present invention to provide a method and apparatus for solving a clipping problem occurring in a multichannel audio signal by constructing a bitstream.

상기의 목적을 달성하기 위하여, 본 발명은 상기 멀티채널 오디오 신호에 클리핑방지게인을 적용한 후에, 다운믹스 과정을 진행하여 다운믹스 오디오 신호를 생성하는 단계와; 상기 멀티채널 오디오 신호로부터 공간 정보를 추출하는 단계와; 상기 다운믹스오디오신호 및 공간 정보를 이용하여 전체 비트스트림을 생성하는 단계;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호의 인코딩 방법을 제공한다.In order to achieve the above object, the present invention comprises the steps of generating a downmix audio signal by performing a downmix process after applying a clipping prevention gain to the multi-channel audio signal; Extracting spatial information from the multichannel audio signal; And generating an entire bitstream using the downmix audio signal and spatial information.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 상기 멀티채널 오디오 신호를 다운믹스하여 다운믹스 오디오 신호를 생성하고, 상기 다운믹스 오디오 신호에 클리핑방지게인을 적용하는 단계와; 상기 멀티채널 오디오 신호로부터 공간 정보를 추출하는 단계와; 상기 클리핑방지게인이 적용된 다운믹스오디오신호 및 공간 정보를 포함하는 전체 비트스트림을 생성하는 단계;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호의 인코딩 방법을 제공한다.In addition, to achieve the above object, the present invention comprises the steps of downmixing the multi-channel audio signal to generate a downmix audio signal, and applying a clipping prevention gain to the downmix audio signal; Extracting spatial information from the multichannel audio signal; And generating an entire bitstream including the downmix audio signal and the spatial information to which the clipping prevention gain is applied.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 상기 멀티채널 오디오 신 호를 다운믹스하는 과정에서 클리핑방지게인을 적용하는 단계와; 상기 멀티채널 오디오 신호로부터 공간 정보를 추출하는 단계와; 상기 클리핑방지게인이 적용된 다운믹스오디오신호 및 공간 정보를 포함하는 전체 비트스트림을 생성하는 단계;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호의 인코딩 방법을 제공한다.In addition, to achieve the above object, the present invention comprises the steps of applying a clipping prevention gain in the process of downmixing the multi-channel audio signal; Extracting spatial information from the multichannel audio signal; And generating an entire bitstream including the downmix audio signal and the spatial information to which the clipping prevention gain is applied.

상기 인코딩 방법들은 상기 멀티채널 오디오 신호의 하나 이상의 채널에 각 채널별 게인을 적용하는 단계를 더 포함할 수 있으며, 상기 클리핑방지게인은 ⅰ)전체적으로 적용되거나, ⅱ)일정한 간격마다 가변적으로 적용되거나, ⅲ)프레임마다 가변적으로 적용되거나, 또는 ⅳ) 상기 ⅰ),ⅱ),ⅲ)을 조합하여 적용될 수 있다. 또한, 상기 클리핑방지게인은 프레임마다 미리 정해진 1단계 값을 초과하지 못하거나, 또는 특정한 수의 프레임마다 1단계 변화만이 가능하도록 적용될 수 있다. 그리고 상기 인코딩 방법들은 프레임별로 클리핑복구정보(Clipping Restoration Information)를 상기 전체 비트스트림에 삽입하는 단계를 더 포함할 수 있다. The encoding methods may further include applying gain for each channel to one or more channels of the multichannel audio signal, wherein the clipping prevention gain is i) applied globally, or ii) variably applied at regular intervals, Iii) it may be applied variably for each frame, or iii) a combination of iv), ii) and iv). In addition, the anti-clipping gain may be applied so as not to exceed a predetermined one step value for each frame or only one step change for a specific number of frames. The encoding methods may further include inserting clipping restoration information into the entire bitstream for each frame.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 다운믹스 오디오 신호 및 공간 정보를 포함하는 비트스트림을 수신하는 단계와; 상기 비트스트림을 디코딩하여 공간 정보를 추출하고, 추출된 상기 공간 정보를 이용하여 상기 다운믹스 오디오 신호를 멀티채널 오디오 신호로 변환하는 단계와; 상기 멀티채널 오디오 신호에 다운믹스게인을 적용하는 단계;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호로 디코딩하는 방법을 제공한다. In addition, to achieve the above object, the present invention includes the steps of receiving a bitstream comprising a downmix audio signal and spatial information; Decoding the bitstream to extract spatial information, and converting the downmix audio signal into a multichannel audio signal using the extracted spatial information; And applying a downmix gain to the multichannel audio signal.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 다운믹스 오디오 신호 및 공간 정보를 포함하는 비트스트림을 수신하는 단계와; 상기 비트스트림으로부터 다 운믹스 오디오 신호를 추출하고, 추출된 상기 다운믹스 오디오 신호에 다운믹스게인(Downmix Gain)을 적용하는 단계;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호로 디코딩하는 방법을 제공한다.In addition, to achieve the above object, the present invention includes the steps of receiving a bitstream comprising a downmix audio signal and spatial information; Extracting a downmix audio signal from the bitstream and applying a downmix gain to the extracted downmix audio signal; providing a method of decoding into a multichannel audio signal do.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 다운믹스 오디오 신호 및 공간 정보를 포함하는 비트스트림을 수신하는 단계와; 상기 비트스트림으로부터 다운믹스 오디오 신호를 추출하고, 추출된 상기 다운믹스 오디오 신호를 멀티채널 오디오 신호로 바꾸는 과정에서 다운믹스게인(Downmix Gain)을 적용하는 단계;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호로 디코딩하는 방법을 제공한다.In addition, to achieve the above object, the present invention includes the steps of receiving a bitstream comprising a downmix audio signal and spatial information; Extracting a downmix audio signal from the bitstream and applying a downmix gain in the process of converting the extracted downmix audio signal into a multichannel audio signal; multichannel audio Provides a method for decoding into a signal.

상기 디코딩 방법들은 상기 멀티채널 오디오 신호의 하나 이상의 채널에 각 채널별 다운믹스게인을 적용하는 단계를 더 포함할 수 있으며, 상기 다운믹스게인은 ⅰ)전체적으로 적용되거나, ⅱ)일정한 간격마다 가변적으로 적용되거나, ⅲ)프레임마다 가변적으로 적용되거나, 또는 ⅳ) 상기 ⅰ),ⅱ),ⅲ)을 조합하여 적용될 수 있다. 또한, 상기 다운믹스게인은 프레임마다 미리 정해진 1단계 값을 초과하지 못하거나, 또는 특정한 수의 프레임마다 1단계 변화만이 가능하도록 적용될 수 있다. 그리고 상기 디코딩 방법들은 비트스트림에 포함된 프레임별로 클리핑복구정보(Clipping Restoration Gain)를 추출하고, 추출된 상기 클리핑복구정보를 이용하여 상기 다운믹스게인이 적용된 멀티채널 오디오 신호에 클리핑복구를 수행하는 단계;를 더 포함할 수 있다. 상기 다운믹스 오디오 신호를 멀태채널로 바꾸는 과정 중에 다운믹스게인을 적용하는 디코딩 방법은 상기 다운믹스게인을 QMF도메인(domain) 단계에서 적용할 수 있다.The decoding methods may further include applying downmix gain for each channel to one or more channels of the multichannel audio signal, wherein the downmix gain is i) applied globally or ii) variably applied at regular intervals. Or iii) variably applied to each frame, or iii) a combination of iv), ii) and iv). In addition, the downmix gain may be applied so as not to exceed a predetermined one-step value for each frame or only one-step change for a specific number of frames. The decoding methods extract clipping clipping information for each frame included in the bitstream, and perform clipping recovery on the multichannel audio signal to which the downmix gain is applied using the extracted clipping recovery information. It may further include; The decoding method of applying a downmix gain during the process of converting the downmix audio signal into a multichannel may apply the downmix gain in a QMF domain step.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 오디오 신호가 다운믹스 오디오 신호를 포함하도록 생성하되, 상기 다운믹스 오디오 신호는 클리핑방지게인을 적용한 후에 다운믹스되어 생성되는 것을 특징으로 하는 오디오 신호의 생성방법을 제공한다.In addition, in order to achieve the above object, the present invention generates an audio signal including a downmix audio signal, the downmix audio signal is downmixed after applying the anti-clipping gain of the audio signal, characterized in that Provides a creation method.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 오디오 신호가 다운믹스 오디오 신호를 포함하도록 생성하되, 상기 다운믹스 오디오 신호는 다운믹스 된 후에 클리핑방지게인이 적용되도록 생성되는 것을 특징으로 하는 오디오 신호의 생성방법을 제공한다.In addition, in order to achieve the above object, the present invention generates an audio signal to include a downmix audio signal, the downmix audio signal is generated so that the clipping prevention gain is applied after downmixing Provides a method of generating.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 오디오 신호가 다운믹스 오디오 신호를 포함하도록 생성하되, 상기 다운믹스 오디오 신호는 다운믹스하는 과정에서 클리핑방지게인이 적용되도록 생성되는 것을 특징으로 하는 오디오 신호의 생성방법을 제공한다.In addition, in order to achieve the above object, the present invention generates an audio signal to include a downmix audio signal, the downmix audio signal is characterized in that the generated to prevent the clipping prevention gain in the process of downmixing Provides a method of generating a signal.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 상기 멀티채널 오디오 신호에 클리핑방지게인을 적용한 후에, 다운믹스 과정을 진행하여 다운믹스 오디오 신호를 생성하는 다운믹스부; 상기 멀티채널 오디오 신호로부터 공간 정보를 추출하는 공간정보발생부; 및 상기 다운믹스오디오신호 및 공간 정보를 이용하여 전체 비트스트림을 생성하는 비트스트림 포맷터;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호의 인코딩 장치를 제공한다.In addition, in order to achieve the above object, the present invention is a downmix unit for generating a downmix audio signal by performing a downmix process after applying a clipping prevention gain to the multi-channel audio signal; A spatial information generator for extracting spatial information from the multichannel audio signal; And a bitstream formatter configured to generate an entire bitstream using the downmix audio signal and spatial information.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 상기 멀티채널 오디오 신호를 다운믹스하여 다운믹스 오디오 신호를 생성하고, 상기 다운믹스 오디오 신호 에 클리핑방지게인을 적용하는 다운믹스부; 상기 멀티채널 오디오 신호로부터 공간 정보를 추출하는 공간정보발생부; 및 상기 클리핑방지게인이 적용된 다운믹스오디오신호 및 공간 정보를 포함하는 전체 비트스트림을 생성하는 비트스트림 포맷터;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호의 인코딩 장치를 제공한다.In addition, in order to achieve the above object, the present invention provides a downmix unit for downmixing the multichannel audio signal to generate a downmix audio signal, and applying a clipping prevention gain to the downmix audio signal; A spatial information generator for extracting spatial information from the multichannel audio signal; And a bitstream formatter configured to generate an entire bitstream including the downmix audio signal and spatial information to which the clipping prevention gain is applied.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 상기 멀티채널 오디오 신호를 다운믹스하는 과정에서 클리핑방지게인을 적용하는 다운믹스부; 상기 멀티채널 오디오 신호로부터 공간 정보를 추출하는 공간정보발생부; 및 상기 클리핑방지게인이 적용된 다운믹스오디오신호 및 공간 정보를 포함하는 전체 비트스트림을 생성하는 비트스트림 포맷터;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호의 인코딩 장치를 제공한다.In addition, to achieve the above object, the present invention provides a downmix unit for applying a clipping prevention gain in the process of downmixing the multi-channel audio signal; A spatial information generator for extracting spatial information from the multichannel audio signal; And a bitstream formatter configured to generate an entire bitstream including the downmix audio signal and spatial information to which the clipping prevention gain is applied.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 다운믹스 오디오 신호 및 공간 정보를 포함하는 비트스트림을 수신하는 비트스트림수신부; 상기 비트스트림을 디코딩하여 공간 정보를 추출하고, 추출된 상기 공간 정보를 이용하여 상기 다운믹스 오디오 신호를 멀티채널 오디오 신호로 변환하는 오디오디코딩 및 멀티채널생성부; 및 상기 멀티채널 오디오 신호에 다운믹스게인을 적용하는 다운믹스게인적용부;를 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호의 디코딩 장치를 제공한다.In addition, in order to achieve the above object, the present invention includes a bitstream receiving unit for receiving a bitstream including a downmix audio signal and spatial information; An audio decoding and multichannel generator for extracting spatial information by decoding the bitstream and converting the downmix audio signal into a multichannel audio signal using the extracted spatial information; And a downmix gain applying unit configured to apply a downmix gain to the multichannel audio signal.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 다운믹스 오디오 신호 및 공간 정보를 포함하는 비트스트림을 수신하는 비트스트림수신부; 및 상기 비트스트림으로부터 다운믹스 오디오 신호를 추출하고, 추출된 상기 다운믹스 오디오 신호 에 다운믹스게인(Downmix Gain)을 적용하는 다운믹스게인적용부;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호의 디코딩 장치를 제공한다.In addition, in order to achieve the above object, the present invention includes a bitstream receiving unit for receiving a bitstream including a downmix audio signal and spatial information; And a downmix gain application unit configured to extract a downmix audio signal from the bitstream and apply a downmix gain to the extracted downmix audio signal. Provide the device.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 다운믹스 오디오 신호 및 공간 정보를 포함하는 비트스트림을 수신하는 비트스트림수신부; 및 상기 비트스트림으로부터 다운믹스 오디오 신호를 추출하고, 추출된 상기 다운믹스 오디오 신호를 멀티채널 오디오 신호로 바꾸는 과정에서 다운믹스게인(Downmix Gain)을 적용하는 다운믹스게인적용부;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호의 디코딩 장치를 제공한다.In addition, in order to achieve the above object, the present invention includes a bitstream receiving unit for receiving a bitstream including a downmix audio signal and spatial information; And a downmix gain application unit configured to apply a downmix gain in the process of extracting a downmix audio signal from the bitstream and converting the extracted downmix audio signal into a multichannel audio signal. An apparatus for decoding a multichannel audio signal is provided.

이하 상기의 목적을 구체적으로 실현할 수 있는 본 발명의 바람직한 실시예를 첨부한 도면을 참조하여 설명한다.Hereinafter, with reference to the accompanying drawings, preferred embodiments of the present invention that can specifically realize the above object will be described.

도 1 은 본 발명에서의 오디오 신호에 대한 공간 정보를 인간이 인식하는 방법을 도시한다. 멀티채널 오디오 신호에 대한 코딩방법은 인간이 오디오 신호를 3차원적 공간으로 인지한다는 사실을 바탕으로, 복수의 파라미터 세트(parameter sets)를 통하여 상기 오디오 신호를 3차원적 공간 정보로 표현할 수 있다는 것을 이용한다. 멀티채널 오디오 신호의 공간 정보를 표시하기 위한 "공간 파라미터"라고 불리는 상기 파라미터에는 ICLD(Inter Channel level differences), ICC(Inter Channel Coherences) 및 ICTD(Inter Channel Time Difference)등이 있다. 상기 ICLD는 두 채널간의 에너지 차이를 의미하고, 상기 ICC는 두 채널 간의 상관관계(correlation)를 의미하며, ICTD는 두 채널간의 시간 차이를 의미한다.1 shows a method for a human to recognize spatial information about an audio signal in the present invention. The coding method for a multichannel audio signal is based on the fact that a human perceives the audio signal as a three-dimensional space. I use it. Such parameters, called "spatial parameters" for indicating spatial information of a multichannel audio signal, include ICLD (Inter Channel level differences), ICC (Inter Channel Coherences), ICTD (Inter Channel Time Difference), and the like. The ICLD means an energy difference between two channels, the ICC means a correlation between two channels, and the ICTD means a time difference between two channels.

인간이 오디오 신호를 어떻게 공간적으로 인식하며, 상기 공간 파라미터의 개념이 어떻게 생성되는지가 도 1에 도시된다. 원거리에 있는 음원(105)으로부터의 직접적인 음파(direct sound wave)(103)가 인간의 왼쪽 귀(107)에 도달하고, 또 다른 직접적인 음파(102)는 머리 주위에서 회절되어 오른쪽 귀(106)에 도달하게 된다. 상기 두 음파(102 및 103)는 도달시간 및 에너지 레벨에서 차이를 보이게 되며, 이와 같은 차이가 상기 CLD, CPC 및 CTD 파라미터를 생성하게 된다.How a human perceives an audio signal spatially and how the concept of the spatial parameter is generated is shown in FIG. 1. Direct sound wave 103 from the remote source 105 arrives at the human left ear 107, and another direct sound wave 102 is diffracted around the head to the right ear 106. Will be reached. The two sound waves 102 and 103 show a difference in arrival time and energy level, and this difference generates the CLD, CPC and CTD parameters.

또한 만일 반사된 음파(104 및 105)가 양 귀에 도달되거나, 또는 상기 음원(105)이 분산되어 있다면, 서로 상관관계가 없는 음파가 양 귀에 도달될 것이고, 이것이 상기 ICC 파라미터를 생성하게 된다. 상기와 같이 원리로 생성된 공간 파라미터들은 멀티채널 오디오 신호를 모노 또는 스테레오 신호로 전송한 후 다시 멀티채널로 출력하는데 있어서, 강력한 비트 수 감소를 가능하게 한다는 것이 알려져 있다. 본 발명은 상기 공간 정보를 이용하는 멀티채널 오디오 신호에 있어서, 멀티채널을 다운믹스하여 코딩하는 과정에서 발생할 수 있는 클리핑(Clipping) 현상을 방지하기 위한 방법을 제시한다.Also, if the reflected sound waves 104 and 105 reach both ears, or if the sound source 105 is dispersed, sound waves that do not correlate with each other will reach both ears, which will generate the ICC parameter. Spatial parameters generated on the principle as described above are known to enable a strong number of bits in transmitting a multichannel audio signal as a mono or stereo signal and then outputting the multichannel audio signal back to the multichannel. The present invention provides a method for preventing clipping from occurring in a process of downmixing and coding a multichannel in a multichannel audio signal using the spatial information.

도 2는 클리핑 발생과정을 도시한다. 클리핑은 주로 두 가지 원인으로 발생한다. 첫 번째는 원래 신호(original signal)의 음량(sound level)이 높은 경우에 발생한다. 두 번째는 다운믹스 과정 중에 입력 채널(input channel)의 수가 많은 경우에 발생한다. 예를 들면, 3개의 채널을 1개의 채널도 다운믹스하는 경우보다, 7개의 채널을 1개의 채널도 다운믹스하는 경우에 클리핑이 더 자주 발생한다. 도 2의 클리핑 발생과정은 5개 채널을 1개의 채널로 다운믹스하는 경우를 도시하나, 본 발명은 이 경우에만 한정되지는 않는다. 도 2의 (a)는 5개의 채널로 구성된 원래 신호의 음량을 도시한다. 각각의 채널은 제한된 크기(예를 들면, 16비트)의 거의 전 범위를 사용할 수 있다. 도 2의 (b)는 상기 5개의 채널을 다운믹스하여 생성된 다운믹스 오디오 신호를 도시한다. 도시된 것처럼, 상기 다운믹스 오디오 신호는 많은 클리핑 지점들을 가질 수 있다. 도 2의 (c)는 상기 다운믹스 오디오 신호를 코어 코덱(예를 들면, AAC 코덱)을 이용하여 인코딩/디코딩한 오디오 신호를 도시한다. 상기 코어 코덱을 이용하여 인코딩/디코딩된 오디오 신호도 제한된 크기(예를 들면, 16비트)로 표현되므로, 클리핑이 지속될 수 있다. 상기 클리핑은 멀티채널 오디오 신호의 재생부에서의 출력에도 영향을 주며, 음질 저하의 원인이 될 수 있다.2 shows a clipping process. Clipping occurs mainly for two reasons. The first occurs when the sound level of the original signal is high. The second occurs when the number of input channels is large during the downmix process. For example, clipping occurs more often when downmixing seven channels to one channel than when three channels are downmixed. The clipping generation process of FIG. 2 illustrates a case of downmixing five channels into one channel, but the present invention is not limited thereto. 2 (a) shows the volume of the original signal consisting of five channels. Each channel can use almost the entire range of limited size (eg 16 bits). 2B illustrates a downmix audio signal generated by downmixing the five channels. As shown, the downmix audio signal can have many clipping points. FIG. 2C illustrates an audio signal obtained by encoding / decoding the downmix audio signal using a core codec (eg, an AAC codec). Since the audio signal encoded / decoded using the core codec is also represented in a limited size (eg, 16 bits), clipping can be continued. The clipping also affects the output from the reproduction unit of the multi-channel audio signal and may cause sound quality degradation.

도 3은 본 발명에 따른 클리핑방지게인 및 다운믹스게인을 적용하여 클리핑을 방지하기 위한 제1 방법을 도시한다. 도시된 것처럼, 먼저 공간 인코더(311)에 멀티채널 오디오 신호(301)가 입력되기 전에, 상기 멀티채널 오디오 신호(301)에 클리핑방지게인(Clipping Prevention Gain, C, )이 적용(302)된다. 여기서 n은 입력 채널의 수를 의미하고, 상기 클리핑방지게인 값은 1보다 작은 값(즉, C<1)에 해당된다. 상기 멀티채널 오디오 신호(301)의 하나 이상의 채널에는 각 채널별 게인(예를 들면, LFE게인 또는 서라운드(Surround)게인)이 적용될 수 있다. 그 다음에 클리핑방지게인이 적용된 상기 멀티채널 오디오 신호(301)는 다운믹스(down-mix)부(303)에서 다운믹스되어 다운믹스 신호로 된다. 3 illustrates a first method for preventing clipping by applying the clipping prevention gain and the downmix gain according to the present invention. As shown, first, before the multichannel audio signal 301 is input to the spatial encoder 311, a clipping prevention gain C is applied 302 to the multichannel audio signal 301. In this case, n denotes the number of input channels, and the clipping prevention gain value corresponds to a value smaller than 1 (ie, C <1). Each channel gain (eg, LFE gain or surround gain) may be applied to one or more channels of the multichannel audio signal 301. The multichannel audio signal 301 to which the clipping prevention gain is applied is then downmixed by the down-mix unit 303 to become a downmix signal.

또한 상기 멀티채널 오디오 신호의 공간 정보, 즉 공간 파라미터가 공간정보발생부(Spatial informatin genenerator, 203)에서 상기 멀티채널 오디오 신 호(301)로부터 추출된다. 여기서 공간 정보(spatial information)란 멀티채널(예를 들면, Left, Right, Center, Left surround, Right surround 등) 오디오 신호를 다운믹스하고, 상기 다운믹스 신호를 전송하며, 상기 전송된 다운믹스 신호를 다시 멀티채널로 업믹스 할 때 사용되는 오디오 신호 채널에 대한 정보를 말한다.In addition, the spatial information of the multichannel audio signal, that is, the spatial parameter, is extracted from the multichannel audio signal 301 by a spatial informatin genenerator 203. In this case, spatial information refers to downmixing a multi-channel (eg, Left, Right, Center, Left surround, Right surround, etc.) audio signal, transmitting the downmix signal, and transmitting the transmitted downmix signal. Information about the audio signal channel used when upmixing back to multichannel.

상기 다운믹스 신호는 코어 코덱 코딩방법을 이용하여 인코딩되어 코어 코덱 비트스트림을 형성하고, 상기 공간 정보, 즉 공간 파라미터는 공간 정보 비트스트림을 형성한다. 상기 코어 코덱은 공간 정보, 즉 공간 파라미터가 아닌 오디오 신호를 코딩 또는 인코딩하는 코덱을 지칭하며, 상기 코어 코덱에는 MP3, AC-3, DTS 또는 AAC가 포함될 수 있으며, 오디오 신호에 대하여 코덱 기능을 수행한다면 기존에 개발된 코덱뿐만 아니라 향후 개발될 코덱을 포함할 수 있다. 비트스트림포맷터(Bitstream Formatter, 305)에서 상기 코어 코덱 비트스트림 및 공간 정보 비트스트림을 포함하는 전체 비트스트림(306)이 생성되고, 생성된 상기 전체 비트스트림(306)은 공간 디코더(312)로 전송된다. 상기 전체 비트스트림(306)에는 클리핑방지정보(Clipping Prevention Information)가 포함될 수 있다. 전송된 전체 비트스트림(306)은 비트스트림파서(Bitstream Parser, 307)를 거쳐 오디오디코딩 및 멀티채널생성부(308)에서 멀티채널 오디오 신호(308)로 변환될 수 있다. 상기 비트스트림파서(307)에서 공간 정보 비트스트림과 코어 코덱 비트스트림이 분리될 수 있다. 상기 오디오디코딩 및 멀티채널생성부(308)에서 상기 공간 정보 비트스트림과 코어 코덱 비트스트림을 디코딩하여, 각각 다운믹스 오디오 신호 및 공간 정보를 추출하고, 추출된 상기 공간 정보를 이용하여 상기 다운믹스 오디오 신호를 멀티채널 오 디오 신호(310)로 변환할 수 있다. 그 다음에 변환된 멀티채널 오디오 신호에 다운믹스게인(Down-mix gain, 1/C)을 적용(310)할 수 있다. 상기 다운믹스게인은 상기 클리핑방지게인의 역수가 될 수 있다. 여기서, 상기 클리핑방지게인 및 다운믹스게인은 전체 신호에 대해서 적용되므로, 클리핑이 일어나는 신호의 크기가 큰 구간에 대해서는 좋은 방법이지만, 원래 신호의 크기가 작은 구간에서는 신호의 SNR(signal-to-noise ratio)을 떨어뜨리는 등의 부작용을 발생시킬 수 있다. 따라서, 상기 클리핑방지게인 및 다운믹스게인은 일정한 시간 간격마다 다른 값을 사용할 수 있다. The downmix signal is encoded using a core codec coding method to form a core codec bitstream, and the spatial information, that is, spatial parameters, form a spatial information bitstream. The core codec refers to a codec for coding or encoding an audio signal instead of spatial information, that is, a spatial parameter. The core codec may include MP3, AC-3, DTS, or AAC, and performs a codec function on an audio signal. If so, it may include a codec to be developed in the future as well as a codec previously developed. In the bitstream formatter 305, the entire bitstream 306 including the core codec bitstream and the spatial information bitstream is generated, and the generated bitstream 306 is transmitted to the spatial decoder 312. do. The entire bitstream 306 may include clipping prevention information. The entire bitstream 306 transmitted may be converted into a multichannel audio signal 308 by the audio decoding and multichannel generator 308 via a bitstream parser 307. In the bitstream parser 307, the spatial information bitstream and the core codec bitstream may be separated. The audio decoding and multichannel generator 308 decodes the spatial information bitstream and the core codec bitstream, extracts downmix audio signals and spatial information, respectively, and uses the extracted spatial information to extract the downmix audio. The signal may be converted into a multichannel audio signal 310. Next, down-mix gain (1 / C) may be applied to the converted multichannel audio signal. The downmix gain may be an inverse of the clipping prevention gain. Here, since the clipping prevention gain and the downmix gain are applied to the entire signal, it is a good method for a large section of the clipping signal, but the signal-to-noise of the signal in the small section of the original signal is small. It can cause side effects such as lowering the ratio. Therefore, the clipping prevention gain and the downmix gain may use different values at regular time intervals.

또한 상기 클리핑방지게인 및 다운믹스게인은 전체 신호 또는 1~2초 단위로 갱신되는 구간 전체에 대하여 적용될 수 있을 뿐만 아니라, 프레임별로 상기 클리핑방지게인 및 다운믹스게인을 적용할 수 있는 신택스를 비트스트림내에 정의하고, 상기 신택스에 의해 매 프레임별로 다운믹스 신호의 게인 조절을 선택적으로 할 수 있다. 프레임별로 적용되는 클리핑방지게인 및 다운믹스게인을 각각 FCPG(Frame Clipping Prevention Gain) 및 FDG(Frame Down-mix Gain)라고 할 때, 상기 FCPG는 ⅰ)FCPG를 헤더에 정의하고, 전체 신호 또는 일정한 주기로 상기 헤더가 갱신될 경우, 상기 헤더에 의해 영향을 받는 구간에 대하여 동일하게 적용하거나, ⅱ) 별도로 정의된 신택스에 의해 매 프레임별로 FCPG를 적용하여 프레임별로 다른 게인을 사용하거나, 또는 ⅲ) 상기 ⅰ) 과 ⅱ)의 방법을 조합하여 전체적으로 적용되는 CPG를 정하고, 상기 CPG를 전체 범위 또는 1~2초 범위의 큰 범위에 대해 사용하고, 이와 별도로 프레임별로 FCPG를 적용하여 상기 CPG가 커버하지 못하는 범위에 대해 게인 조절(gain control)을 하게 할 수 있다. 상기와 같은 신호의 디코딩에서는, 모노 또는 스테레오 신호와 같은 다운믹스 신호에 대해서 CPG 또는 DG를 고려하지 않고 즉시 디코딩하여 재생할 수 있다. 멀티채널 오디오 신호로 재생하는 경우에는, ⅰ)전체 프레임 또는 헤더가 적용되는 범위에 대해서 DG를 적용하거나, ⅱ)프레임별 또는 ⅰ)의 범위보다 작은 범위(group of frames, GOF)에 대해서는 FDG 또는 GOFDG(Group of Frame Down-mix Gain)을 적용하거나, 또는 ⅲ) 상기 ⅰ)과 ⅱ)의 방법을 조합하여 사용할 수 있다. In addition, the clipping prevention gain and the downmix gain may be applied to the entire signal or the entire section updated in units of 1 to 2 seconds, and the bitstream may include the syntax for applying the clipping prevention gain and the downmix gain for each frame. Defined within, the gain can be selectively adjusted for the downmix signal every frame by the syntax. When the clipping prevention gain and the downmix gain applied for each frame are called FCPG (Frame Clipping Prevention Gain) and FDG (Frame Down-mix Gain), respectively, the FCPG defines the FCPG in the header. When the header is updated, the same applies to the section affected by the header, or ii) uses different gains for each frame by applying FCPG for each frame by a separately defined syntax, or iii) the ⅰ ) And ii) to determine the CPG to be applied as a whole, to use the CPG for the entire range or a large range of 1 to 2 seconds, and separately apply the FCPG on a frame-by-frame basis that the CPG can not cover You can make gain control for. In the decoding of such a signal, a downmix signal such as a mono or stereo signal can be immediately decoded and reproduced without considering CPG or DG. When playing back a multi-channel audio signal, i) apply DG to the range to which the entire frame or header is applied; or ii) FDG or Group of Frame Down-mix Gain (GOFDG) may be applied, or i) may be used in combination with the methods of i) and ii).

상기와 같은 CPG(또는 FCPG) 및 DG(또는 FDG)를 비트스트림에 표현하기 위해, 우선 전체 헤더에 대하여 상기 CPG(또는 FCPG) 및 DG(또는 FDG)의 사용여부에 관한 신택스를 가지고, 상기 신택스에 의해 상기 CPG(또는 FCPG) 및 DG(또는 FDG)의 사용여부를 결정한다. 만약 사용하기로 결정되는 경우, 프레임별로 상기 CPG(또는 FCPG) 및 DG(또는 FDG)의 사용여부에 관한 신택스를 갖는다. 만약 사용하기로 결정되는 경우에는, 해당 프레임에 대해 상기 CPG(또는 FCPG) 및 DG(또는 FDG)의 값을 표현한다. In order to express such CPGs (or FCPGs) and DGs (or FDGs) in the bitstream, first of all have a syntax on the use of the CPG (or FCPG) and DG (or FDG) for the entire header, and the syntax Determine whether to use the CPG (or FCPG) and DG (or FDG). If it is determined to use, it has syntax on whether to use the CPG (or FCPG) and DG (or FDG) on a frame-by-frame basis. If it is decided to use, the values of the CPG (or FCPG) and DG (or FDG) are expressed for the frame.

도 4는 본 발명에 따른 클리핑방지게인 및 다운믹스게인을 적용하여 클리핑을 방지하기 위한 제2 방법을 도시한다. 상기 제2 방법은 도 3에서 도시된 제1 방법과 유사하나, 차이점은 클리핑방지게인 및 다운믹스게인의 적용 시점이 다르다는 것이다. 상기 제2 방법에서는 공간 인코더(411)에 입력된 멀티채널 오디오 신호(401)가 다운믹스부(403)에서 다운믹스된 후 클리핑방지게인이 적용(402)되거나, 또는 다운믹스과정 중에 클리핑방지게인이 적용(402)될 수 있다. 또한, 공간 디코 더(412)에서의 차이점은 오디오디코딩 및 멀티채널생성부(408)에서 추출된 다운믹스 오디오 신호에 다운믹스게인을 적용(409)한 후, 또는 상기 다운믹스 오디오 신호를 멀티채널로 변환하는 과정 중에 상기 다운믹스게인을 적용(409)한 후, 공간 정보를 이용하여 상기 다운믹스 오디오 신호를 멀티채널 오디오 신호(410)로 변환할 수 있다는 점이다. 여기서도 상기 멀티채널 오디오 신호의 하나 이상의 채널에 각 채널별 다운믹스게인을 적용할 수 있다. 특히, 다운믹스 과정중 또는 다운믹스 오디오 신호를 멀티채널로 변환하는 과정 중에 클리핑방지게인 또는 다운믹스게인을 적용하는 것은, 상기 오디오 신호를 처리하는 여러 도메인 중 QMF도메인 단계에서 적용될 수 있다.4 illustrates a second method for preventing clipping by applying the clipping prevention gain and the downmix gain according to the present invention. The second method is similar to the first method shown in FIG. 3 except that the application time of the clipping prevention gain and the downmix gain is different. In the second method, the multi-channel audio signal 401 input to the spatial encoder 411 is downmixed by the downmix unit 403 and then the clipping prevention gain is applied 402 or the clipping prevention gain during the downmix process. This application may be applied 402. In addition, the difference in the spatial decoder 412 is that the downmix audio signal 409 is applied to the downmix audio signal extracted from the audio decoding and multichannel generator 408, or the downmix audio signal is multi-channel. After the downmix gain is applied 409 during the conversion process, the downmix audio signal may be converted into the multichannel audio signal 410 using spatial information. Here, the downmix gain for each channel may be applied to one or more channels of the multichannel audio signal. In particular, applying the anti-clipping gain or the downmix gain during the downmix process or during the process of converting the downmix audio signal to the multichannel may be applied in the QMF domain step among the various domains processing the audio signal.

도 5a 및 5b는 본 발명에 따른 클리핑방지게인 값의 다양한 실시예에 관한 테이블을 도시한다. 상기 클리핑방지게인 값은 다양한 값을 가지도록 표현될 수 있다. 예를 들면, 비트스트림내에 bsFixedGains라는 신택스(syntax)를 두고, 상기 신택스 값에 의해 각 채널별 게인(예를 들면, 서라운드게인(Surround Gain) 및 LFE 게인 등) 및 클리핑방지게인을 표시하고자 할 때, 도시된 테이블처럼 다양한 실시예를 가질 수 있다. 제1 실시예에서는 서라운드게인 값 및 LFE게인 값은 각각 1/sqrt(2) 및 1/sqrt(10)을 사용하고, 클리핑방지게인 값은 1 또는 1/2를 사용할 수 있다. 제2 실시예에서는 서라운드게인 값 및 LFE게인 값은 각각 1/sqrt(2) 및 1/sqrt(10)을 사용하고, 클리핑방지게인 값은 1, 1/2 또는 1/4를 사용할 수 있다. 제3 실시예에서는 서라운드게인 값 및 LFE게인 값은 각각 1/sqrt(2) 및 1/sqrt(10)을 사용하고, 클리핑방지게인 값은 1, 1/sqrt(2) 또는 1/2를 사용할 수 있다. 제4 실시예에서는 서라운드게인 값 및 LFE게인 값은 각각 1/sqrt(2) 및 1/sqrt(10)을 사용하고, 클리핑방지게인 값은 1, 1/sqrt(2), 1/2, 1/(2xsqrt(2)) 또는 1/4를 사용할 수 있다. 제5 실시예에서는 서라운드게인 값 및 LFE게인 값은 각각 1/sqrt(2) 및 1/sqrt(10)을 사용하고, 클리핑방지게인 값은 1, 3/4, 2/3, 또는 1/2를 사용할 수 있다. 제6 실시예에서는 서라운드게인 값 및 LFE게인 값은 각각 1/sqrt(2) 및 1/sqrt(10)을 사용하고, 클리핑방지게인 값은 1, 3/4, 2/4, 또는 1/4를 사용할 수 있다. 5A and 5B show tables relating to various embodiments of anti-clipping gain values in accordance with the present invention. The clipping prevention gain value may be expressed to have various values. For example, when a syntax called bsFixedGains is placed in the bitstream and the gain for each channel (for example, a surround gain and an LFE gain) and an anti-clipping gain are to be displayed based on the syntax value. As shown in the table, the present invention may have various embodiments. In the first embodiment, 1 / sqrt (2) and 1 / sqrt (10) may be used as the surround gain value and the LFE gain value, and 1 or 1/2 may be used as the clipping prevention gain value. In the second embodiment, 1 / sqrt (2) and 1 / sqrt (10) may be used as the surround gain value and the LFE gain value, and 1, 1/2 or 1/4 may be used as the clipping prevention gain value. In the third embodiment, the surround gain value and the LFE gain value use 1 / sqrt (2) and 1 / sqrt (10), respectively, and the clipping prevention gain value uses 1, 1 / sqrt (2) or 1/2. Can be. In the fourth embodiment, the surround gain value and the LFE gain value use 1 / sqrt (2) and 1 / sqrt (10), respectively, and the clipping prevention gain values are 1, 1 / sqrt (2), 1/2, 1 You can use / (2xsqrt (2)) or 1/4. In the fifth embodiment, the surround gain value and the LFE gain value use 1 / sqrt (2) and 1 / sqrt (10), respectively, and the clipping prevention gain value is 1, 3/4, 2/3, or 1/2. Can be used. In the sixth embodiment, the surround gain value and the LFE gain value use 1 / sqrt (2) and 1 / sqrt (10), respectively, and the clipping prevention gain values are 1, 3/4, 2/4, or 1/4. Can be used.

도 5a 및 5b에서는 서라운드게인 값 및 LFE게인 값이 특정한 값(예를 들면, 1/sqrt(2) 및 1/sqrt(10))으로 고정된 경우만을 도시하고 있으나, 본 발명은 상기 실시예에 한정되지 않는다. 본 발명은 상기 서라운드게인 값 및 LFE게인 값도 상기 클리핑방지게인 값처럼 복수의 값 중에서 선택되는 경우를 포함할 수 있다. 또한, 본 발명은 상기 서라운드게인 및 LFE게인 이외에 다른 채널에 대한 게인 값을 가지는 경우도 포함할 수 있다. 5A and 5B illustrate only the case where the surround gain value and the LFE gain value are fixed to a specific value (for example, 1 / sqrt (2) and 1 / sqrt (10)), but the present invention is not limited to the above embodiment. It is not limited. The present invention may include a case where the surround gain value and the LFE gain value are also selected from a plurality of values, such as the clipping prevention gain value. In addition, the present invention may include a case having a gain value for other channels in addition to the surround gain and the LFE gain.

도 6은 본 발명에 따른 클리핑방지게인, 다운믹스게인 및 클리핑복구정보를 사용하여 클리핑을 방지하기 위한 방법을 도시한다. 상기 클리핑복구정보(Clipping Restoration Information, CRI)는 클리핑 발생여부 및 클리핑 위치 등에 관한 정보를 포함하는 클리핑에 관한 정보를 말하며, 상기 클리핑복구정보는 프레임별로 비트스트림내에 포함될 수 있다. 도시된 것처럼, 상기 클리핑복구정보는 다운믹스 오디오 신호 또는 멀티채널 오디오 신호에 적용(602 또는 609)되는 상기 클리핑방지게인 및 다운믹스게인과 함께 사용될 수 있다. 즉, 클리핑방지게인이 적용(602)된 멀티채널 오디오 신호(601)를 다운믹스하여 생성된 다운믹스 오디오 신호를 포함하는 전체 비트스트림에 프레임 단위로 상기 클리핑복구정보를 포함시킬 수 있다. 또한, 다운믹스 오디오 신호에 클리핑방지게인을 적용(602)하고, 상기 클리핑방지게인이 적용된 다운믹스 오디오 신호를 포함하는 전체 비트스트림에 프레임 단위로 상기 클리핑복구정보를 포함시킬 수 있다. 또한, 다운믹스과정 중에 클리핑방지게인을 적용(602)하고, 상기 클리핑방지게인이 적용된 다운믹스 오디오 신호를 포함하는 전체 비트스트림에 프레임 단위로 상기 클리핑복구정보를 포함시킬 수 있다. 공간 디코더(612)에서는 다운믹스게인과 상기 클리핑복구정보를 모두 이용하여 멀티채널 오디오 신호로 디코딩할 수 있다. 6 illustrates a method for preventing clipping using anti-clipping gain, downmix gain and clipping recovery information according to the present invention. The Clipping Restoration Information (CRI) refers to information about clipping including information on whether clipping occurs and a clipping position, and the clipping recovery information may be included in a bitstream for each frame. As shown, the clipping recovery information can be used with the anti-clipping and downmix gains applied to the downmix audio signal or the multichannel audio signal (602 or 609). That is, the clipping recovery information may be included in the entire bitstream including the downmix audio signal generated by downmixing the multichannel audio signal 601 to which the clipping prevention gain is applied 602. In addition, the clipping prevention gain may be applied to the downmix audio signal (602), and the clipping recovery information may be included in the entire bitstream including the downmix audio signal to which the clipping prevention gain is applied in units of frames. In addition, during the downmix process, the clipping prevention gain may be applied (602), and the clipping recovery information may be included in the entire bitstream including the downmix audio signal to which the clipping prevention gain is applied in units of frames. The spatial decoder 612 may decode the multi-channel audio signal using both the downmix gain and the clipping recovery information.

도 7은 본 발명에 따른 프레임 주위에서 음질저하를 방지하는 클리핑방지게인의 이용원리를 도시한다. 클리핑방지게인에 의해 음량이 변할 경우, 상기 클리핑방지게인 값이 변하는 프레임 주위에서 음질 저하가 이루어질 수 있다. 따라서 상기 클리핑방지게인 값의 변화에 의한 효과가 서서히 나타나도록 변이 구간을 정할 필요가 있다. 이때, 다음과 같은 수식을 사용하여 스무딩(smoothing) 과정을 수행할 수 있다.Figure 7 illustrates the principle of use of the anti-clipping gain to prevent sound degradation around the frame according to the present invention. When the volume is changed by the anti-clipping gain, the sound quality may be degraded around the frame where the value of the anti-clipping gain is changed. Therefore, it is necessary to determine the variation section so that the effect by the change of the clipping prevention gain value is gradually shown. At this time, a smoothing process may be performed using the following equation.

CPG(n)=a(n)CPG_t-1(n-1) + (1-a(n))CPG_t(n), n=0,1,2,...,NCPG (n) = a (n) CPG _t-1 (n-1) + (1-a (n)) CPG _t (n), n = 0,1,2, ..., N

여기서, a(n)은 1차 직선이 될 수도 있고, 일반적인 n차의 다항함수가 될 수도 있다. 또한, 가우스(gaussian) 함수, 해닝(hanning), 해밍(hamming) 함수 등의 비-다항(non-polynomial) 함수로서, 상기 CPG 값이 바뀔 때의 부드러운 변화를 위 해 사용되는 것이면 상관없다. 한편, 급격한 CPG의 변화는 상기와 같은 스무딩 과정을 거치더라도 부정적인 효과를 일으킬 수 있다. 따라서, 인코딩 과정에서 급격한 변화를 가지지 못하게 제한할 수 있다. 또는, 인코더에서 아무렇게나 넣더라도 디코더에서 급격하게 변화하지 못하도록 강제적으로 해석할 수 있다. 예를 들면, 상기 CPG의 값이 여러 단계의 값을 갖는 경우, 매 프레임마다 1단계 초과의 변화를 가지지 못하게 한다거나, 특정 수의 프레임(n 프레임)마다 1단계의 변화만 가능하게 할 수 있다. Here, a (n) may be a linear linear order or a general nth polynomial function. In addition, as a non-polynomial function such as a Gaussian function, a hanning, a hamming function, or the like, it may be used for a smooth change when the CPG value is changed. On the other hand, a sudden change in CPG may cause a negative effect even after the smoothing process as described above. Therefore, it is possible to limit the rapid change in the encoding process. Alternatively, the encoder may be forcibly interpreted so as not to change abruptly in the decoder even if it is inserted in the encoder. For example, when the value of the CPG has a value of several levels, it may be possible not to have more than one step change in every frame or only one step change in a specific number of frames (n frames).

도 8은 본 발명에 따른 클리핑을 방지하기 위한 제1 방법을 이용하는 멀티채널 오디오 신호의 인코딩 방법에 대한 흐름도이다. 먼저 멀티채널 오디오 신호(801)에 클링방지게인을 적용(802)한다. 상기 클리핑방지게인이 적용된 멀티채널 오디오 신호를 다운믹스(803)하여 다운믹스 신호를 생성하고, 상기 멀티채널 오디오 신호로부터 공간 정보를 추출(804)한다. 그 다음에 상기 다운믹스 오디오 신호 및 공간 정보를 포함하는 전체 비트스트림을 전송(805)한다. 8 is a flowchart of a method of encoding a multichannel audio signal using a first method for preventing clipping according to the present invention. First, an anti-clocking gain is applied to the multi-channel audio signal 801 (802). The multichannel audio signal to which the clipping prevention gain is applied is downmixed 803 to generate a downmix signal, and spatial information is extracted from the multichannel audio signal 804. The entire bitstream containing the downmix audio signal and spatial information is then transmitted 805.

도 9는 본 발명에 따른 클리핑을 방지하기 위한 제2 방법을 이용하는 멀티채널 오디오 신호의 인코딩 방법에 대한 흐름도이다. 먼저 멀티채널 오디오 신호(901)를 다운믹스(902)하여 다운믹스 오디오 신호를 생성하고, 상기 멀티채널 오디오 신호로부터 공간 정보를 추출(904)한다. 그 다음에 상기 다운믹스 오디오 신호에 클리핑방지게인을 적용(904)한다. 그 다음에 상기 클리핑방지게인을 적용한 다운믹스 오디오 신호 및 공간 정보를 포함하는 전체 비트스트림을 전송(905)한다. 9 is a flowchart of a method of encoding a multichannel audio signal using a second method for preventing clipping according to the present invention. First, the multichannel audio signal 901 is downmixed 902 to generate a downmix audio signal, and spatial information is extracted 904 from the multichannel audio signal. An anti-clipping gain is then applied to the downmix audio signal (904). Then, the entire bitstream including the downmix audio signal and spatial information to which the clipping prevention gain is applied is transmitted (905).

도 10은 본 발명에 따른 클리핑을 방지하기 위한 제1 방법을 이용하는 멀티 채널 오디오 신호의 디코딩 방법에 대한 흐름도이다. 먼저 다운믹스 오디오 신호 및 공간 정보를 포함하는 비트스트림을 수신(1001)하고, 상기 비트스트림으로부터 다운믹스 오디오 신호 및 공간 정보를 추출(1002 및 1003)한다. 그 다음에 상기 공간 정보를 이용하여 상기 다운믹스 오디오 신호를 멀티채널 오디오 신호로 변환(1004)하고, 상기 멀티채널 오디오 신호에 다운믹스게인을 적용(1005)한다.10 is a flowchart of a method of decoding a multi-channel audio signal using a first method for preventing clipping according to the present invention. First, a bitstream including a downmix audio signal and spatial information is received (1001), and downmix audio signals and spatial information are extracted (1002 and 1003) from the bitstream. Next, the downmix audio signal is converted into a multichannel audio signal using the spatial information (1004), and downmix gain is applied to the multichannel audio signal (1005).

도 11은 본 발명에 따른 클리핑을 방지하기 위한 제2 방법을 이용하는 멀티채널 오디오 신호의 디코딩 방법에 대한 흐름도이다. 먼저 다운믹스 오디오 신호 및 공간 정보를 포함하는 비트스트림을 수신(1101)하고, 상기 비트스트림으로부터 다운믹스 오디오 신호 및 공간 정보를 추출(1102 및 1103)한다. 그 다음에 상기 다운믹스 오디오 신호에 다운믹스게인을 적용(1104)하고, 상기 공간 정보를 이용하여 상기 다운믹스게인이 적용된 다운믹스 오디오 신호를 멀티채널 오디오 신호로 변환(1105)한다.11 is a flowchart of a method of decoding a multichannel audio signal using a second method for preventing clipping according to the present invention. First, a bitstream including a downmix audio signal and spatial information is received 1101, and a downmix audio signal and spatial information are extracted 1102 and 1103 from the bitstream. A downmix gain is then applied to the downmix audio signal (1104), and the downmix audio signal to which the downmix gain is applied is converted to a multichannel audio signal using the spatial information (1105).

지금까지 본 발명에 대하여 몇몇 실시예들을 들어 구체적으로 설명하였으나, 상기 실시예들은 본 발명을 이해하기 위한 설명을 위해 제시된 것이며, 본 발명의 범위가 상기 실시예에 제한되는 것은 아니다. 당업자라면 본 발명의 기술적 사상의 범위를 벗어나지 않고도 다양한 변형이 가능함을 이해할 수 있을 것이며, 본 발명의 범위는 첨부된 특허청구범위에 의해서 해석되어야 할 것이다.Although the present invention has been described in detail with reference to some embodiments, the above embodiments are presented for the purpose of understanding the present invention, and the scope of the present invention is not limited to the above embodiments. Those skilled in the art will understand that various modifications are possible without departing from the scope of the technical idea of the present invention, and the scope of the present invention should be interpreted by the appended claims.

이상에서 기술된 것과 같이, 본 발명에 따른 멀티채널 오디오 신호를 코딩하는데 있어서, 멀티채널 오디오 신호에 클리핑방지게인을 적용한 후에 다운믹스하여 비트스트림을 구성하거나, 또는 멀티채널 오디오 신호를 다운믹스한 후에 클리핑방지게인을 적용하여 비트스트림을 구성하고, 디코딩 과정에서 멀티채널 오디오 신호 또는 다운믹스 신호에 다운믹스게인을 적용함으로써, 멀티채널 오디오 신호를 다운믹스하는 과정에서 발생되는 클리핑 문제를 효과적으로 방지할 수 있다.As described above, in coding the multichannel audio signal according to the present invention, after applying the anti-clipping gain to the multichannel audio signal, downmixing is performed to configure the bitstream, or after downmixing the multichannel audio signal. By applying the anti-clipping gain to compose the bitstream and applying the downmix gain to the multichannel audio signal or the downmix signal during the decoding process, it is possible to effectively prevent the clipping problem generated during the downmixing of the multichannel audio signal. have.

또한, 상기 클리핑방지게인 및 다운믹스게인을 전체 비트스트림, 일정한 구간, 또는 프레임별로 사용하거나, 클리핑복구정보와 함께 사용함으로써 멀티채널 오디오 신호를 다운믹스하는 과정에서 발생되는 클리핑 문제를 효과적으로 방지할 수 있다.In addition, by using the anti-clipping gain and the downmix gain on an entire bitstream, in a predetermined interval, or in a frame, or by using the clipping recovery information, the clipping problem occurring in downmixing a multichannel audio signal can be effectively prevented. have.

Claims

A method of encoding a multichannel audio signal,

(a) generating a downmix audio signal by performing a downmix process after applying a clipping prevention gain to the multichannel audio signal;

(b) extracting spatial information from the multichannel audio signal; And

and (c) generating the entire bitstream using the downmix audio signal and the spatial information.

A method of encoding a multichannel audio signal,

(a) downmixing the multichannel audio signal to generate a downmix audio signal and applying a clipping prevention gain to the downmix audio signal;

(b) extracting spatial information from the multichannel audio signal; And

and (c) generating the entire bitstream including the downmix audio signal and the spatial information to which the clipping prevention gain is applied.

A method of encoding a multichannel audio signal,

(a) applying a clipping prevention gain during downmixing the multichannel audio signal;

(b) extracting spatial information from the multichannel audio signal; And

The method according to any one of claims 1 to 3,

The encoding method is

Prior to the step (a), further comprising the step of applying the gain for each channel to one or more channels of the multi-channel audio signal, the multi-channel audio signal encoding method.

The method according to any one of claims 1 to 3,

The anti-clipping gain may be applied iii) as a whole, ii) variably at regular intervals, iii) variably at every frame, or iii) in combination with iii), ii) and iii). A method for encoding a multichannel audio signal.

The method of claim 5,

The anti-clipping gain may not exceed a predetermined one-step value for each frame or only one step change for a specific number of frames.

The method according to any one of claims 1 to 3,

Step (c) is,

And inserting clipping restoration information into the entire bitstream on a frame-by-frame basis.

The method of claim 3, wherein

The anti-clipping gain is applied in the QMF domain step of the downmix process, characterized in that the multi-channel audio signal encoding method.

In the method of decoding into a multi-channel audio signal,

(a) receiving a bitstream comprising a downmix audio signal and spatial information;

(b) extracting the spatial information by decoding the bitstream and converting the downmix audio signal into a multichannel audio signal using the extracted spatial information; And

(c) applying a downmix gain to the multichannel audio signal.

In the method of decoding into a multi-channel audio signal,

(a) receiving a bitstream comprising a downmix audio signal and spatial information; And

(b) extracting a downmix audio signal from the bitstream and applying a downmix gain to the extracted downmix audio signal; decoding to a multichannel audio signal Way.

In the method of decoding into a multi-channel audio signal,

(b) extracting a downmix audio signal from the bitstream and applying downmix gain in the process of converting the extracted downmix audio signal into a multi-channel audio signal; Decoding into a multichannel audio signal.

The method of claim 9,

Step (c) is,

And applying downmix gain for each channel to one or more channels of the multichannel audio signal.

The method according to any one of claims 9 to 11,

The downmix gain may be applied iii) as a whole, ii) variably at regular intervals, iii) variably for each frame, or iii) as a combination of iii), ii) and iii). A method for encoding a multichannel audio signal.

The method of claim 12,

The downmix gain may not exceed a predetermined one-step value for each frame, or only one step may be changed for a predetermined number of frames.

The method of claim 9,

Step (c) is,

Extracting clipping restoration gain for each frame included in the bitstream; And

And performing clipping recovery on the multichannel audio signal to which the downmix gain has been applied using the extracted clipping recovery information.

The method of claim 10 or 11,

In step (b),

The method of claim 10 or 11,

The decoding method

Extracting spatial information by decoding the bitstream; And

And converting the downmix audio signal to which the downmix gain is applied to a multichannel audio signal by using the extracted spatial information.

The method of claim 17,

The decoding method,

The method of claim 11,

The downmix gain is applied in the QMF domain step of converting the downmix signal into a multi-channel, characterized in that the multi-channel audio signal encoding method.

In generating an audio signal,

The audio signal is generated to include a downmix audio signal,

And the downmix audio signal is downmixed after applying the clipping prevention gain.

In generating an audio signal,

The audio signal is generated to include a downmix audio signal,

And the downmix audio signal is generated such that clipping prevention gain is applied after downmixing.

In generating an audio signal,

The audio signal is generated to include a downmix audio signal,

The downmix audio signal is generated so that the clipping prevention gain is applied in the downmixing process.

An apparatus for encoding a multichannel audio signal,

a downmix unit configured to generate a downmix audio signal by performing a downmix process after applying a clipping prevention gain to the multichannel audio signal;

(b) a spatial information generator for extracting spatial information from the multichannel audio signal; And

and (c) a bitstream formatter for generating the entire bitstream using the downmixed audio signal and spatial information.

An apparatus for encoding a multichannel audio signal,

a downmix unit for downmixing the multichannel audio signal to generate a downmix audio signal and applying a clipping prevention gain to the downmix audio signal;

and (c) a bitstream formatter for generating the entire bitstream including the downmix audio signal and spatial information to which the clipping prevention gain is applied.

An apparatus for encoding a multichannel audio signal,

(a) a downmix unit for applying a clipping prevention gain in the downmixing of the multichannel audio signal;

An apparatus for decoding a multichannel audio signal,

(a) a bitstream receiver configured to receive a bitstream including a downmix audio signal and spatial information;

(b) an audio decoding and multichannel generator for extracting spatial information by decoding the bitstream and converting the downmix audio signal into a multichannel audio signal using the extracted spatial information; And

and a downmix gain applying unit configured to apply a downmix gain to the multichannel audio signal.

An apparatus for decoding a multichannel audio signal,

(a) a bitstream receiver configured to receive a bitstream including a downmix audio signal and spatial information; And

(b) a downmix gain application unit for extracting a downmix audio signal from the bitstream and applying a downmix gain to the extracted downmix audio signal; multichannel audio Device for decoding signal.

An apparatus for decoding a multichannel audio signal,

(b) a downmix gain application unit configured to apply a downmix gain in the process of extracting a downmix audio signal from the bitstream and converting the extracted downmix audio signal into a multichannel audio signal; An apparatus for decoding a multichannel audio signal, characterized in that.